Research uses big data to find cancer mutations in cells

Artificial intelligence and machine learning are among the latest tools cancer researchers are using to detect and treat the disease.

One of the scientists working in this new frontier of cancer research is Ryan Layer, PhD, a member of the University of Colorado Cancer Center, who recently published a study detailing his research that uses big data to find cancerous mutations in cells.

Identifying the genetic changes that cause healthy cells to become cancerous can help doctors select therapies that target the tumor specifically. For example, about 25% of breast cancers are HER2-positive, meaning the cells in this type of tumor have mutations that make them produce more of a protein called HER2 that helps them grow. Treatments specifically targeting HER2 have dramatically increased survival rates for this type of breast cancer.”


Ryan Layer, assistant professor of computer science, CU Boulder

Scientists can evaluate cell DNA to identify mutations, Layer says, but the challenge is that the human genome is huge and mutations are a normal part of evolution.

“The human genome is long enough to fill a 1.2 million page book, and every two people can have about 3 million genetic differences,” he says. “Finding one cancer-causing mutation in a tumor is like finding a needle in a pile of needles.”

Scan the data

The ideal method for determining what type of cancer mutation a patient has is to compare two samples from the same patient, one from the tumor and one from healthy tissue. However, such tests can be complicated and costly, so Layer came up with another idea -; using huge public DNA databases to search for common cell mutations that are mostly benign so that researchers can identify rarer mutations that may be cancerous.

“There was a project called the Broad Institute’s Genome Aggregation Database, or gnomAD, where they brought together a bunch of different studies that happened within the Broad into the largest genetic database anyone’s ever thought of,” says Laag. it’s 65,000 individuals, and now it’s about half a million individuals. At the time, I was doing research in the undiagnosed rare disease clinic at the University of Utah, and the usefulness of that database was just unbelievable.”

Even if he was able to sequence a child with cancer and her parents, Layer says, there were often so many genetic mutations that it was difficult to determine which one caused the disease. Using gnomAD, he was able to see how common a particular variant was in a larger population, greatly reducing the number of therapeutic targets.

Verify variants

Inspired by that experience, Layer began looking at other ways to use big data to identify potentially cancerous mutations. Knowing that detection of complex DNA mutations called structural variants (SV) can often lead to false negatives, he and his colleagues developed a process that focuses on verification rather than detection. This method searches raw data from thousands of DNA samples for evidence that supports a specific structural variant.

“We scanned the SVs identified in previous cancer studies and found that thousands of SVs previously associated with cancer also exist in normal healthy samples,” Layer says. “This indicates that these variants are benign, hereditary sequences rather than disease-causing ones.”

The team also found that the method performed just as well as the traditional strategy that requires both tumor and healthy samples, opening the door to reducing costs and increasing accessibility to high-quality cancer mutation analysis.

“With all the data out there for cancer, we were able to show that this method is really powerful for identifying not necessarily the driving mutation in cancer, but which variants are unique to the tumor, versus the rest of your body,” he says. † “That way, tumor treatment can become superpersonal. We can say, ‘If you have this mutation, take this drug; if you don’t have this mutation, don’t take that drug.'”

Share the research

Layer’s lab has now launched a website where doctors can enter information about structural variants found in a patient’s tumor to see how often -; and potentially dangerous -; they are. He also wants to build a larger cancer-focused dataset to better understand how and where tumors form.

“Our work so far has been to take a structural variant and see how common it is in a healthy population,” he says. “But what if we create indexes that let you search our populations? Let’s say you take a sample of a tumor in a lung and you find structural variants — now you can look for those against prostate cancer and breast cancer and all other cancers.” , and it can help you identify: ‘What is the origin of the tumor?’ “Did it metastasize, or did it originate in the lung?” We can search the tumor databases to find other matched tumors for more personalized drug-inspired treatments.”

Leave a Comment

Your email address will not be published. Required fields are marked *