Cancer has emerged as a fundamentally genetic disease, where mutations in genes cause cells to go haywire. Yet, finding out exactly which mutations are responsible for a certain type of cancer is a daunting task. A
recent report in Nature which details the cataloging of tens of thousands of mutations in tens of thousands of tumors illustrates the merits and dangerous pitfalls of such an approach.
The article talks about the International Cancer Genome Consortium (ICGC), formed in 2008, whose task is to coordinate an international effort spread across different countries, where every country has the responsibility of documenting significant mutations in certain types of cancer. For instance, the US is doing 6 types including brain and lung, China is doing gastric, India is doing oral and Australia is doing pancreatic. The process would involve sequencing tens of thousands of genes from tumors. The goal is to find out all the mutations that separate cancerous cells from normal ones.
Yet this goal immediately runs into David Hume's well-known problem of induction. If a mutation in a gene is observed in, say 5% of tumors, would it be observed in all of them? More importantly, would it be significant as a
causative agent? Consider the IDH1 gene which encodes isocitrate dehydrogenase, a key enzyme in the all-important Krebs cycle. As the article notes, IDH1 was not regarded as significant when it initially showed up in a very small subset of certain kinds of tumors. But then it consistently showed up in 12% of brain tumors of a certain kind and 8% of tumors from leukemic patients. Thus, rather than being a chance occurrence, IDH1 now seems like a significant correlative factor for cancer. It is now hot cancer genomic property.
But this is just the beginning, the very beginning. Evolutionary biologists are very well familiar with the problem of determining which mutations- called "drivers"- are causative for a given phenotype, and which ones are just "passengers". One of the biggest mistakes that "adaptionists" can make is to assume that every genotypical change somehow provides a selective advantage, when the change could just be riding on the back of another significant one. Identifying and cataloging thousands of mutant genes says nothing about which of those are truly responsible for the cancer and which ones have just come along for the ride. As a researcher quoted in the article says, "it's going to take good old-fashioned biology to really determine what these mutations are doing".
And I think we can all agree that's its going to take
a lot of good old-fashioned biology to accomplish this. Firstly, one has to determine the function of the mutated gene by doing knockout and other experiments, and endeavor fraught with complications. Maybe it codes for a protein, maybe it does not. Even if it does, one then has to identify the function of that protein by finding a suitable system in which it can be expressed and purified. Structure determination may be another hard obstacle on the path to success. Finally, if any kind of therapeutic intervention is going to be attempted, one would have go first find out whether the target is "druggable". And then of course, the long and wildly uncertain road towards finding a small molecule inhibitor only begins.
Even assuming that all this happens, there is no guarantee that hitting the enzyme will produce a therapeutic response. Maybe the enzyme that is mutated is part of a complex pathway of signaling, and maybe one has to really hit something upstream or downstream to actually make a difference. And of course, hitting the target may cause a difference, but it may not be therapeutically significant enough. Thus, it's pretty clear that this project is far from curing any kind of cancer at this stage. What we just described is light years ahead of what is being currently done. Plus, it's worth noting that this is data that is extremely heterogeneous, collated from a variety of populations, potentially subject to the capricious standards of individual agencies and workers. It's nothing if not a statistician's nightmare.
So is the effort worth it? Undoubtedly. Sitting among those haystacks of mutations is the valuable one that may actually be causative. We are never going to identify the culprit if we never line up the suspects. But here, much more than in a police lineup, it is easy to be seduced by statistical significance. The pursuit of the wrong gene could easily mean the loss of millions of dollars and countless hours wasted. The researchers who have descended into this quagmire need to be more careful than Ulysses on their tortuous journey toward the discovery of important cancer-causing mutations. It is all too easy to slip on a stone and chase the wrong rabbit into the wrong rabbit hole. And there are countless number of these at every step.
As Yeats might have rephrased a line from one of his enduring
poems, "tread softly, because you tread on my genes".
Ledford, H. (2010). Big science: The cancer genome challenge Nature, 464 (7291), 972-974 DOI: 10.1038/464972a