Field of Science

Parallel worlds, parallel lives



"Parallel Worlds, Parallel Lives" is a documentary about Hugh Everett, the fascinating, brilliant and troubled physicist who conceived the idea of parallel universes, which have become a staple of science fiction ever since and are now being taken seriously even by serious scientists.

Everett received his PhD. from Princeton in the 1950s with John Wheeler. At the time the prevailing view of what happens when you observe a quantum system was the Copenhagen Interpretation which said that until you observe a quantum system it exists in a superposition of states; the wavefunction of such a system suddenly "collapses" when you observe it. This of course led to several dilemmas and paradoxes, the most famous one being Schrodinger's Cat. Several questions arose; when exactly does the wavefunction collapse? Who can collapse it? Everett bypassed the whole problem by assuming that quantum systems simply exist in many different states but in separate universes and you observe one of them. Thus the wavefunction does not collapse at all. This of course sounded fantastic, implying that at every moment, there is a copy of you for instance that splits into infinitely many copies in infinitely many universes. However, it did seem to provide a simple way out of Schrodinger's cat-type problems. The "many-worlds interpretation" of quantum mechanics has fascinated, troubled and interested scientists and laymen alike ever since.

Unfortunately, Bohr's "gospel" prevailed among physicists, and Bohr strongly disagreed with Everett in a meeting that Wheeler had set up between them. Disappointed and with a family history of depression, Everett left academia for good. He spent the rest of his life doing top-secret work for the government, coming up with algorithms and computer programs for modeling nuclear war. He apparently was very influential in suggesting nuclear weapons policy which the government adopted and several of his reports are still highly classified. One of the concepts he pioneered was the Lagrange multipliers method, a key tool in solving differential equations with constraints in diverse disciplines. He died suddenly of a heart attack in the 1970s. Everett had a drinking problem and a tragic family life. He was very distant from his children. His son who is the main subject in the documentary says that the only time he touched his father physically was after he died and the dead body had to be moved. Everett's daughter committed suicide, writing a bizarre suicide note saying that she was going to meet her father in a parallel universe.

The documentary is a NOVA documentary on PBS. It's about Everett's son, the musician Mark Everett (who seems to be quite successful with his band "The Eels") who sets out on a journey to Princeton, the Pentagon, Austin, Cambridge etc. to find out more about his father and speaks to such people as Charles Misner and David Deutsch. On the way he learns some quantum mechanics and gets to know his father much better. In the end he feels much closer to his father and seems to have finally received closure. It was rather touching to be honest and there is a sense of satisfaction in his son finally seeming to be at peace.

Note: A rather expensive biography of Everett has just come out. A cheaper, free version is a short Scientific American piece on him.

The origin of life cannot escape basic organic chemistry

ResearchBlogging.org
One of the key challenges facing any theories of the molecular origins of life concerns the synthesis, stability polymerization and self-assembly of early life's molecular components. If you cannot explain the chemical origin of these components, you cannot really explain the origin of life. In case of life as we know it, this boils down to explaining the origin of the building blocks of living organisms, namely nucleotides and amino acids.

The simplest principles and quirks of chemistry could have had an influence on how life could have evolved. A neat paper in ACS Chemical Biology offers a potential explanation based on basic organic chemistry for why a certain class of phosphorylated nucleotides formed in preference to others, even though 'conventional' organic chemistry would dictate the opposite.

An anhydroarabinonucleoside has been postulated as an important potential precursor to further nucleotide synthesis. A key step is the phosphorylation of this nucleoside to yield an activated cyclic nucleoside phosphate. Having an activated molecule makes all the difference since activation primes the molecule to be attacked by further nucleophiles, thus triggering polymerization and growth.

However, the phosphorylation of the arabinose nucleoside raises a fundamental question (hopefully) familiar to sophomore organic chemistry students. Why does phosphorylation take place preferentially on the secondary 3'-OH while sterically, as every student of organic chemistry knows, it should be preferred much more on the primary 5'-OH?

To tackle this question, the authors get a crystal structure of the nucleoside in question. This x-ray structure shows an unusually short distance between the 2'-OH oxygen and the C2 carbon (2.7 A, a).


Energy optimization using quantum chemical techniques surprisingly does not get rid of the short distance. Because of this proximity, the 2'-OH can undergo an internal attack on this carbon to generate a reactive intermediate (1), whose ring can be opened in turn by a 3'-OH phosphate to form the activated phosphate product. Now, the 5'-OH also gets phosphorylated; it's just that it cannot attack the C3 carbon of the activated intermediate the way the 2'-OH can because it's not in proximity to this carbon the way the 2'-OH is.


The authors explain the short distance between the 2'-OH and the C3 carbon by postulating an interaction between the lone pair of the 2'-OH oxygen and the pi* orbital of the C2=N bond. This kind of interaction is quite familiar to organic chemists; it is invoked in the famous Burgi-Dunitz trajectory that enables nucleophilic attack on carbonyl carbons. Indeed, the authors perform a theoretical analysis that shows the angle of attack for the 2'-OH to be about a 100 degrees, close enough to the Burgi-Dunitz trajectory.

This is a classic case of there being two competing pathways in chemistry, one of which is preferred to the other because of a subsequent low-energy route that can be traversed. It's a common theme in chemistry and biochemistry and illustrates how otherwise counter-intuitive reactions can be accelerated by putting them at the top of the right energy cliffs. No matter how complex life may be, it still cannot get around the basic laws of organic chemistry. Score one for thermodynamics.

Choudhary, A., Kamer, K., Powner, M., Sutherland, J., & Raines, R. (2010). A Stereoelectronic Effect in Prebiotic Nucleotide Synthesis ACS Chemical Biology DOI: 10.1021/cb100093g

The aesthete

I had a great time visiting Santa Fe and Los Alamos over the weekend. At Los Alamos there is a nice little museum in Fuller Lodge, where the Manhattan Project scientists used to socialize on weekends. One of the amusing artifacts there is a set of two letters sent by Oppenheimer's secretary asking for a nail to be driven into the wall so that he could hang his hat. There is the first letter...and then there is the follow up.


It's remarkable that this intellectual aesthete did not have the practical drive to hammer a nail into the wall. One could not have imagined someone like Fermi or Feynman leaving the problem unattended for so long. In light of this it seems even more astonishing that a dyed-in-the-wool hands-off theoretician like Oppenheimer could not only direct a world-class group of Nobel Prize winning scientists and engineers to achieve the impossible in record time, but also keep the most practical details of an unimaginably vast project in his head. He even knew who was the best person in the country to manage the organic chemistry stockroom.

Physicist Victor Weisskopf of MIT said it well:
"He did not direct from the head office. He was intellectually and even physically present at each decisive step. He was present in the laboratory or in the seminar rooms, when a new effect was measured, when a new idea was conceived. It was not that he contributed so many ideas or suggestions; he did so sometimes, but his main influence came from something else. It was his continuous and intense presence, which produced a sense of direct participation in all of us; it created that unique atmosphere of enthusiasm and challenge that pervaded the place throughout its time"

Conformations of the stevastelins: A reassessment

ResearchBlogging.org
Shameless self-promotion: my paper on the conformational analysis of cyclic antiviral peptides called stevastelins is now online on the Biopolymers site. Here's a brief overview.

The stevastelins are cyclic peptides that show promising antiviral activity against the vaccinia viral VHR phosphatase. These peptides are phosphorylated in vivo before they can inhibit their target protein. A group in Germany previously did a meticulous analysis of four diastereometric analogs of these peptides which included their synthesis, biological characterization and conformational analysis. However, the conformational analysis was done using force field conformational searches from a single force field, constrained by variables from the NMR data (coupling constant derived dihedral angles and NOESY derived distances). Using such a protocol, the group concluded that each of the four diastereomers exists as a single conformational family in solution.

The problem with constrained conformational searches (or constrained molecular dynamics for that matter) is that they constitute a rather self-fulfilling exercise, with the assumption that there is in fact a single conformation of the molecule under question. However, as I have often discussed on this blog, any molecule with a couple of rotatable bonds is going to exist as multiple conformers in solution, so an assumption of a single conformation would be fuzzy unless supported by more data. NMR by itself is of scant value in determining these conformations for thermodynamic and kinetic reasons. Plus, analyzing conformations using a single force field can be fraught with ambiguity, since every force field comes with its own set of parameters and convergence criteria. Especially trusting energies from force fields can be dangerous. In this case, the stevastelin peptides have 9 rotatable bonds each, so I thought it worthwhile to apply our previously developed and applied NAMFIS (NMR Analysis of Molecular Flexibility In Solution) methodology combining NMR variables with structures from extensive conformational searches to the enumeration of the conformational behavior of these interesting molecules.

The paper essentially describes the conformational variability obtained for each of the diastereomers. Many of the conformations are very similar to the previously postulated families, but some are quite different. There are also some striking observations that are corroborated; for instance, the use of a d-serine truly seems to 'lock' the peptides in a single conformation. Such a lock could be effected to counter the entropic penalty that a multiconformational ensemble of molecules might have to pay. The instructive general observation is that subtle changes in sterechemistry at one or two chiral centers can dramatically affect conformational behavior, a fact that continues to surprise and confound medicinal chemists. I also note that if the NMR data for the phosphorylated peptides were available, an interesting comparison of the conformational pool for the phosphorylated and unphosphorylated counterparts could be attempted. This would shed light on whether phosphorylation leads to less conformational variability or simply increases the proportion of a chosen subset of conformations of the peptides.

Comments, criticism and general feelings of chagrin are welcomed.

Jogalekar, A. (2010). Conformations of stevastelin C3 analogs: Computational deconvolution of NMR data reveals conformational heterogeneity and novel motifs Biopolymers DOI: 10.1002/bip.21504

A shot in the arm for antimalarial drug discovery?

ResearchBlogging.org
While heart disease, cancer and Alzheimer's continue to grab the headlines, malaria and tuberculosis continue to quietly do their deadly work behind the scenes. Diseases that disproportionately affect sub-Saharan Africa are not exactly priorities for drug companies. But they pose a tremendous unmet need. Especially malaria, which kills an unbelievable 800,000 people every year, has fought back against almost every traditional drug. The fight against the disease has boiled down to one class of drugs- the artemisinins. If the parasite develops resistance against these, nobody knows how fast and wide it will spread.

Since pharma companies often get bad press for neglecting....neglected diseases, this makes the duo of papers in this week's issue of Nature especially impressive. The papers talk about GSK collaborating with a host of academic laboratories to discover literally hundreds of hits against malaria through phenotypic screening. The sheer multidisciplinary effort put into this endeavor is laudable. Phenotypic screening is an effective method for drug discovery since it does not care about the target of a drug, at least in the beginning. It's a more top down approach that complements bottom-up rational drug design. The goal is to simply watch out for a particular kind of response, which could be anything from fluorescence to cell shrinkage. In this case it was 80% inhibition of growth of the parasite in the asexual stage in red blood cells. Target identification can come later.

The company screened its proprietary collection of about 2 million compounds. The compound library was chosen for diversity of scaffolds and novel chemotypes. The assay looked for 80% inhibition of the P. falciparum parasite, and came up with hundreds of diverse compounds. The scientists seemed to have taken due care to minimize false positives. They sought to eliminate promiscuous, lipophilic compounds from the list. They also screened their compounds against well known targets and processes that the malarial parasite exploits to subdue its host. One of these was particularly eye-opening for me; apparently, the insidious little weasel can hack up the amino acids from hemoglobin molecules in the host to assemble its own proteins. Now that's stealth for you. More interestingly, the group then screened the selected molecules against seven novel malarial targets and found encouraging inhibition profiles against these targets. Infectious disease are best treated when you can hit the causative agent in multiple places. Paucity of targets has especially been an issue for malaria and TB, and these chemotypes along with their suggested targets provide promising leads. As a final act, the first paper co-authored by Guiguemde et al. also demonstrates favorable pharmacokinetic properties for one of their hits.

Especially interesting is the report in the second paper authored by Gamo et al. where the authors follow a similar procedure but discover that the novel target list for the purported antimalarial candidates is enriched in kinases. They take due care to investigate that this enrichment is not a chance enrichment. Unlike the human genome which has about 500 kinases, the malarial genome has about 80. But finding kinases among the targets of these novel chemotypes has rich implications, since kinases have already been intensely investigated, the targets are well-understood and there are literally thousands of kinase inhibitors out there waiting to be tested. Testing kinase inhibitors against malaria would open up a whole new chapter for antimalarial drug discovery.

Finally, and this is the kicker most talked about, GSK has made the entire list of hits freely available to the public. This is a very laudable act. In an age where corporations are routinely derided for their emphasis on secrecy and profit-making, such a decision should drive home the good work that corporations can potentially do. It also underscores the tremendous opportunities for drug discovery against neglected diseases gained from academic-corporate collaboration. While it remains to be seen how many of these promising candidates become bona fide drugs, it provides many promising starting points for further efforts. Malaria is about as insidious a disease as you can have, lurking in the shadows and waiting to pounce on you. The more the hands that try to squeeze its neck, the better.

Guiguemde, W., Shelat, A., Bouck, D., Duffy, S., Crowther, G., Davis, P., Smithson, D., Connelly, M., Clark, J., Zhu, F., Jiménez-Díaz, M., Martinez, M., Wilson, E., Tripathi, A., Gut, J., Sharlow, E., Bathurst, I., Mazouni, F., Fowble, J., Forquer, I., McGinley, P., Castro, S., Angulo-Barturen, I., Ferrer, S., Rosenthal, P., DeRisi, J., Sullivan, D., Lazo, J., Roos, D., Riscoe, M., Phillips, M., Rathod, P., Van Voorhis, W., Avery, V., & Guy, R. (2010). Chemical genetics of Plasmodium falciparum Nature, 465 (7296), 311-315 DOI: 10.1038/nature09099

Gamo, F., Sanz, L., Vidal, J., de Cozar, C., Alvarez, E., Lavandera, J., Vanderwall, D., Green, D., Kumar, V., Hasan, S., Brown, J., Peishoff, C., Cardon, L., & Garcia-Bustos, J. (2010). Thousands of chemical starting points for antimalarial lead identification Nature, 465 (7296), 305-310 DOI: 10.1038/nature09107

Life anew?

The recent creation of a "synthetic organism" by Craig Venter and his colleagues his hit the headlines. By all accounts it is a thoroughly impressive piece of work, a tour de force that designed a genome from scratch, literally by writing it the way a piece of computer code is written. The perseverance and ingenuity put into the process deserve ample applause. And it should rightly catapult the emerging field of synthetic biology into the public discourse.

But it's still not a "synthetic cell" in my opinion. The genome was inserted into a cell where it started working exactly as expected. I would not hold my breath before we can design completely synthetic genomes that can do whatever we want, including eating CO2 or producing Lipitor.

To this non-expert, the reason looks simple: everything in molecular biology that we have encountered until now has turned out to be more complex than expected. Notice what's happened to AIDS vaccines, gene therapy and treatments for Alzheimer's disease, all of which would supposedly be simpler than designing a synthetic organism. In each of these cases, what seemed obvious and straightforward has turned out to be a maze of unexpected challenges and unexplained observations. The fact is, designing a genome is one thing, making it produce proteins that will interact with each other in a carefully orchestrated manner, will find binding partners with exquisite specificity and accomplish the extremely complex and often non-intuitive cascades of signal transduction is quite another. A cell is not just the genome, it's really about interactions. And I am willing to bet that we are a long way off before we can generally design all those countless specific interactions which give rise to that entity named a "cell".

Perturbed by Free Energy Perturbation?

Family matters kept me away for sometime, but this topic seems apt to jump into the fray again. In the Pipeline has an interesting slew of comments about the role of computational chemistry in drug design and discovery. The comments were in response to a question by Derek about how useful Free Energy Perturbation (FEP) could be in drug design. FEP is a kind of holy grail for drug hunters. If you could really predict the absolute free energy of binding of a series of diverse drug-like molecules to a protein, it would comprise an unprecedented breakthrough. It may not instantly make it possible to put two new cancer drugs a week on the market, but predicting the affinity of compounds without making them would certainly lead to unimaginable savings in cost and money for the pharmaceutical industry. Not surprisingly, many erstwhile knights are pursuing this dream with vigor. To me it seems interesting to summarize what my reading of some of the major challenges in the field are. This is a personal evaluation, feel free to enlighten in the comments section.

1. Our understanding of protein structure and conformation is still significantly inadequate: This may be the single-most daunting challenge in doing FEP. We don't know how to calculate the entropy and enthalpy of proteins binding to drugs that arises from their motions. Induced fit effects have long been recognized as being very important in dictating protein-ligand binding. Yet, most docking programs that try to fit ligands into protein pockets do so while considering the protein rigid. Movements of side chains, loops and sometimes even large scale movement of helices can be significant yet subtle, and it's an uphill task to include these into a docking calculation. Some docking programs have made impressive advancements in predicting induced fit, but a lot remains to be done. However, the core problem with doing any of this really leads us to the biggie in the field- protein structure prediction. Convoys of experimentalists and theorists have been trying to do this for decades. Sucess has been impressive, but still not general enough.

The general problem has huge implications for understanding protein folding, misfolding and of course, protein-drug binding. It's significiant and appreciated enough that at least one man, who happens to be the richest man in the world, has decided to put his money on it. Bill Gates recently announced that he is investing 10 million dollars in the computational drug design company Schrodinger, specifically with a view to supporting developments in protein structure prediction and related issues. That must mean something. In any case, unless we can capture the dance of proteins even as they bind to a drug, our dream of FEP will be a distant spot on the horizon. If an x-ray structure is available, such efforts become more feasible. And yet for some of the most important proteins like GPCRs, only a handful of structures exist. Homology modeling can and is supplying some of the missing structures, but the process involves tremendous guesswork and the devil in the details often thwarts your best efforts. In the end, computational prediction of protein structure can only come from an enhanced basic understanding of the basic properties of proteins, and both theory and experiment will need to massively intertwine in this quest.

2. Our understanding of ligand conformations is much better, but still not perfect: Compared to protein conformation prediction, we are orders of magnitude better with ligand conformations prediction, primarily because of the small size of the ligand. But even here challenges lurk. Ligands usually exist as multiple conformations in solution. One of these conformations is the bioactive one that binds to the protein. Often this is only 2-3% percent, which means it's virtually impossible to detect by NMR. While several methods exist for generating relevant ligand conformations, it is prima facie very difficult to say which one is the bioactive one. Plus, ligand and protein have to expend strain energy for the ligand to adopt the right conformation. One never knows how much strain energy the ligand can pay, although recent estimates have suggested a maximum cap of a few kcal/mol. Beyond all this, it's worth noting that drastic changes in activity can sometimes result from small changes in ligand conformation. Docking cannot always capture these small changes, although in some cases as I demonstrated before, docking can capture non-intuitive ligand conformations that only crystal structures can reveal. The bottom line is that even though we have a much better handle on ligand conformations compared to protein conformations, locating the bioactive conformation is still trying to locate a needle among a haystack of needles.

3. Water is still the big white elephant in the room: The most well-known solvent is still the least well-understood, especially in the context of its interaction with biomolecules. By some estimates, the displacement of water molecules by hydrophobic parts of a ligand is the single most important driver for binding affinity. Apart from the more obvious roles that water molecules can play in bridging ligand protein interactions and serving as well-placed displaceable entities that can be kicked out by ligand extensions with huge resulting changes in free energy, water also plays more subtle roles that we are just beginning to comprehend. Water can act as a kind of lubricant, 'massaging' proteins as they unfold and fold, gliding across hydrophobic and hydrophilic surfaces and helping them to form interactions. Plus, proteins usually are surrounded by a ghostly layer of bound water molecules that almost act as a virtual extension of their structure. These water molecules can exert important influences on protein conformational changes. Plus, the hydrophobic effect only gets more interesting every day, with recent findings suggesting that there is a 'dewetting transition' when two hydrophobic surfaces approach each other closer than a critical distance. To find out more, you can check out an excellent review of water's role in biology on the molecular level. Current methods for modeling water include implicit and explicit solvation models. The drawbacks of both are well-recognized. It seems astonishing that we are trying to predict the solvation of protein-ligand assemblies when we are still struggling to get the solvation of simple organic molecules right. In the end, correct accounting of water for specific systems is going to be key for accurate FEP calculation.

The real challenge in FEP comes from the exquisite, exponential dependence of free energy of binding on the dissociation constant of a protein ligand complex. Since a 1 kcal/mol change in ∆G can lead to a ten fold change in dissociation constant, we need to do at least as well as this number in predicting free energies accurately. Since hydrogen bonds are a few kcal/mol, hydrophobic and electrostatic interactions can contribute another few kcals, and the errors in these parameters effected by inadequate solvation, incomplete sampling of conformations and incomplete representation of things like entropy are all incremental, it's pretty clear that getting things correct to 1 kcal/mol is a decidedly uphill task. The methods just cannot include all the parameters from real life necessary to achieve this. Real life measurements of binding affinity are frequently conducted under messy conditions with mixed solvents, ions, buffers and inhomogeneous environments. Rest assured that your grandson will be trying as hard as you are to include these factors into a FEP calculation.

I have always thought that this glass ceiling of 1 kcal/mol really represents all the riches we can get from understanding the diverse factors that dictate protein-ligand binding. The magic number is like the mythical island of Ithaca. You may arrive there weary and old, and may even discover that the place does not exist, but the wisdom you would have gained on the way would be of permanent value. That's what counts.

It's truly the entropy that binds us together

ResearchBlogging.org
Fragment-based Drug Design (FBDD) has emerged as one of the key strategies in drug design during the past two decades. FBDD hinges on the fact that fragments, as opposed to complete ligands, are easier to optimize and study since they possess lesser molecular complexity and have fewer binding interactions.

When fragments are optimized to bind to parts of a protein's active site, they can gain powerful binding affinity by being linked together. Usually fragments are relatively weak binders, and connecting them with linkers can provide orders of magnitude improvements in the free energy of binding. The reason for this affinity increase is usually stated to be entropic. The rationale is that when two fragments are linked together, the entropic cost that they would have to pay were they to separately bind is already paid for by the linker. Thus the combined free energy is much more favorable than the pairwise free energy. The increase in free energy is quantified by a number called the "linking coefficient", where a value of less than zeo indicates enhanced binding relative to the fragments.

However, such an analysis assumes that the contribution to the binding process from other factors is minimal, and the entropic advantage is the only major player in the affinity increase game. But as with protein-ligand interactions, the picture is more complex. There can be unfavorable enthalpic contributions from the fragments losing favorable binding interactions upon being constrained by linkers. There can sometimes be favorable enthaplic interactions from new contacts between the ligand and protein. And there can be enthalpic and entropic contributions from the linker itself. Thus, dissecting the factors that go into the free energy upon fragment linking is like the classic conundrums of physical organic chemistry which I encountered in college, where controlled experiments are deviously hard, and changing one factor inevitably changes another (when does it not?).

An ideal system for studying the contribution of entropy to FBDD would be a system of two fragments binding to a protein which can be linked together by a single bond and which retain all their existing contacts with the protein upon being linked. Such systems would admittedly be hard to design, but a group of Italian researchers has come up with a neat system linking two very simple fragments, a hydroxamic acid and a benzenesulfonamide, with a single bond.

The fragments and resulting molecule inhibit matrix metalloproteinase (MMP), a key enzyme implicated in cancer and inflammation. The authors perform x-ray crystallographic and isothermal calorimetry (ITC) to investigate the thermodynamics and binding of the individual fragments and their linked counterpart to MMP. They demonstrate that the two fragments preserve their binding modes even when linked together and observe a rather large free energy enhancement of almost 4 kcal/mole on fragment linking.

This is a nice case where a careful analysis and dismissal of other factors points accurately to entropy as the main contributor to enhanced binding of linked fragments.

Borsi, V., Calderone, V., Fragai, M., Luchinat, C., & Sarti, N. (2010). Entropic Contribution to the Linking Coefficient in Fragment Based Drug Design: A Case Study Journal of Medicinal Chemistry DOI: 10.1021/jm901723z

Steering library bias toward A2A adenosine receptor ligand discovery

ResearchBlogging.org
The A2A adenosine receptor is an important GPCR, well-known for binding caffeine. Adenosine receptors are emerging as relevant drug targets for a variety of disorders including Parkinson's disease, and there is interest in discovering new ligands that bind to them. Among adenosine receptor subtypes, the A2A receptor is one of the few GPCRs whose crystal structure is available. Thus the A2A is amenable to structure-based design efforts, and virtual screening is an especially attractive endeavor in this regard.

In the present report, a team of researchers from NIH and UCSF led by Brian Shoichet, John Irwin and Kenneth Jacobson use virtual screening to discover new ligands for the A2A. There are several points to note here. The authors use the ZINC library of drug like molecules to dock about a million and a half compounds into the binding pocket of the A2A crystal structure. They pick the best-scored 500 (0.035% of the total) ligands and investigate their fit in the binding site. Using criteria like electrostatic and VdW complementarity and novelty of chemotype, they finally select 20 of these 500 hits and test them in assays. Out of these 20, 7 inhibited binding by more than 40% at 20 μM concentration, thus constituting a hit rate of 35%. While the compounds formed the same kinds of interactions as some other A2A ligands, they were also relatively diverse in structure. The ligands were also tested in aggregation-based screens to determine that their activity was not a spurious artifact of aggregation-based inhibition.

This is a pretty good hit rate. Generally virtual screening campaigns are lucky to have a hit rate of a few percent. Curiously, the authors also found a similarly high hit rate during a past VS campaign against the well-known β2 adrenergic receptor. What could be responsible for this high hit rate against GPCRs? The reasons are interesting. One reason could be that GPCRs are very well adapted to bind small molecules in compact pockets, enclosing them and forming many kinds of productive interactions. But more intriguingly, as the authors have noted earlier, there is "biogenic bias" in favor of certain target-specific chemotypes in commercial libraries that are screened, both during VS as well as HTS. This in turn reflects the biases of medicinal chemists in picking and synthesizing certain kinds of chemotypes based on the importance of drug targets and past successes in hitting these targets. GPCRs clearly are enormously important, and GPCR-friendly ligand chemotypes thus constitute a large part of screening libraries. These chemotypes are much more prevalent than those for kinases or ion channels for instance.

This observation has both positive and negative implications. The positive implication is that one is likely to keep finding high hit rates for GPCRs using VS. However, the negative implication is that one is also going to be constrained by biogenic bias, and this might preclude finding more diverse and entirely novel subtypes. Thus, while VS campaigns for GPCRs might find a good number of hits, the novelty of these hits might not always be satisfying. One other quite intriguing point emerging in this study is that the kind of hits found (agonist, inverse agonist, antagonist etc.) reflects the ligand which the target structure used for VS is co-crystallized with. Thus the A2A houses an antagonist in the binding site, leading to a preponderance of antagonists in the top docking hits. Indeed, agonists ranked abysmally low in the list.

GPCR ligand discovery is one of the most important goals in drug discovery. This and other similar studies demonstrate that, with all its caveats, VS can be productively used to mine for new GPCR drugs.

Carlsson, J., Yoo, L., Gao, Z., Irwin, J., Shoichet, B., & Jacobson, K. (2010). Structure-Based Discovery of A2A Adenosine Receptor Ligands Journal of Medicinal Chemistry DOI: 10.1021/jm100240h

It's (not) the mutation, stupid

ResearchBlogging.org
Cancer has emerged as a fundamentally genetic disease, where mutations in genes cause cells to go haywire. Yet, finding out exactly which mutations are responsible for a certain type of cancer is a daunting task. A recent report in Nature which details the cataloging of tens of thousands of mutations in tens of thousands of tumors illustrates the merits and dangerous pitfalls of such an approach.

The article talks about the International Cancer Genome Consortium (ICGC), formed in 2008, whose task is to coordinate an international effort spread across different countries, where every country has the responsibility of documenting significant mutations in certain types of cancer. For instance, the US is doing 6 types including brain and lung, China is doing gastric, India is doing oral and Australia is doing pancreatic. The process would involve sequencing tens of thousands of genes from tumors. The goal is to find out all the mutations that separate cancerous cells from normal ones.

Yet this goal immediately runs into David Hume's well-known problem of induction. If a mutation in a gene is observed in, say 5% of tumors, would it be observed in all of them? More importantly, would it be significant as a causative agent? Consider the IDH1 gene which encodes isocitrate dehydrogenase, a key enzyme in the all-important Krebs cycle. As the article notes, IDH1 was not regarded as significant when it initially showed up in a very small subset of certain kinds of tumors. But then it consistently showed up in 12% of brain tumors of a certain kind and 8% of tumors from leukemic patients. Thus, rather than being a chance occurrence, IDH1 now seems like a significant correlative factor for cancer. It is now hot cancer genomic property.

But this is just the beginning, the very beginning. Evolutionary biologists are very well familiar with the problem of determining which mutations- called "drivers"- are causative for a given phenotype, and which ones are just "passengers". One of the biggest mistakes that "adaptionists" can make is to assume that every genotypical change somehow provides a selective advantage, when the change could just be riding on the back of another significant one. Identifying and cataloging thousands of mutant genes says nothing about which of those are truly responsible for the cancer and which ones have just come along for the ride. As a researcher quoted in the article says, "it's going to take good old-fashioned biology to really determine what these mutations are doing".

And I think we can all agree that's its going to take a lot of good old-fashioned biology to accomplish this. Firstly, one has to determine the function of the mutated gene by doing knockout and other experiments, and endeavor fraught with complications. Maybe it codes for a protein, maybe it does not. Even if it does, one then has to identify the function of that protein by finding a suitable system in which it can be expressed and purified. Structure determination may be another hard obstacle on the path to success. Finally, if any kind of therapeutic intervention is going to be attempted, one would have go first find out whether the target is "druggable". And then of course, the long and wildly uncertain road towards finding a small molecule inhibitor only begins.

Even assuming that all this happens, there is no guarantee that hitting the enzyme will produce a therapeutic response. Maybe the enzyme that is mutated is part of a complex pathway of signaling, and maybe one has to really hit something upstream or downstream to actually make a difference. And of course, hitting the target may cause a difference, but it may not be therapeutically significant enough. Thus, it's pretty clear that this project is far from curing any kind of cancer at this stage. What we just described is light years ahead of what is being currently done. Plus, it's worth noting that this is data that is extremely heterogeneous, collated from a variety of populations, potentially subject to the capricious standards of individual agencies and workers. It's nothing if not a statistician's nightmare.

So is the effort worth it? Undoubtedly. Sitting among those haystacks of mutations is the valuable one that may actually be causative. We are never going to identify the culprit if we never line up the suspects. But here, much more than in a police lineup, it is easy to be seduced by statistical significance. The pursuit of the wrong gene could easily mean the loss of millions of dollars and countless hours wasted. The researchers who have descended into this quagmire need to be more careful than Ulysses on their tortuous journey toward the discovery of important cancer-causing mutations. It is all too easy to slip on a stone and chase the wrong rabbit into the wrong rabbit hole. And there are countless number of these at every step.

As Yeats might have rephrased a line from one of his enduring poems, "tread softly, because you tread on my genes".

Ledford, H. (2010). Big science: The cancer genome challenge Nature, 464 (7291), 972-974 DOI: 10.1038/464972a