The Curious Wavefunction: April 2010

Steering library bias toward A2A adenosine receptor ligand discovery

The A2A adenosine receptor is an important GPCR, well-known for binding caffeine. Adenosine receptors are emerging as relevant drug targets for a variety of disorders including Parkinson's disease, and there is interest in discovering new ligands that bind to them. Among adenosine receptor subtypes, the A2A receptor is one of the few GPCRs whose crystal structure is available. Thus the A2A is amenable to structure-based design efforts, and virtual screening is an especially attractive endeavor in this regard.

In the present report, a team of researchers from NIH and UCSF led by Brian Shoichet, John Irwin and Kenneth Jacobson use virtual screening to discover new ligands for the A2A. There are several points to note here. The authors use the ZINC library of drug like molecules to dock about a million and a half compounds into the binding pocket of the A2A crystal structure. They pick the best-scored 500 (0.035% of the total) ligands and investigate their fit in the binding site. Using criteria like electrostatic and VdW complementarity and novelty of chemotype, they finally select 20 of these 500 hits and test them in assays. Out of these 20, 7 inhibited binding by more than 40% at 20 μM concentration, thus constituting a hit rate of 35%. While the compounds formed the same kinds of interactions as some other A2A ligands, they were also relatively diverse in structure. The ligands were also tested in aggregation-based screens to determine that their activity was not a spurious artifact of aggregation-based inhibition.

This is a pretty good hit rate. Generally virtual screening campaigns are lucky to have a hit rate of a few percent. Curiously, the authors also found a similarly high hit rate during a past VS campaign against the well-known β2 adrenergic receptor. What could be responsible for this high hit rate against GPCRs? The reasons are interesting. One reason could be that GPCRs are very well adapted to bind small molecules in compact pockets, enclosing them and forming many kinds of productive interactions. But more intriguingly, as the authors have noted earlier, there is "biogenic bias" in favor of certain target-specific chemotypes in commercial libraries that are screened, both during VS as well as HTS. This in turn reflects the biases of medicinal chemists in picking and synthesizing certain kinds of chemotypes based on the importance of drug targets and past successes in hitting these targets. GPCRs clearly are enormously important, and GPCR-friendly ligand chemotypes thus constitute a large part of screening libraries. These chemotypes are much more prevalent than those for kinases or ion channels for instance.

This observation has both positive and negative implications. The positive implication is that one is likely to keep finding high hit rates for GPCRs using VS. However, the negative implication is that one is also going to be constrained by biogenic bias, and this might preclude finding more diverse and entirely novel subtypes. Thus, while VS campaigns for GPCRs might find a good number of hits, the novelty of these hits might not always be satisfying. One other quite intriguing point emerging in this study is that the kind of hits found (agonist, inverse agonist, antagonist etc.) reflects the ligand which the target structure used for VS is co-crystallized with. Thus the A2A houses an antagonist in the binding site, leading to a preponderance of antagonists in the top docking hits. Indeed, agonists ranked abysmally low in the list.

GPCR ligand discovery is one of the most important goals in drug discovery. This and other similar studies demonstrate that, with all its caveats, VS can be productively used to mine for new GPCR drugs.

Carlsson, J., Yoo, L., Gao, Z., Irwin, J., Shoichet, B., & Jacobson, K. (2010). Structure-Based Discovery of A2A Adenosine Receptor Ligands Journal of Medicinal Chemistry DOI: 10.1021/jm100240h

It's (not) the mutation, stupid

Cancer has emerged as a fundamentally genetic disease, where mutations in genes cause cells to go haywire. Yet, finding out exactly which mutations are responsible for a certain type of cancer is a daunting task. A recent report in Nature which details the cataloging of tens of thousands of mutations in tens of thousands of tumors illustrates the merits and dangerous pitfalls of such an approach.

The article talks about the International Cancer Genome Consortium (ICGC), formed in 2008, whose task is to coordinate an international effort spread across different countries, where every country has the responsibility of documenting significant mutations in certain types of cancer. For instance, the US is doing 6 types including brain and lung, China is doing gastric, India is doing oral and Australia is doing pancreatic. The process would involve sequencing tens of thousands of genes from tumors. The goal is to find out all the mutations that separate cancerous cells from normal ones.

Yet this goal immediately runs into David Hume's well-known problem of induction. If a mutation in a gene is observed in, say 5% of tumors, would it be observed in all of them? More importantly, would it be significant as a causative agent? Consider the IDH1 gene which encodes isocitrate dehydrogenase, a key enzyme in the all-important Krebs cycle. As the article notes, IDH1 was not regarded as significant when it initially showed up in a very small subset of certain kinds of tumors. But then it consistently showed up in 12% of brain tumors of a certain kind and 8% of tumors from leukemic patients. Thus, rather than being a chance occurrence, IDH1 now seems like a significant correlative factor for cancer. It is now hot cancer genomic property.

But this is just the beginning, the very beginning. Evolutionary biologists are very well familiar with the problem of determining which mutations- called "drivers"- are causative for a given phenotype, and which ones are just "passengers". One of the biggest mistakes that "adaptionists" can make is to assume that every genotypical change somehow provides a selective advantage, when the change could just be riding on the back of another significant one. Identifying and cataloging thousands of mutant genes says nothing about which of those are truly responsible for the cancer and which ones have just come along for the ride. As a researcher quoted in the article says, "it's going to take good old-fashioned biology to really determine what these mutations are doing".

And I think we can all agree that's its going to take a lot of good old-fashioned biology to accomplish this. Firstly, one has to determine the function of the mutated gene by doing knockout and other experiments, and endeavor fraught with complications. Maybe it codes for a protein, maybe it does not. Even if it does, one then has to identify the function of that protein by finding a suitable system in which it can be expressed and purified. Structure determination may be another hard obstacle on the path to success. Finally, if any kind of therapeutic intervention is going to be attempted, one would have go first find out whether the target is "druggable". And then of course, the long and wildly uncertain road towards finding a small molecule inhibitor only begins.

Even assuming that all this happens, there is no guarantee that hitting the enzyme will produce a therapeutic response. Maybe the enzyme that is mutated is part of a complex pathway of signaling, and maybe one has to really hit something upstream or downstream to actually make a difference. And of course, hitting the target may cause a difference, but it may not be therapeutically significant enough. Thus, it's pretty clear that this project is far from curing any kind of cancer at this stage. What we just described is light years ahead of what is being currently done. Plus, it's worth noting that this is data that is extremely heterogeneous, collated from a variety of populations, potentially subject to the capricious standards of individual agencies and workers. It's nothing if not a statistician's nightmare.

So is the effort worth it? Undoubtedly. Sitting among those haystacks of mutations is the valuable one that may actually be causative. We are never going to identify the culprit if we never line up the suspects. But here, much more than in a police lineup, it is easy to be seduced by statistical significance. The pursuit of the wrong gene could easily mean the loss of millions of dollars and countless hours wasted. The researchers who have descended into this quagmire need to be more careful than Ulysses on their tortuous journey toward the discovery of important cancer-causing mutations. It is all too easy to slip on a stone and chase the wrong rabbit into the wrong rabbit hole. And there are countless number of these at every step.

As Yeats might have rephrased a line from one of his enduring poems, "tread softly, because you tread on my genes".

Ledford, H. (2010). Big science: The cancer genome challenge Nature, 464 (7291), 972-974 DOI: 10.1038/464972a

You have a Ph.D.?? Who doesn't!

I just finished reading Peter Feibelman's fantastic book, "A Ph.D. is not enough", about career advice for fresh (and also slightly staler) Ph.D.s. I very highly recommend this 100 page slim little volume. There's a blurb from the great Carl "Papa" Djerassi on the cover saying that you will get from this book in one hour what it took Djerassi 40 years to learn. Djerassi may be exaggerating, but he is close. Feibelman himself was a physics professor at SUNY Buffalo and was then a member of the technical staff at Sandia National Laboratories, so he has seen the world and tasted its ugly side.

The book was written in 1993 but its contents are as relevant as ever, and probably even more relevant in this age of tight funding and layoffs. It's got the whole works; from applying for postdoc positions (be realistic, pick a project which you think you can actually finish) to picking a postdoc advisor (don't pick a flashy young professor who would be loathe to share credit), job interviews (don't be a dilettante), writing grants (be modest in your goals even as you emphasize the big picture), giving a talk (OMG he talks about slides and projectors!), to choosing between academia and industry.

The last part is particularly intriguing and Feibelman has some novel advice for wannabe professors. Unless you are hell-bent on an academic position and wouldn't want to even think of anything else, Feibelman quite emphatically discourages plunging into academia as an assistant professor. The pay is low, respect is lacking, it's one hell of a rope trick to secure funding without any significant past background, there's basically no vacation, tenure is always uncertain, and you keep wondering when you will get publishable results even as you spend most of your time explaining to pre-meds why they deserved a D on their last exam. In short, you have no life and there's lots of necessary conditions that you have to satisfy to stay afloat, none of which is sufficient.

Better than this, says Feibelman, is to start working in a goverment or industrial lab where you (hopefully) have plenty of time for research, establish a solid reputation with financial security, and then apply to a university at the tenured professor level. Of course this is easier said than done. These days it's hard to do basic research in industry and you are afraid of losing your job every day. But I think Feibelman's point is well taken; unless you have absolutely no interest in anything other than an academic position, it's definitely worthwhile considering a more indirect path to academia where you actually have a life. My old PhD advisor actually did that and it worked out well for him.

In any case, read this little book if you are a fresh, red faced, scared little new PhD. Which we all are.

The beta-amyloid hyp(e)othesis; the saga continues

As Churchill would have said, beta-amyloid is a riddle, wrapped in a mystery, inside an enigma. For years now, the "amyloid hypothesis" has been widely-accepted as somehow being importantly responsible for Alzheimer's disease. A few years back researchers were cheerfully confident that amyloid was to AD what artherosclerotic plagues were to heart disease. Rudolph Tanzi, a Harvard neurologist who is a leading authority on the disease and identified the first AD gene, had this to say in an interview in 2000:

Q. How close are we to an effective treatment for Alzheimer's disease?
A. I wouldn't be surprised if five years from now we have a pretty effective drug that can slow the disease down enough so that it will be preventable in those at risk, and significantly slow down the deterioration of people who already have it.

Q. Why do you have such optimism?

A. Because, in 15 years, we've gone from knowing little about what causes this disease to having a pretty concrete idea of which biological pathways and body proteins are involved.

If you compare Alzheimer's to heart disease where cholesterol levels must be lowered, we now have our own cholesterol equivalent, which we call the beta-amyloid. The name of the game in Alzheimer's therapy is lowering the accumulation of beta-amyloid in the brain.

It's ten years later and we are no closer to finding an AD drug. Tanzi's hope was not unwarranted given what we knew about amyloid then. But as the amyloid hypothesis matured, so did our understanding of it. First we discovered that it's not the amyloid aggregates themselves but soluble oligomers that are probably responsible for neuronal toxicity. Now it has been proposed that amyloid could have a protective antimicrobial role (I myself had an evolutionary speculation on this) in which case targeting it could even be dangerous. The fact remains that there is no proof that amyloid causes AD. It certainly seems to be related in an important way and many revealing details about it have been uncovered in the last decade, but the proof of principle has been on an increasingly slippery slope and if anything the picture gets murkier and more fascinating.

An article in the latest issue of C & EN basically says that we are targeting beta-amyloid because at least for now we cannot think of anything better to do. It's true that currently, our best bet at treating AD lies in interfering with amyloid formation. But since amyloid formation has never been shown to be causative for AD, treatments targeted at it are always going to be something of a shot in the dark. The advantage of targeting amyloid formation though, as the article says, is that there are lots of points in the mechanism where one can potentially interfere. Two key enzymes responsible for formation of amyloid are beta and gamma-secretase, which clip the amyloid precursor peptide into apparently toxic fragments. Scores of articles are published every year about new chemical agents targeting these two enzymes, and yet the jungle is thicker than we think.

Gamma-secretase is actually a multiprotein complex whose structure is not known, so finding molecules that inhibit it is like finding a black cat in a dark room. More importantly, it's also involved in a second pathway called the Notch pathway which is critical in cell-signaling. Thus blocking it may lead to one of the classic problems in drug discovery whereby eliminating a harmful function also eliminates a useful one, often fatally. Beta-secretase is much more well-studied and its crystal structure has been solved, but it poses a classic structure-based design conundrum; the enzyme's active pocket is flexible and expansive and can bind many ligands in different subpockets. Thus, developing drugs that block this moving target is admittedly challenging. Throw in the requirements for safety and an ability to cross the blood-brain barrier (BBB), and we have a pickle on our hands that's almost as dense as amyloid plaques.

But the much more serious issue is whether any of these strategies will work at all. If amyloid formation turns out to be a side-show in AD progression, then all these strategies might ultimately come to naught. Unfortunately the data so far is not promising. The last few years have seen a disappointing string of late-stage failures of amyloid blocking molecules and antibodies in clinical trials. In some cases the agents have failed to clear the plaques, but tellingly in others, clearing the plagues did not put the disease in remission. Thus the scores of pharmaceutical companies that have several pipeline dwellers focused toward amyloid may be chasing an imaginary rabbit. There are serious concerns that scientists may have to go back to the drawing board and start all over again. This would be a huge setback.

However, hope need not be completely lost. One of the most reasonable explanations for the failure of these agents is simply that they arrived too late on the scene, when the disease had progressed too far to be defeated. Perhaps these drugs would have helped had they been administered earlier. Even cancers that can be treated if detected early fail to be cured in late stages, and AD should be no different. One of the big problems in the field is that detecting AD early is still a challenge and is being addressed by many promising neural imaging initiatives. Perhaps early and focused administration of these drugs could be successful.

Yet it all hinges on putting all your eggs in the amyloid basket. Another protein implicated in AD is tau, which forms tangles in the brain. But as the article says, targeting tau may be even harder than targeting amyloid since it is ubiquitous. Other processes hypothesized to be important for AD include oxidation and other neurotoxic processes of which amyloid may simply be a side-product. And as I was thinking this morning, perhaps no drug will ever be as effective in treating AD as a balanced lifestyle that includes preventive measures; and this may especially be true if amyloid is a natural part of our body's physiology. But as of now we have to keep on trying, and the amyloid hypotheis, shaky as it is, seems to be our best bet of making a dent into this devastating disease. At the very least it will lead to novel basic insights. Perhaps it's an indication of how primitive our understanding of the disease is that we continue to cling to amyloid. Under the present circumstances it seems to be the best we can do, but as another Churchillian admonition indicated, "it is not enough that we do our best; we must do what's necessary".

New hominid fossil has 9-year-old discoverer

A boy in South Africa has stumbled upon Australopithecus sediba, a possible Homo erectus ancestor who demonstrated both upright walking traits as well as the ability to swing with apelike arms among trees. In the hunt for transitional forms in human evolution, this is clearly an important touchstone. The report will be published in Science this Friday. The report is already causing a controversy because of its suggestion that A. sediba might be a direct Homo ancestor. Such kind of classification controversy about hominid fossils has long animated anthropology and human evolution.

The 9-year-old boy named Matthew Berger was chasing his dog and accidentally came across the remarkably well preserved fossil, situated close to the site where his father Lee Berger has been excavating for years. The site also contains fossils of carnivores and is at the bottom of a cliff, indicating that both humans and animals might have lost their footing and fallen to their deaths, either when the humans were chasing the animals or vice versa.

Just one question. Since the boy discovered the fossil, shouldn't the paper bear his name as co-author? Or are 9-year-olds safely ignored for authorship on Science papers? Although he is mentioned in the paper as the discoverer, it seems a little unfair, especially for a field like paleontology where the initial discovery is the most important event.

Will virtual screening ever work?

Virtual screening (VS), wherein a large number of compounds are screened, either by docking against a protein target of interest or by similarity searching against a known active, is one of the most popular computational techniques in drug discovery. The goal of VS is to complement high-throughput screening (HTS) and the ideal goal is to at least partly substitute HTS in finding new hits.

But this goal is still far from being achieved. VS still has to make a significant contribution in the discovery of a major drug and typical hit rates range from a few tenths of a percent to perhaps a percent or two. VS has been intensively investigated for more than a decade. What do we know about its limitations, and where do we go from here?

Gisbert Schneider of ETH Zurich has some thoughts on VS in a recent review. Success in VS ultimately boils down to understanding the detailed structure and dynamics of protein-ligand complexes, a goal that we are still miles away from. We still struggle to realistically include entropy in any calculation, and we are still not completely clear about the role that buried water molecules play in dictating ligand binding. Plus we cannot yet take allosteric binding properly into account, let alone more complex interactions like protein-protein interactions. Thus, maybe, as pointed out in a past post and article, the correct question to ask would be the "anti-question", namely, why does VS work at all in spite of this supposedly woeful lack of understanding?

First of all it is important to know what VS can do well and what it can't. As the article notes, VS is still best for negative selection, that is for weeding out inactive molecules which are bad binders. One of the goals of VS is also to duplicate the correct protein-bound x-ray conformation of the ligand, and in this endeavor (termed pose prediction) VS seems to be succeeding much better than in the ultimate goal which is to rank ligand binding to a protein in order of free energy of binding. As the article notes, the true binding interaction energy landscape for a protein might be more of a plateau; thus there may be a variety of protein-ligand contacts corresponding to a 'good' solution, rather than a global optimum. Plus, one may end up modeling details that are not very relevant to the gist of the ligand binding event; in such a case productive contacts can be preserved with no great sacrifice of qualitative prediction.

Nonetheless, tiny details can sometimes radically shift the balance. No wonder that VS has been heavily dependent on the target rather than on the computational algorithm. Nature continues to throw up surprises as protein entropy, hydrophobic interactions and subtle behavior of water molecules continue to be uncovered as powerful forces operating for a particular protein-ligand complex.

In the end, modeling the dynamic behavior of macromolecules is an absolute must for lending general utility to VS campaigns. In the absence of adequate modeling of entropy, it may be wise from a practical viewpoint to aim for ligand chemotypes whose binding is dominated more by enthalpic effects. It's interesting to note a past set of studies which I had highlighted which suggested that it's really the enthalpy rather than entropy which is rendered favorable in a drug discovery project as one proceeds from hit to lead.

Finally, the author makes an appeal to fields spread far and wide to come up with ideas that could be applied in VS and related approaches. It is likely that while incremental improvements will continue to be made in the field through better understanding of protein-ligand interactions, only a novel idea would revolutionize the field. Thus insights could possibly come from unlikely quarters, including complexity theory, non linear dynamics, other aspects of physics and even engineering and architecture.

How this might happen is not at all clear, but it definitely calls for more multidisciplinary work and for more scientists from diverse fields to become interested in the problem. After all VS is fundamentally an optimization problem, one of locating the optimal ligand energetic minimum in a multidimensional landscape of protein, ligand, ions and solvent. I can't see why any mathematician, physicist or engineer worth his or her salt won't find it exciting.

Schneider, G. (2010). Virtual screening: an endless staircase? Nature Reviews Drug Discovery, 9 (4), 273-276 DOI: 10.1038/nrd3139