Field of Science

"A Fred Sanger would not survive today's world of science."

Somehow I missed last year's obituary for double Nobel laureate and bench scientist extraordinaire Fred Sanger by Sydney Brenner in Science. The characteristically provocative Brenner has this to say about a (thankfully) fictional twenty-first century Sanger:
A Fred Sanger would not survive today's world of science. With continuous reporting and appraisals, some committee would note that he published little of import between insulin in 1952 and his first paper on RNA sequencing in 1967 with another long gap until DNA sequencing in 1977. He would be labeled as unproductive, and his modest personal support would be denied. We no longer have a culture that allows individuals to embark on long-term—and what would be considered today extremely risky—projects.

"Designing drugs without chemicals"

As usual Derek beat me to highlighting this rather alarming picture from an October 5, 1981 issue of Fortune magazine that I posted on Twitter yesterday. The image is from an article about computer-aided design and it looks both like a major communications failure (chemical-free drugs, anyone?) as well as a massive dollop of hype about computer-aided drug design. In fact the article has been cited by drug designers themselves as an example of the famous hype curve, with 1981 representing the peak of inflated expectations.

It's intriguing to consider both how we are still considering pretty much the same questions about computational drug design that we were in 1981 as well as how much progress we have made on various fronts since then. I posted an extended comment about both these aspects of the issue on Derek's blog so I am just going to copy it it below. Bottom line: Many of the fundamental problems are still the same and are unsolved on the general level. However, there has been enough understanding and progress to expect solutions to a wide variety of specific problems in the near future. My own attitude is one of cautious optimism, which in drug discovery is usually the best you can have...

For anyone who wants the exact reference it's the Fortune magazine issue from Oct 5, 1981. The reference is widely considered to be both the time when CADD came to the attention of the masses, as well as a classic lesson in hype. The article itself is really odd since most of it is about computer-aided design in the industrial, construction and aeronautical fields; these are fields where the tools have actually worked exceedingly well. The part about drug design was almost a throwaway with almost no explanation in the text.

Another way to look at the issue is to consider a presentation by Peter Goodford in 1989 (cited in a highly readable perspective by John van Drie (J Comput Aided Mol Des (2007) 21:591–601) in which he laid out the major problems in molecular modeling - things like including water, building homology models, calculating conformational changes, predicting solubility, predicting x-ray conformations etc. What's interesting is that - aside from homology modeling and x-ray conformations - we are struggling with the exact same problems today as we were in the 80s. 

That doesn't mean we haven't made any progress though. Far from it in fact. Even though many of these problems are still unsolved on a general level, the number of successful specific examples is on the rise so at some point we should be able to derive a few general principles. In addition we have made a huge amount of progress in understanding the issues, dissecting the various operational factors and in building up a solid database of results. Fields like homology modeling have actually seen very significant advances, although that's as much because of the rise of the PDB which was enabled through crystallography as accurate sequence comparison and threading algorithms. We are also now aware of the level of validation that our results need to have for everyone to take them seriously. Journals are implementing new standards for reproducibility and knowledge of the right statistical validation techniques is becoming more widespread; as Feynman warned us, hopefully this will stop us from fooling ourselves.

As you mention however, the disproportionate growth of hardware and processing power relative to our understanding of the basic physics of drug-protein interaction has led to an illusion of understanding and control. For instance it's quite true that no amount of simulation time and smart algorithms will help us if the underlying force fields are inaccurate and ill-tested. Thus you can beat every motion out of a protein until the cows come home and you still might not get accurate binding energies. That being said, we also have to realize that every method's success needs to be judged in terms of a particular context and scale. For instance an MD simulation on a GPCR might get some of the conformational details of specific residues wrong but may still help us rationalize large-scale motions that can be compared with experimental parameters. Some of the more unproductive criticism in the field has come from people who have the wrong expectations from a particular method to begin with.

Personally I am quite optimistic with the progress we have made. Computational drug design has actually followed the classic Gartner Hype curve, and it's only in the 2000s that we have reached that cherished plateau of realistic expectations. The hope is that at the very least this plateau will have a small but consistent positive slope.

Reproducibility in molecular modeling research

When I was a graduate student, a pet peeve of my advisor and I was the extreme unlikelihood of finding structural 3D coordinates of molecules in papers. A typical paper on conformational analysis would claim to derive a solution conformation for a druglike molecule and present a picture of the conformation, but the lack of 3D coordinates or a SDF or PDB file would make detailed inspection of the structure impossible. Sadly this was more generally in line with a lack of structural information regarding molecular structures in molecular modeling or structural chemistry articles.

Last year Pat Walters (Vertex) wrote an editorial in the Journal of Chemical Information and Modeling which I have wanted to highlight for a while. Walters laments the general lack of reproducibility in the molecular modeling literature and notes that while reproducibility is taken for granted in experimental science, there has been an unusual dearth of it in computational science. He also makes it clear that authors are running out of excuses in making things like original source code and 3D coordinates available under the pretext of lack of standard hardware and software platforms.
It is possible that the wide array of computer hardware platforms available 20 years ago would have made supporting a particular code base more difficult. However, over the last 10 years, the world seems to have settled on a small number of hardware platforms, dramatically reducing any sort of support burden. In fact, modern virtual machine technologies have made it almost trivial to install and run software developed in a different computing environment...
When chemical structures are provided, they are typically presented as structure drawings that cannot be readily translated into a machine-readable format. When groups do go to the trouble of redrawing dozens of structures, it is almost inevitable that structure errors will be introduced. Years ago, one could have argued that too manyfile formats existed and that it was difficult to agree on a common format for distribution. However, over the last 10 years, the community seems to have agreed on SMILES and MDL Mol or SD files as a standard means of distribution. Both of these formats can be easily processed and translated using freely available software such as OpenBabel, CDK, and RDKit. At the current time, there do not seem to be any technical impediments to the distribution of structures and data in electronic form.
Indeed, even experimental scientists are now quite familiar with the SD and SMILES file formats and standard chemical illustration software like ChemDraw is now able to readily handle such formats, so nobody should really claim to be hobbled by the lack of a standard communication language. 

Walters proposes some commonsense guidelines for aiding reproducibility in computational research, foremost of which is to include source code. These guidelines mirror some of the suggestions I had made in a previous post. Unfortunately explicit source code cannot always be made available for proprietary reasons, but as Walters notes, executable versions of this code can still be offered. The other guidelines are also simple and should be required for any computational submissions: mandatory provision of SD files; inclusion of scripts and parameter files for verification; a clear description of the method, preferably with an example; and an emphasis by reviewers on reproducibility aspects of a study.

As has been pointed out by many molecular modelers in the recent past - Ant Nicholls from OpenEye for instance - it's hard to judge even supposedly successful molecular modeling results because the relevant statistical validation has not been done and because there is scant method comparison. At the heart of validation however is simply reproduction, because your opinion is only as good as my processing of your data. The simple guidelines from this editorial should go some way in establishing the right benchmarks.

All active kinase states are similar but...

CDK2 bound to inhibitor staurosporine (PDB code: 1aq1)
…inactive kinase states are different in their own ways. This admittedly awkward rephrasing of Tolstoy's quote came to my mind as I read this new report on differences between active and inactive states of kinases as revealed by differential interactions with inhibitors. 

The paper provides a good indication of why kinases continue to provide such enduring objects of interest for drug designers. Ever since imatinib (Gleevec) opened the floodgates to kinase drugs and quenched the widespread belief that it might well be impossible to hit particular kinases selectively, researchers have realized that kinase inhibitors may work either by targeting the active as well as the inactive states of their proteins. One intriguing observation emerging from the last few years is that inhibitors targeting inactive, monomeric states of kinases seem to provide better selectivity than those targeting phosphorylated, active states. In this study the authors (from Oxford, Padua and Newcastle) interrogate this phenomenon specifically for cyclin-dependent kinases (CDKs).

CDKs are prime regulators of the cell cycle and their discovery led to a Nobel Prize a few years ago. Over the last few years there have been several attempts to target these key proteins, not surprisingly especially in the oncology field; I worked on a rather successful kinase inhibitor project myself at one point. CDKs are rather typical kinases, having low activity when not bound to a cyclin but becoming active and ready to phosphorylate their substrates when cyclin-bound. What the authors did was to study the binding of a diverse set of inhibitors to a set of CDKs ranging from CDK2 to CDK9 by differential scanning fluorimetry (DSF). DSF can provide information on binding affinity of inhibitors by way of the melting temperature Tm. Measuring the Tm values for binding to different kinases can thus give you an idea of inhibitor selectivity.

The main observation from the study is that there is much more discrimination in the Tm values and therefore the binding affinities of the inhibitors when they are bound to the monomeric state than when they are bound to active, cyclin-bound states. Interestingly the study also finds that the same CDK in different states provides more discrimination in inhibitor binding compared to different CDKs in the same state. There is thus plenty of scope in targeting the same CDK based on its different states. In addition, binding to the monomeric state of specific cyclins will be a better bet for engineering selectivity than binding to the active, cyclin-bound states.

There are clearly significant structural differences between the inactive and active states of CDKs, mainly related to the movement of the so-called alphaC-helix. This study promisingly indicates that structure-based drug design experts can potentially target such structural features of the binding sites in the inactive states and design more selective drugs.

High-throughput method for detecting intramolecular hydrogen bonds (IHBs)

We have talked about the importance of intramolecular hydrogen bonds (IHBs) as motifs to hide polar surface area and improve the permeability of molecules a couple of times here. The problem is that there's no high-throughput method for detecting these in, say, a library of a thousand molecules. The best method I know is temperature and solvent dependent NMR spectroscopy but that's pretty time consuming. Computational methods can certainly be useful but they can sometimes lead to false positives by overemphasizing hydrogen bonding between groups in proximity.

Now here's a promising new method from Pfizer which could lend itself to high-throughput screening of IHBs. The authors essentially use supercritical fluid chromatography (SFC) to study retention times of molecules with and without IHBs. The solvent system consists of supercritical CO2 spiked with some methanol. They pick a judicious set of matched molecular pairs, each one of which contains a hydrogen bonded and non-hydrogen bonded version. They then look at retention times and find - not surprisingly - that the hydrogen-bonded versions which can hide their polar surface area have lower retention times on the column. 

They corroborate these findings with some detailed NMR studies looking at solvent and temperature dependent chemical shifts. At the beginning, when they plot the retention times vs the total polar surface area (TPSA) they get a nice correlation for non IHB compounds. For compounds with IHBs however, the TPSA is an imperfect predictor; what they find in that case is that a new parameter called EPSA based on the retention time is a better predictor of IHBs compared to TPSA.

This seems to me to be a potentially valuable method for quickly looking at IHBs in relatively large numbers of compounds, especially since we have now started worrying about how to get "large" molecules like macrocycles and peptides across membranes. And I assume every medium or large company will probably have access to SFC, so the technology itself should not pose a measurable barrier. One more handy tool to look at "beyond rule of five" compounds.


A High Throughput Method for the Indirect Detection of Intramolecular Hydrogen Bonding

J. Med. Chem.Just Accepted
Publication Date (Web): March 18, 2014 (Article)
DOI: 10.1021/jm401859b

Why the new NIH guidelines for psychiatric drug testing worry me

Psychiatric drugs have always been a black box. The complexity of the brain has meant that most successful drugs for treating disorders like depression, psychosis and bipolar disorder were discovered by accident and trial and error rather than rational design. There have also been few truly novel classes of these drugs discovered since the 70s (and surely nothing like chlorpromazine which caused a bonafide revolution in the treatment of brain disorders). Psychiatric drugs also challenge the classic paradigm linking a drug to a single defective protein whose activity it blocks or improves. When the molecular mechanism of many psychiatric medicines was studied, it was found that they worked by binding to multiple receptors for neurotransmitters like serotonin and dopamine; in other words psychiatric drugs are "dirty".

There is a running debate over whether a drug needs to be clean or dirty in order to be effective. The debate has been brought into sharp focus by three decades of targeted drug discovery in which selective, (relatively) clean drugs hitting single proteins have led to billion dollar markets and relief for millions of patients. For instance consider captopril which blocks the action of angiotensin-converting-enzyme (ACE). For a long time this was the world's best-selling blood pressure-lowering drug. Similar single drug-single protein strategies have been effective for other major diseases like AIDS (HIV protease inhibitors) and heart disease (HMG-CoA inhibitors like Lipitor).

However recent thinking has veered in the direction of drugs that are "selectively non-selective". The logic is actually rather simple. Most biological disorders are modulated by networks of proteins spanning several physiological systems. While some of these are more important than others as drug targets, there are sound reasons to think that targeting a judiciously picked set of proteins rather than just a single one would be more effective in treating a disease. The challenge has been to purposefully engineer this hand-picked target promiscuity into drugs; mostly it is found accidentally and in retrospect, as in case of the anticancer drug Gleevec. Since we couldn't do this rationally (it's hard even to target a single protein rationally), the approach was simply to test different drugs without worrying about their mechanism and let biology decide which drugs work best. In fact, in the absence of detailed knowledge about the molecular targets of drugs this became a common approach in many disorders, and even today the FDA does not necessarily require proof of mechanism of action for a drug as long as it's shown to be effective and safe. In psychiatry this has been the status quo for a long time.

But now it looks like this approach has run into a wall. Lack of knowledge of the mode of action of psychiatric drugs may have led to accidental discovery, but the NIH thinks that it has also stalled the discovery of original drugs for decades. The agency has now taken note of this and as a recent editorial in Nature indicates, they are now going to require proof of mechanism of action for new psychiatric medicines. The new rules came from an appreciation of ignorance:
"Part of the problem is that, for many people, the available therapies simply do not work, and that situation is unlikely to improve any time soon. By the early 1990s, the pharmaceutical industry had discovered — mostly through luck — a handful of drug classes that today account for most mental-health prescriptions. Then the pipeline ran dry. On close inspection, it was far from clear how the available drugs worked. Our understanding of mental-health disorders, the firms realized, is insufficient to inform drug development."
For several years, the NIMH has been trying to forge a different approach, and late last month institute director Thomas Insel announced that the agency will no longer fund clinical trials that do not attempt to determine a drug or psychotherapy’s mechanism of action. Without understanding how the brain works, he has long maintained, we cannot hope to know how a therapy works."
This is a pretty significant move on the part of the NIMH since as the article notes, it could mean a funding cut for about half of the clinical trials that the agency is currently supporting. The new rules would require researchers to have much better hypotheses regarding targets or pathways in the brain which they think their therapies are targeting, whether the therapies are aimed at depression or ADD. So basically now you cannot just take a small molecule that seems to make mice happier and pitch it in clinical trials for depression.

Personally I have mixed feelings about this development. It would indeed be quite enlightening to know the mechanism of action for neurological drugs, and I will be the first one to applaud if we could neatly direct therapies at specific patient populations based on known differences in signaling pathways for instance. But the fact remains that our knowledge of the brain is still too primitive and clunky for us to easily formulate target-based hypotheses for new psychiatric drugs. For complex, multifactorial diseases like schizophrenia there are still dozens of hypotheses for mechanisms of action. In addition there is a reasonable argument that it's precisely this obsession with targets and mechanisms of action that has slowed down pharmaceutical development; the thinking is that hitting well-defined targets has been too reductionist, and many times it doesn't work because it disregards the complexities of biology. If you really wanted to discover a new antidepressant, then it really might be better to look at what drug makes mice happier rather than try to design drugs to hit specific protein targets that may or may not be involved in depression.

So yes, I am skeptical about the NIMH's new proposal, not because an understanding of mechanism of action is futile - it's the opposite, in fact - but because our knowledge of drug discovery and design is still not advanced enough for us to formulate and successfully test target-based hypotheses for complex psychiatric disorders. The NIH thinks that its approach is necessary because we haven't found new psychiatric drugs for a while, but in the face of biological ignorance what it may do might be to make the situation worse. I worry that requiring this kind of proposal would simply slow down new psychiatric drug discovery for want of knowledge. Perhaps there is a middle ground in which you require a few trials to demonstrate mechanism of action while allowing the majority to proceed on their own merry way, simply banking on the messy world of the biology to give them the answer. Biology is too complex to be held hostage to rational thinking alone.

Cosmological inflation, water in proteins and JFK: The enigma of ignorance

I was immersed in the American Chemical Society's national meeting in Dallas this week, which meant that I could not catch more than wisps of the thrilling announcement from cosmology on Monday that could potentially confirm the prediction of inflation. If this turns out to be right it would indeed be a landmark discovery. My fellow Scientific American blogger John Horgan - who performs the valuable function of being the wet blanket of the network - prudently cautions us to wait for confirmation from the Planck satellite and from other groups before we definitively proclaim a new era in our understanding of the universe. As of now this does look like the real deal though, and physicists must be feeling on top the world. Especially Andrei Linde whose endearing reaction to a surprise announcement at his front door by a fellow physicist has been captured on video.

But as social media and the airwaves were abuzz with news of this potentially historic discovery, I was sitting in a session devoted to the behavior of water in biological systems, especially in and around proteins. Even now we have little understanding of the ghostly networks of water molecules surrounding molecules that allow them to interact with each other. We have some understanding of the thermodynamic variables that influence this interaction, but as of now we have to dissect these parameters individually on a case-by-case basis; this is still no general algorithm. Our lack of knowledge is hampered by both an overarching theoretical framework and computational obstacles. The water session was part of a larger one on drug design and discovery. The role of water in influencing the binding of drugs to proteins is only one of the unknowns that we struggle with; there are dozens of others factors – both known unknowns and unknown unknowns – which contribute to the behavior of drugs on a molecular level. We have made some promising advances, but there is clearly a long way to go.

Sitting in these talks, surrounded by physicists and chemists who were struggling to apply their primitive computational tools to drug design, my thoughts about water briefly juxtaposed with the experimental observation of cosmological inflation. And I could not help but think about the still gaping chasms that exist in our understanding of so many different fields.
Let’s put this in perspective: We have now obtained what is likely the first strong experimental evidence for cosmological inflation. This is a spectacular achievement of both experiment and theory. If true no doubt there will be much well deserved celebration, not to mention at least one well deserved Nobel Prize.

But meanwhile, we still cannot design a simple small organic molecule that will bind to a protein involved in a disease, be stable inside the body, show minimum side effects and cure or mitigate the effects of that disease. We are about as far away from this goal as physics was from discovering the Big Bang two hundred years ago, perhaps more. Our cancer drugs are still dirty and most of them cause terrible side effects; we just don’t have a good enough scientific understanding of drug behavior and the human body to minimize these effects. Our knowledge of neurological disorders like Alzheimer’s disease is even more backward. There we don’t even know what the exact causes are, let alone how we can mitigate them. We still waste billions of dollars in designing and testing new drugs in a staggering process of attrition that we would be ashamed of had we known something better. And as I mentioned in my series of posts on challenges in drug discovery, even something as simple as getting drugs past the cell membranes is still an unsolved problem on a general level. So is the general problem of figuring out the energy of binding between two arbitrary molecules. The process of designing medicines, both on a theoretical and an experimental level, is still a process of fits and starts, of Hail Mary passes and failed predictions, of groping and simply lucking out rather than proceeding toward a successful solution with a trajectory even resembling a smooth one. We are swimming in a vast sea of ignorance, floundering because we often simply don’t have enough information.

The fact remains that we may have now mapped the universe from its birth to the present but there are clearly areas of science where our knowledge is primitive, where we constantly fight against sheer ignorance, where we are no more than children playing with wooden toys. In part this is simply about what we call domains of expertise. There are parts of nature which can bend to our equations after intense effort, and yet there are other parts where those equations almost become pointless because we cannot solve them without significant approximations. The main culprit for this failure concerns the limitations of reductionism which we have discussed many times on this blog. Physics can solve the puzzle of inflation but not the conundrum of side effects because the latter is a product of a complicated emergent system, every level of which demands an understanding of fundamental rules at its own level. Physics is as powerless in designing drugs today  - or in understanding the brain for that matter - as it is successful in calculating the magnetic moment of the electron to ten decimal places. Such is the paradox of science; the same tools which allow us to understand the beginnings of the cosmos fail when applied to the well-being of one of its tiniest, most insignificant specks of matter.

Scientists around the world are calling the latest discovery “humbling”. But for me the finding is far more humbling because it illuminates the gap between what we know and how much more we still have to find out. This may well be a historic day for physics and astronomy, but there are other fields like chemistry, neuroscience and medicine where we are struggling even with seemingly elementary problems. As a whole science continues to march on into the great unknown and there remains an infinite regression of work to do. That’s what makes it our impossibly difficult companion, one whose company we will be confronted with for eternity. While we have reached new peaks in some scientific endeavors, we have barely started clearing the underbrush and squinting into the dark forest in others. It is this ignorance that keeps me from feeling too self-congratulatory as a member of the human species whenever a major discovery like this is announced. And it is this ignorance that makes our world an open world, a world without end.

A comparison, however, provided a silver lining to this feeling of lack of control. Catching a break in the day’s events I strolled down Houston Street after lunch. Almost fifty-one years ago a car drove down this street and then slowed down for the sharp left turn on Elm Street. At the intersection stood the Texas Book Depository. Three shots rang out, a young President’s life was snuffed out and the river of American history changed course forever. All because of the rash actions of a confused and deranged 23-year old former marine. Looking out of the sixth floor window I could see how a good marksman could easily take the shot. What really strikes you however is the perfect ordinariness of the location, a location made extraordinary in space and time because of a freak accident of history. It compels us to confront our utter helplessness in the face of history’s random acts. Oswald got lucky and left us floundering in the maelstrom of misfortune.
But a cosmic perspective may help to assuage our incomprehension and provide salve for our wounds. Carl Sagan once said that if you want to make an apple pie from scratch, you have to first invent the universe. That fateful bullet on November 22, 1963 was the result of an infinitude of events whose reality was energized into potential existence by the same inflation that we are now exploring through ground and space-based telescopes and the ingenuity of our scribblings. There is something reassuring in the fact that while we still do not understand the enigma of human thought and feeling that dictated the trajectory of that bullet, we can now at least understand where it all started. That has to count for something.