The golden age of computational materials science gives me a disturbing feeling of déjà vu


Graphene, a wonder material which was made by scientists using a version of Scotch tape (Image: Wikipedia)
I was a mere toddler in the early 1980s when they announced the “golden age of computational drug design”. Now I may have been a toddler, but I often hear stories about the impending golden age from misty-eyed veterans in the field. A cover story in Fortune magazine (which I can never seem to find online) announced that pharmaceutical scientists were now designing drugs on computers. The idea was that once you feed in the parameters for how a drug behaves in the human body, the computer would simply spit out the answer. The only roadblock was computing power limited by hardware and software advances. Give it enough time, the article seemed to indicate, and the white-coat clad laboratory scientist might be a historical curiosity. The future looked rosy and full of promise.

Fast forward to the twilight days of 2013. We are still awaiting the golden age of computational drug design. The preponderance of drug discovery and design is still enabled by white coat-clad laboratory scientists. Now let’s be clear about one thing: the computational side of the field has seen enormous advances since the 1980s and it continues to thrive. There will almost undoubtedly be a time when its contributions to drug design would be seen as substantial. Most drugs perform their magic in living systems by binding to specific proteins, and computational drug design is now competent enough so that it can predict with a fair degree of accuracy, the structure and orientation of a drug molecule bound in a protein’s deep binding pocket. Computational scientists can now suggest useful modifications to a drug’s structure which laboratory chemists can make to improve multiple properties including solubility, diffusivity across cell membranes, activity inside cells and ability to avoid getting chewed up by enzymes in the body. You would be hard pressed to find a drug design project where computational modeling does not play at least a modest role. The awarding of this year’s Nobel Prize in chemistry to computational chemists is only one indication of how far the field has advanced.

And yet it seems that computational drug designers are facing exactly the same basic challenges they faced in the 80s. They have certainly made progress in understanding these challenges, but robust prediction is still a thing of the future. The most significant questions they are dealing with are the same ones they dealt with in the 80s: How do you account for water in a protein-drug system? How do you calculate entropies? How do you predict the folded structure of a protein? How do you calculate the different structures a drug molecule adopts in the aqueous milieu of the body? How do you modify a drug compound so that cells – which have evolved to resist the intrusion of foreign molecules - don’t toss it right out? How do you predict the absolute value of the binding energy between drug and protein? And scientists are grappling with these questions in spite of tremendous, orders-of -magnitude improvements in software and hardware.

I say all this because a very similar cover story about computational materials design in this month's Scientific American evokes disturbing feelings of déjà vu in me. The article is written by a pair of scientists who enthusiastically talk about a project whose goal is to tabulate calculated properties of materials for every conceivable application: from lightweight alloys in cars to new materials for solar cells to thermoelectric materials that would convert dissipated heat into electricity. The authors are confident that we are now approaching a golden age of computational materials design where high-throughput prediction of materials properties will allow us to at least speed up the making of novel materials.
We can now use a century of progress in physics and computing to move beyond the Edisonian process (of trial and error). The exponential growth of computer-processing power, combined with work done in the 1960s and 1970s by Walter Kohn and the late John Pople, who developed simplified but accurate solutions to the equations of quantum mechanics, has made it possible to design new materials from scratch using supercomputers and first-principle physics. The technique is called high-throughput computational materials design, and the idea is simple: use supercomputers to virtually study hundreds or thousands of chemical compounds at a time, quickly and efficiently looking for the best building blocks for a new material, be it a battery electrode, a metal alloy or a new type of semiconductor.
It certainly sounds optimistic. However the article seems big on possibilities and short on substance and shortcomings. This is probably because it occupies only three pages in the magazine; I think it deserved far more space, especially for a cover article. As it stands the piece appears more pollyannaish than grounded in cautious optimism.

I applaud the efforts to build a database of computed materials properties but I am far more pessimistic about how well this knowledge can be used in designing new materials in the near future. I am not a materials scientist, but I think some of the problems the computational end of the discipline faces are similar to those faced by any computational chemist. As the article notes, the principal tools used for materials design are quantum mechanics-based chemistry methods developed mainly by John Pople and Walter Kohn in the 1970s, a discovery that got the duo the 1998 chemistry Nobel Prize. Since then these methods have been coded into dozens of efficient, user-friendly computer programs. Yet these methods – based as they are on first principles – are notoriously slow. Even with heavy computing power it can take several days to do a detailed quantum mechanical calculation on an atomic lattice. With materials involving hundreds of atoms and extended frameworks it would take much longer.

I am especially not convinced that the methods would allow the kind of fast, high-throughput calculations that would substitute for experimental trial and error. One reason why I feel pessimistic is because of the great difficulty of predicting crystal structures. Here’s the problem: the properties of a material depend on the geometric arrangement of its atoms in a well-defined crystal lattice. Depending on the conditions (temperature, pressure, solvent) a material can crystallize in dozens of possible structures, which makes the exercise of assuming “a” crystal structure futile. What’s worse for computer calculations is that the energy differences between these structures may be tiny, within the error limits of many theoretical techniques.

On the other hand, the wrong crystal structure could give us the wrong properties. The challenge for any kind of computational prediction method is therefore two-fold: firstly, it has to predict the various possible crystal forms that a given material can adopt (and sometimes this number can run into the hundreds). Secondly, even if it can achieve this listing, it now has to rank these crystal forms in order of energy and predict which would be the most stable one. Since the energy differences between the various forms are tiny, this would be a steep challenge even for detailed calculation on a single material. Factoring conditions of temperature, pressure and solvent into the calculation would make it even more computationally expensive. To me, it seems like doing all this in a high-throughput manner for dozens or hundreds of materials would be an endeavor fraught with delays and errors. It would certainly make for an extremely valuable intellectual contribution that advances the field, but I cannot see how we can be on the verge of practically and cheaply using such calculations to design complex new materials at a pace which at least equals experiment.

The second problem I foresee is a common one, what almost any scientist or engineer calls the multi-parameter optimization problem. We in the drug design field face it all the time; not only do we need to optimize the activity of a drug inside cells, but we also need to simultaneously optimize other key properties like stability, toxicity and solubility and – at a higher level – even non-scientific properties like price and availability of starting materials for making the drug. Optimizing each one of these properties would be an uphill battle, but optimizing them all at once (or at least two or more at a given time) strains the intellect and resources of the best scientists and computers. I assume that new materials also have to satisfy similar multiple properties; for instance a new alloy for cars would have to be lightweight, environmentally benign, stable to heat and light and inexpensive. One of the principal reasons drug discovery is still so hard is this multi-parameter optimization problem, and I cannot see how the situation would be different for materials science on a computational level, especially if the majority of techniques involve expensive quantum mechanical calculations.

One way in which calculations can be sped up – and this is something which I would have loved to read about in the article – is by using cheap, classical mechanics-based parameterized methods. In these methods you simplify the problem by using parameters from experiment in a classical model that implicitly includes quantum effects by way of the experimentally determined variables. While these calculations are cheap they can also result in larger error, although they work almost as well as detailed quantum calculations for simpler systems. It seems to me that this database of properties they are building could be shored up with experimental values and used to build parameterized, cheaper models that can actually be employed in a high-throughput capacity.

Does all this make me pessimistic about the future of computational materials design? Not at all; we are just getting started and we need an influx of talented scientists in this area. Computational drug design followed the classic technology curve, with inflated expectations followed by a valley of disappointment culminating in a plateau of realistic assessment. Perhaps something similar will happen for computational materials design. But I think it’s a mistake to say that we are entering the golden age. We are probably testing the waters right and getting ready for a satisfying dip. And that is what it should be called, a dip, not a successful swim across the materials channel. I wish those who take the plunge all good luck.

First published on the Scientific American Blog Network.

Note: Profs. Chris Cramer and Alan Aspuru-Guzik have pointed me to some successful examples of the paradigm. It's clear that certain kinds of problems (especially involving MOFs) are more accessible to the approach than others.

Enthalpy-entropy compensation and water networks

Enthalpy-entropy compensation (EEC) is an endlessly interesting phenomenon; it's the kind of topic that makes scientists either roll up their sleeves for a good fight or slowly walk away from the table. The basic idea is simple; when you are building new chemical functionality into a drug molecule to interact better with a protein (improving ∆H) you are also tying down the molecule (worsening ∆S) and constraining its movement. However since the two variables oppose each other this won't be reflected in the overall ∆G of binding which will stay the same.

Scientists have been going back and forth over the causes of EEC and now there's a new paper from George Whitesides's group at Harvard, Schrodinger and Brookhaven which sheds some light on one possible, usually neglected factor: the subtle changes in the thermodynamics of the network of water molecules surrounding a ligand. These are not the water molecules displaced by the ligand from the protein pocket (which have received considerable attention over the last decade or so) but the ones on the surface that contact the ligand on the outside.

The paper is based on a workhorse protein system that Whitesides's group has been working on for a while now - carbonic anhydrase. The protein is stable, relatively rigid, biochemically well-studied, amply expressed and easily crystallized by itself and with several ligands; all features which make it a good model system to look at the thermodynamics of binding. Whitesides's group has found out that you can have ligands with different fluorination patterns that bind to the protein and show very similar ∆Gs of binding. This is unexpected, since you expect additional fluorines to give you better entropy from the hydrophobic effect.

To explore the phenomenon the authors use two techniques; x-ray crystallography and molecular dynamics simulations. The former provides information on intermolecular interactions while the latter provides information on the thermodynamics of surrounding water molecules, more specifically about their enthalpy and entropy. The MD and thermodynamic calculations are done using the WaterMap tool from Schrodinger.

From the crystal structures the authors find that the enthalpy of binding can actually get unfavorable from the added fluorines as a result of repulsive interactions with a few oxygens in the protein. Since ∆G stays the same this means that the unfavorable ∆H in the active site might be compensated for by ∆H changes in the water network surrounding the ligand along with corresponding ∆S adjustments. In the picture above, water molecules with more favorable ∆H values are colored green while the unfavorable ones are colored red. 

Notice the difference between the three difluorinated analogs: the 4,6 analog has the most green waters, the 5,6 analog has the most red waters (and an extra red water compared to the others) and the 6,7 analog is somewhere in between. The gradation of unfavorable water molecules around the three compounds tracks well with enthalpies extracted from ITC. The entropies duly compensate. The thermodynamics of surface water molecules therefore certainly seem to be one possible reason for the EEC. It's also worth noting that the behavior of the water molecules corresponds to what you would call an "enthalpy-driven hydrophobic effect".

While we are neglecting second-order effects and while it's still hard to get quantitative agreement down to a kcal or so, I like the fact that we can eyeball such figures and at least qualitatively rank cases by favorable and unfavorable enthalpies. I also find it promising that we can actually do this kind of thing for surface water molecules which are part of a network; ten years ago most people might have thrown up their hands when asked to do this. Of course not every drug-protein binding case is going to be dictated by surface water behavior but the fact that we can at least get a semi-quantitative look at this important factor is, in my opinion, a valuable stepping stone toward the future.

Reference: 





Barry Werth on the cost of new drugs

Barry Werth who wrote the swashbuckling book about the creation of Vertex (sequel out in February) has an excellent piece (also highlighted by @Chemjobber) in the MIT Technology Review about the cost of new drugs. He asks a question which is usually the first question that any pharmaceutical scientist who tells a layperson what he/she does for a living encounters: Why do drugs cost so much? (The next question is usually "Why do drugs have so many side-effects?")

Werth compares two drugs to illustrate the strange world of drug pricing and the moral dilemma that riddles that world: Vertex's cystic fibrosis drug Kalydeco and Regeneron/Sanofi's cancer drug Zaltrap. Here's the problem: Kalydeco is a breakthrough medicine which has breathed completely new life into the treatment of a disease for which no effective therapies existed before. It costs about $300K a year. Zaltrap increases the median lifespan of patients with advanced colorectal cancer by 1.5 months. And it costs $11K a month. Now is it surprising why people are so critical of the pharmaceutical industry? I would be too, if I was constantly bombarded by news of "breakthroughs" like Zaltrap.

The reason why this whole thing seems so absurd is that the actual price of a drug often sounds almost completely arbitrary. As Werth notes, Zaltrap caused an outrage among patients and physicians, leading a group led by doctors from Memorial Sloan Kettering Hospital to protest the price of the drug in an unprecedented NYT Op-Ed. In response Sanofi cut the price of the drug by half through rebates and other schemes. If a drug company can reduce the price of a medication by 50% just like that without major catastrophe, it really makes you ask what the "true" price of the drug is.

In any case, the whole thing is definitely worth a read, especially in an age where drugs are paradoxically going to start becoming more effective - even as they are targeted toward select, small patient subpopulations - and simultaneously more expensive.