The Curious Wavefunction: Molecular dynamics: I have a bad feeling about this.

Molecular dynamics: I have a bad feeling about this.

By Wavefunction on Thursday, October 31, 2013

Computer models of chemical and biological systems are not reality; rather they are what I call “invitations to reality”. They provide guidance to experimentalists to try out certain experiments, test certain techniques. They are suggestive, not factual. However as any good modeler and chagrined experimentalist knows, it’s not hard to mistake models for reality, especially when they look seductive and are replete with bells and whistles.

This was one of the many excellent points that Anthony Nicholls made in his lunch critique of molecular dynamics yesterday at the offices of OpenEye scientific software in Cambridge, MA. In his talks and papers Anthony has offered not just sound technical criticism but also a rare philosophical and historical perspective. He has also emerged as one of the sharpest critics of molecular dynamics in the last few years, so we were all eager to hear what it exactly is about the method that rubs him the wrong way. Many of his friends and colleagues call him ‘Ant’, so that’s what I will do here.

Here’s some background for a general audience: Molecular dynamics (MD) is a computational technique that is used to simulate the motion of atoms and molecules. It is used extensively in all kinds of fields, from biochemistry to materials science. Most MD employed in research is classical MD, based on Newton’s laws of motion. We know that the atomic world is inherently quantum mechanical in nature, but it turns out we can get away to a remarkable extent using classical mechanics as an approximation. Over the last few years user-friendly software and advances in computing hardware have brought MD to the masses, so that even non-specialists can now run MD calculations using brightly colored and accessible graphical user interfaces and desktop computers. A leader in this development is David E. Shaw, creator of the famed D E Shaw hedge fund who has made the admirable decision to spend all his time (and a good deal of his money) developing MD software and hardware for biochemistry and drug discovery.

Ant’s 2-hour talk was very comprehensive and enjoyable, covering several diverse topics including a few crucial ones from the philosophy of science.

It would be too much to describe everything that Ant said and I do hope OpenEye puts the video up on their website. I think it would be most convenient to summarize his main points here.

MD is not a useless technique but it’s not held up to the same standards as other techniques, and therefore its true utility is at best unknown: Over the last few years the modeling community has done a lot of brainstorming about the use of appropriate statistical and benchmarking methods to evaluate computational techniques. Statistical tests have thus emerged for many methods, including docking, shape-based screening, protein-based virtual screening and quantum chemical calculations. Such tests are however manifestly lacking for molecular dynamics. As Ant pointed out, almost all statements in support of MD are anecdotal and uncontrolled. There are almost no follow-up studies.

MD can accomplish in days what other techniques can achieve in seconds or hours: No matter how many computational resources you throw at it, the fact remains (and will likely always remain) that MD is a relatively slow technique. Ant pointed out cases where simpler techniques gave the same results as MD but in much lesser time. I think this reveals a more general caveat; that before looking for complicated explanations for any phenomenon in drug discovery or biology (potency, selectivity, differences in assay behavior etc.), one must look for simple ones. For instance is there a simple physicochemical property like molecular weight, logP, number of rotatable bonds or charge that correlates with the observed effect? If there is one, why run a simulation lasting hours or days to get the same result?

A case in point is the recent Nature paper by D. E. Shaw’s group described by Derek on his blog. Ant brought our attention to the Supporting Information which says that they got the same result for the ligand pose using docking which they got using MD, a difference translating to a simulation time of days vs seconds. In addition they saw a protein pocket expansion in the dynamics simulation whose validity was tested by synthesizing one compound. That they prospectively tested the simulation is a good thing, but one compound? Does that prove that MD is predictive for their system?

MD can look and feel “real” and seductive: This objection really applies to all models which by definition are not real. Sure, they incorporate some elements of reality but they also leave many others out. They simplify, use fudge factors and parameters and often neglect outliers. This is a not a strike against models since they are trying to model some complex reality and they cannot do this without simplification, but it does indicate reasons for being careful when interpreting their results. However I agree that MD is in a special category since it can generate very impressive movies that emerge from simulations run on special purpose machines, supercomputers or GPUs for days or months at a time. Here’s one that looks particularly impressive and denotes a drug molecule successfully “finding” its binding site on a protein.

This apparently awesome power of computing power and graphical software brought to bear on an important problem often makes MD sound way more important than what it is. The really damning thing though may be that shimmering protein on your screen. It’s very easy for non-computational chemists to believe that that is how the proteins in our body actually move. It’s easy to believe that you are actually seeing the physics of protein motion being simulated, down to the level of individual atoms.

But none of this is really true. Like many other molecular models what you are seeing in front of you is a model, replete with approximations and error bars. As Ant pointed out, it’s almost impossible to get real variables like statistical mechanical partition functions, let alone numbers from experiment, out of such simulations. Another thing that’s perpetually forgotten is that in the real world, proteins are not isolated but are tightly clustered together with other proteins, ions, small molecules and a dense blanket of water. Except perhaps for the water (and poorly understood water at that), we are ignoring all of this when we are running the simulation. There are other problems in real systems, like thermal averaging and non-ergodicity which physicists would appreciate. And of course, let’s not even get started on the force fields, the engines at the heart of almost every simulation technique that are consistently shown to be imperfect. No, the picture that you see in a molecular dynamics movie is a shadow of its “real” counterpart, even if there is some agreement with experiment. At the very least this means you should keep your jaw from dropping every time you see such a movie.

Using jargon, movies and the illusion of reality, MD oversells itself to the public and to journals: Ultimately it’s not possible to discuss the science behind MD without alluding to the sociological factors responsible for its perception. The fact is that top journals like Nature or Science are very impressed when they see a simulation shepherded by a team led by Big Name Scientist being run for days using enough computing power to fly a jetfighter. They are even more impressed when they see movies that apparently mirror the actual motion of proteins. Journals are only human, and they cannot be entirely faulted for buying into seductive images. But the unfortunate consequence of this is that MD gets oversold. Because it seems so real, because simulations that are run for days must undoubtedly be serious stuff because they have been run for days, because their results are published in prestigious journals like Nature, therefore it all must be important stuff. This belief is however misplaced.

What’s the take home message here? What was strange in one sense was that although I agreed with almost everything that Ant said, it would not really affect the way I personally use MD in my day-to-day to work, and I suspect this is going to be the case for most sane modelers. For me MD is a tool, just like any other. When it works I use its results, when it doesn’t I move on and use another tool. In addition there are really no other ways to capture protein and ligand motion. I think Ant’s talk is best directed at the high priests of MD and their followers, people who either hype MD or think that it is somehow orders of magnitude better than other modeling techniques. I agree that we should all band together against the exhortations of MD zealots.

I am however in the camp of modelers who have always used MD as an idea generator, a qualitative tool that goads me into constructing hypothesis and making suggestions to experimentalists. After all the goal of the trade I am involved in is not just ideas but products. I do care about scientific rigor and completeness as much as the other person, but the truth is that you won’t get too far in the business I am involved in if you constantly keep worrying about scientific rigor rather than the utility – even if it’s occasional – of the tools we are using. And this applies to theoretical as well as experimental tools; when was the last time my synthetic chemistry friends used a time-tested reaction on a complex natural product and got the answer they expected? If we think MD is anecdotal, we should also admit that most other drug design strategies are anecdotal too. In fact we shouldn’t expect it to be otherwise. In a field where the validity of ideas is always being tested against a notoriously complex biological system whose workings we don’t understand and where the real goal is to get a useful product, even occasional successes are treasured and imperfect methods are constantly embraced.

Nonetheless, in good conscience my heart is in Ant’s camp even if my head protests a bit. The sound practice of science demands that every method be duplicated, extensively validated, compared with other methods, benchmarked and quantified to the best of our abilities if we want to make it part of our standard tool kit. This has manifestly not happened with MD. It’s the only way that we can make such methods predictive. In fact it’s part of a paradigm which as Ant pointed out goes back to the time of Galileo. If a method is not consistently predictive it does not mean it is useless, but it does mean that there is much in it that needs to be refined. Just because it can work even when it’s not quantitative does not mean trying to make it quantitative won’t help. As Ant concluded, this can happen when the community comes together to compare and duplicate results from their simulations, when it devotes resources to performing the kind of simple benchmarking experiments that would help make sense of complicated results, when theorists and experimentalists both work together to achieve the kinds of basic goals that have made science such a successful enterprise for five hundred years.

19 comments:

Chris9:09 PM, October 31, 2013
Several of these arguments are actually directed at the "field / practitioners" and not the technique. I agree that molecular dynamics studies should be held to higher statistical standards, but with growing computational resources and limited grad student manpower, molecular dynamics studies will invariably improve in statistics.

With respect to the technique, it's misguided to base a critique on molecular dynamics on the application to "binding studies" alone. There are many examples of molecular dynamics applications for which no simpler computational technique can be formulated and for which wet-lab experiments can provide little information.
ReplyDelete
Replies
Anonymous1:41 AM, November 01, 2013
I feel like many points of critics for MD are more related to limitations of modern force fields than to the actual concept of MD.
ReplyDelete
Replies
Anonymous8:45 AM, November 01, 2013
Since the author has taken the ethically dubious route of omitting any statement about potential conflict of interest, I'll add it here: OpenEye is a for-profit company that markets a line of computer products as alternatives to extensive molecular simulation. The financial success of OpenEye relies on researchers buying into their vision and then buying their software.

Many of the criticisms in the article above are warranted, but the data is cherry picked and there is no attempt at statistical rigour while criticizing a field for lack of statistical rigour. To use the original authors words, this article is anecdotal and uncontrolled.
ReplyDelete
Replies
Anonymous7:11 PM, November 02, 2013
Yet another shallow marketing exercise from Nicholls. As someone said above, there is a clear conflict of interest and this "lecture" really sounds like a buy-my-products drill.

Whoever is talking about the limitations of MD without mentioning the key problem of sampling is not an expert, just a commentator. It is hard to understand how so many "scientists" are buying this. No doubt there are a lots of problems with MD, but one needs a deep understanding to solve them. You won't find this here.
ReplyDelete
Replies
Peter Kenny10:14 PM, November 02, 2013
My gripe with MD is when it is used in a manner that could be described as either 'qualitative' or 'graphical'. For example, somebody runs a simulation of something really huge and then shows a movie of an animation. It could be really great or it could be complete horse shit but how can I tell? Similarly, when an MD simulation is used to 'confirm' a docked pose.

Sampling was mentioned by the previous commentator and this is an important consideration if attempting to do free energy calculations (which generate quantitative and testable predictions). A key question here is whether MD samples more or less effectively than Monte Carlo (may be system and even force field dependent). One can debate as to whether free energy calculations with protein flexibility are worth their computational expense (I keep an open mind here) but you are going to need effective sampling in order to apply the statistical mechanics. Force fields will always be an issue and another of my gripes is the use of atom centered charges (but don't let that distract us from this discussion).

I also have an issue with people talking about dynamic effects on affinity. Free energies of binding are ensemble averages and you should get the same answer whether you sample with MD or MC.
ReplyDelete
Replies
Anonymous12:09 PM, November 07, 2013
Well, I am an academic molecular modeler and I am using MD quite often, almost not exclusively, and I try my best not to use MD when simple and more effective techniques are available. When I am asked to perform some MD I always ask for solid experimental knowledge first. I prefer to be guided by experimental facts when I start the modeling part, and I also want data to validate my results, am I annoying. I constantly remind colleagues full of good intents of the limitations and approximations of computational techniques. I am not simply aware of those, I want the people I work with to understand them too, at least a little bit. As a result of living by the golden rule of being a good modeler according to Derek Rowe, I am just causing myself a lot of trouble.

Everybody around me just seem to assume that
(1) MD is the golden standard in molecular modeling
(2) "Blind" MD provides interesting structural data provided it is run long enough. Nothing yet? Just run the simulation longer, "something will happen" - the sad thing about this is that it does sometimes... persuading everybody that it should, all the time, without fail
(3) Validated experimental data, even if perfectly good as it is, provided it is vaguely structurally-related, will just get "better" with the help of some MD. At best, I am asked to make complicated modeling jobs just to find out what has been proven already. At worst, I am suggested to perform a biased, useless simulation by injecting the results into the input... (who says MD is not predictive enough?)
(4) The above should be enough for me to publish "pure" molecular modeling papers in decent journals...

The biggest progress in the molecular modeling field in the recent years as percieved by most people I work with is -from what I have experienced- the implementation of ambient occlusion rendering in VMD... The pictures I can provide look so much nicer since then...

By being so stubbornly critical of my own area of expertise I am just eventually persuading others that I am being lazy. Should promise wonders with molecular modeling, and expecially molecular dynamics, this would make my life easier. All the nonsense like "this movie made from the MD trajectory shows the actual motion of the system", "ok, I will compute the free energy of binding of your very original molecule to this target, this is really easy", "sure, you can make a reliable protein model yourself, just paste the sequence to a Web server, it works without fail so this is good" is just what most people want to hear. I really do not like that.

I am not sure Antony Nichols is as good for voicing a healty criticism for MD as Gerard Kleywegt had been for crystallography, even if Ant seems to coin perfectly the biggest issue we are currently facing (forcefields: their lack of sound validation, their obscure parameterization protocols...). As with many in silico techniques MD is consistently useful (and sometimes, spectacularly so) provided simulations are prepared and analyzed with the help validated experimental facts. "MD zealots" are overselling it? Sure. Please just notice that those zealots are often not the MD specialists themselves, most of whom I know would agree with most of what Ant says.

Scientists concerned with the quality of drug design research in academia should better take care of the "omics zealots" urgently. Now I can live with all the misconceptions related to molecular modeling and MD around me, but I do not want to be told one day "just collect all biochemical data from every possible source, throw all the numbers in a big database, then code a computer algorithm will find out the needle in the haystack".
ReplyDelete
Replies
Anonymous11:35 AM, December 11, 2013
Your criticisms (1) - (4) made me cringe... they are too true!

Here are some neat sampling techniques called replica exchange and transition path sampling.
http://www.ncbi.nlm.nih.gov/pubmed/16957325
http://en.wikipedia.org/wiki/Transition_path_sampling

How ARE the forcefield parameters adjusted and fitted for forcefields... let's say the Charmm forcefield just to be concrete? I suspect that machine learning has a lot to offer here.

--Geoff
ReplyDelete
Replies
MPS6:55 AM, January 17, 2014
Great article, thanks for the debate.

I don't like this kind of criticism. It is as fundamentalist as its opposite. I am a experimentalist and a modeler on protein conformation, both with the same rigor (I hope).

I found many weak points in the Ant's critique, but I'd like to focus on the following point. Scientist in general make dubious or weak assumptions on their results, even experimentalist. After every CD or fluorescence spectrum we read that "the protein" behaves this or that: "the protein": just one, no ensemble, no averaging, all the molecules the same... The native structure of proteins is usually assumed to be a rock the shape you see from the crystal structure, disregarding packing artifacts or the fact that the structure is an averaging (some people have found alternative conformation from the data discarded by crystallographers), even disregarding dynamical evidence from NMR. How many dimers emerge irresponsible from crystal structures, without further tests? How many more emerge from throughput screening "in vitro" tests in uncontrolled conditions, disregarding the effects of fusion-proteins, tags, cysteine oxidation, aggregation an so on? Almost everybody fit their unfolding curves to two- or three-state models assuming that this is the truth without further evidence, even without considering that other models could also fit (e.g: downhill). We read a lot of concluding information based on Western blots and anybody that did a western in his life knows how much the result depends on the staining (every year more and more sensitive), how uncertain remains the concentration of the protein detected, how unreliable are commercial antibodies... Most of the best in vitro experiments are also carried out in (almost pure) water; crowding effect is modeled with polymers as dextran or PEG (albumin if you want to be more realistic).

Today, science-journals (we) are eager of concluding results and demand every paper to be a final proof. In this scenario, authors adopt a car-seller over-confident attitude, and papers are full of dubious and weak conclusions based either on experimental or simulation techniques. Maybe, to solve the doubts about MD universal application, we should allow the modelers to publish their results without any expectation of truth, and wait to experimentalist to prove or discard their assumptions.

In brief. We are making assumptions and models every day, even with experimental results. Nobody saw a protein folding or finding its ligand. Everything is assumed from indirect facts. In this context, maybe MD could be a more indirect tool, but I can tell that it helped me to understand and predict more than once.
ReplyDelete
Replies

Add comment

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS

About Me

“Ashutosh (Ash) Jogalekar is a scientist and science writer based in the San Francisco Bay Area. He has been blogging at the “Curious Wavefunction” blog for more than fifteen years, and in this capacity has written for several organizations including Nature, Scientific American and the Lindau Meeting of Nobel Laureates. His main interests include the history and philosophy of science and technology, especially physics and mathematics. Professionally he is trained as a chemist and works at the intersection of chemistry, technology and drug discovery. Follow @curiouswavefn

Field of Science

The Curious Wavefunction

Molecular dynamics: I have a bad feeling about this.

19 comments:

Previous Posts

Popular Posts

Follow

Blogroll

Journals and Magazines

Archives