|Automation in biology and drug development can design, execute, explore and analyze, but can it ask the right questions?|
The piece talks about the "virtualization" of several key aspects of biology and drug discovery which would involve using software and smart automation to perform experiments. The areas that Horowitz and Pande are focusing on in particular involve cloud-based experiments, machine learning and lab automation. This is not the first time that Bay Area software entrepreneurs and scientists are taking notice of drug discovery. Peter Thiel who has had some interesting thoughts on drug development has already funded Emerald Cloud Labs (ECL), a venture that uses 15,000 sq ft of lab space packed with robots and automation to perform experiments that you can design and initiate with the click of a button on your laptop.
I am all for new approaches to drug discovery and especially ones that promise to make it more efficient, but as the piece notes, the optimism about applying software to drug development is rightly tempered by a recognition of the inherent messiness of biology and the vast gaps of ignorance that riddle our knowledge of the interaction between small molecules and living organisms. The article quotes several industry experts on the challenges that any kind of software-based drug development platform would face. Here's Mark Murcko for instance:
“Every day, every company I work with is struggling with target validation, biomarkers and patient selection. Questions come up such as “‘I have a hit from a screen and I do not know what it does’ and ‘Which of these two targets (out of ten in total) do I pick for my next drug discovery project?’” Murcko said. All of this, Murcko said, gets into biology that is “half-right and half-wrong. For example, ‘I have to extrapolate from mouse data.’ Or ‘It is human genetic data but it’s from the germ line [e.g. from sperm or egg cells or their immediate progeny].’” So the data do not necessarily teach you what will happen if you shut down 80% of the activity of the same target in a 50-year-old patient."
And he's right. The kinds of questions that most people in drug discovery tackle are very messy and often quantitatively ill-defined. They deal with emergent biological organization and non-linear dose-response. It's one thing to be able to speed up the acquisition of data in such experiments; another to interpret that data or even to ask the right questions in the first place. Although I am all for automation and cloud-based analysis, these experiments by themselves are not going to speed up the fundamental challenges involved in getting to a new drug, nor are they going to account for unexpected events. One of the other experts quoted in the article, Nagesh Mahanthappa, puts this issue into perspective:
“You can automate an assay and be in love with the output. But if you bother to look you can find out that you have been grossly misled…These days, so much equipment is automated or semi-automated. They give you results in thirty minutes. But the results often sound like this: ‘The molecule inhibited signaling.’ You have to remember to ask in that case, are you sure that the molecule did not just kill all the cells? Or that the cells were not washed off the plates during a washing step?”
Consider Emerald Cloud Labs for instance; their website lists dozens of experiments ranging from flow cytometry to fluorescence microscopy which you can remotely ask a robot to perform. Key biostructural techniques like crystallography and NMR spectroscopy are coming online by the end of the year. It's great that you can cheaply outsource such techniques from the comfort of your living room. But the problem as anyone who has worked in the field knows is that both the course and the output of these experiments are far from standard. No assay development project is the same as another, even for well known targets, and assays and biophysical characterization of every target and class of small molecule demands its own tweaking, idiosyncrasies and unexpected glitches. There is no doubt that a facility like Emerald Cloud Labs will speed up plain vanilla type experiments, but there is also little doubt in my mind that the automation that such a facility promises will be severely hampered by the project specific human-intervention that will be constantly demanded by the vagaries of drug discovery.
Some of the thinking in that article exemplifies what Derek Lowe has called the "Andy Grove fallacy", the belief that bringing computational thinking to biology will help us rapidly sort the wheat from the chaff and get to the right answer fast. It's the kind of thinking that a lot of Silicon Valley entrepreneurs who are steeped in the high success rate of software ventures are bringing to bear on the intricacies of biology. Unfortunately as I mentioned before the problem here is not speed or efficiency per se, it's asking the right questions in the first place. Very little of our ability to develop new drugs is constrained by speed; much of it is constrained by plain ignorance. You can hack together a car app given enough manpower, money and time because the goal is usually quite clear and the process highly deterministic. That's not the case with the emergent world of biology. There's not much point in doing something fast if you don't know whether that's the right thing to do. Blazing automation will help only if you are asking the right question.
All this being said, I am glad that people like Thiel, Andreessen, Horowitz and Pande are putting their own money into such investments and walking the talk. The real benefit of such ventures would be to push the boundaries of our thinking regarding the application of data science to biology, and even the ignorance that they would discover would be enlightening. At the very least, increased speed and automation would allow us to make mistakes faster and learn what doesn't work. And as anyone who has worked in the time and money-constrained world of pharma knows, that's as good an asset to have as any other.