Field of Science

Domains of Applicability (DOA) in top-down and bottom-up drug discovery

You don’t use a hammer to do impressionistic painting. And although you technically could, you won’t use a spoon for drinking beer. The domains of applicability of these tools are different, in terms of quality and quantity.

The idea of domains of applicability (DOA) is an idea that is somehow both blatantly simple as well as easily forgotten. As the examples above indicate, the definition is apparent; every tool, every idea, every protocol, has a certain reach. There are certain kinds of data for which it works well and certain others for which it fails miserably. Then there are the most interesting cases; pieces of data on the boundary between applicable and non-applicable. These often serve as real testing grounds for your tool or idea.

Often the DOA of a tool becomes clear only when it’s been used for a long time on enough number of test cases. Sometimes the DOA reveals itself accidentally, when you are trying to use the tool on data for which it’s not really designed. That way can lie much heartbreak. It’s better instead to be constantly aware of the DOA for your techniques and deliberately stress-test its range. The DOA can also inform you about the sensitivity of your model; for instance, for a certain model a small change from a methyl to a hydroxy might fall within its DOA, while for another it might exceed it.

The development and use of molecular docking, an important part of bottom-up drug discovery, makes the idea of DOA clear. By now there’s an extensive body of knowledge about docking, developed over at least twenty years, which makes it clear when docking works well and when you can trust it less. For example, docking works quite well in reproducing known crystal poses and generating new poses when the protein is well resolved and relatively rigid; when there are no large-scale conformational changes; when there are no unusual interactions in the binding site; when water molecules aren’t playing any weird or special role in the binding. On the other hand, if you are doing docking on a homology model built on sparse homology that features a highly flexible loop and several bridging water molecules as key binding elements, all bets are off. You have probably stepped way outside the DOA of docking. Then there are the intermediate and in many ways the most interesting cases; somewhat rigid proteins, just one or two water molecules, a good knowledge base around that protein that tells you what works. In these cases, one can be cautiously optimistic and make some testable hypotheses.

Fortunately there are ways to pressure-test the DOA of a favorite technique. If you suspect that the system under consideration does not fall within the DOA, there are simple tests you can run and questions you can ask. The first set of questions concerns the quality and quantity of data that is available. This data falls into two categories; data that was used for training the method and the data that you actually have in your test case. If the test data closely matches the training data then there’s a fair chance that your DOA is covered. If not, you ask the second important question: What’s the quickest way I can actually test the DOA? Usually the quickest way to test any hypothesis in early stage drug discovery is to propose a set of molecules that your model suggests as top candidates. As always, the easier these are to make, the faster you can test them and the better you can convince chemists to make them in the first place. It might also be a good idea to sneak in a molecule that your model says has no chance in hell of working. If neither of these predictions come true within a reasonable margin, you clearly have a problem, either with the data itself or with your DOA.

There are also ways to fix the DOA of a technique, but because that task involves generating more training data and tweaking the code accordingly, it’s not something that most end users can do. In case of docking for instance, a DOA failure might result from inadequate sampling or inadequate scoring. Both of these issues can be fixed in principle through better data and better force fields, but that’s really something only a methods developer can do.

When a technique is new it always struggles to establish its DOA. Unfortunately both technical users and management don’t understand this and can immediately start proclaiming the method as a cure for all your problems; they think that just because it has worked well on certain cases it will do so on most others. The lure of publicity, funding and career advancement can further encourage this behavior. That certainly happened with docking and other bottom-up drug design tools in the Wild West of the late 80s and early 90s. I believe that something similar is happening with machine learning and deep learning now.

For instance it’s well known that when it comes to problems like image recognition and natural language processing (NLP), machine learning can do extremely well. In that case one is clearly operating well within the DOA. But what about modeling traffic patterns or brain activity or social networks or SAR data for that matter? What is the DOA of machine learning in these areas? The honest answer is that we don’t know. Now some users and developers of machine learning acknowledge this and are actually trying to circumscribe the right DOA by pressure-testing the algorithms. Others unfortunately simply take it for granted that more data must translate to better accuracy; in other words, they assume that the DOA is purely dictated by data quantity. This is true only in a narrow sense. Yes, less data can certainly hamper your efforts, but more data is neither always necessary and certainly not sufficient. You can have as much data as possible, but your technique can still be operating in the wrong DOA. For example, the presence of a discontinuous landscape of molecular activity places limitations on using machine learning in medicinal chemistry. Would more data ameliorate this problem? We don’t know yet, but this kind of thinking would not be inconsistent with the new religion of “dataism” which says that data is everything.

There are many opportunities to test the DOA of top-down approaches like deep learning in drug discovery and beyond. But to do this, both scientists and management must have realistic goals about the efficacy of the techniques, and more importantly must honestly acknowledge that they don’t know the DOA in the first place. In other words, they need to honestly acknowledge that they don’t yet know whether the technique will work for their specific problem. Unfortunately these kinds of decisions and proclamations are severely subject to hype and the enticement of dollars and drama. Machine learning is seen as a technique with such an outsize potential impact on diverse areas of our lives, that many err on the side of wishful thinking. Companies have sunk billions of dollars into the technology; how many of them would be willing to admit that the investment was really based on hope rather than reality?

In this context, machine learning can draw some useful lessons from the cautionary tale of drug design in the 80s, when companies were throwing money from all directions at molecular modeling. Did that money result in important lessons learnt and egos burnt? Indeed it did, but one might argue that computational chemists are still suffering from the negative effects of that hype, both in accurately using their techniques and in communicating the true value of those techniques to what seem like perpetually skeptical Nervous Nellies and Debbie Downers. Machine learning could go down the same route and it would be a real tragedy, not only because the technique is promising but because it could potentially impact many other aspects of science, technology, engineering and business and not just pharmaceutical development. And it might all happen because we were unable or unwilling to acknowledge the DOA of our methods.

Whether it’s top-down or bottom-up approaches, we can all ultimately benefit from Feynman’s words: “For a successful technology, reality has to take precedence over public relations, for Nature cannot be fooled.” For starters, let’s try not to fool each other.

No comments:

Post a Comment

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS