Classically, when very little about molecular biology and protein structure was known, one of the best methods to discover drugs was to try to guess physiological effects by looking at drug similarity. The rationale was simple; drug A has this chemical structure and lowers blood pressure. Drug B which is being used for a completely different indication also has a similar chemical structure. Maybe we can use drug B for lowering blood pressure too?
In the absence of rational approaches this technique (which these days is called "repurposing") could be surprisingly fruitful. However, as molecular biology, crystallography and structure-based drug design took off in the 80s, rational drug discovery became much more focused on protein structure and drug developers began trying to guess drug function by looking at target similarity in terms of sequences and binding pockets.
But the relative lack of dividends from structure-based drug design (which nonetheless continues to be important) has led to a re-examination of the old paradigm. One of the most interesting advances to come out of this thinking was pioneered by Brian Shoichet and Bryan Roth's groups (at UCSF and UNC-Chapel Hill) a few years back. Their rationale was simple too; look at drug similarity, albeit using modern computational methods and metrics, and try to cross-correlate drug targets using these similarity metrics. Similar drugs should hit each other's targets. The method seems somewhat crude but has worked surprisingly well. It was even listed as one of Wired magazine's top ten scientific breakthroughs of 2009.
In a recent publication the authors take the method a step further and apply it to phenotypic screening which has emerged as an increasingly popular method in the pharmaceutical industry. Phenotypic screening is attractive because it bypasses having knowledge of the exact molecular target; basically you just inject different potential drugs in a test system and look at the results which are usually quantified by a phenotypic response such as a change in membrane potential, cell differentiation or even something as general as locomotion. Once a particular compound has elicited a particular response, we can start the arduous task of finding out what it's doing at a molecular level. Not surprisingly, several proteins can be responsible for a general response like locomotion and detecting all of them is non-trivial to say the least. It is for this rather involved exercise that the authors provide a remarkably simple potential (partial) solution.
The current study looks at phenotypic screening on the zebrafish, a favorite model organism of biologists. 14,000 molecule were screened to elicit a photomotor response in the zebrafish embryos. Out of these, about 700 were deemed active. To find out the targets for these molecules, the authors interrogated their chemical "similarity" against a large group of compounds for which targets are known. Importantly, the authors use a statistical technique to calculate an expectation value (E value) which indicates whether the similarity arises by chance alone. A lower E value means a higher likelihood of statistical significance. One of the most remarkable things in these studies is that the metric for similarity is a simple, computationally cheap 2D metric called a fingerprint which looks at the presence or absence of certain functional groups in the molecules. That such a metric can work at all is remarkable because we know that an accurate estimation of similarity should ideally include the 3D conformation that the drug presents to the protein target.
Nonetheless, 284 molecules predicted to be active on 404 targets were picked based on their low E values. Out of these, 20 molecules were especially interesting because they were seen to have novel drug-target associations not seen before. When the authors tested these molecules against the relevant targets, 11 among them had activities ranging from low nanomolar to 10 µM. For a computational method this hit rate is quite impressive, although a more realistic measure of the hit rate would have come from testing all 284 compounds. The activity of the molecules was validated by running comparisons with molecules that were known to elicit the same response from the predicted targets. Confirmation also came from competition experiments with antagonists. Some of the unexpected targets predicted include the beta-adrenergic receptor, dopamine receptors and potassium ion channels.
I find these studies very encouraging. For one thing, the computational method can potentially save a huge amount of time needed to experimentally uncover the target for every active compound. As mentioned above, it's also remarkable that a very simple metric of 2D similarity yields useful results. The success rate is impressive; however, an even lower rate would still have been worth the modest effort if it resulted in new drug-target relationships (which are far more useful than new chemically similar ligands for the same target). However I do think it would have been very interesting to look at the failures. An obvious source of failure comes from using the wrong measure of similarity; at least some compounds are failing presumably because their 2D similarity does not translate to the 3D similarity in conformation required for binding to the protein target. In addition there could be protein flexibility which would result in very different binding for supposedly similar compounds. Medicinal chemists are well aware of "activity cliffs" where small differences in chemical structure lead to great differences in activity. These cliffs could also lead to lack of binding to a predicted target.
Nevertheless, in an age when drug discovery is only getting harder and phenotypic screening seems to be an increasingly important technique, these computational approaches promise to be a useful complement. Gratifyingly, the authors have developed a website where the algorithm is available for free. The technique has also spawned a startup.