If successful, virtual screening (VS) promises to become an efficient way to find new pharmaceutical hits, competitive with high-throughput screening (HTS). Briefly, virtual screening screens libraries of millions of compounds to find new and diverse hits, either based on similarity to a known active or by complementarity to a protein binding site. The former protocol is called ligand-based VS (LBVS) and the latter is called structure-based VS (SBVS). In a typical VS campaign, either LBVS or SBVS is used to screen compounds which are then ranked by how well they are likely to be active. The top few percent compounds are then actually tested in assays, thus validating the success or failure of the VS procedure. VS has the potential to cut down on time and expenses inherent in the HTS process.
Unfortunately the success rate of VS has been relatively poor, ranging from a few tenths of a percent to no more than a few percent. If VS is to become a standard part of drug discovery, the factors that influence its failures and successes warrant a thorough review. A recent
review in JMC addresses some of these factors and raises some intriguing questions.
From a single publication with the phrase ‘virtual screening’ in 1997, there are now about a hundred such papers every year. The authors pick about 400 successful VS results from three dominant journals in the field- J. Med. Chem., Bioorg. Med. Chem. Lett. and ChemMedChem, along with some from J. Chem. Inf. Mod. They then classify these into ligand-based and structure-based techniques. As mentioned before, VS comes in two flavors. LBVS starts from a known potent compound and then looks for “similar” compounds (with dozens of ways of defining ‘similarity’), with the hope that chemical similarity will translate into biological similarity. Structure-based techniques start with a protein structure, either an x-ray crystal structure, NMR structure or a structure built from homology modeling.
The authors look for correlations between the types of methods and various parameters of success and make some interesting observations, some of which are rather counterintuitive. Here are a couple that are especially interesting.
While SBVS methods dominate,
LBVS methods seem to find more potent hits, usually defined as less than 1 μM in activity. Why does this happen? Actually the authors don’t seem to dwell on this point but I have some thoughts on this. Typically when you start a ligand-based campaign, your starting point is a bonafide highly potent ligand. If you have a choice of ligands with a range of activity, you will naturally pick the most potent among them as your query. Now, if your method works, is it surprising that the hits you find based on this query will also be potent? You get what you put in.
Contrast this to a structure-based approach. You usually start with a crystal structure having a co-crystallized ligand in it. Co-crystallized ligands are usually but not always highly potent. The next step would be to use a method like docking to find hits that are complementary to your protein binding-site. But the binding site is conformationally pre-organized and optimized to bind its co-crystallized ligand. Thus, the ligands you screen will not be ranked highly by your docking protocol if they are ill-optimized for the binding site. For instance there could be significant induced fit changes during their binding. Even in the absence of explicit induced fit, fine parameters like precise hydrogen bonding geometries will greatly affect your score; after all, the protein binding site has hydrogen bonding geometries tailored for optimally binding its cognate ligand. If the hydrogen bonding geometries for your ligands are off even by a bit, the score will suffer. No wonder that the hits you find span a range of activities; you are using a binding site template that is not optimized to bind most of your ligands. The other reason which could thwart SBVS campaigns is simply that there is more work necessary in ‘preparing’ a crystal structure for docking. You have to add hydrogens to a structure, make sure all the ionization states are right and optimize the hydrogen bonding network in the protein. If any one of these steps goes wrong you will start with a fundamentally crappy protein structure for screening. Thus this protocol usually requires expert inspection, unlike LBVS where you just have to ‘prepare’ a single template ligand by making sure that the ionization state and bond orders are ok. These differences mean that your starting point for SBVS is more tortuous and much more likely to be messy than it is for LBVS. Again, you get out what you put in.
The second observation that the authors make is also interesting, and it bears on the protein preparation step we just mentioned. They find that VS campaigns where putative hits are docked into homology models seem to find more potent hits compared to those using an x-ray structure. This is surprising since x-ray structures are supposed to be the most rock-solid structures for docking. The authors speculate that this difference could be due to the fact that building good homology model requires a fair level of expertise; thus, successful VS campaigns using homology models are likely to be carried out by experts who know what they are doing, whereas x-ray structures are more likely to be used by novices who simply use the default parameters for docking.
Thirdly, the authors note an interesting correlation between the potency and frequency of hits found and the families of proteins targeted. GPCRs seem to be the most successful targeted family, followed by enzymes and then kinases. This is a pretty interesting observation and to me it points to a crucial factor which the authors don’t seem to really discuss- the nature of the libraries used for screening. These libraries are usually biased by the preferences and efforts of medicinal chemists in making certain kinds of compounds. I already
blogged about a paper that looked at the surprising success of VS in finding GPCR ligands, and that paper ascribed this success to ‘library bias’ which was the fact that libraries are sometimes ‘enriched’ for GPCR-active ligands, such as aminergic compounds. Ditto for kinases; kinase inhibitor-like molecules now abound in many libraries. This is partly due to the importance of these targets and partly because of the prevalence of synthetic reactions (like cross-coupling reactions) that make it easy for medicinal chemists to synthesize such ligands and populate libraries with them. I think it would have been very interesting for the authors to analyze the nature of the screened libraries; unfortunately such information is proprietary in industrial publications. But in the absence of such data, one would have to assume that we are dealing with a fundamentally biased set of libraries, which would explain selective target enrichment.
Finally, the authors find that most successful VS efforts have come from academia, while most of the potent hits have come from industry. This seems to be consistent with the role of the former in validating methodologies and that of the former in discovering new drugs.
There are some caveats as usual. Most of the studies don’t include a detailed analysis of false positives and negatives since such analysis is time consuming. But this analysis can be extremely valuable in truly validating a method. Standards for assessing the success of VS are also not consistent and universal and these will have to be decided for true comparisons. But overall, virtual screening seems to hold promise. At the very least there are holes and gaps to fill. And researchers are always fond of these.
Ripphausen, P., Nisius, B., Peltason, L., & Bajorath, J. (2010). Quo Vadis, Virtual Screening? A Comprehensive Survey of Prospective Applications Journal of Medicinal Chemistry DOI: 10.1021/jm101020z
Thanks for the post. I liked the article and I agree that having open standards for comparing and evaluating different VS methods is missing. I started a project called PyRx that has some elements for that. Having different vendors of VS software to contribute to a common standard would be a challenging problem.
ReplyDeleteThanks for the link, I will check it out. The OpenEye guys had published some guidelines for standardized VS evaluation about two years ago.
ReplyDelete