Here’s a good review of the pitfalls and promises of molecular docking by John Irwin and Brian Shoichet (UCSF) which is worth your time, especially if you are a non-specialist who wants a summary of what’s happening in the field. As the review notes, there are many first principles-based reasons why docking should not work – poor calculation of ligand conformations, poor treatment of protein and ligand electrostatics and desolvation, non-existent consideration of protein movement, wishful treatment of water molecules, sloppy representation of x-ray structures...the list just goes on. When docking 107 or so ligands involving 1013 total configurations, any one of these “maddening details” can doom your study.
And yet as the authors note both through general considerations and a few case studies, incremental but steady improvements in docking methodology have now made the technique respectable in most structure-based drug design campaigns (which interestingly have provided more drug candidates than HTS). Several reasons have contributed to this respectability. The first is the sheer throughput; no experimental technique can possibly screen 10 million ligands in a few days or weeks, so even with its flaws docking rises up at least as a potential complement to experiment. As the review notes, even a 10% success rate in finding new binders to a good target would be an improvement, and with well-defined binding sites the success rate can surpass high throughput screening (thus, the correct question to ask is not whether your method gives you false positives and negatives but whether this error rate is enough to overwhelm the experimental discovery rate for true positives). The second factor is the existence of massive databases like ZINC and ChEMBL containing millions of annotated ligands which could serve as starting points for ligand discovery.
Thirdly, while the holy grail of docking would be to correctly predict the absolute affinities of your ligands or at least to rank them, the more modest goal is to try to separate binders from non-binders and to discover novel chemotypes. Docking has been reasonably successful in meeting this goal, and the review presents several case studies that discovered interesting ligands which were dissimilar to known ones. In turn however, what really makes the discovery of novel chemotypes interesting is that it could lead to novel biology. It is this ability to potentially “break out of medicinal chemistry boxes” that makes docking attractive. For instance you could potentially find agonists by docking when you only found antagonists before, or – in what is one of the more interesting examples illustrated in the article – you could ‘deorphanize’ an enzyme by docking potential substrates to it and predicting its reaction profile. I still find this evidence anecdotal, but it's at least a good starting point for trying out things.
The rest of the review is also useful, not in the least because - given that it's from the Shoichet lab - there's also an instructive checklist of caveats to keep in mind while experimentally screening (PAINS, aggregators etc.) ligands. There is also a discussion of using homology models for docking. Using homology models is tricky even for lead optimization, so I would be wary in applying them too widely for high-throughput docking. While the review does illustrate an interesting case involving a GPCR, I want to note that I did blog about this case when it was published and described how – new ligands notwithstanding – the calculation seemed to use enough computing power and models to light up a startup, along with copious expert input.
It’s this last point in the review which is really the crux of the matter. When the servers have cooled down and the electrons have stopped flowing, the most important equipment that one can bring to bear on a docking study is a good pair of eyes connected to an experienced brain. Even with small error rates "the scum can rise to the top" ("The scum is out there" could be a good tagline for a X-Files episode about HTS). There is little substitute for careful inspection of top docked hits and looking at things like strain, abnormal charged interactions and wrong tautomeric states; otherwise staring down a computer screen would present the same risks as staring down a gun barrel.
The Shoichet lab has occasional "hit-picking parties" where teams of medicinal and computational chemists examine docked structures, and I suspect these parties are more common in other places than you think (although probably not as common as they should be). It’s only when the high throughput-low accuracy domain of docking meets the low throughput-high accuracy domain of the human mind that docking will continue to be successful. Given what we have seen so far I think there are grounds for hope.