Ken Dill and David Mobley from UCSF have a really nice review in Structure on computational modeling of protein-drug interactions and the problems inherent in the process. I would strongly recommend anyone interested in the challenges of calculating protein-drug binding to read the review, if not for anything else for the copious references provided. The holy grail of most such modeling is to accurately calculate the free energy of binding. For doing this we frequently start with a known structure of a protein-ligand complex. The main point that the authors emphasize is that when we are looking at a single protein-ligand complex, deduced either through crystallography or NMR, we are missing a lot of important things.
Perhaps the most important factor is entropy which is not at all obvious in a single structure. Typically both the protein and the ligand will populate several different conformations in solution. Both will have to pay complex entropic penalties to bind one another. The ligand strain energy (usually estimated at 2-3 kcal/mol for most ligands) also plays an important role. The desolvation cost for the ligand also can prominently figure. In addition both protein and ligand will have some residual entropy even in the bound state. As if this were not enough of a problem, much of the binding energy can come from the entropic gain that the release of water molecules from active sites engenders. Calculating all these entropies for protein, ligand and solvent is important for accurately calculating the free energy of protein-ligand binding. But there are few methods that can accomplish this complex task.
Among the methods reviewed in the article are most of the important methods used currently. Usually the tradeoff for each method is between cost and accuracy. Methods like docking are fast but inaccurate although they can work well on relatively rigid and well-parameterized systems. Docking also typically does not take protein motion and induced-fit effects into account. Slightly better methods are MM-PBSA or MM-GBSA which as the names indicate, combine docking poses with an implicit solvent model (PBSA or GBSA). Entropy and especially protein entropy is largely ignored, but since we are usually comparing similar ligands, such errors are expected to cancel. Going to more advanced techniques, relative free-energy calculations use molecular dynamics (MD) to try to map the detailed potential energy surfaces for both protein and ligand. Absolute free-energy perturbation calculations are perhaps the gold standard in calculating free energies but are hideously expensive. They work best for ligands that are simple.
There is clearly a long way to go before calculation of ∆Gs becomes a practical endeavor in the pharmaceutical industry. There are essentially two factors that contribute to the recalcitrance of the problem. The first factor as indicated is the sheer complexity of the problem; assessing the thermodynamic features of protein, ligand and solvent in multiple configurational and conformational states. The second problem is a problem inherent in nature; the sensitivity of the binding constant to the free energy. As iterated before, the all-holy relation ∆G = -RT ln K ensures that an error of even 1 kcal/mol in calculation will translate to a large error in the binding constant. The myriad complex factors noted above ensure that errors of 2-3 kcal/mol already constitute the limit of what the best methods can give us. Recall that an error of 3 kcal/mol means that you are dead and buried.
But we push on. One equal temper of heroic hearts. Made weak by time and fate, but strong in will. To strive, to seek, to find, and not to yield. At some point we will reach 1 kcal/mol. And then we will sail.
Reference: Mobley, D., & Dill, K. (2009). Binding of Small-Molecule Ligands to Proteins: “What You See” Is Not Always “What You Get” Structure, 17 (4), 489-498 DOI: 10.1016/j.str.2009.02.010