Family matters kept me away for sometime, but this topic seems apt to jump into the fray again. In the Pipeline has an interesting slew of comments about the role of computational chemistry in drug design and discovery. The comments were in response to a question by Derek about how useful Free Energy Perturbation (FEP) could be in drug design. FEP is a kind of holy grail for drug hunters. If you could really predict the absolute free energy of binding of a series of diverse drug-like molecules to a protein, it would comprise an unprecedented breakthrough. It may not instantly make it possible to put two new cancer drugs a week on the market, but predicting the affinity of compounds without making them would certainly lead to unimaginable savings in cost and money for the pharmaceutical industry. Not surprisingly, many erstwhile knights are pursuing this dream with vigor. To me it seems interesting to summarize what my reading of some of the major challenges in the field are. This is a personal evaluation, feel free to enlighten in the comments section.
1. Our understanding of protein structure and conformation is still significantly inadequate: This may be the single-most daunting challenge in doing FEP. We don't know how to calculate the entropy and enthalpy of proteins binding to drugs that arises from their motions. Induced fit effects have long been recognized as being very important in dictating protein-ligand binding. Yet, most docking programs that try to fit ligands into protein pockets do so while considering the protein rigid. Movements of side chains, loops and sometimes even large scale movement of helices can be significant yet subtle, and it's an uphill task to include these into a docking calculation. Some docking programs have made impressive advancements in predicting induced fit, but a lot remains to be done. However, the core problem with doing any of this really leads us to the biggie in the field- protein structure prediction. Convoys of experimentalists and theorists have been trying to do this for decades. Sucess has been impressive, but still not general enough.
The general problem has huge implications for understanding protein folding, misfolding and of course, protein-drug binding. It's significiant and appreciated enough that at least one man, who happens to be the richest man in the world, has decided to put his money on it. Bill Gates recently announced that he is investing 10 million dollars in the computational drug design company Schrodinger, specifically with a view to supporting developments in protein structure prediction and related issues. That must mean something. In any case, unless we can capture the dance of proteins even as they bind to a drug, our dream of FEP will be a distant spot on the horizon. If an x-ray structure is available, such efforts become more feasible. And yet for some of the most important proteins like GPCRs, only a handful of structures exist. Homology modeling can and is supplying some of the missing structures, but the process involves tremendous guesswork and the devil in the details often thwarts your best efforts. In the end, computational prediction of protein structure can only come from an enhanced basic understanding of the basic properties of proteins, and both theory and experiment will need to massively intertwine in this quest.
2. Our understanding of ligand conformations is much better, but still not perfect: Compared to protein conformation prediction, we are orders of magnitude better with ligand conformations prediction, primarily because of the small size of the ligand. But even here challenges lurk. Ligands usually exist as multiple conformations in solution. One of these conformations is the bioactive one that binds to the protein. Often this is only 2-3% percent, which means it's virtually impossible to detect by NMR. While several methods exist for generating relevant ligand conformations, it is prima facie very difficult to say which one is the bioactive one. Plus, ligand and protein have to expend strain energy for the ligand to adopt the right conformation. One never knows how much strain energy the ligand can pay, although recent estimates have suggested a maximum cap of a few kcal/mol. Beyond all this, it's worth noting that drastic changes in activity can sometimes result from small changes in ligand conformation. Docking cannot always capture these small changes, although in some cases as I demonstrated before, docking can capture non-intuitive ligand conformations that only crystal structures can reveal. The bottom line is that even though we have a much better handle on ligand conformations compared to protein conformations, locating the bioactive conformation is still trying to locate a needle among a haystack of needles.
3. Water is still the big white elephant in the room: The most well-known solvent is still the least well-understood, especially in the context of its interaction with biomolecules. By some estimates, the displacement of water molecules by hydrophobic parts of a ligand is the single most important driver for binding affinity. Apart from the more obvious roles that water molecules can play in bridging ligand protein interactions and serving as well-placed displaceable entities that can be kicked out by ligand extensions with huge resulting changes in free energy, water also plays more subtle roles that we are just beginning to comprehend. Water can act as a kind of lubricant, 'massaging' proteins as they unfold and fold, gliding across hydrophobic and hydrophilic surfaces and helping them to form interactions. Plus, proteins usually are surrounded by a ghostly layer of bound water molecules that almost act as a virtual extension of their structure. These water molecules can exert important influences on protein conformational changes. Plus, the hydrophobic effect only gets more interesting every day, with recent findings suggesting that there is a 'dewetting transition' when two hydrophobic surfaces approach each other closer than a critical distance. To find out more, you can check out an excellent review of water's role in biology on the molecular level. Current methods for modeling water include implicit and explicit solvation models. The drawbacks of both are well-recognized. It seems astonishing that we are trying to predict the solvation of protein-ligand assemblies when we are still struggling to get the solvation of simple organic molecules right. In the end, correct accounting of water for specific systems is going to be key for accurate FEP calculation.
The real challenge in FEP comes from the exquisite, exponential dependence of free energy of binding on the dissociation constant of a protein ligand complex. Since a 1 kcal/mol change in ∆G can lead to a ten fold change in dissociation constant, we need to do at least as well as this number in predicting free energies accurately. Since hydrogen bonds are a few kcal/mol, hydrophobic and electrostatic interactions can contribute another few kcals, and the errors in these parameters effected by inadequate solvation, incomplete sampling of conformations and incomplete representation of things like entropy are all incremental, it's pretty clear that getting things correct to 1 kcal/mol is a decidedly uphill task. The methods just cannot include all the parameters from real life necessary to achieve this. Real life measurements of binding affinity are frequently conducted under messy conditions with mixed solvents, ions, buffers and inhomogeneous environments. Rest assured that your grandson will be trying as hard as you are to include these factors into a FEP calculation.
I have always thought that this glass ceiling of 1 kcal/mol really represents all the riches we can get from understanding the diverse factors that dictate protein-ligand binding. The magic number is like the mythical island of Ithaca. You may arrive there weary and old, and may even discover that the place does not exist, but the wisdom you would have gained on the way would be of permanent value. That's what counts.
How green is your evergreen tree?
5 hours ago in The Phytophactor