The first step in much of SBDD, including docking, is the selection of a good crystal structure if it exists. The crystal structure is used as the starting point for seeking new leads and optimizing them. Consider any docking method evaluation paper in J. Med. Chem. and one will come across a benchmarking set of protein structures that are used as starting models for testing the docking protocols.
Now crystal structures are frequently as close as you can get to "reality", but even they are models and should be treated with some skepticism. But the more obvious question for such a study when multiple crystal structures of a protein are available is, which crystal structure among those should you use?
The short answer to this question is, choose one with good resolution (preferably 2.0 A or less), which does not have missing portions, and which is preferably also unencumbered by the presence of a whole lot of counterions, stabilizing molecules, and other ligands.
But is that really all? Maybe not. Recently, I was playing around with docking some molecules into kinase crystal structures. I was trying to see if docking scores can correlate with the selectivity for one related protein over the other. Usually they don't, but I was going to look at similar proteins and similar structures, so I though it may be worth a shot. I was particularly looking at cyclin-dependent kinases (CDKs) which share a lot of homology especially in their ATP binding pocket. CDK2 is probably the most well-characterised CDK among the CDKs, and there are at least four to five different high-resolution CDK2 structures in the PDB. Also, I was more keen on using CDK2, because it was one of the proteins used for benchmarking the docking program.
So I decided upon two structures, both of high resolution. One had ATP docked into it, the other one had Staurosporine. I took an inhibitor which was known to be selective for another CDK over CDK2. First I docked it into that other CDK, and into the CDK2 structure that had ATP bound to it (without the ATP of course). I noted that the score for the other CDK was higher (which actually means more negative, since it is supposed to reflect the free energy of binding). That was consistent with the experimental data, which showed that the inhibitor was in fact more selective for the other kinase. But then, I docked it into the other CDK2 structure, and now the score was much better than for the other kinase. So the two docking runs gave two opposite results for the same protein. One predicted that the inhibitor would be less selective for CDK2, and the other one predicted that it would be more selective.
Now one of the things this says is that you cannot trust docking scores much. But this still was weird, because the question persists; which CDK2 structure should I use if I am going to do some SBDD and selectivity studies? I don't know the answer to this question, but I took a look at the two structures to try to figure out. In the one with the ATP, the adenine region of ATP nicely made two hydrogen bonds with the hinge region of the kinase, and so did my inhibitor which was supposed to be an ATP mimic. In the other one however, the backbone carbonyl that was supposed to form the hydrogen bond to the inhibitor was rotated by almost 90 degrees upwards. It did not form a bond with stauroporine, and it did not have to, because staurosporine does not "look" like ATP. And needless to say, it could not hydrogen bond with my inhibitor too. That's why the docking score was much worse.
What's the solution for circumventing such a problem? One quick answer that comes to my mind is; if you are docking a ligand that is "similar" to ATP, use the protein structure that has ATP bound to it. However, "similarity" can be a tricky concept, and should be considered carefully. Also, it may be slightly easy for kinase inhibitors, because there are literally hundreds of very typical planar, heterocyclic amino-pyrimidine based kinase inhibitors that share some very obvious similarity to ATP (or not...)
But probably the best message to take home from this from a computational standpoint is that rigid protein docking not surprisingly can get you into some bad trouble. Not allowing the protein to move means that you are going to preconstrain the protein based on its preconstrained conformation in the crystal. To test this thought, I did an induced-fit docking run on both structures with the inhibitor. Gratifingly, both the runs converged on the same protein-ligand structure.
Choosing a PDB x-ray structure may not be as easy as we think, and may have to be done critically. And more importantly as usual, what we put in is what we get out. Rigid docking is ok if there's only one crystal structure, and then only because there's no other choice. But in other circumstances, always allow the protein to move. That's closer to nature.
Why I'm Marching for Science
20 hours ago in Angry by Choice