Field of Science

Showing posts with label GPCR modeling. Show all posts
Showing posts with label GPCR modeling. Show all posts

The GPCR Network: A model for open scientific collaboration

This post was first published on the Scientific American Blog Network


The complexity of GPCRs is illustrated by this mechanical view of their workings (Image: Scripps Research Institute)
G Protein-Coupled Receptors (GPCRs) are the messengers of the human body, key proteins whose ubiquitous importance was validated by the 2012 Nobel Prize in chemistry. As I mentioned in a post written after the announcement of the prize, GPCRs are involved in virtually every physiological process you can think of, from sensing colors, flavors and smells to the action of neurotransmitters and hormones. In addition they are of enormous commercial importance, with something like 30% of marketed drugs binding to these proteins and regulating their function. These drugs include everything from antidepressants to blood-pressure lowering medications.

But GPCRs are also notoriously hard to study. They are hard to isolate from their protective lipid cell membrane, hard to crystallize and hard to coax into giving up their molecular secrets. One reason the Nobel Prize was awarded was because the two researchers – Robert Lefkowitz and Brian Kobilka – perfected techniques to isolate, stabilize, crystallize and study these complex proteins. But there’s still a long way to go. There are almost 800 GPCRs, out of which ‘only’ 16 have been crystallized during the past decade or so. In addition all the studied GPCRs are from the so-called Class A family. There’s still five classes left to decipher, and these contain many important receptors including the ones involved in smell. Clearly it’s going to be a long time before we can get a handle on the majority of these important proteins.

Fortunately there’s something important that GPCR researchers have realized; it’s the fact that many of these GPCRs have amino acid sequences that are similar. If you know what experimental conditions work for one protein, perhaps you can use the same conditions for another similar GPCR. Even for dissimilar proteins one can bootstrap based on existing knowledge. Based on the similarity you could also build computer models for related proteins. Finally, you can use a small organic molecule like a drug to essentially serve as a clamp that helps stabilize and crystallize the GPCR.

But all this knowledge represents a distributed body of work, spread over the labs of researchers worldwide and expected to be sequestered by them for their own benefits. These individual researchers working in isolation would not only face an uphill battle in figuring out the right conditions for studying their proteins but would also run the risk of reinventing the wheel and duplicating conditions from other laboratories. The central question asked by all these researchers is, how does the binding of a small molecule like a drug on the outside of a GPCR lead to the transmission of a signal to the inside?

Enter the GPCR Network, a model of collaborative science which promises to serve as a fine blueprint for other similar efforts. The network was created through a funding opportunity from the National Institute of General Medical Sciences in 2010 and has set itself the goal of structurally characterizing 15-25 GPCRs in the next five years. The effort is based at the Scripps Research Institute in La Holla and involves at least a dozen academic and industrial labs.

So how does this network work? The idea for the network came from the recognition that there are hundreds of GPCR researchers spread across the world. Each one is an expert on a particular GPCR but each one has largely worked separately. What the network does is to leverage the expertise from one researcher’s lab and apply it a similar GPCR in another lab (there are technical criteria for defining ‘similarity’ in this case). There are a variety of very useful protocols, ideas and equipment that can be shared between labs. This sharing cuts down on redundant protocols, saves money and accelerates the resolution of new GPCR puzzles much faster than what could be achieved individually.

For instance, a favorite strategy for stabilizing a GPCR involves tagging it with an antibody that essentially holds the protein together. An antibody that worked for one GPCR can be lent to a researcher who is investigating another GPCR with a similar amino acid sequence. Or perhaps there is a chemist who has discovered a new molecule that binds very tightly to a particular receptor. The network would put him in touch with a crystallographer who could use that molecule to fish out that GPCR from a soup of other proteins and crystallize it. Once the crystallographer solves the structure of the protein using this molecule, he or she could then send the structure to a computer modeler who can use it to build a structure for another particularly stubborn GPCR which could not be crystallized. The computer model might explain some unexpected observations from a fellow network researcher who was using a novel instrumental technique. This novel technique would then be shared with everyone else for further studies.

Thus, what has happened here is that the individual pockets of knowledge from a biochemist, organic chemist, crystallographer and computer modeler – none of whom would have proceeded very far by themselves – are merged together to provide an integrated picture of a few important GPCRs. The entire pipeline of protocols including protein isolation, purification, structure determination and modeling also serves as a feedback loop, with insights from one step constantly informing and enriching others. This represents a fine example of how collaborative and open research can accelerate important research and save time and money. It's to the credit of these scientists that they haven't held their valuable reagents and techniques close to their chest but are sharing them for everyone's benefit.

In the three years since it has been up and running, the GPCR Network has leveraged the expertise of many experts in generating insights into the structure and function of important receptors. Its collaborative efforts have resulted in eight protein structures in just two years. These include the adenosine receptor which mediates the effect of caffeine, the opioid receptor which is the target for morphine and the dopamine receptor which binds to dopamine. Every one of these collaborations involved a dozen or so researchers across at least three or four labs, with each lab employing its particular area of expertise. Gratifyingly, there’s also a few industrial labs involved in the efforts and we can hope that this number will increase even as the pharmaceutical industry becomes more collaborative.

It’s also worth noting that the network was funded by the NIGMS, an institution which has been subject to the whims of budget and personnel cuts. This institution is now responsible for an effort that’s not only accelerating research in a fundamental biological area but is also contributing to a better understanding of existing and future drugs. Scientists, politicians and members of the public who are seeking a validation of basic, curiosity-driven scientific research and reasons to fund it shouldn’t have to look for.

GPCR modeling: The devil hasn't left the details

The last decade has been a bonanza decade for the elucidation of structures of G Protein-Coupled Receptors (GPCRs), culminating with the landmark structure of the first GPCR-G protein complex published a few weeks ago. With 30% of all drugs targeting these proteins and their involvement in virtually every key aspect of health and disease, GPCRs remain glowingly important targets for pure and applied science.

Yet there are miles to go before we sleep. Although we now have more than a dozen structures of half a dozen GPCRs in various states (inactive, active, G-protein coupled), there are still hundreds of GPCRs whose structures are not known. The existing GPCRs all fall into the 'Class A' GPCRs. We still have to mine the vast body of Class B and C GPCRs which comprise a huge number of functionally relevant proteins. The crystal structures which we do have comprise an invaluable resource but from the point of view of drug discovery, we still don't have enough.

In the absence of crystal structures, homology modeling wherein a protein of high sequence homology is used to build a computational model for an unknown structure has been the favorite tool of modelers and structural biologists. Homology modelers were recently provided an opportunity to pit their skills against nature when a contest asked them to predict the structures of the D3 and CXCR4 receptors just before the real x-ray structures came out. Both proteins are important targets involved in multiple processes like neurotransmission, depression, psychoses, cancer and HIV infection. The D3 structure prediction involved predicting the ligand-bound structure of the protein complexed with eticlopride, a D3 antagonist.

The results of the contest have been published before, but in a recent Nature Chemical Biology paper, a team led by Brian Shoichet (UCSF) and Bryan Roth (UNC-Chapel Hill) perform another test of homology modeling, this time connected to the ability to virtually screen potential D3 receptor ligands and discover novel active molecules with interesting chemotypes.

Two experiments provided the comparison. One protocol used the D3 homology model to screen about 3 million compounds by docking, out of which about 20 were picked and tested in assays based on docking scores and inspection. The homology model was built on the basis of the published structure of the ß2 adrenergic receptor which has been structurally heavily studied. Then, after the x-ray structure of the D3 was released, they repeated the virtual screening protocol with the crystal structure; again, 3 million compounds out of which roughly 20 were picked and tested.

First the somewhat surprising and heartening result; both homology model and crystal structure demonstrated similar hit rates- about 20%. In both the cases the actual affinity of the ligands ranged from about 200 nM - 3 µM. In addition, the screen revealed some novel chemotypes that did not resemble known D3 antagonists (although not surprisingly, some hits were similar to eticlopride). As an added bonus, the top ranked ligands using the homology model did not measurably inhibit the template ß2 adrenergic receptor, which means that the homology model probably did not retain the "memory" of the original template.

Now for the bee in the bonnet. The very fact that the homology model and the crystal structure produced different hits means that the two models were not identical (only one hit overlapped between the two). Of course, it's too much to expect a model of a protein with thousands of moving parts to be identical to the experimental structure, but it goes to show how careful homology modeling has to be performed and how it can still be imperfect. What is more disturbing is that the differences between the model and the crystal structure responsible for the different hits were small; in one case the difference between two carbons was only 1 Ã… between the two models. Other amino acids differed by less than that.

And all this even after generating a stupendous number of models of unbound and ligand-bound protein. As the paper says, the team generated about 98 million initial ligand-bound homology models. Screening the top models among these involved generating multiple conformations and binding modes of the 3 million compounds; the total number of discrete protein-ligand complexes resulting from this exercise numbered about
2 trillion. That such kind of evaluation is possible is a tribute to the enormous computing power we have at our fingertips. But it's also a commentary on how relatively primitive our models are so that we are still at a loss to predict minute structural differences with significant consequences in finding new active molecules.

So where does this lead us? I think it's really useful to be able to perform such comparisons between homology models and crystal structures and we can only hope more such comparisons will be possible by virtue of an increasing pipeline of GPCR structures. Yet these exercises demonstrate how challenging it is to generate a truly accurate homology model. A few years ago a similar study demonstrated that a difference in a single torsional angle of a phenylalanine residue (and that too resulting in a counter-intuitive
gauche conformation) affected the binding of a ligand to a homology model of the ß2 adrenergic receptor. Our ability to pinpoint such tiny differences in homology models is still in its infancy. And this is just for Class A GPCRs for which relatively accurate templates are available. Get into Class B and Class C territory and you start looking for the proverbial black cat in the dark.

Now throw in the fascinating phenomenon of functional selectivity and you have a real wrench in the works. Functional selectivity, whereby different conformations of a GPCR binding to the
same ligand modulate different signal transduction pathways and cause the ligand to change its mode of action (agonist, inverse agonist etc.) takes modeling of GPCRs to unknown levels of difficulty. Most modeling currently being done does not even attempt to consider protein flexibility which is at the heart of functional selectivity. Routinely including protein flexibility in GPCR modeling has some way to go.

That is why I think that, as much as we will continue to learn from GPCR homology modeling, it's not going to contribute massively to GPCR drug discovery anytime soon. Constructing accurate homology models of even a fraction of the GPCR universe will take a long time. Using such models would be like throwing darts at a board for which the center is unknown. Until we can locate the center and are plagued with the complexities of functional selectivity, we may be better off pursuing experimental approaches that that can map the effect of ligands on a particular GPCR using multifunctional assays. Fortunately, such approaches are definitely seeing the light of day.

Carlsson, J., Coleman, R., Setola, V., Irwin, J., Fan, H., Schlessinger, A., Sali, A., Roth, B., & Shoichet, B. (2011). Ligand discovery from a dopamine D3 receptor homology model and crystal structure Nature Chemical Biology DOI: 10.1038/nchembio.662

Why modeling GPCRs is (still) hard

ResearchBlogging.org
Well, it's hard for several reasons which I have discussed in previous posts, but here's one reason demonstrated by a recent paper. In this paper they crystallized the ß2 adrenergic receptor with an antagonist. Previously, in the landmark publication of the ß2 structure in 2007, the protein had been crystallized with an inverse agonist. Recall that an inverse agonist inhibits the basal activity of the GPCR whereas an antagonist stabilizes both active and inactive states but does not affect the basal activity.

In this case they crystallized the ß2 with an antagonist and compared the resulting structure to that of the agonist-GPCR complex. And they saw...nothing in particular. The protein backbone and side-chain locations are very similar for the antagonist (compound 3) and inverse agonist (compound 2) shown in the figure below.



As we can see in the figure, about the only side-chain that shows any movement is the tyrosine on the left (Y316). No wonder that cross-docking the two ligands (that is, docking one ligand into the other ligand's protein conformation) gave very accurate ligand orientations; this was essentially a softball problem for a docking program since the antagonist was being docked into a protein conformation that was virtually identical to its own.

But of course, we know that antagonists and agonists affect GPCR function quite differently. As this study shows, clearly the action is not taking place in the ligand-binding pocket where things aren't really moving. So where is the real action? It's naturally taking place on the intracellular side, where the GPCR interacts with a medley of other proteins. And as the paper accurately notes, the difference between antagonist and inverse agonist binding is probably also reflected in the protein dynamics corresponding to the two ligands.

Good luck modeling that. That's the whole deal with modeling GPCRs. Simply modeling the ligand-binding pocket is not going to help us understand the differences between the binding of various ligands; one has to model multiprotein interactions and subtle effects on dynamics that are relayed through the helices. The program Desmond which I described in a earlier post is a powerful MD program, but even MD is going to really turn heads when it can take account of multiprotein interactions, and such interactions happen on a time-scale much longer than what even Desmond can access. We have a long way to go before we can do all this. But please, don't stop.

Wacker, D., Fenalti, G., Brown, M., Katritch, V., Abagyan, R., Cherezov, V., & Stevens, R. (2010). Conserved Binding Mode of Human β-2 Adrenergic Receptor Inverse Agonists and Antagonist Revealed by X-ray Crystallography Journal of the American Chemical Society, 132 (33), 11443-11445 DOI: 10.1021/ja105108q

Computational modeling of GPCRs: What are the challenges?

ResearchBlogging.org
GPCRs are extremely important proteins both for pure and applied science research, but they are also very difficult to crystallize and hence structural information on them has been sparse. Naturally in such a case, computational modeling can be expected to be of great value of providing insight into GPCR structure and function. However, even though progress has been impressive, such modeling still has to overcome many challenges. A recent review lists some of them.

Firstly, in the absence of crystal structure, homology modeling wherein a sequence for an unknown structure is 'threaded' through that of a known one is well-established as a valuable technique. However the technique is tricky. First and foremost one has to get the right sequence alignment between the target and the template. As the article notes, recent studies have suggested that using multiple structures for alignment instead of a single one provides better results. Particularly noteworthy is this detailed study. Once a homology model has been obtained, it must be meticulously examined, both for internal consistency (bad contacts, incorrect hydrogen bonding interactions etc.) and for its agreement with experiment. Data from cross-linking studies and mutagenesis can be used to achieve this. A recent promising development has been termed 'ligand-supported homology modeling'. In this process, topographical protein-ligand interaction data from mutagenesis and other studies is used to limit the number of homology models. Such data-driven homology modeling is becoming increasingly popular.

Once a good homology model has been obtained, many things can be done with it. Molecular dynamics (MD) simulations provide a very valuable avenue for exploring protein motion and be used to detect structural features not obvious in static models. A recent MD simulation of the beta-adrenergic receptor helped to resolve discrepancies between biochemical and structural observations. MD simulations can be used to investigate protein dynamics and to refine the models. Several challenges present themselves during this procedure. Firstly, while helices in GPCRs can be well-modeled, loops (of which there are six- three intracellular and three extracellular) are much harder to model because of their higher flexibility and because they are often ill-resolved in crystal structures. Unfortunately, it's these loops which are important ligand-interacting elements, so getting them right is key. Recently developed algorithms for loop-refinement based on either first-principles energy minimization or by statistical modeling based on a database of known loop conformations have been used in getting loops right. Also, state-of-the-art long MD simulations spanning several microseconds can be used to model large-scale structural changes in GPCRs.

There are still immense challenges still to be overcome in understanding GPCRs. One of the biggest concerns the cycling between several inactive and active states (and not just one active and one inactive state) that present often conflicting features that can be subject to varying interpretation. For instance, for class A GPCRs (which is the largest class), it has been well-established that activated states involve the breakage of the "ionic lock", a salt bridge between arginines and glutamates on transmembrane helices 6 and 3. Breaking this lock allows TM6 to shift away from TM3 and towards TM5, a hallmark of GPCR activation. Yet the MD study on the beta2 cited above indicated that even an inactive state may feature breakage of this lock.

In the GPCR jungle, strange shape-shifting creatures appear and clutch gems of insight in their palms. It is only fitting that we throw the kitchen sink at them to unravel their secrets, and computational techniques can only be a valuable arrow in this quiver.

Yarnitzky T, Levit A, & Niv MY (2010). Homology modeling of G-protein-coupled receptors with X-ray structures on the rise. Current opinion in drug discovery & development, 13 (3), 317-25 PMID: 20443165