Field of Science

Those who cannot predict...discover?: On 'deliberate serendipity' in drug design

A commenter on my last post about the computer-aided discovery of a promising new morphine-like analog brought up a very interesting point that deserves some discussion. As the commenter who was the synthetic chemist on the study says,

"I particularly liked the part where the authors admit that no matter how smart their approach was, many unique properties of the lead compound were pure luck and wouldn't be possible to predict with current tools. And if Brian Shoichet writes this, I'm pretty sure that nobody in the world can claim otherwise. As a synthetic chemist behind PZM21, I confess that its synthesis in diastereomerically pure form and - especially - figuring out its absolute configuration were quite a luck, too."

This sentiment is confirmed from reading the paper where the authors state:

"Several caveats deserve to be mentioned. Although structure-based discovery succeeded in finding novel scaffolds and supported facile optimization, some of the properties of PZM21 were likely fortuitous. Biased signalling through G protein and arrestin pathways reflects the stabilization of conformations over 30 Å from the orthosteric site where PZM21 binds. We did not select molecules that preferentially stabilize these conformations, but instead relied on chemical novelty to confer new biological properties.

Receptor subtype selectivity was attained by simply selecting molecules that extended into variable regions of the receptor, a strategy that may not always work. Several aspects of the pharmacology presented here remain preliminary, including the metabolic stability studies and the pharmacokinetics, and it is not clear at this time whether the unprecedented in vivo activity of PZM21 reflects its biased and specific agonism, or some other feature conferred by its novel chemotype. Finally, identification of agonists from docking to an inactive state receptor structure cannot always be relied upon, though there is precedence for doing so against opioid receptors."

Where does all this interesting biology come from? Several factors may contribute, but recall that the top ligands form an extra hydrogen bond with an aspartate which hasn't been seen in opioid receptor ligands before. Knowing the cooperative effects which ligand binding can transmit through protein structures, it's not farfetched to believe that this tiny difference may have contributed to the biological effects. The general point here is that several biological features of the small molecule, especially biased agonism (differential interactions between GPCRs and their protein partners) and stabilization of conformations in remote parts of the protein were not predicted. However, the chemical features of the ligand were discovered more rationally by docking. Thus as the authors say, they used chemistry to discovery novel biology.

This is an important point that should not be lost on critics as well as advocates of molecular modeling and medicinal chemistry. Sometimes novel biology can be discovered precisely because it's unpredictable. Now this is not some facetious cop out that puts a positive spin on the unpredictability of modeling downstream changes in protein conformation and inter-protein interactions. But what's happening here is that because chemistry is necessarily correlated to biology in highly non-linear and emergent ways, it's hard to tell beforehand what will happen when you discover a new ligand with novel interactions with a protein. On the other hand, this same unpredictability also provides a chance to make forays into unexplored biological territory.

It turns out this is actually very good news for chemists and drug designers. 
Operationally it's far easier to simply (relatively speaking) discover new ligands and then find out what they do: virtually docking a large screen of molecules against a complex protein like a GPCR is far easier than trying to predict what happens downstream after those molecules bind. Similarly, making analogs of these molecules is also quite easy. On the other hand, modeling any kind of remote conformational changes or biased agonism in this case would not just have been futile but would have likely led the researchers in the wrong direction, and this fact is more generally true of other targets too. It was far better to use the chemistry as the starting point, and then let the biology sort itself out.

This study therefore demonstrates a very fortuitous aspect of medicinal chemistry and modeling that is true of drug discovery in general: you can actually use the unpredictability of these techniques to discover new biology, because the more uncertain your starting point is, the more probable it is that it will lead to novelty. In uncertainty lies opportunity. All you need to do really is to make sure you are endowing the system with novel chemistry; the complex but deterministic relationship between well-understood chemistry and ill-understood biology will then take care of the rest. They key is to generate as many starting points for unexpected biology as possible. You may not always discover useful biology this way, but you certainly increase the chances of doing so. 

Sometimes you need to give serendipity a deliberate platform to launch itself.
 

The rise of translational research and the death of organic synthesis (again)?

The journal ACS Neuroscience has an editorial lamenting the shortage of qualified synthetic organic chemists and pharmacologists in the pharmaceutical and biotech industries. The editorial lays much of the blame at the feet of flagging support for these disciplines at the expense of the fad of 'translational research'. It makes the cogent point that historically, accomplished synthetic organic chemists and pharmacologists have been the backbone of the industry; whatever medicinal chemistry and drug design they learnt was picked up on the job. The background and rigor that these scientists brought to their work was invaluable in discovering some of the most important drugs of our time, including ones against cancer, AIDS and heart disease.
The current fascination of applied basic science, i.e., translational science, to funding agencies, due in large part to the perception of a more immediate impact on human health, is a harbinger of its own doom. Strong words? It is clear in the last 10 years that research funding for basic organic chemistry and/or molecular pharmacology is in rapid decline. However, the quality of translational science is only as strong as the basic science training and acumen of its practitioners—this truth is lost in the translational and applied science furor. A training program that instills and trains the “basics” while offering additional research in applied science can be a powerful combination; yet, funding mechanisms for the critical first steps are lacking. 
Historically, the pharmaceutical industry hired the best classically trained synthetic chemists and pharmacologists, and then medicinal chemistry/drug discovery was taught “on the job”. These highly trained and knowledge experts could tackle any problem, and it is this basic training that enabled success against HIV in the 1990s. When the next pandemic arises in the future, we will have lost the in-depth know-how to be effective. Moreover, innovation will diminish.
I have a problem pushing translational research at the expense of basic research myself. As I wrote in a piece for the Lindau Nobel Laureate meeting a few years ago, at least two problems riddle this approach:
The first problem is that history is not really on the side of translational research. Most inventions and practical applications of science and technology which we take for granted have come not from people sitting in a room trying to invent new things but as fortuitous offshoots of curiosity-driven research...For instance, as Nobel Laureates Joseph Goldstein and Michael Brown describe in a Science opinion piece, NIH scientists in the 60s focused on basic questions involving receptors and cancer cells, but this work had an immense impact on drug discovery; as just one glowing example, heart disease-preventing statins which are the world’s best-selling drugs derive directly from Goldstein and Brown’s pioneering work on cholesterol metabolism. Examples also proliferate other disciplines; the Charged-Coupled Device (CCD), lasers, microwaves, computers and the World Wide Web are all fruits of basic and not targeted research. If the history of science teaches us anything, it is that curiosity-driven basic research has paid the highest dividends in terms of practical inventions and advances. 
The second more practical but equally important problem with translational research is that it puts the cart before the horse. First come the ideas; then come the applications. There is nothing fundamentally wrong with trying to build a focused institute to discover a drug, say, for schizophrenia. But doing this when most of the basic neuropharmacology, biochemistry and genetics of schizophrenia are unknown is a great diversion of focus and funds. Before we can apply basic knowledge, let's first make sure that the knowledge exists. Efforts based on incomplete knowledge would only result in a great squandering of manpower, intellectual and financial resources. Such misapplication of resources seems to be the major problem for instance with a new center for drug discovery that the NIH plans to establish. The NIH seeks to channel the newfound data on the human genome to discover new drugs for personalized medicine. This is a laudable goal, but the problem is that we still have miles to go before we truly understand the basic implications of genomic data.

It is only recently that we have started to become aware of the "post-genomic" universe of epigenetics and signal transduction. We have barely started to scratch the surface of the myriad ways in which genomic sequences are massaged and manipulated to produce the complex set of physiological events involved in disease and health. And all this does not even consider the actual workings of proteins and small molecules in mediating key biological events, something which is underlined by genetics but which constitutes a whole new level of emergent complexity. In the absence of all this basic knowledge which is just emerging, how pertinent is it to launch a concerted effort to discover new drugs based on this vastly incomplete knowledge? It would be like trying to construct a skyscraper without fully understanding the properties of bricks and cement.
As an aside, that piece also mentions NIH's NCATS translational research center that has been the brainchild of Francis Collins. It's been five years since that center was set up, and while I know that there are some outstanding scientists working there, I wonder if someone has done a quantitative analysis of how much the work done there has, well, translated into therapeutic developments.

The editorial also has testimonials from leading organic chemists like Phil Baran, E J Corey and Stephen Buchwald who attest to the power of basic science that they discovered in their academic labs, power that they see almost disappearing from today's labs and funding agencies. This basic science which they have pioneered unexpectedly found use in industry. Buchwald's emphasis on C-N cross-coupling reactions is especially noteworthy since it was these kinds of reactions which really transformed drug synthesis and which led to Nobel Prizes for their inventors.

Baran's words are worth noting:
“It is ironic that a field with such an incredible track record for tangible contributions to the betterment of society is under continual attack. Fundamental organic synthesis has been defending its existence since I was a graduate student. If the NIH continues to disproportionally cut funding to this area, progress in the development of medicines will slow down and a vital domestic talent pool will evaporate leaving our population reliant on other countries for the invention of life saving medicines, agrochemicals, and materials.”
Baran is right that fundamental organic synthesis has been defending its existence for the last twenty years or so, but as has been discussed on this blog and in other sources, it's probably because it worked so well that it became a victim of its own success. The NIH is simply not interested in funding more total synthesis for its own sake. To some extent this is a mistake since the training that even a non-novel total synthesis imparts is valuable, but it's also hard to completely blame them. The number of truly novel reactions that have been invented in the last thirty years or so can be counted on one hand, and while chemists like Baran continue to perform incredibly creative feats in the synthesis of complex organic molecules, what they are doing is mostly applying known chemistry in highly imaginative new ways. I have no doubt that they will also invent some new chemistry in the next few years, but how much of it will compare to the fundamental explosion of new reactions and syntheses in the 1960s and 70s? I don't think this blog as well as others have denied the kind of training that synthetic organic chemistry provides, but I have certainly questioned the aura that sometimes continues to surround it (although it has declined in the last few decades) as well as the degree to which the pharmaceutical industry truly needs it.

To some extent the argument is simply about degree. The biggest challenge in most of the pharmaceutical company's postwar history was figuring out the synthesis of important drugs like penicillin, niacin and avermectin. In the era of massive screening of natural products, design wasn't really a major consideration. Contrast this period to today. The general problem of synthesis is now solved, and the major challenge facing today's drug discovery scientists is design. The big question today is not "How do I make this molecule?" but rather "How do I design this molecule within multiple constraints (potency, stability, toxicity etc.) all at the same time?" Multiparameter optimization has replaced synthesis as the holy grail of drug discovery. There are still undoubtedly tough synthetic puzzles that would benefit from creative problem-solving, but nobody thinks these puzzles won't yield to enough manpower or resources or would necessitate the discovery of fundamental new chemical principles. We of course still need top-notch synthetic organic chemists trained by top-notch academic chemists like Corey and Baran, but we equally (or even more) need chemists who are trained in solving such multiparameter design problems. Importantly, the solution to these problems is not going to come only from synthesis but also from other fields like pharmacokinetics, statistics and computer-aided design.

Another major point which I think the editorial does not touch on is the massive layoffs and outsourcing in industry which have bled it dry of deep and hard-won institutional knowledge. Drug discovery is not theoretical physics, and you cannot replenish lost talent and discover new drugs simply by staffing your organization with smart twenty-five year old wunderkinds from Berkeley or Harvard. Twenty or thirty years' experience counts for a hell of a lot in this industry; far from being a fever chill, age is a unique asset in this world. To me, this loss of institutional knowledge is a tragedy that is gargantuan compared to the lack of support for training synthetic organic chemists, and one that may have likely hobbled pharmaceutical chemistry for decades to come, if not longer.

Other than that the editorial gets it right. Too much emphasis on translational research can detract from the kind of rigorous, character-building experience that organic synthesis and classical pharmacology provide. As with many other things we need a bit of both, and some moderation seems to be in order here.

Conquering the curse of morphine, one docked structure at a time

Morphine, one of the most celebrated molecules in history, also happens to be one of the deadliest. It belongs to the same family of opioid compounds that includes oxycodone and heroin. So does fentanyl, a weaponized form of which is rumored to have caused havoc and killed dozens in a terrorist standoff in Russia a few years ago. Potent painkillers which mitigate pain that is impossible to cure using traditional painkillers like ibuprofen, morphine and its cousins are also devils in disguise.

Constipation is the least concerning side effect of opioid use. Even in moderate doses they have a high risk of causing death from respiratory failure; in the jargon of drug science, their therapeutic index - the difference between a beneficial and a fatal dose - is very low. To get an idea of how potent these compounds are, it's sobering to know that in 2014, more than 28,000 deaths were caused by opioid overdoses, and half of these were because of prescription opioids; the addiction potential of these substances is frighteningly high. The continuing oxycodone epidemic in places like New Hampshire bears testimony to the hideous trail of death and misery that opioids can leave in their wake.

Quite unsurprisingly, finding a chemical agent that has the same unique painkiller profile as morphine without the insidious side effects has been a longstanding goal in drug discovery, almost a holy grail. It would be hard to overestimate the amount of futile resources that pharmaceutical companies and academic labs alike have spent in this endeavor. It was long thought that it might be impossible to decouple morphine's good effects from its bad ones, but studies in the last few decades have provided important insights into the molecular mechanism of morphine's action that may potentially help us accomplish this goal. Morphine binds to the so-called µ-opioid receptor (MOR), one of a family of several opioid receptors involved in pain, addiction and reward pathways. The MOR is a GPCR, a family of complex seven-helix protein bundle which are ubiquitously involved in pretty much every important physiological process, from vision to neuronal regulation.

Almost 30% of all drugs work by modulating the activity of GPCRs, but the problem until now has been the difficulty of obtaining atomic-level crystal structures of these proteins that would allow researchers to try out a rational drug design approach against them. All that changed with a series of breakthroughs that has allowed us to crystallize dozens of GPCRs (that particular tour de force bagged its inventors - Brian Kobilka and Robert Lefkowitz - the 2012 Nobel Prize, and I had the pleasure of interviewing Kobilka a few years ago). More recently they have crystallized the various opioid receptors, and this definitely represents a promising opportunity to find drugs against these proteins using rational drug design. When activated or inhibited by small molecules, GPCRs bind to several different proteins and engage several different biochemical pathways. Among other insights, detailed studies of GPCRs have allowed us to determine that the side effects of opioid drugs emerge mainly from engaging one particular protein called ß-arrestin.

A paper published this week in Nature documents a promising effort to this end. Using computational docking, Brian Shoichet (UCSF), Bryan Roth (UNC) and their colleagues have found a molecule called PZM21 that seems to activate the MOR without also activating the molecular mechanisms which cause dangerous side effects like respiratory depression. The team started with the crystal structure of the MOR and docked about 3 million commercial lead-like molecules against it. The top 2500 entries were visually inspected, and using known data on GPCR-ligand interactions (especially knowledge of the interaction between the ligands and an aspartate - Asp 147) and sound physicochemical principles (discarding strained ligands), they narrowed down the set to about 25 ligands which gave binding affinities ranging from 2 µM - 14 µM; interestingly, along with the Asp 147 interaction, the model located another unique hydrogen bond with Asp 147 that has never been seen before in GPCR-ligand binding (compound 7 in figure above). Using traditional structure-based design that allowed the group to access novel interactions, the affinity was then improved all the way to single digit nanomolar, which is an excellent starting point for future pharmacokinetic studies. The resulting compound hit all the right endpoints in the right assays; it had greatly reduced recruitment of ß-arrestin and subsequently reduced respiratory arrest and constipation, it mitigated pain efficiently in mice and showed much lesser addiction potential. It also showed selectivity against other opioid receptor subtypes, which may make it a more targeted drug.

This paper is noteworthy for several aspects, and not just for the promising result it achieved. Foremost among them is the fact that a relatively straightforward docking approach was used to find a novel ligand with novel interactions for one of the most coveted and intractable protein targets in pharmacology. But to me, equally noteworthy is the fact that the group examined 2500 docked ligands to pick the right ones and discard bad ones based on chemical intuition and existing knowledge; the final 25 ligands were not just selected from the top 50 or 100. That exercise shows that no modeling can compensate for the virtues of what chemist Garland Marshall called "a functioning pair of eyes connected to a good brain", and as I have found out myself, there's really no substitute for patiently slogging through hundreds of ligands and using your chemical intuition to pick the right ones.

There is little doubt that approaches such as this one will continue to find value in discovering new ligand chemistry and biology in valuable areas of pharmacology and medicine. As is becoming clear in the age of computing, the best results emerge not from computers or human beings alone, but when the two collaborate. This here is as good an example of that collaboration as I have recently seen.

Physicist Ed Witten on consciousness: "I tend to believe that it will remain a mystery"

Here's a very interesting video of mathematical physicist Edward Witten - widely regarded as perhaps the most brilliant mind in the field of the last fifty years - holding forth on consciousness (the relevant part begins at 1:10:25). 




Many people regard consciousness as the last nut to crack at the frontier of science. If we crack that nut it would open the way to an unprecedented understanding of humanity that may in part explain why mankind produces thinkers like Ed Witten that allow us to understand the deep secrets of the universe.

But Witten is not too optimistic about it. And he seems to have fairly clear reasons for believing that consciousness will always remain a mystery. Here's what he has to say (italics mine).
"I think consciousness will remain a mystery. Yes, that's what I tend to believe. I tend to think that the workings of the conscious brain will be elucidated to a large extent. Biologists and perhaps physicists will understand much better how the brain works. But why something that we call consciousness goes with those workings, I think that will remain mysterious. I have a much easier time imagining how we understand the Big Bang than I have imagining how we can understand consciousness... 
Understanding the function of the brain is a very exciting problem in which probably there will be a lot of progress during the next few decades. That's not out of reach. But I think there probably will remain a level of mystery regarding why the brain is functioning in the ways that we can see it, why it creates consciousness or whatever you want to call it. How it functions in the way a conscious human being functions will become clear. But what it is we are experiencing when we are experiencing consciousness, I see as remaining a mystery... 
Perhaps it won't remain a mystery if there is a modification in the laws of physics as they apply to the brain. I think that's very unlikely. I am skeptical that it's going to be a part of physics.
Later on he talks a bit about Roger Penrose's thesis on why we could never build an AI simulating the human mind, and why we may need a modification of the laws of physics to account for consciousness: Witten personally disagrees with the latter stance. I am partial myself toward the belief that we may not understand consciousness simply because you cannot truly understand a system which you are already a part of.

But what Witten is saying here is in some sense quite simple: even if we understand the how of consciousness, we still won't understand the why. This kind of ignorance of whys is not limited to consciousness, however. For instance among other things, we don't know why our universe happens to be the one in which the tuning of the fundamental constants of nature is precisely such that it allows the evolution of sentient human beings which can ask that question. We don't know why the elementary particles have the masses that they do. We don't know why the eukaryotic cell evolved only once.

It's interesting to contrast Witten's thoughts with John Horgan's "End of Science" thesis. In that case Horgan is really saying that the fundamental laws of physics have been largely discovered. They cannot be discovered twice and they almost without question won't ever be fundamentally modified. But Horgan's thesis applies in a larger sense to the whys that Witten is treading on. The end of science really is the end of the search for final causation. In that sense not just consciousness but many aspects of the world may always remain a mystery. Whether that is emotionally pleasing or disconcerting is an individual choice that each one of us has to make.

Physics, biology and models: A view from 1993 and now

Twenty three years ago the computer scientist Danny Hillis wrote an essay titled "Why physicists like models, and why biologists should." The essay gently took biologists to task for not borrowing model-building tricks from the physicists' trade. Hillis's contention was that simplified models have been hugely successful in physics, from Newton to Einstein, and that biologists seem to have largely dismissed model building by invoking the knee-jerk reason that biological systems are far too complex.

There is a grain of truth in what Hillis says about biologists not adopting modeling and simulation to understand reality. While some of it probably is still a matter of training - most experimental biologists do not receive training in statistics and computer science as a formal part of their education - the real reasons probably have more to do with culture and philosophy. Historically too biology has always been a much more experimental science compared to physics; Carl Linnaeus was still classifying animals and plants while Isaac Newton was mathematizing all of classical mechanics. 

Hillis documents three reasons why biologists aren't quick to use models.
For various reasons, biologists who are willing to accept a living model as a source of insight are unwilling to apply the same criterion to a computational model. Some of the reasons for this are legitimate, others are not. For example, some fields of biology are so swamped with information that new sources of ideas are unwelcome. A Nobel Prize winning molecular biologist said to me recently, "There may be some good ideas there, but we don't really need any more good ideas right now." He might be right. 
A second reason for biologists' general lack of interest in computational models is that they are often expressed in mathematical terms. Because most mathematics is not very useful in biology, biologists have little reason to learn much beyond statistics and calculus. The result is that the time investment required for many biologists to understand what is going on in computational models is not worth the payoff. 
A third reason why biologists prefer living models is that all known life is related by common ancestry. Two living organisms may have many things in common that are beyond our ability to observe. Computational models are only similar by construction; life is similar by descent.
Many of these reasons still apply but have evolved for the better since 1993. The information glut has, if anything, increased in important fields of biology like genomics and neuroscience. Hillis did not live in the age of 'Big Data' but his observation precedes it. However data by itself should not preclude the infusion of modeling; if anything it should encourage it even more. Also, the idea that "most mathematics is not very useful in biology" pertains to the difficulty (or even impossibility) or writing down, say, a mathematical description of a cat. But you don't have to always go that far in order to use mathematics effectively. For instance ecologists have used simple differential equations to model the rise and fall of predator and prey populations, and systems biologists are now using similar equations to model the flux of nutrients, metabolites and chemical reactions in a cell. Mathematics is certainly more useful in biology now than what it was in 1993, and much of this resurgence has been enabled by the rise of high speed computing and better algorithms. 

The emergence of better computing also speaks to the difficulty in understanding computational models that Hillis talks about; to a large extent it has now mitigated this difficulty. When Hillis was writing, it took a supercomputer to perform the kind of calculation than you can now do on your laptop in a few hours. The advent of Moore's Law-enabled software and hardware has enormously enabled number-crunching in biology. The third objection - that living things are similar by descent rather than construction - is a trivial one in my opinion. If anything it makes the use of computational models even more important, since by comparing similarities across various species one can actually get insights into potential causal relationships between them. Another reason Hillis gives for biologists not embracing computation is because they are emotionally invested in living things rather than non-living material things. This is probably much less of a problem now, especially since computers are used commonly even by biologists to perform routine calculations like graphing.

While the problems responsible for biologists' lukewarm response to computational models are legitimate, Hillis then talks about why biologists should still borrow from the physicists' modeling toolkit. The basic reason is that by constructing a simple system with analogous behavior, models can capture the essential features of a more complex system: This is in fact the sine qua non of model building. The tricky part of course is in figuring out whether the simple features truly reproduce the behavior of the real world system. The funny thing about models however is that they simply need to be useful, so they need not correspond to any of our beliefs about real world systems. In fact, trying to incorporate too much reality into models can make them worse and less accurate.

A good example I know is this paper from Merck that correlated simple minimized energies of a set of HIV protease inhibitor drugs to their inhibition values (IC50s) against the enzyme. Now, nobody believes that the very simple force field underlying this calculation actually reproduces the complex interplay of protein, small molecules and solvent that takes place in the enzyme. But the point is that they don't care: as far as the model is predictive it's all good. Hillis though is making the point that models in physics have been more than just predictive, they have been explanatory. He extols biologists to move away from simple prediction, as useful as it might be, and towards explanation. 

I agree with this sentiment, especially since prediction alone can lead you down a beatific path in which you may get more and more divorced from reality. Something similar can happen especially in fields like machine learning, where combinations of abstract descriptors that defy real world interpretation are used merely because they are predictive. These kinds of models are very risky in the sense that you can't really figure out what went wrong if their break down; at that point you have a giant morass of descriptors and relationships to sort through, very few of which make any physical sense. This dilemma is true of biological models as a whole, so one needs to tread a fine line between keeping a model simple and descriptive and incorporating enough real world variables to make it scientifically sensible.

Later the article talks about models separating out the important variables from the trivial ones, and that's certainly true. He also talks about models being able to synthesize experimental variables into a seamless whole, and I think machine learning and multiparameter optimization in drug discovery for instance can achieve this kind of data fusion. There is a twist here though, since the kinds of emergent variables that are rampant in biology are not usually seen in physics. Thus, modeling in biology needs to account for the mechanisms underlying the generation of emergent phenomena. We are still not at the point where we can do this successfully, but we seem to be getting there. Finally, I also like to emphasize one very important utility of models: as negative filters. They can tell you what experiment not to do or what variable to ignore or what molecule not to make. That at the very least leads to a saving of time and resources.

The bottom line is that there is certainly more cause for biologists to embrace computational models than there was in 1993. And it should have nothing to do with physics envy.

Portrait of the human as a tangled bank: A review of "I Contain Multitudes: The Microbes Within Us and a Grander View of Life"

It’s time we became friends with microbes. And not just with them but with their very idea, because it’s likely going to be crucial to our lives on this planet and beyond. For a long time most humans have regarded bacteria as a nuisance. This is because we become aware of them only when something goes wrong, only when they cause diseases like tuberculosis and diarrhea. But as Ed Yong reveals in this sweeping, exciting tour of biology, ecology and medicine which is pregnant with possibility, the vast majority of microbes help us in ways which we cannot possibly fathom, which permeate not just our existence but that of every single other life form on our planet. The knowledge that this microbial universe is uncovering holds tantalizing clues to treating diseases, changing how we eat and live and potentially effecting a philosophical upheaval in our view of our relationship with each other and with the rest of life.

Yong’s book shines in three ways. Firstly it’s not just a book about the much heralded ‘microbiome’ – the densely populated and ubiquitous universe of bacteria which lives on and within us and which rivals our cells in terms of numbers – but it’s about the much larger universe of microbes in all its guises. Yong dispels many misconceptions, such as the blanket statements that bacteria are good or bad for us, or that antibiotics are always good or bad for us. His narrative sweeps over vast landscape, from the role of bacteria in the origins of life to their key functions in helping animals bond on the savannah, to new therapies that could emerge from understanding their roles in diseases like allergies and IBD. One fascinating subject which I think Yong could have touched on is the potential role of microbes in seeding extraterrestrial life.

The universal theme threading through the book is symbiosis: how bacteria and all other life forms function together, mostly peacefully but sometimes in a hostile manner. The first complex cell likely evolved when a primitive life form swallowed an ancient bacterium, and since this seminal event life on earth has never been the same. They are involved in literally every imaginable life process: gut bacteria break down food in mammals’ stomachs, nitrogen fixing bacteria construct the basic building blocks of life, others play critical roles in the water, carbon and oxygen cycle. Some enable insects, aphids and a variety of other animals to wage chemical warfare, yet others keep coral reefs fresh and stable. There’s even a species that can cause a sex change in wasps. Perhaps the most important ones are those which break down environmental chemicals as well as food into myriad interesting and far-ranging molecules affecting everything, from mate-finding to distinguishing friends from foes to nurturing babies’ immune systems through their ability to break down sugars in mother’s milk. This critical role that bacterial symbiosis plays in human disease, health and even behavior is probably the most fascinating aspect of human-bacteria co-existence, and one which is only now being gradually teased out. Yong’s central message is that the reason bacteria are so fully integrated into living beings is simple: we evolved in a sweltering, ubiquitous pool of them that was present and evolving billions of years before we arrived on the scene. Our relationship with them is thus complex and multifaceted, and as Yong demonstrates, has been forged through billions of years of messy and haphazard evolution. For one thing, this therefore makes any kind of simple generalization about them almost certainly false. And it makes us realize how humanity would rapidly become extinct in a world suddenly devoid of microbes.

Secondly, Yong is adept at painting vivid portraits of the men and women who are unraveling the secrets of the microbial universe. Old pioneers like Pasteur, Leeuwenhoek and Koch come alive in crisp portraits (for longer ones, I would recommend Paul DeKruif's captivating classic, "Microbe Hunters"). At the same time, new pioneers herald new visions. Yong crisscrosses the globe, from the San Diego Zoo to the coral reefs of Australia to the savannah, talking to adventurous researchers about wasps, aphids, hyenas, squid, pangolins, spiders, human infants and all the microbes that are intimately sharing their genes with these life forms. He is also a sure guide to the latest technology including gene sequencing that has revolutionized our understanding of these fascinating creatures (although I would have appreciated a longer discussion on the so-called CRISPR genetic technology that has recently taken the world by storm). Yong’s narrative makes it clear that innovative ideas come from the best researchers combining their acumen with the best technology. At the same time his sometimes-wondrous narrative is tempered with caution, and he makes it clear that the true implications of the findings emerging from the microbiome will take years and perhaps decades to unravel. The good news is that we're just getting started.

Thirdly, Yong delves deeply into the fascinating functions of bacteria in health and disease, and this involves diseases which go way beyond the familiar pandemics that have bedeviled humanity throughout its history. Antibiotics, antibiotic resistance and the marvelous process of horizontal gene transfer that allows bacteria to rapidly share genes and evolve all get a nod. Yong also leads us through the reasonable but still debated 'hygiene hypothesis' which lays blame for an increased prevalence of allergies and autoimmune disorders at the feet of overly and deliberately clean environments and suburban living. He discusses the novel practice of fecal transplants that promises to cure serious intestinal inflammation and ailments like IBD and Crohn’s disease, but is also wary about its unpredictable and unknown consequences. He also talks about the fascinating role that bacteria in newborn infants’ bodies play when they digest crucial sugars in mother’s milk and affect multiple functions of the developing baby’s body and brain. Unlike proteins and nucleic acids, sugars have been the poor cousins of biochemistry for a long time, and reading about their key role in microbial symbiosis warmed this chemist's heart. Finally and most tantalizingly, the book describes potential impacts that the body’s microbiome and its outside guests might have on animal and human behavior itself, leading to potential breakthrough treatments in psychiatry. The real implications of these roles will have to be unraveled through the patient, thoroughgoing process that is the mainstay of science, but there is little doubt that the arrows seem to be pointing in very promising directions.

“There is grandeur in this view of life”, Darwin said in his magnum opus “The Origin of Species”. And just how much grandeur there exactly is becomes apparent with the realization that Darwin was dimly aware at best of microbes and their seminal role in the origin and propagation of life. Darwin saw life as an 'entangled bank' full of wondrous species: I can only imagine that he would have been enthralled and stupefied by the vision of this entangled bank presented in Ed Yong's book.

Protein-protein interactions: Can't live without 'em, can't easily drug 'em

The many varieties of protein-protein interactions
Here's a pretty good survey of efforts to classify and drug protein-protein interaction (PPI) targets by a group from Cambridge University in Nature Review Drug Discovery. Most drug developers have been aware of the looming mountain of these ubiquitous interactions - there's at least 300,000 of them at last count and most likely many more - and have been trying to attack them in one way or another for more than a decade. There's also no doubt that many PPIs are involved in crucial ways in disease like cancer and inflammation. By any token PPIs are important.

As the review indicates though, attacking these interactions has been daunting to say the least. They present several challenges that seem to ask for both new scientific and institutional models of strategy. From the scientific standpoint PPIs present a nightmarish panoply of difficulties: proteins that change conformation when they comes together, water molecules that may or may not be involved in key interactions, the universal challenges of designing 'beyond rule of 5' drugs for such targets and the challenges of developing highly sensitive new biophysical techniques to detect ligand binding to begin with.

Consider protein flexibility, a factor which often is the nemesis of even 'regular', non PPI projects. Protein flexibility not only make crystallization hard and its results dicey, but it decidedly thwarts computational predictions, especially if the conformational changes are large. PPIs however regularly present cases in which the conformations of the unbound proteins are different from the bound ones, so at the very minimum you need crystal or NMR structures of both bound and unbound forms. This is even harder if one of the partners is a peptide, in which case it's likely going to undergo even more conformational changes when it binds to a protein target. The hardest case is two peptides such as Myc and Max, both of which are unstable and disordered by themselves, which stabilize only when they bind to each other. That's quite certainly an interaction forged in the fires of Mount Doom; good luck getting any kind of concrete structural data on that gem.

The screening challenges involved in studying PPIs are as if not harder than the structural challenges. NMR is probably the only technique that can reliably detect weak binding between proteins and ligands in as much of a 'natural' state as possible. although it presents its own difficulties like protein size and other technical challenges. SPR and FRET can help, but you are really testing the limits of these binding assays in such cases. Finding reliable, more or less universal methods that combine high throughput with good sensitivity has been an elusive goal, and most data we have on this front is anecdotal.

Finally, the medicinal chemistry challenges can never be underestimated, and that's where the institutional challenges also come in. In many PPI projects you are basically trying to approximate a giant block of protein or peptide by a small molecule with good pharmacological properties. In most cases this small molecule is likely going to fall outside the not-so-hallowed-anymore Lipinski Rule of 5 space. I should know something about this since I have worked in the area myself and can attest to the challenges of modeling large, greasy, floppy compounds. These molecules can cause havoc on multiple levels: by aggregating among themselves, by sticking non-specifically to other proteins and by causing weird conformational changes that can only be 'predicted' when observed (remember that old adage about the moon...). Not only do you need to discover new ways of discovering large PPI inhibitors that can make it across cell membranes and are not chewed up the moment they get into the body, but you also need new management structures that encourage the exploration of such target space (especially in applications like CNS drug discovery: in oncology, as happens depressingly often, you can get away with almost anything). If there's one thing harder than science, it's psychology.

In spite of these hurdles which are reflected in the sparse number of bonafide drugs that have emerged from PPI campaigns, the review talks about a number of successful pre-clinical PPI projects involving new modalities like stapled peptides, macrocycles and fragment-based screening that at the very least shine light on the unique properties of two or more proteins coming together. One of the more promising strategies is to find an amino acid residue in one partner of a PPI that acts like an 'anchor' and provides binding energy. There have been some successful efforts to approximate this anchor residue with a small molecule, although it's worth remembering that nature designed the rest of the native protein or peptide for a reason. 

Another point from the review which seems to me like it should be highlighted is the importance of academia in discovering more novel features of PPIs and their inhibitors. In a new field even the basics are not as well known, it seems logical to devote as many efforts to discovering the science as to applying it. At the very least we can bang our collective heads against the giant PPI wall and hope for some cracks to emerge.