Field of Science

What areas of chemistry could AI impact the most? An opinion poll

The other day I asked the following as a survey question regarding potential areas of chemistry where AI could have the biggest impact.


There were 163 responses which wasn't a bad representative sample. The responses are in fact in line with my own thinking: synthesis planning and automation emerge as the leading candidates. 

I think synthesis planning AI will have the biggest impact on everyday lab operations during the next decade. Synthesis planning, while still challenging, is still a relatively deterministic protocol based on a few good reactions and a large but digestible number of data points. Reliable reactions like olefin metathesis and metal-mediated coupling have now become fairly robust and heavily used to generate thousands of machine-readable data points and demonstrate reliability and relative predictability; there are now fewer surprises, and whatever surprises exist are well-documented. As recent papers make it clear, synthesis planning had been waiting in the wings for several years for the curation of millions of examples pertaining to successful and unsuccessful reactions and chemotypes as well as for better neural networks and computing power. Without the first development the second wouldn't have made a big difference, and it seems like we are finally getting there with good curation.

I was a bit more surprised that materials science did not rank higher. Quantum chemical calculations for estimating optical, magnetic, electronic and other properties have been successful in materials science and have started enabling high-throughout studies in areas like MOF and battery technology, so I expect this field to expand quite a bit during the next few years. Similar to how computation has worked in drug discovery, AI approaches don't need to accurately predict every material property to three decimal places; they will have a measurable impact even if they can qualitatively rank different options and narrow down the pool so that chemists have to spend fewer resources making them.

Drug design, while already a beneficiary of compute, will see mixed results in my opinion over the next decade. For one thing, "drug design" is a catchall phrase that can include everything from basic protein-ligand structure prediction to toxicity prediction, with the latter being at the challenging end of the spectrum. Structure-based design will likely benefit from deep learning that learns basic intermolecular interactions which are transferable across target classes, so that they are limited by the paucity of training data.

Areas like synthesis planning do contribute to drug design, but the real crux of successful drug design will be multiparameter optimization and SAR prediction, where an algorithm is able to successfully calculate multiple properties of interest like affinity, PK/PD and toxicity. PK/PD and toxicity are systemic effects that are complex and emergent, and I think the field will still not be able to make a significant dent in predicting idiosyncratic toxicity except for obvious cases. One area in which I see AI having a bigger impact is any field of drug discovery involving image recognition; for instance phenotypic screening, and perhaps the processing of images in cryo-EM and standard x-ray crystallography.

Finally, automation is one area where I do think AI will make substantial progress. This is partly due to better seamless integration of hardware and software and partly because of better data generation and recording that will enable machine learning and related models to improve. This development, combined with reaction planning that allows scientists to test multiple hypotheses will contribute, in my opinion, in automation making heavy inroads in the day-to-day work of chemists fairly soon.

Another area which I did not mention in the survey but which will impact all of the above areas is text mining. There the problem is one of discovering relationships between different entities (drugs and potential targets, for instance) that are not novel per se but that are just hidden in a thicket of patents, papers and other text sources which are too complicated for humans to parse. Ideally, one would be able to combine text mining with intelligent natural language processing algorithms to enable new discovery through voice commands.

Can we turn biology into engineering?

Vijay Pande of Andreessen-Horowitz/Stanford has a thought-provoking piece in Scientific American in which he lays out a game plan for how we could potentially make biology more like engineering. He takes issue with what Derek (Lowe) once called the “Andy Grove fallacy”, which in a nutshell says that you can make fields like biotechnology and drug discovery as efficient as semiconductor or automobile engineering if you borrow principles from engineering.

There are parts of the piece that I resoundingly agree with; for instance, there’s little doubt that fields like automation and AI are going to have a significant impact on making biological experiments more reproducible, many of which are still more art than science and subject to the whims and sloppiness of their creators and their lab notebooks. Vijay is also optimistic about making biology more modular, so that one can string along parts of molecules, cells and organelles to enable better biological engineering of body parts, drugs and genetic systems. He also believes that bringing more quantitative measurements encoded into key performance indicators (KPI) will make the discipline more efficient and more mindful of its successes and failures. One point which I think is very important is that these kinds of approaches would allow us to gather more negative data, a data collection problem that still hobbles AI and machine learning approaches.

So far I am with him; I don’t believe that biology can’t ever benefit from such approaches, and it’s certainly true that the applications of AI, automation and other engineering-based approaches are only going to increase with time. But the article doesn’t mention some very fundamental differences between biology and engineering which I think demarcate the two substantially from each other and which make knowledge transfer between them highly problematic.

Foremost among these are non-linearity, redundancy and emergence.

Let’s take two examples which the piece talks about that illustrate all three concepts – building bridges and the Apollo project. Comparisons with the latter always make me wince a bit. Vijay is quite right that the right approach to the Apollo program was to break the problems into parts, then further break those parts up into individual steps such as building small models. The scientists and engineers working on the program gradually built up layers of complexity until the model that they tested was the moon landing itself.

Now, the fact is that we already do this in biology. For instance, when we want to understand or treat a disease, we try to break it down to simpler levels – organelles, cells, proteins, genes – and then try to understand and modulate each of these entities. We use animal models like genetically engineered mice and dogs as simpler representatives of complex human biology. But firstly - and as we keep on finding out - these models are pale shadows of true human biology; we use them because we can't do better. And secondly, even these ‘simple’ models are much more complex than we think. The reasons are non-linearity and emergence, both of which can thwart modular approaches. The sum of proteins in a cell is not the same as the cell phenotype itself, just like the sum of neurons in a human brain is not the brain itself. So modulating a protein for instance can cause complex downstream effects that depend on both the strength and nature of the modulating signal. In addition, biological pathways are redundant, so modulating one can cause another one to take over, or for the pathway to switch between complex networks. Many parts downstream, even ones that don’t seem to be directly connected, can interact with each other through complex, non-linear feedback through far-flung networks.

This is very unlike engineering. The equivalent of these unpredictable consequences in building a bridge, for example, would be for a second bridge to sprout out of nowhere when the first one is built, or the rock on the other side of the river suddenly turning from metamorphic to sedimentary, or the sum of weights of two parallel beams on the bridge being more than what simple addition would suggest. Or imagine the Apollo rocket suddenly accelerating to ten times its speed when the booster rockets fall off. Or the shape of the reentry vehicle suddenly changing through some weird feedback mechanisms as it reaches a certain temperature when it’s hurtling through the atmosphere. 

Whatever the complexities of challenging engineering projects like building rockets or bridges, they are still highly predictable compared to the effects of engineering biology. The fact of the matter is that the laws of aerodynamics and gravity were extremely well understood before the Apollo program (literally) took off the ground, so as amazing as the achievement was, it didn't involve discovering new basic scientific laws on the fly, something that we do a lot in biology. Aircraft design is decidedly not drug design. And all this is simply a product of ignorance, ignorance of the laws of biology and evolution – a clunky, suboptimal, haphazard, opportunistic process if there ever was one – relative to the laws of (largely predictable) Newtonian physics that underlie engineering problems.

The concept of modularity in biology therefore becomes very tricky compared to engineering. There is some modularity in biology for sure, but it’s not going to take you all the way. One of the reasons is that unlike modularity in engineering, biological modularity is flexible, both spatially and temporally. This is again a consequence of different levels of emergence. For instance, a long time ago we thought that the brain was modular and that the fundamental modules were neurons. This view has now changed and we think that it’s networks of neurons that are the basic modules. But we don’t even think that these modular networks are fixed through space and time; they likely form, dissolve and change members and locations in the brain according to need, much like political groups fleetingly forming and breaking apart for convenience. The problem is that we don’t know what level of modularity is relevant to addressing a particular problem. For instance, is the right ‘module’ for thinking about Alzheimer’s disease the beta-amyloid protein, or is it the mitochondria and its redox state, or is it the gut-brain axis and the microbiome? In addition, modules in biology are again non-linear, so the effects from combining two modules are not going to simply be twice the effects of one module – they can be twice or half or even zero.

Now, having noted all these problems, I certainly don’t think that biology cannot benefit at all from the principles of engineering. For one thing, I have always thought that biologists should really take the “move fast and break things” philosophy of software engineering to heart; we simply don’t spend enough time trying to break and falsify hypotheses, and this leads to a lot of attrition and time chasing ghosts down rabbit holes. More importantly though, as a big fan of tool-driven scientific revolutions, I do believe that inventing tools like CRISPR and sequencing will allow us to study biological systems at an increasingly fine-grained level. They will allow us to gather more measurements that would allow better AL/machine learning models, and I am all for this.

But all this will work as far as we realize that the real problem is not improving the measurements, it’s knowing what measurements to make in the first place. Otherwise we find ourselves in the classic position of the drunkard trying to find his keys below the lamp, because that’s where the light is. Inventing better lamps, or a metal detector for that matter, is not going to help if we are looking in the wrong place for the keys. Or looking for keys when we should really be looking for a muffin.

Victor Weisskopf and the many joys of scientific insight

Victor Weisskopf (Viki to his friend) emigrated to the United States in the 1930s as part of the windfall of Jewish European emigre physicists which the country inherited thanks to Adolf Hitler. In many ways Weisskopf's story was typical of his generation's: born to well-to-do parents in Vienna at the turn of the century, educated in the best centers of theoretical physics - Gottingen, Zurich and Copenhagen - where he learnt quantum mechanics from masters like Wolfgang Pauli, Werner Heisenberg and Niels Bohr, and finally escaping the growing tentacles of fascism to make a home for himself in the United States where he flourished, first at Rochester and then at MIT. He worked at Los Alamos on the bomb, then campaigned against it as well as against the growing tide of red-baiting in the United States. A beloved teacher and researcher, he was also the first director-general of CERN, a laboratory which continues to work at the forefront of particle physics and rack up honors.

But Weisskopf also had qualities that set him apart from many of his fellow physicists; among them were an acute sense of human tragedy and triumph and a keen and serious interest in music and the humanities that allowed him to appreciate human problems and connect ideas from various disciplines. He was also renowned for being a wonderfully warm teacher. Many of these qualities are on full display in his wonderful, underappreciated memoir titled "The Joy of Insight: Passions of a Physicist".

The memoir starts by describing Weisskopf's upbringing in early twentieth century Vienna, which was then a hotbed of revolutions in science, art, psychology and music. The scientifically inclined Weisskopf came of age at the right time, when quantum mechanics was being developed in Europe. He was fortunate to study first at Göttingen which was the epicenter of the new developments, and then in Zurich under the tutelage of the famously brilliant and acerbic Wolfgang Pauli. It was Göttingen where Max Born and Heisenberg had invented quantum mechanics; by the time Weisskopf came along, in the early 1930s, physicists were in a frenzy to apply quantum mechanics to a range of well known, outstanding problems in nuclear physics, solid state physics and other frontier branches of physics.

Pauli who was known as the "conscience of physics" was known for his sharp tongue that spared no one, but also for his honesty and friendship. Weisskopf's first encounter with Pauli was typical:
"When I arrived at the Institute, I knocked at the door of Pauli's office until I heard a faint voice saying, "Come in". There at the far end of the room I saw Pauli sitting at his desk. "Wait, wait", he said, "I have to finish this calculation." So I waited for a few minutes. Finally, he lifted his head and said, "Who are you?" I answered, "I am Weisskopf. You asked me to be your assistant." He replied, "Oh, yes. I really wanted (Hans) Bethe, but he works on solid state theory, which I don't like, although I started it."... 
Pauli gave me some problem to study - I no longer remember what it was - and after a week he asked me what I had done about it. I showed him my solution, and he said, "I should have taken Bethe after all."...
In spite of this rather inauspicious start, Weisskopf became both a very good physicist and a close friend of both Pauli and Bethe; he credits Pauli for lovingly 'spanking him into shape'.

Weisskopf also spent a productive year at Niels Bohr's institute in Copenhagen, where he was the 'victim' of Bohr's extended walks and tortuous reformulations of scientific statements to render them as accurate as possible. He benefited immensely from Bohr's style, as did many other leading theoretical physicists of the time. Bohr was known for his Delphic utterances and his mesmerizing personality that left listeners both frustrated as well as filled with a sense of wonder; only Einstein was more famous in the world of science then. In Copenhagen Bohr had created his own magic kingdom, one to which almost every budding physicist was required to make a pilgrimage. Many memories of Weisskopf's time with Bohr are recounted, but one in particular attests to the man's fame, essential qualities and influence:
"One evening at six o'clock, my usual quitting time, Bohr and I were still deep in discussion. I had an appointment that night and had to leave promptly, so Bohr walked me to the streetcar stop, about five minutes from his house. We walked and he talked. When we got there, the streetcar was approaching. It stopped and I climbed on to the steps. But Bohr was not finished. Oblivious to the people sitting in the car, he went right on with what he had been saying while I stood on the steps. Everyone knew who Bohr was, even the motorman, who made no attempt to move to start the car. He was listening with what seemed like rapt attention while Bohr talked for several minutes about certain subtle properties of the electron. Finally Bohr was through and the streetcar started. I walked to my seat under the eyes of the passengers, who looked at me as if I were a messenger from a special world, a person chosen to work with the great Niels Bohr."
Weisskopf made important contribution to quantum electrodynamics, but he suffered from a self-admitted lack of confidence that sometimes kept him from pushing calculations through. In one episode, he made an important mistake in a paper that one of Robert Oppenheimer's students pointed out in a private letter; Weisskopf was grateful and rightly comments that in today's times, that student might have directly sent a strident correction to the journal the published the paper, causing public embarrassment. 

In another, more consequential embarrassment that must have been jarring, he wrote a paper after the war on the famous Lamb Shift that revealed fundamental discrepancies in the structure of quantum field theory. He then and compared the results with ones acquired by Richard Feynman and Julian Schwinger, the new young mandarins who were revolutionizing the field. When his results did not agree with theirs, he withheld publication; surely physicists as brilliant as Feynman and Schwinger couldn't be wrong? After a few weeks, he heard from Feynman who had realized that both he and Schwinger had made the same mistake in their calculation. Weisskopf was right, and if had published his paper, he himself admits that he might have won a Nobel Prize. None of this engendered a sense of bitterness in him, however, and he used the incident to illustrate the importance of self-confidence in science.

After the war, the House Un-American Activities Committee (HUAC) started questioning and indicting hundreds of people for their pre-war and post-war communist party membership, and into the trap fell several physicists who were connected with national security work. One of these was Bernard Peters, a past student of Robert Oppenheimer's who had spent time in Dachau for communist agitation. Since the war, Peters had become a colleague of Weisskopf's at the University of Rochester and was doing significant research in cosmic ray physics. In June 1949, Oppenheimer gave a damning testimony against Peters and conceded that he was likely a dangerous communist. Much of this testimony was simply based on opinion and on Peters's activities before the war, with no concrete evidence. The testimony created an uproar in the physics community, much of which regarded Oppenheimer as its foremost spokesman. In response, Weisskopf wrote an impassioned letter to Oppenheimer, essentially taking him to task for his betrayal of Peters and begging him to set the record straight. He lobbied the University of Rochester and convinced its president to refrain from firing Peters. Weisskopf's conscientious actions during a period of great turmoil demonstrated resolve and empathy.

The Soviets had exploded their atomic own bomb in August 1949, and in February 1950, Senator Joseph McCarthy gave a speech in which he claimed to have a list of more than two hundred communists and likely spies in the State Department. News of physicist Klaus Fuchs' passing of atomic secrets to the Soviet Union started a furious arms race between the two nations to build a hydrogen bomb. When the debates were raging, a crucial conversation between Weisskopf and Hans Bethe in Princeton persuaded Bethe to reconsider working with Edward Teller on the hydrogen bomb. Egged on by Teller, Bethe was having a hard time deciding whether to work on the bomb. After a meeting with Oppenheimer, Weisskopf painted a vivid picture of thermonuclear destruction and convinced Bethe that even a world in which the US won a war with hydrogen bombs would not be worth living. Throughout the Cold War years, Hans Bethe was rightly considered as the conscience of the physics community, but Weisskopf could rightly be considered his wingman.

Keeping in spirit with his earlier work, Weisskopf kept on being known for the occasional mathematical mistakes that sometimes slowed down his calculations. On the other hand, this kind of inspired sloppiness made him a truly wonderful teacher, one who could provide for a completely immersive experience for his students in a way that made them feel they were participating in, rather than being taught, the process of scientific discovery. The physicist and science writer Jeremy Bernstein captured this memorable aspect of Weisskopf's trade in a 1991 review of the book:
"My visits to Viki's class in quantum mechanics at MIT were, in every way, a culture shock. The class and the classroom were both huge—at least a hundred students. Weisskopf was also huge, at least he was tall compared to the diminutive Schwinger. I do not think he wore a jacket, or if he did, it must have been rumpled. Schwinger was what we used to call a spiffy dresser.  
Weisskopf's first remark on entering the classroom, was "Boys [there were no women in the class], I just had a wonderful night!" There were raucous catcalls of "Yeah Viki!" along with assorted outbursts of applause. When things had quieted down Weisskopf said, "No, no it's not what you think. Last night, for the first time, I really understood the Born approximation." This was a reference to an important approximation method in quantum mechanics that had been invented in the late 1920s by the German physicist Max Born, with whom Weisskopf studied in Göttingen. Weisskopf then proceeded to derive the principal formulas of the Born approximation, using notes that looked as if they had been written on the back of an envelope. 
Along the way, he got nearly every factor of two and pi wrong. At each of these mistakes there would be a general outcry from the class; at the end of the process, a correct formula emerged, along with the sense, perhaps illusory, that we were participating in a scientific discovery rather than an intellectual entertainment. Weisskopf also had wonderful insights into what each term in the formula meant for understanding physics. We were, in short, in the hands of a master teacher."
Throughout his memoir, Weisskopf's consistent emphasis is on what he calls the joy of insight; whether in science, in music (he was an accomplished classical pianist) or into human beings. His focus is on complementarity and totality, themes that were hallmarks of Niels Bohr's thinking. Complementarity means seeing the world from different viewpoints, each of which may not be strictly compatible with the others, but all of which are collectively important to make sense of reality. He realized that as powerful and satisfying as it is, science is one way of comprehending the world. It gives us the facts, but it doesn't always point us to the best way to use the facts. Religion, the humanities and the arts are all important, and it is important to use as many ways as possible to look at problems and try to solve them; this applies especially to human problems where science can only take us so far.

Nevertheless, in what would be a ringing endorsement of the joy of insight into the secrets of nature during these politically troubled times, here's Weisskopf speaking about the value of science as a candle in the dark:
"I can best describe the joy of insight as a feeling of aesthetic pleasure. It kept alive my belief in humankind at a time when the world was headed for catastrophe. The great creations of the human mind in both art and science helped soften the despair I was beginning to feel when I experienced the political changes that were taking place in Europe and recognized the growing threat of war." 
"During the 1960s I tried to recall my emotions of those days for the students who came to me during the protests against the Vietnam War. This, and other political issues, preoccupied them, and they told me that they found it impossible to concentrate on problems of theoretical physics when so much was at stake for the country and for humanity. I tried to convince them - not too successfully - that especially in difficult times it was important to remain aware of the great enduring achievements in science and in other fields in order to remain sane and preserve a belief in the future. Apart from these great contributions to civilization, humankind offers rather little to support that faith."
In today's times, when so much of the world seems to be chaotic, dangerous and unpredictable, Weisskopf's ode to the 'most precious thing that we have' is worth keeping in mind. Victor Weisskopf's spirit should live on.

30 favorite books

Computer science professor Scott Aaronson listed his 30 favorite books on his blog, so I thought I would (incompletely) list my own. These are volumes which inspired me even as a teenager and continue to stimulate and enrich my worldview. Reflecting my interests, they are mostly non-fiction (or as Richard Rhodes calls it, "verity"). List yours.
1. The Making of the Atomic Bomb – Richard Rhodes (probably the best work of non-fiction I have read, and in my opinion one of the best books ever written: easily parallels Shakespeare or the Greek tragedies as an epic work of horror and glory)
2. Blood Meridian – Cormac McCarthy (probably the best work of fiction I have read; the imagery, violence and profound depth are simply stunning and without parallel)
3. Paradigms Lost – John Casti (probably the best work of general science I have read. Six great problems of modern science – the origin of life, nature vs nurture, language acquisition, artificial intelligence, quantum reality and extraterrestrial intelligence - are tackled in the form of a courtroom case with wit and brilliance)
4. Disturbing the Universe – Freeman Dyson (probably the best autobiography I have read)
5. Chaos – James Gleick (I know of no other book which speaks so vividly of a science on the cusp of explosive progress)
6. The Beginning of Infinity – David Deutsch (mind-expanding)
7. Gödel, Escher, Bach – Douglas Hofstadter (mind-blowing)
8. The Emperor’s New Mind – Roger Penrose (mind-bending)
9. Surely You’re Joking, Mr. Feynman! – Richard Feynman
10. Naturalist – E. O. Wilson (ranks with Dyson’s book as the most sincere set of self-reflections I have ever seen penned by a scientist)
11. Waking Up – Sam Harris (probably the best and clearest argument in favor of secular meditation I have read)
12. Complete works of T.S. Eliot – T.S. Eliot
13. The Dragons of Eden – Carl Sagan (a lot of people rightly recommend Sagan’s other books, but I found this one to be his boldest and most imaginative volume – and it won a Pulitzer)
14. The Time Machine – H. G. Wells
15. The Story of Civilization – Will and Ariel Durant (A ten volume magnum opus; probably all you need to read for a grand dive into Westerrn Civilization)
16. Why I am Not A Christian – Bertrand Russell (brimming with trenchant wit and provocative juices).
17. The Man Who Loved Only Numbers – Paul Hoffman (a compulsively readable biography of one of the strangest and most brilliant minds of the 20th century)
18. A Beautiful Mind – Sylvia Nasar (an amazing exploration of both mathematical brilliance and mental illness)
19. My Family and other Animals – Gerald Durrell
20. King Solomon’s Ring – Konrad Lorenz (both Lorenz and Durrell provide delightful accounts of communing with nature and animals that would make any ten year old fall in love with the natural world)
21. Stories – Anton Chekhov (no one can turn words so simply as Chekhov)
22. The Rise and Fall of the Third Reich – William Shirer (an unsurpassed epic full of horror and triumph)
23. The Longest Day – Cornelius Ryan (the book that got me hooked on to WW2 history).
24. Begone Godmen! – Abraham Kovoor (a rare volume: Kovoor was an Indian rationalist who bravely took on spiritual and religious frauds and exposed their ‘miracles’ long before it was fashionable to do so)
25. In Search of Schrödinger’s Cat – John Gribbin (a book which would inspire anyone to study physics)
26. Manhunt – James Swanson (edge of your seat account of the 12 day hunt for John Wilkes Booth)
27. The Second Creation – Robert Crease (possibly the best history of particle physics)
28. The Eighth Day of Creation – Horace Freeland Judson (possibly the best history of molecular biology)
29. The Double Helix – James Watson (science with all its warts)
30. My World Line – George Gamow (Gamow was brilliant, wide-ranging, a prankster, and all these qualities shine through in this memoir)
And a few more:
31. Natural Obsessions - Natalie Angier (one of the best fly-on-the-wall accounts of academic science)
32. The Billion-Dollar Molecule - Barry Werth (a similar account of industrial science)
33. The JASONS - Ann Finkbeiner (vivid and entertaining profile of some of the most brilliant minds of American science)
34. Consciousness Explained - Daniel Dennett (I find Dennett to be one of the deepest and most original thinkers of our time)
35. Wittgenstein's Poker - Edmonds and Eidinow (Wittgenstein! Popper! The Vienna Circle! Russell!)
36. Paradox - Rebecca Goldstein (a brilliant account of a singular and tortured mind)