Field of Science

This year's 100 book odyssey

I finally achieved my goal of reading a hundred books this year (105 to be exact, although I will probably take a break for the rest of the year). It's an important item off my bucket list. The experience has provided an immense sense of satisfaction since it may be very hard to do this for a while because of time constraints. I am particularly happy that some of these volumes were real tomes that I was lugging around everywhere.

As usual, the list was heavily biased toward non-fiction, with most of the books covering history, science and philosophy. I do want to increase my share of fiction next year, and on the non-fiction (or "verity", as Richard Rhodes calls it) front want to read more about AI, technology, biology and economics.

Is it hard to read a hundred books in one year, essentially two books a week? Not particularly if you try to grab every spare minute (after work and family, that is) for doing it; and I am not even a fast reader. Apart from the usual times (early morning, after work, bedtime, in the bathroom), I used to read in Ubers, in trains, in lines in coffee shops, in restaurants when dining alone, while waiting for friends to show up, in stores while the wife was shopping, at the DMV when I had to wait 2 hrs for my driver's license, during the occasional walk or hike, and during lunch break at work (but don’t tell anyone!). I used to read paper books when possible, but read on my Kindle and on my phone when nothing else was available. I could have read even more had I listened to Audiobooks, but I find it hard to concentrate on the spoken as opposed to the written word. Basically you try to cram in a few words every time you can. Sometimes this does lead to fragmented reading, but you gradually get used to it. And at the end of it you feel uniquely well-read, so it’s absolutely worth it, if for no other reason as a personal challenge.

 Here’s the list in case someone wants holiday book suggestions, starting with the most recent volumes (starred volumes indicate favorites that were reread). Happy Reading! 
 1. Jill Lepore – These Truths: A History of the United States. Perhaps the best, most even-handed single volume on American history that I have read. 2. Don Norman – The Design of Everyday Things 3. Edward Gibbon – Decline and Fall of the Roman Empire, Vol. 1. A monumental work that really needs to be savored like fine wine. Because of its archaic style and long sentences it’s not easy reading, but enlightenment comes to those who are patient. I don’t know if I will ever get through all six volumes, but one can try. 4. Isaac Asimov – Asimov’s Mysteries. * 5. Charles Darwin – The Origin of Species.* 6. Candice Millard – Destiny of the Republic. A great thriller about the sadly short-lived presidency of the brilliant James Garfield and his assassination. The book is as much about medical ignorance as about anything else; even though Joseph Lister had demonstrated the value of antibiotics, the doctors of the day resorted to crude measures like sticking their fingers into a wound to find the bullet. It was infection that did Garfield in and not the bullet. 7. Leslie Berlin - Troublemakers: Silicon Valley’s Coming of Age. A fantastic book that brings to life some of the underappreciated characters who, in just eight years, pioneered six transformational industries: video games, personal computing, biotechnology, venture capital, semiconductors and communications. 8. Valley of Genius – Another great and very unique book about Silicon Valley that patches together sound bytes from interviews scores of Silicon Valley pioneers over thirty years conducted during different times. Makes for a very unique experience where one person begins where another trails off, and you get multiple perspectives on the same people and events. 9. Michael Hiltzik – Dealers of Lightning: Xerox-PARC and the Dawn of the Computer Age. An excellent account of what was, for a brief period, the most innovative computer science lab in the world, giving us everyday inventions like the mouse, the GUI and the windows desktop. 10. John Carreyou - Bad Blood. Reads like a soap opera. No wonder it’s being turned into a Hollywood movie with Jennifer Lawrence. 11. Abraham Pais – Niels Bohr’s Times 12. Abraham Pais – Subtle is the Lord* 13. Niels Bohr – Atomic Physics and Human Knowledge. There are few wiser men in human history than Niels Bohr. 14. Niels Bohr – Atomic Theory and the Description of Nature. 15. Doris Kearns Goodwin – Leadership in Turbulent Times. You think these times are politically fraught? Just ask Lincoln or FDR. 16. William Aspray – John von Neumann and the Origins of Modern Computing. 17. Oxtoby and Pettis – John von Neumann. 18. Ray Monk – How to Read Wittgenstein. 19. Michael Lewis – The Fifth Risk. A great book on how crucial government functions are in keeping Americans alive and thriving every single day. 20. John Wesley Powell – The Exploration of the Colorado River and its Canyons. I read this first-hand account during a trip to the Grand Canyon. Powell was really the first American to explore what was then a land inhabited only by Natives, and the courage and resilience of his team were amazing (he lost several men). 21. Michael Beschloss – Presidents of War 22. Charles Krauthammer – Things that Matter 23. Charles Krauthammer – The Point of It All. One of the last conservatives who was a well-read and eloquent intellectual. 24. Venki Ramakrishnan – Gene Machine: The Race to Decipher the Secrets of the Ribosome. 25. Charles Darwin – The Autobiography of Charles Darwin. Darwin may have seemed conservative in his demeanor, but this book really brings out both his radical thinking and his sense of humor, especially about religion. 26. Steven Weinberg – Third Thoughts. 27. Richard Powers – The Overstory. In one word – spellbinding. Powers is the most creative writer alive in my opinion. It made me fall in love with redwood trees. 28. Craig Childs – House of Rain: Tracking a Vanished Civilization Across the American Southwest. 29. Simon Winchester – The Perfectionists. A superb history of precision engineering. 30. Alan Lightman – Searching for Stars on an Island in Maine. A meditation on science, time, existence and other topics from one of the most literary and poetic science writers of his generation. 31. Sabine Hossenfelder – Lost in Math: How the Search for Beauty Leads Physics Astray. 32. Chaos – James Gleick. * (This may be the fourth of fifth time I have read this landmark book). 33. Brian VanDeMark – Road to Disaster. A new approach to the Vietnam War that sees it through the lens of theories of organizational behavior and cognitive biases of the kind explored by Amos Tversky and Daniel Kahneman. 34. Brian Keating – Losing the Nobel Prize. A unique first-hand account of a false alarm (cosmic inflation) that led the author very close to a Nobel Prize. 35. William Perry – My Journey to the Nuclear Brink. 36. Jeremy Bernstein – A Bouquet of Dyson. 37. Loren Eiseley – The Unexpected Universe. 38. Loren Eiseley – The Immense Universe. 39. Loren Eiseley – The Firmament of Time. If you think Carl Sagan is eloquent and poetic about how vast the cosmos and how insignificant and yet profound life are, read Loren Eiseley. 40. Ann Finkbeiner – The Jasons: The Secret History of America’s Postwar Elite.* 41. Tom Holland – Persian Fire. A gripping account of the Greco-Persian Wars. Marathon, Salamis, Thermopylae, all come alive on these pages. 42. AI Superpowers – Kai-Fu Lee. 43. What is Real – Adam Becker. A case for David Bohm’s interpretation of quantum theory. 44. Carlo Rovelli – The Order of Time. 45. Oliver Sacks – The River of Consciousness. A moving posthumous collection of Sacks’s eclectic writing; on Darwin, on consciousness, on plant biology and neurology and on time. 46. Oliver Sacks – Uncle Tungsten: Memories of a Chemical Boyhood. 47. Sy Montgomery – The Soul of an Octopus. The description of octopus intelligence in this book impressed me so much that I actually stopped eating it. 48. Kevin Kelly – What Technology Wants. 49. Jon Gertner – The Idea Factory.* 50. Arieh Ben-Naim – Myths and Verities in Protein Folding. 51. A. P. French – Einstein: A Centenary Volume. Fond recollections of a great physicist and human being by friends, colleagues and students. 52. Intercom on Product Management. 53. John McPhee – Annals of the Former World. A towering history of the geology of the United States, written by one of the best non-fiction writers in America. No one else has the eye for observational detail for both people and places that McPhee does. 54. Oliver Sacks – On the Move.* 55. William Prescott - History of the Conquest of Mexico: Vols 1 and 2. Vivid, engaging, monumental; hard to believe Prescott wasn’t there. 56. Joel Shurkin – True Genius. A biography of physicist and engineer Richard Garwin, one of the very few people to whom the label genius can be applied. 57. Lillian Hoddeson and Vicki Daitch – True Genius*. Another true genius: John Bardeen, the only person to win two Nobel Prizes for physics. The definitive treatment of his life and contributions to world-changing inventions like the transistor and superconductivity. 58. Cormac McCarthy – Blood Meridian. The greatest novel I have read. Breathtaking in its raw beauty and Biblical violence. 59. John Archibald Wheeler – At Home in the Universe. A collection of essays from a physicist who was also a great philosopher and poet. 60. John Wesley Powell – The Exploration of the Colorado River and Its Canyons. The first exploration of the area around the Grand Canyon by a white man, Powell’s journey became known for its harrowing loss of life and adventures. 61. Franck McCourt –Angela’s Ashes. Achingly beautiful, heartbreaking account of Irish poverty. 62. Adam Becker – What is Real? Fascinating account of quantum physics and reality, and one that challenges the traditional Copenhagen Interpretation and gives voice to David Bohm, John Wheeler and others. 63. Steven Weinberg – Facing Up: Science and its Cultural Adversaries 64. Benjamin Hett – The Death of Democracy. The best account I have read of the details of how Hitler came to power. Read and learn. 65. George Trigg – Landmark Experiments in Twentieth Century Physics. The nuts and bolts of some of the most important physics experiments of the last hundred years, from the oil drop experiment to the Lamb Shift. 66. Albert Camus – The Stranger 67. Norman McCrea – John von Neumann 68. Werner Heisenberg – Physics and Philosophy* 69. David Schwartz – The Last Man Who Knew Everything. A fine biography of Enrico Fermi, the consummate scientist. 70. A. Douglas Stone – Einstein and the Quantum. Einstein’s opposition to quantum theory is well known; his monumental contributions to the theory are not as well appreciated. Stone fixes this gap. 71. John Cheever – Cheever: The Collected Short Stories. Melancholy, beautiful prose describing the quiet despair of New England upper class suburbia. 72. Arnold Toynbee – A Study of History 73. Aldous Huxley – Brave New World 74. Charles Darwin – Insectivorous Plants 75. William Faulkner – As I Lay Dying 76. Lillian Hoddeson – Critical Assembly* 77. Herbert York – The Advisors: Oppenheimer, Teller and the Superbomb 78. Joseph Ellis – Founding Brothers*. Worth reading for the wisdom, insights and follies that the founding fathers displayed in erecting a great nation. 79. Noam Chomsky – Requiem for the American Dream. Everyone’s perpetual wet blanket, with his incomparable combination of resoundingly true diagnoses and befuddling philosophy. 80. Colin Wilson – Beyond the Occult. This book had mesmerized me as a child. Now I am more critical, but some of the case studies are fascinating. 81. John Tolland – Adolf Hitler. Still stands as the most readable Hitler biography in my opinion. 82. Ron Chernow – Grant. Fantastic. Brings to life the towering, plainspoken, determined man who won the Civil War and became president. Chernow is evenhanded in his treatment of Grant’s drinking problems and corruption-riddled presidency, but he clearly loves his subject. And who wouldn’t? That kind of simplicity and grassroots activism seems to be from another planet these days. 83. Priscilla McMillan – The Ruin of J. Robert Oppenheimer* 84. Paul Horgan – Great River: The Rio Grande in North American History. An epic history of Indians, Spaniards and Anglo-Americans who settled the great states around the Rio Grande. 85. Toby Huff – The Rise of Early Modern Science: Islam, China and the West*. A superb examination of why modern science developed in Europe and not other parts of the world. Huff’s main explanation centers around the European legal and scientific system derived from Roman law and Greek philosophy, both of which encouraged scientific inquiry. Both these elements were crucially missing from Islamic countries, China and India. 86. A. P. French – Niels Bohr: A Centenary Volume. A glowing set of tribute to a great physicist and human being. 87. Stephen Kotkin – Stalin, Volume 1: Paradoxes of Power. Monumental biography of the tyrant, although not easy going because of the dense detail and slightly academic writing. Kotkin’s is likely to be the last word, though. 88. Joseph Heilbronner and Jack Dunitz – Reflections on Symmetry. A beautiful exploration of symmetry in chemistry, physics, biology, architecture and other scientific and human endeavors. 89. Chuck Hansen – The Swords of Armageddon, Vols 2 and 3. The definitive history of US nuclear weapons. Everything you can possibly read about their details without having the feds show up at your doorstep (as they did show up at Hansen’s door many times without being able to ever prove that he had access to non-public information). 90. David Kaiser – Drawing Theories Apart: The Dispersion of Feynman Diagrams in Postwar Physics* A fascinating socio-scientific exploration of how a key scientific idea makes it ways by fits, starts and eventual acceptance into the scientific community. 91. Cormac McCarthy – Child of God. Highly disturbing story of a man on the fringes, filled with dark humor. Not McCarthy’s best in my opinion, and I won’t recommend it for weak stomachs. 92. Franz Kafka – The Metamorphosis 93. Lawrence Badash – Reminiscences of Los Alamos 94. Herodotus – The Histories. The account of the Greco-Persian Wars is especially rousing, and Herodotus of course made a seminal contribution to history by treating it as contemporary account rather than divine, untouchable past. 95. Freeman Dyson – Maker of Patterns: An Autobiography Through Letters 96. Iris Chang – The Chinese in America. An amazing account of how Chinese immigrants came to the United States and laid down roots here in the face of poverty, cultural challenges, political upheavals and discrimination. 97. Robert Divine – Blowing on the Wind 98. Richard Rhodes – Energy: A Human History. A history of energy transitions, focusing on the human beings, some well known and others obscure, who engineered it. As usual, Richard is highly adept at both digging up fascinating individual stories as well as deciphering the big picture. 99. Norman Cohn – Warrant for Genocide: The Myth of the Jewish World Conspiracy and the Protocols of the Elders of Zion. 100. Jim Holt - When Einstein Walked with Gödel. A series of essays on math, philosophy, genius and the nature of reality. 101. Anil Ananthaswamy – Through Two Doors at the Same Time. Captivating account of the essential nature and many ramifications of one of the most simply stated and yet perplexing, deep and mind-bending experiments of all time. 102. Marcus Chown – The Magic Furnace: The Search for the Origin of Atoms. 103. E. O. Wilson – The Meaning of Human Existence. 104. David Quammen – The Tangled Tree. 105. V. S. Naipaul – A House for Mr. Biswas.

Will CADD ever become relevant to drug discovery?



Let’s face some hard facts first. The good news is that in the last twenty years or so, computer-aided drug design (CADD or computational chemistry) has become a relatively standard part of drug discovery, and even in organizations not formally employing CADD scientists, some form of computation – sometimes as simple as property calculation or visualization – is used.

The bad news is that, as it is largely practiced now, CADD is not considered a core part of the drug discovery process. Period. Rather, it’s considered as a supporting part. This situation has only marginally improved in twenty years. Biology and synthetic chemistry are still the core driving disciplines of drug discovery and will remain so in the foreseeable future. If there is a marketed drug that benefits extravagantly from CADD, it comes along once every decade or so at best (good luck finding another HIV protease, for instance). As integral as CADD scientists consider themselves and as impressive as they think the rotating pictures on their screen are, the fact that their peers consider what they do as being marginally relevant to the big picture is a bitter pill that needs to be swallowed.

Why is CADD not considered a core part of drug discovery, even though it is now part of every organization’s drug discovery portfolio in one form or another? There are two reasons, one related to fact and the other related to perception. The fact is that drug discovery and biology are still very much experimental disciplines, and you simply can’t compute your way to a drug; in this sense, CADD simply is not good enough yet. The perception of CADD is that unlike synthetic chemistry and biology which generate products (compounds in bottles, assays, animal models etc.), CADD generates ideas. In that sense, experimental scientists often have the same view of CADD scientists that Ernest Rutherford had of theorists: "They play games with their symbols while we discover the mysteries of the universe."

Now one of the advantages of CADD is that the ideas can be useful because of their generality (eg. “Put a hydrophobic group here.”), but it’s this very generality that often counts against CADD. Synthetic chemists in particular want to see specific suggestions, and while plenty of them appreciate general guidance, it doesn’t quite have the same impact. The fact that CADD scientists also often make suggestions that are considered obvious – filling hydrophobic pockets is a standard suggestion – does even less to enamor the discipline to other scientists.

This rather undervalued reputation of CADD impacts both the discipline and the scientists. The market for CADD scientists is usually a niche market, with low demand and low supply, and typically a small organization will have only one or two CADD scientists. But the supportive role of CADD means that when there are layoffs, a CADD scientist is far more likely to go than a cell biologist or assay developer. Because CADD scientists are not abundant, they also may not have direct reports. All this means that CADD scientists may not have the ear of senior management. They thus rarely occupy senior managerial positions like CSO, VP of drug discovery or CTO; if you take a random sample of top management at both small and large drug discovery organizations, you will very rarely find scientists with a background in CADD in these positions.

The supporting role of CADD combined with the lack of core influence and understanding CADD scientists have means that they often have to fight uphill battles to make their voice heard. While this is a good character-building experience, it doesn’t necessarily contribute to career progression or a growing influence on the part of the field. So how can CADD make a bigger impact both on the facts of drug discovery and the perception?

At least part of the answer – as unseemly as it seems to a lot of drug hunters today – does involve large datasets and machine learning. I am not saying that ML or some form of AI will have the kind of immediate, hype-heavy, transformational impact that often seems all-too-apparent through Silicon Valley sunglasses. But I am saying that the impact of ML and AI on drug discovery will inexorably increase as time passes, and anyone who simply chooses the option of dismissing these technologies will be left behind.

There are some important reasons why I believe this. The most important reason perhaps is that traditional CADD, as it pertains to protein structure and physics-based algorithms, has fallen far short of its promise. Part of the problem was the hype in the late 80s and early 90s, but a more realistic problem is that the data is often not good enough and even when it’s good it can impact a very limited part of early drug discovery. The biggest problem is that, even thirty years after a leading CADD scientist stated the challenges inherent in using CADD to predict binding affinity (protein flexibility, water behavior, crowding effects) we are still struggling with the same issues. Ignorance of the basic physics of protein-ligand interactions, especially as it pertains to making reliable, general predictions, is still rife.

I am not saying that physics-based approaches haven’t worked or improved at all over the last two decades – some applications like shape-based similarity searches and pose prediction – have certainly showed impressive progress, but the fact remains that we are still swimming in a sea of ignorance, either ignorance borne out of basic scientific understanding, or ignorance borne out of engineering our current understanding into viable approaches. Equally important, even when we understand these factors, they will be applicable to a very narrow part of drug development, namely improving binding affinity between a single small molecule and (usually) a single protein. At best CADD as we know it will design good ligands, not drugs.

Approaches based on large datasets in contrast are agnostic about the physics of protein-ligand binding. In principle, they can take a bunch of well-curated data points about interaction energies and predict what a new interaction will be like without ever explicitly modeling a single hydrogen bond. They don’t understand causation but can provide actionable correlations. But the second and truly novel advantage of machine learning approaches in my opinion is that, unlike traditional CADD which applies to a very narrow part of drug discovery even on its best day, ML can in theory encompass multiple aspects of drug discovery by simply adding more columns to the Excel file. Thus, if one has multiple kinds of data – in vitro and in vivo binding affinity, mutations, metabolic data, solubility, clearance etc. – traditional CADD can pretty much take advantage of only one or two kinds of data, but ML can consider all the datasets at once. Thus, machine learning provides pretty much the only way to solve the multiparameter optimization problem that is at the heart of drug discovery.

Now, as is well known, the real problem with data is not the physics but the quantity, quality and curation of the datasets (including negative data). But these are without a doubt getting better and more integrated across various phases of drug discovery every day. Whether it’s structures in the PDB, synthetic reactions for making drugs or patient data in the TCGA (The Cancer Genome Atlas), the improvement of data quantity and quality is inexorable. And as this quantity and quality improves, so would the utility of this data for machine learning. This is a certainty. The naysayers therefore are fighting against a current which is only going to get stronger, even if it doesn’t quite turn into the tsunami that its practitioners say will engulf us very soon.

Are we there yet? The simple answer is no. And we probably won’t be in terms of measurable impact for some time, especially since we have to hack our way through so much hype to get to the nugget of causal or even correlative truth. The domain of applicability of even impressive techniques like deep learning is still limited (for instance it works best on images), and the data problem is still a big one. But what I am saying is that unless CADD embraces machine learning and AI – irrespective of the hype – CADD scientists will always occupy a marginal perch in the grand stadium of drug development and CADD scientists will increasingly be cast on the sidelines, both in terms of career and discipline progress. This will especially be true as both protein and drug classes expand out into space (unstructured proteins, macrocycles etc.) that is even more challenging for traditional CADD than usual. 

Sadly, as with many things these days, the mention of AI or ML in CADD continues to evoke two extreme reactions: the first one which is more common is to say that it’s going to solve everything and is the Next Big Thing, but the second one is sadly also common, the one which says that it’s all hype and good old thinking will always be better. But good old thinking will always be around. The data is getting better and the machine learning is getting better; these are pretty much certainties which we ignore at our own peril. The bottom line is that unless CADD starts including machine learning and the related paraphernalia of techniques that either accurately or misleadingly fall under the rubric of “AI” in its very definition, its role in drug discovery and development threatens to dwindle to a point of vanishing relevance. This will do an enormous disservice both to CADD and drug discovery in general. 

Every year when the Nobel Prize in chemistry is announced, there's some bellyaching when the work leading to the prize turns out to be as much related to biology or physics as to chemistry. "But is it chemistry?", is the collective outcry. However, a more refined understanding of the matter over the years leads us to realize that chemistry has as much absorbed biology or physics as they have absorbed chemistry. Chemistry's definition has been changing ever since it emerged from the shroud of alchemy. Chemical biology, chemical physics and chemical engineering are as much chemistry as they are biology, physics and engineering. The same thing needs to happen to CADD in my opinion. By embracing computer science or automation or AI, by hiring ML engineers or data scientists in their midst, CADD is not losing its identity. It's showing that its identity is much bigger than anyone thought. That it is vast and contains multitudes.

Late hour

The fall turned colors faster than ever before. The streets never saw any activity. The whole gambit of Prometheus hinged on a mere coin flip. Richard Albrook gingerly closed his book and took a look around.
The café was almost deserted, college students and startup founders struggling to meet last minute deadlines, their faces a picture of desperate concentration. The baristas and their blues, the coffee with its vitriolic flavors. It seemed like the uneasy middle of time. Had not the soothsayer spoken with gusto and evident admiration for the march of destiny, he might have almost been forgiven for having a sense of whimsy.
Albrook had been languishing in this carved out area of spacetime until his visceral emotions had gotten the better of him. His friends had warned him that too much time with a speakeasy kind of permissive feeling would mark his doom. Not that feelings of doom had never crossed his mind, but this time it seemed all too real. Lost love, the convolutions of Clifford algebras and dandy details of daffodil pollination had always been seemingly on the verge of materializing in a cloud of abject reality, but the effect had been subtle at best.
It was this rather susceptible mix of preternaturally wholesome unification that Albrook was mulling over when the wizard walked in.
Two precisely round chocolate chip cookies and a latte, bitte, he croaked, even as his voice sounded like it would warp space with a brand new curvature tensor. Wouldn’t his friend want to know what exactly the latest abomination was? What would be the whole point of stealing a few moments from eviscerated time? What a world it was. What a world. Jesus, Mohammed, the Buddha, Bohr and Heisenberg and their tortured children marveling at mutilated meaning; all had had their say about the void, the complete mess of humanity and sweet free will that had been ours for the taking. But only the wizard seemed like he would have the answers.
He had been responsible, after all, for creating the fuzziest metamorphosis of all things pure since the metamorphosis. Kurt and Albert could have disturbed the whole universe newly and duly on their meanderings through the snow and the wizard wouldn’t have sneezed. The gall of the cosmic dance had sowed the seeds for a lion’s formidable roar of what turned out to be a mouse tweet. The ferns and bison and Ediacarans were small meat on the giant pancake, although who knew the biped would have shown such a cheerful, smooth indifference to the incredible gamble.
The drink fomented, the cookie corners cut, he settled down into a chair that seemed borrowed from Henry VIII’s custom-built collection. A sigh as heavy as whipped baryonic soup. A flash of the eyes declaring a concern best left to the accountants. Albrook guffawed silently at the sheer temerity of the multidimensional swindle. A child’s question would have cut to the chase, and still his laughter did not obliterate the deep anguish. What was here should have been lost through the ages, and yet it had somehow winded itself through survival of the fittest, through the fine structure constant and Rome and Versailles and plopped itself down on plush synthetic.
Somehow the flicker of courage appeared before time had a moment to reassert itself, and Albrook walked over, the scuff of his shoes drawing strange patterns from a misbegotten textile’s short half-life.
Good evening, wizard. And what gave you a change of heart?
The wizard quietly looked up, the embers in his pitiful flask, one among an infinite solution space of consumables, quietly giving off entropy to his own creations so that they could live and lie one more day. The wrinkles of spacetime dug deep into his forehead. It was an experiment, my friend. The parameters set, the goodness all baked in, the whole outcome an exciting – albeit combustible, I must admit – mix of speculation, fine-tuning and demonically predictive power. And one that spit out Santa Claus, asymptotic freedom and Amanita phalloides, ho ho ho, thank you very much, what wretched delight. To have carried the delicious burden of biophilia and single malt was a crowning glory, although Vibrio cholerae is a smirk too wide. The bridge too far that gets on the vicious underbelly of the other side deserves its own name, and even I find myself lexically challenged to uncover that particular mystery.
Albrook sighed with dim recognition. Yes, he said, he had had that sense of gnawing guilt that told him that he should have maybe said something that time, that time when the lever was being tried out for the first time, when the drooling promise of superficial glory was exploding with a million frenzied colors, when the ultimate party was cooking in the innards of some rotten genius mass of neural habitat for humanity. But the coin can fall either way, and he had had neither the heart nor the horror of instant recognition to have everyone step back. Late night scribblings in the café were the best he could accomplish. And he hadn’t hesitated before melting all his worries away in the wary delight of Zarathustra and Żubrówka. I mean what the hell; the gel of existence has to glisten on someone’s hair.
It’s all right, nobody made you the guardian of all that’s good and possible, said the wizard with a pensive expression of almost dead resignation. Don’t worry about it; you’ll be all right. The gurgling maw can be filled in again, but déjà vu never found itself a home in gladdened hearts. In any case, what has to be done has to be done; the cautious footsteps of improbability were never too bold for inevitability. He looked into his cup and saw the bottomed out hole filled with the veneer of hopeful damnation.
Straightening out his tunic, an upper bound of desire and planning writ into his expression, he extended his hand to Albrook. I’ll be seeing you. The quick, blurry movement of the door and the morbidly cold gust of air sealed the hands of time immemorial. Albrook stood there for a moment and walked back to fetch his hat and coat. Far away, on the Serengeti, a lion ripped into the pulsating veins of a forlorn gazelle.
First posted on 3 Quarks Daily.

What areas of chemistry could AI impact the most? An opinion poll

The other day I asked the following as a survey question regarding potential areas of chemistry where AI could have the biggest impact.


There were 163 responses which wasn't a bad representative sample. The responses are in fact in line with my own thinking: synthesis planning and automation emerge as the leading candidates. 

I think synthesis planning AI will have the biggest impact on everyday lab operations during the next decade. Synthesis planning, while still challenging, is still a relatively deterministic protocol based on a few good reactions and a large but digestible number of data points. Reliable reactions like olefin metathesis and metal-mediated coupling have now become fairly robust and heavily used to generate thousands of machine-readable data points and demonstrate reliability and relative predictability; there are now fewer surprises, and whatever surprises exist are well-documented. As recent papers make it clear, synthesis planning had been waiting in the wings for several years for the curation of millions of examples pertaining to successful and unsuccessful reactions and chemotypes as well as for better neural networks and computing power. Without the first development the second wouldn't have made a big difference, and it seems like we are finally getting there with good curation.

I was a bit more surprised that materials science did not rank higher. Quantum chemical calculations for estimating optical, magnetic, electronic and other properties have been successful in materials science and have started enabling high-throughout studies in areas like MOF and battery technology, so I expect this field to expand quite a bit during the next few years. Similar to how computation has worked in drug discovery, AI approaches don't need to accurately predict every material property to three decimal places; they will have a measurable impact even if they can qualitatively rank different options and narrow down the pool so that chemists have to spend fewer resources making them.

Drug design, while already a beneficiary of compute, will see mixed results in my opinion over the next decade. For one thing, "drug design" is a catchall phrase that can include everything from basic protein-ligand structure prediction to toxicity prediction, with the latter being at the challenging end of the spectrum. Structure-based design will likely benefit from deep learning that learns basic intermolecular interactions which are transferable across target classes, so that they are limited by the paucity of training data.

Areas like synthesis planning do contribute to drug design, but the real crux of successful drug design will be multiparameter optimization and SAR prediction, where an algorithm is able to successfully calculate multiple properties of interest like affinity, PK/PD and toxicity. PK/PD and toxicity are systemic effects that are complex and emergent, and I think the field will still not be able to make a significant dent in predicting idiosyncratic toxicity except for obvious cases. One area in which I see AI having a bigger impact is any field of drug discovery involving image recognition; for instance phenotypic screening, and perhaps the processing of images in cryo-EM and standard x-ray crystallography.

Finally, automation is one area where I do think AI will make substantial progress. This is partly due to better seamless integration of hardware and software and partly because of better data generation and recording that will enable machine learning and related models to improve. This development, combined with reaction planning that allows scientists to test multiple hypotheses will contribute, in my opinion, in automation making heavy inroads in the day-to-day work of chemists fairly soon.

Another area which I did not mention in the survey but which will impact all of the above areas is text mining. There the problem is one of discovering relationships between different entities (drugs and potential targets, for instance) that are not novel per se but that are just hidden in a thicket of patents, papers and other text sources which are too complicated for humans to parse. Ideally, one would be able to combine text mining with intelligent natural language processing algorithms to enable new discovery through voice commands.

Can we turn biology into engineering?

Vijay Pande of Andreessen-Horowitz/Stanford has a thought-provoking piece in Scientific American in which he lays out a game plan for how we could potentially make biology more like engineering. He takes issue with what Derek (Lowe) once called the “Andy Grove fallacy”, which in a nutshell says that you can make fields like biotechnology and drug discovery as efficient as semiconductor or automobile engineering if you borrow principles from engineering.

There are parts of the piece that I resoundingly agree with; for instance, there’s little doubt that fields like automation and AI are going to have a significant impact on making biological experiments more reproducible, many of which are still more art than science and subject to the whims and sloppiness of their creators and their lab notebooks. Vijay is also optimistic about making biology more modular, so that one can string along parts of molecules, cells and organelles to enable better biological engineering of body parts, drugs and genetic systems. He also believes that bringing more quantitative measurements encoded into key performance indicators (KPI) will make the discipline more efficient and more mindful of its successes and failures. One point which I think is very important is that these kinds of approaches would allow us to gather more negative data, a data collection problem that still hobbles AI and machine learning approaches.

So far I am with him; I don’t believe that biology can’t ever benefit from such approaches, and it’s certainly true that the applications of AI, automation and other engineering-based approaches are only going to increase with time. But the article doesn’t mention some very fundamental differences between biology and engineering which I think demarcate the two substantially from each other and which make knowledge transfer between them highly problematic.

Foremost among these are non-linearity, redundancy and emergence.

Let’s take two examples which the piece talks about that illustrate all three concepts – building bridges and the Apollo project. Comparisons with the latter always make me wince a bit. Vijay is quite right that the right approach to the Apollo program was to break the problems into parts, then further break those parts up into individual steps such as building small models. The scientists and engineers working on the program gradually built up layers of complexity until the model that they tested was the moon landing itself.

Now, the fact is that we already do this in biology. For instance, when we want to understand or treat a disease, we try to break it down to simpler levels – organelles, cells, proteins, genes – and then try to understand and modulate each of these entities. We use animal models like genetically engineered mice and dogs as simpler representatives of complex human biology. But firstly - and as we keep on finding out - these models are pale shadows of true human biology; we use them because we can't do better. And secondly, even these ‘simple’ models are much more complex than we think. The reasons are non-linearity and emergence, both of which can thwart modular approaches. The sum of proteins in a cell is not the same as the cell phenotype itself, just like the sum of neurons in a human brain is not the brain itself. So modulating a protein for instance can cause complex downstream effects that depend on both the strength and nature of the modulating signal. In addition, biological pathways are redundant, so modulating one can cause another one to take over, or for the pathway to switch between complex networks. Many parts downstream, even ones that don’t seem to be directly connected, can interact with each other through complex, non-linear feedback through far-flung networks.

This is very unlike engineering. The equivalent of these unpredictable consequences in building a bridge, for example, would be for a second bridge to sprout out of nowhere when the first one is built, or the rock on the other side of the river suddenly turning from metamorphic to sedimentary, or the sum of weights of two parallel beams on the bridge being more than what simple addition would suggest. Or imagine the Apollo rocket suddenly accelerating to ten times its speed when the booster rockets fall off. Or the shape of the reentry vehicle suddenly changing through some weird feedback mechanisms as it reaches a certain temperature when it’s hurtling through the atmosphere. 

Whatever the complexities of challenging engineering projects like building rockets or bridges, they are still highly predictable compared to the effects of engineering biology. The fact of the matter is that the laws of aerodynamics and gravity were extremely well understood before the Apollo program (literally) took off the ground, so as amazing as the achievement was, it didn't involve discovering new basic scientific laws on the fly, something that we do a lot in biology. Aircraft design is decidedly not drug design. And all this is simply a product of ignorance, ignorance of the laws of biology and evolution – a clunky, suboptimal, haphazard, opportunistic process if there ever was one – relative to the laws of (largely predictable) Newtonian physics that underlie engineering problems.

The concept of modularity in biology therefore becomes very tricky compared to engineering. There is some modularity in biology for sure, but it’s not going to take you all the way. One of the reasons is that unlike modularity in engineering, biological modularity is flexible, both spatially and temporally. This is again a consequence of different levels of emergence. For instance, a long time ago we thought that the brain was modular and that the fundamental modules were neurons. This view has now changed and we think that it’s networks of neurons that are the basic modules. But we don’t even think that these modular networks are fixed through space and time; they likely form, dissolve and change members and locations in the brain according to need, much like political groups fleetingly forming and breaking apart for convenience. The problem is that we don’t know what level of modularity is relevant to addressing a particular problem. For instance, is the right ‘module’ for thinking about Alzheimer’s disease the beta-amyloid protein, or is it the mitochondria and its redox state, or is it the gut-brain axis and the microbiome? In addition, modules in biology are again non-linear, so the effects from combining two modules are not going to simply be twice the effects of one module – they can be twice or half or even zero.

Now, having noted all these problems, I certainly don’t think that biology cannot benefit at all from the principles of engineering. For one thing, I have always thought that biologists should really take the “move fast and break things” philosophy of software engineering to heart; we simply don’t spend enough time trying to break and falsify hypotheses, and this leads to a lot of attrition and time chasing ghosts down rabbit holes. More importantly though, as a big fan of tool-driven scientific revolutions, I do believe that inventing tools like CRISPR and sequencing will allow us to study biological systems at an increasingly fine-grained level. They will allow us to gather more measurements that would allow better AL/machine learning models, and I am all for this.

But all this will work as far as we realize that the real problem is not improving the measurements, it’s knowing what measurements to make in the first place. Otherwise we find ourselves in the classic position of the drunkard trying to find his keys below the lamp, because that’s where the light is. Inventing better lamps, or a metal detector for that matter, is not going to help if we are looking in the wrong place for the keys. Or looking for keys when we should really be looking for a muffin.

Victor Weisskopf and the many joys of scientific insight

Victor Weisskopf (Viki to his friend) emigrated to the United States in the 1930s as part of the windfall of Jewish European emigre physicists which the country inherited thanks to Adolf Hitler. In many ways Weisskopf's story was typical of his generation's: born to well-to-do parents in Vienna at the turn of the century, educated in the best centers of theoretical physics - Gottingen, Zurich and Copenhagen - where he learnt quantum mechanics from masters like Wolfgang Pauli, Werner Heisenberg and Niels Bohr, and finally escaping the growing tentacles of fascism to make a home for himself in the United States where he flourished, first at Rochester and then at MIT. He worked at Los Alamos on the bomb, then campaigned against it as well as against the growing tide of red-baiting in the United States. A beloved teacher and researcher, he was also the first director-general of CERN, a laboratory which continues to work at the forefront of particle physics and rack up honors.

But Weisskopf also had qualities that set him apart from many of his fellow physicists; among them were an acute sense of human tragedy and triumph and a keen and serious interest in music and the humanities that allowed him to appreciate human problems and connect ideas from various disciplines. He was also renowned for being a wonderfully warm teacher. Many of these qualities are on full display in his wonderful, underappreciated memoir titled "The Joy of Insight: Passions of a Physicist".

The memoir starts by describing Weisskopf's upbringing in early twentieth century Vienna, which was then a hotbed of revolutions in science, art, psychology and music. The scientifically inclined Weisskopf came of age at the right time, when quantum mechanics was being developed in Europe. He was fortunate to study first at Göttingen which was the epicenter of the new developments, and then in Zurich under the tutelage of the famously brilliant and acerbic Wolfgang Pauli. It was Göttingen where Max Born and Heisenberg had invented quantum mechanics; by the time Weisskopf came along, in the early 1930s, physicists were in a frenzy to apply quantum mechanics to a range of well known, outstanding problems in nuclear physics, solid state physics and other frontier branches of physics.

Pauli who was known as the "conscience of physics" was known for his sharp tongue that spared no one, but also for his honesty and friendship. Weisskopf's first encounter with Pauli was typical:
"When I arrived at the Institute, I knocked at the door of Pauli's office until I heard a faint voice saying, "Come in". There at the far end of the room I saw Pauli sitting at his desk. "Wait, wait", he said, "I have to finish this calculation." So I waited for a few minutes. Finally, he lifted his head and said, "Who are you?" I answered, "I am Weisskopf. You asked me to be your assistant." He replied, "Oh, yes. I really wanted (Hans) Bethe, but he works on solid state theory, which I don't like, although I started it."... 
Pauli gave me some problem to study - I no longer remember what it was - and after a week he asked me what I had done about it. I showed him my solution, and he said, "I should have taken Bethe after all."...
In spite of this rather inauspicious start, Weisskopf became both a very good physicist and a close friend of both Pauli and Bethe; he credits Pauli for lovingly 'spanking him into shape'.

Weisskopf also spent a productive year at Niels Bohr's institute in Copenhagen, where he was the 'victim' of Bohr's extended walks and tortuous reformulations of scientific statements to render them as accurate as possible. He benefited immensely from Bohr's style, as did many other leading theoretical physicists of the time. Bohr was known for his Delphic utterances and his mesmerizing personality that left listeners both frustrated as well as filled with a sense of wonder; only Einstein was more famous in the world of science then. In Copenhagen Bohr had created his own magic kingdom, one to which almost every budding physicist was required to make a pilgrimage. Many memories of Weisskopf's time with Bohr are recounted, but one in particular attests to the man's fame, essential qualities and influence:
"One evening at six o'clock, my usual quitting time, Bohr and I were still deep in discussion. I had an appointment that night and had to leave promptly, so Bohr walked me to the streetcar stop, about five minutes from his house. We walked and he talked. When we got there, the streetcar was approaching. It stopped and I climbed on to the steps. But Bohr was not finished. Oblivious to the people sitting in the car, he went right on with what he had been saying while I stood on the steps. Everyone knew who Bohr was, even the motorman, who made no attempt to move to start the car. He was listening with what seemed like rapt attention while Bohr talked for several minutes about certain subtle properties of the electron. Finally Bohr was through and the streetcar started. I walked to my seat under the eyes of the passengers, who looked at me as if I were a messenger from a special world, a person chosen to work with the great Niels Bohr."
Weisskopf made important contribution to quantum electrodynamics, but he suffered from a self-admitted lack of confidence that sometimes kept him from pushing calculations through. In one episode, he made an important mistake in a paper that one of Robert Oppenheimer's students pointed out in a private letter; Weisskopf was grateful and rightly comments that in today's times, that student might have directly sent a strident correction to the journal the published the paper, causing public embarrassment. 

In another, more consequential embarrassment that must have been jarring, he wrote a paper after the war on the famous Lamb Shift that revealed fundamental discrepancies in the structure of quantum field theory. He then and compared the results with ones acquired by Richard Feynman and Julian Schwinger, the new young mandarins who were revolutionizing the field. When his results did not agree with theirs, he withheld publication; surely physicists as brilliant as Feynman and Schwinger couldn't be wrong? After a few weeks, he heard from Feynman who had realized that both he and Schwinger had made the same mistake in their calculation. Weisskopf was right, and if had published his paper, he himself admits that he might have won a Nobel Prize. None of this engendered a sense of bitterness in him, however, and he used the incident to illustrate the importance of self-confidence in science.

After the war, the House Un-American Activities Committee (HUAC) started questioning and indicting hundreds of people for their pre-war and post-war communist party membership, and into the trap fell several physicists who were connected with national security work. One of these was Bernard Peters, a past student of Robert Oppenheimer's who had spent time in Dachau for communist agitation. Since the war, Peters had become a colleague of Weisskopf's at the University of Rochester and was doing significant research in cosmic ray physics. In June 1949, Oppenheimer gave a damning testimony against Peters and conceded that he was likely a dangerous communist. Much of this testimony was simply based on opinion and on Peters's activities before the war, with no concrete evidence. The testimony created an uproar in the physics community, much of which regarded Oppenheimer as its foremost spokesman. In response, Weisskopf wrote an impassioned letter to Oppenheimer, essentially taking him to task for his betrayal of Peters and begging him to set the record straight. He lobbied the University of Rochester and convinced its president to refrain from firing Peters. Weisskopf's conscientious actions during a period of great turmoil demonstrated resolve and empathy.

The Soviets had exploded their atomic own bomb in August 1949, and in February 1950, Senator Joseph McCarthy gave a speech in which he claimed to have a list of more than two hundred communists and likely spies in the State Department. News of physicist Klaus Fuchs' passing of atomic secrets to the Soviet Union started a furious arms race between the two nations to build a hydrogen bomb. When the debates were raging, a crucial conversation between Weisskopf and Hans Bethe in Princeton persuaded Bethe to reconsider working with Edward Teller on the hydrogen bomb. Egged on by Teller, Bethe was having a hard time deciding whether to work on the bomb. After a meeting with Oppenheimer, Weisskopf painted a vivid picture of thermonuclear destruction and convinced Bethe that even a world in which the US won a war with hydrogen bombs would not be worth living. Throughout the Cold War years, Hans Bethe was rightly considered as the conscience of the physics community, but Weisskopf could rightly be considered his wingman.

Keeping in spirit with his earlier work, Weisskopf kept on being known for the occasional mathematical mistakes that sometimes slowed down his calculations. On the other hand, this kind of inspired sloppiness made him a truly wonderful teacher, one who could provide for a completely immersive experience for his students in a way that made them feel they were participating in, rather than being taught, the process of scientific discovery. The physicist and science writer Jeremy Bernstein captured this memorable aspect of Weisskopf's trade in a 1991 review of the book:
"My visits to Viki's class in quantum mechanics at MIT were, in every way, a culture shock. The class and the classroom were both huge—at least a hundred students. Weisskopf was also huge, at least he was tall compared to the diminutive Schwinger. I do not think he wore a jacket, or if he did, it must have been rumpled. Schwinger was what we used to call a spiffy dresser.  
Weisskopf's first remark on entering the classroom, was "Boys [there were no women in the class], I just had a wonderful night!" There were raucous catcalls of "Yeah Viki!" along with assorted outbursts of applause. When things had quieted down Weisskopf said, "No, no it's not what you think. Last night, for the first time, I really understood the Born approximation." This was a reference to an important approximation method in quantum mechanics that had been invented in the late 1920s by the German physicist Max Born, with whom Weisskopf studied in Göttingen. Weisskopf then proceeded to derive the principal formulas of the Born approximation, using notes that looked as if they had been written on the back of an envelope. 
Along the way, he got nearly every factor of two and pi wrong. At each of these mistakes there would be a general outcry from the class; at the end of the process, a correct formula emerged, along with the sense, perhaps illusory, that we were participating in a scientific discovery rather than an intellectual entertainment. Weisskopf also had wonderful insights into what each term in the formula meant for understanding physics. We were, in short, in the hands of a master teacher."
Throughout his memoir, Weisskopf's consistent emphasis is on what he calls the joy of insight; whether in science, in music (he was an accomplished classical pianist) or into human beings. His focus is on complementarity and totality, themes that were hallmarks of Niels Bohr's thinking. Complementarity means seeing the world from different viewpoints, each of which may not be strictly compatible with the others, but all of which are collectively important to make sense of reality. He realized that as powerful and satisfying as it is, science is one way of comprehending the world. It gives us the facts, but it doesn't always point us to the best way to use the facts. Religion, the humanities and the arts are all important, and it is important to use as many ways as possible to look at problems and try to solve them; this applies especially to human problems where science can only take us so far.

Nevertheless, in what would be a ringing endorsement of the joy of insight into the secrets of nature during these politically troubled times, here's Weisskopf speaking about the value of science as a candle in the dark:
"I can best describe the joy of insight as a feeling of aesthetic pleasure. It kept alive my belief in humankind at a time when the world was headed for catastrophe. The great creations of the human mind in both art and science helped soften the despair I was beginning to feel when I experienced the political changes that were taking place in Europe and recognized the growing threat of war." 
"During the 1960s I tried to recall my emotions of those days for the students who came to me during the protests against the Vietnam War. This, and other political issues, preoccupied them, and they told me that they found it impossible to concentrate on problems of theoretical physics when so much was at stake for the country and for humanity. I tried to convince them - not too successfully - that especially in difficult times it was important to remain aware of the great enduring achievements in science and in other fields in order to remain sane and preserve a belief in the future. Apart from these great contributions to civilization, humankind offers rather little to support that faith."
In today's times, when so much of the world seems to be chaotic, dangerous and unpredictable, Weisskopf's ode to the 'most precious thing that we have' is worth keeping in mind. Victor Weisskopf's spirit should live on.