Field of Science

What areas of chemistry could AI impact the most? An opinion poll

The other day I asked the following as a survey question regarding potential areas of chemistry where AI could have the biggest impact.


There were 163 responses which wasn't a bad representative sample. The responses are in fact in line with my own thinking: synthesis planning and automation emerge as the leading candidates. 

I think synthesis planning AI will have the biggest impact on everyday lab operations during the next decade. Synthesis planning, while still challenging, is still a relatively deterministic protocol based on a few good reactions and a large but digestible number of data points. Reliable reactions like olefin metathesis and metal-mediated coupling have now become fairly robust and heavily used to generate thousands of machine-readable data points and demonstrate reliability and relative predictability; there are now fewer surprises, and whatever surprises exist are well-documented. As recent papers make it clear, synthesis planning had been waiting in the wings for several years for the curation of millions of examples pertaining to successful and unsuccessful reactions and chemotypes as well as for better neural networks and computing power. Without the first development the second wouldn't have made a big difference, and it seems like we are finally getting there with good curation.

I was a bit more surprised that materials science did not rank higher. Quantum chemical calculations for estimating optical, magnetic, electronic and other properties have been successful in materials science and have started enabling high-throughout studies in areas like MOF and battery technology, so I expect this field to expand quite a bit during the next few years. Similar to how computation has worked in drug discovery, AI approaches don't need to accurately predict every material property to three decimal places; they will have a measurable impact even if they can qualitatively rank different options and narrow down the pool so that chemists have to spend fewer resources making them.

Drug design, while already a beneficiary of compute, will see mixed results in my opinion over the next decade. For one thing, "drug design" is a catchall phrase that can include everything from basic protein-ligand structure prediction to toxicity prediction, with the latter being at the challenging end of the spectrum. Structure-based design will likely benefit from deep learning that learns basic intermolecular interactions which are transferable across target classes, so that they are limited by the paucity of training data.

Areas like synthesis planning do contribute to drug design, but the real crux of successful drug design will be multiparameter optimization and SAR prediction, where an algorithm is able to successfully calculate multiple properties of interest like affinity, PK/PD and toxicity. PK/PD and toxicity are systemic effects that are complex and emergent, and I think the field will still not be able to make a significant dent in predicting idiosyncratic toxicity except for obvious cases. One area in which I see AI having a bigger impact is any field of drug discovery involving image recognition; for instance phenotypic screening, and perhaps the processing of images in cryo-EM and standard x-ray crystallography.

Finally, automation is one area where I do think AI will make substantial progress. This is partly due to better seamless integration of hardware and software and partly because of better data generation and recording that will enable machine learning and related models to improve. This development, combined with reaction planning that allows scientists to test multiple hypotheses will contribute, in my opinion, in automation making heavy inroads in the day-to-day work of chemists fairly soon.

Another area which I did not mention in the survey but which will impact all of the above areas is text mining. There the problem is one of discovering relationships between different entities (drugs and potential targets, for instance) that are not novel per se but that are just hidden in a thicket of patents, papers and other text sources which are too complicated for humans to parse. Ideally, one would be able to combine text mining with intelligent natural language processing algorithms to enable new discovery through voice commands.

No comments:

Post a Comment

Markup Key:
- <b>bold</b> = bold
- <i>italic</i> = italic
- <a href="http://www.fieldofscience.com/">FoS</a> = FoS