Understanding biology in the age of Artificial Intelligence

Srijit Seal, PhD Student, Yusuf Hamied Department of Chemistry

13 November 2023

While it is impossible to pinpoint the exact time when our modern understanding of biology emerged, our interest in the natural world, from diseases to edible plant species can be traced back to ancient times. 

Machine learning and artificial intelligence methodologies have seen widespread adoption in modern biology, but the scientific rationale for applying these models remains underdeveloped. 

Many researchers apply AI as standard practice, or to derive additional insights, with many of the AI techniques often augmenting (and sometimes replacing) experiments. However, others feel that the effective use of AI in biology requires careful consideration of the theoretical and philosophical aspects of these models and techniques. 

To understand the role of AI and whether breakthroughs in the intersection of AI and biology are pushing us to revise our notion of scientific understanding, I helped organise a conference at the Department of Chemistry, University of Cambridge. It was funded by a grant from the Accelerate Program for Scientific Discovery and the Cambridge Centre for Data-Driven Discovery, with support from the organising committee that spanned across four departments in the University: Philosophy, Pharmacology, Chemistry and Computer Science and Technology.   

The aim of the conference was to build a community of scholars and thought-leaders from artificial intelligence and machine learning, biology and chemistry, as well as philosophy, in order to address biological discovery from three angles: theoretical, scientific, and philosophical. 

Different perspectives 

Between 200 and 250 people attended to hear keynote speakers from different universities as well as from industry across the UK. 

Sarah Teichmann, Head of Cellular Genetics at the Wellcome Sanger Institute, University of Cambridge, explained how her lab uses single cell genomics to attempt to understand cellular diversity in the human body - where 37 trillion cells with a remarkable array of specialised functions cooperate and collaborate to allow us to function – as well as how it goes wrong in disease. She detailed how her team uses machine learning and AI in their work, as well as how the synergy between ‘wet’ and ‘dry’ science is driving new discoveries in the cellular composition and tissue microenvironments of the human body. 

Charlotte Deane, Professor of Structural Bioinformatics, Department of Statistics at the University of Oxford focused on how machine learning has shown great promise for increasing the speed and reducing the cost of developing new vaccines or biotherapeutics, which typically takes many years and requires over $1bn in investment. She described how her team is developing novel computational tools to allow accurate rapid structure prediction, as well as to understand the diverse binding preferences between different types of immune receptor proteins. 

Giving an industrial perspective, Pushmeet Kohli, Vice President of Research at DeepMind, talked about the potential of AI and ML to analyse and understand biological data and improve the ability of researchers to make predictions about the behaviour of complex biological systems. He gave attendees an insight into AlphaFold - the company’s AI system that predicts a protein’s 3D structure from its amino acid sequence – and the impact it is having on biological research.  

Nicola Richmond, Vice President of Artificial Intelligence at BenevolentAI explained how the company is integrating large and small language models into its processes, according to their unique strengths and weaknesses. For example, she explained that their experience suggests smaller, specialised language models are good for many natural language-based use cases to overcome issues with accuracy and latency, while large language models are well suited for binding specialised tools and data into a single cohesive workflow.  

Emily Sullivan, Assistant Professor of Philosophy and Irène Curie Fellow at Eindhoven University of Technology, discussed how philosophy of science and epistemology can help researchers understand the potential and limits of machine learning used for science. She argued that ML models in science function in a similar way that highly idealized toy models do, so thinking of ML models as toy models can help to shed light on the scope of ML’s potential for scientific understanding. 

The conference concluded with a presentation by Andreas Bender, a Professor of Molecular Informatics in the Chemistry Department. He provided an overview of the latest progress in utilizing AI for drug discovery and its future prospects, emphasizing the data being produced and how AI can best optimize its use. 

Collaborative outcomes 

While the academic keynote speakers undoubtedly provided a lot of food for thought, collaborating with industry also gave attendees an understanding of what is of practical relevance to scientific discoveries. 

The event kickstarted the aim of building a community. Several organisers are preparing a manuscript, currently titled Machine learning approaches for understanding biology, to share learnings from the conference and delve deeper into the philosophy of AI and how it can enhance understanding in the field. 

AI and ML are already facilitating biological discoveries. With further collaboration and understanding of the technology’s potential, this is just the beginning.