Accelerate Science Machine Learning Engineering Clinic

14 December 2023

The Accelerate Programme for Scientific Discovery aims to drive a step change in the use of AI for science across the University of Cambridge. To get there, the programme has been exploring ways of supporting researchers using AI, no matter their background.

In 2022, the programme started a Machine Learning Engineering Clinic as part of its support package, with Clinic sessions convened both online, and in departments around the university. The clinics are run by scientists and engineers from the Accelerate Science programme, with the goal of connecting researchers and AI experts. From these clinics, we’ve answered over 80 support tickets across 35 university departments.

The clinic can help a range of issues that researchers encounter in AI for science projects: from brainstorming which machine learning methods to use, through best-practices around publishing software packages, to fine-tuning large language models. No question is too small - sometimes all you need is a short chat with one of our engineers to sanity check your work. We’re even available to help grant applications for research using machine learning.

A great example is an ongoing consultation with Chris Bannon, a researcher from the Wellcome-MRC Institute of Metabolic Science and a former ML Academy student. Chris has done an enormous amount of work applying machine learning to studying gastrointestinal conditions and hormone levels using blood data. Our consultations have ranged from in depth discussions about particular machine learning and evaluation methods, to quick checks of algorithm implementations.

Chris says “I have used the machine learning clinic 3 times over the autumn, and have found their input invaluable. As a clinical research associate, I am the only member of my team performing data science, and the machine learning clinic has not only helped me further my data analysis, but also develop my skills as a data scientist. I fully recommend their services, and have been very grateful for their input and assistance.”

On the other hand, if you have a bigger task, we’re able to dive in and help. Three of the bigger projects we’ve advised on have included:

  • Building open source software: formal thought disorder shows up in the way that people speak. They tend to jump between topics rather than create a coherent narrative. In this project, Caroline Nettekoven from the Department of Psychiatry, developed a pipeline to show semantic connections in transcripts of people’s speech, which could then be analysed to compare patients with formal thought disorders against a control from the general population. Caroline wrote and packaged this software with help from the Turing Institute and was aiming to release the software for general use alongside her paper. Over a period of 8 weeks, we helped shape the code so that it was able to be released on time.
  • Exploring machine learning models: Aditya Ravuri from the Department of Computer Science and Technology and Jen Muir from the ARU Behavioural Ecology Research Group developed a model to detect calls in coppery titi monkeys, a South American primate. Traditional human sound event detection models perform poorly on monkey calls, prompting Aditya and Jen to explore different models. Their aim is to both predict when a call occurs, and what type of call it is. This should help researchers and zoo staff determine whether the monkeys are in distress. Jen and Aditya put in a tremendous amount of work to build a high quality dataset, and with the help of the ML Clinic, were able to explore alternative models for detecting and classifying these calls.
  • Understanding new AI methods, such as Large Language Models: Words often carry multiple meanings, which can vary significantly based on the context in which they are used. For instance, the word “bat” can refer to a nocturnal flying mammal or a piece of sports equipment. This linguistic phenomenon presents a challenge: how can we effectively categorise words into hierarchies based on the ease with which their meanings can be discerned from context? Nina Haket, from the Department of Theoretical and Applied Linguistics, is using large language models to systematically organise words and better understand how they’re used in different contexts, with the help of the ML Clinic.

Nina says: “Ryan and the ML Clinic have been invaluable in the project. Since I had little experience with LLMs and ML, the task I wanted to do seemed daunting and I spent months trying to solve it on my own. With their help, we now have it up and running, and should get some interesting results soon.”

The variety of projects highlighted here reflects the broad range of assistance provided by the Machine Learning Engineering Clinic across the University of Cambridge - from supporting health research, to complex linguistic analysis, our team can offer practical AI and ML guidance.

We’ve worked with departments across the university to host in-person clinic sessions. To host in your department, all we need from you is a room booking to hold the clinic in, and some help advertising in your department. Get in touch with us to fix a date.

And if you have an ML problem that you’d like support with and can’t make it to a clinic, don’t worry. You can still file a support ticket with us and we’ll be in touch!

Whether it’s a short, one-time consultation or a longer-term collaborative effort, our goal is to make advanced AI tools more accessible and effective for all researchers at the university.