The views expressed in this blog are entirely my own and do not necessarily represent the views of the Department of Energy or the United States Government.
The labor structure of how we do science has not significantly changed for hundreds of years and has especially ossified in the post-WWII science funding structure. Mid-career PI’s guide teams of young, generally poorly paid graduate students who conduct experiments. While the capital equipment we use, such as the greatly enhanced microscopy techniques, has significantly improved, Science is still a labor-intensive process.
This is particularly true in the materials space, where strong design theories are lacking and experimental data collection is haphazard. While other fields have usefully accurate numberical models (e.g. finite-element models) or large experimental datasets and some degree of automation (e.g. drug discovery and biotech), the material sciences to a disproportionate degree consists of grad students doing laborious experiments semi-blindly.
Self-Driving Labs (SDLs) ride the trend of cheap robotics, increasingly effective AI models, and new material systems with combinatorically large design spaces. These systems are one that I’ve been involved with for quite some time: when I was at Lawrence Berkeley National Lab, I helped build the AI models that would ultimately be used to successfully apply for ARPA-E funding to design a self-driving lab for optical metamaterial optimization (of course, we didn’t call them self-driving labs back then). And they’re a topic that has popped up in the ML4Sci newsletter on multiple occasions (for 3D printing and photocatalyst discovery).
Despite their potential importance as an emerging science discovery platform, there has been little attention paid to these platforms in science or technology innovation policymaking. This piece is my attempt to write an accessible introduction to SDL’s, what they are, why they matter, and what policy-makers should do about it.
What are Self-Driving Labs?
I define Self-Driving Labs as having the following attributes:
the automation of both material synthesis and characterization
some degree of intelligent, automated decision making in-the-loop (what people now call AI), allowing for autonomous experimental iterations between synthesis and characterization
What are not Self-Driving Labs?
As always when it comes to exciting new terms and technologies, there are strong incentives to mislabel terms are often fuzzy. Based on the definition I provide above, here are common classes of experiments which are not SDLs. This is not to say they are inferior, just that they are not what we are talking about here.
high-throughput synthesis (lacks characterization and AI-in-the-loop)
using AI as a surrogate model i.e. software-only results. Using AI-driven material predictions and then synthesizing an optimal material is also not a SDL, though certainly still quite the accomplishment for AI in Science! (lacks discovery of synthesis procedures and requires numerical models or prior existing data, neither of which are always readily available in the material sciences)
What are some examples of SDLs today?
Using a robot arm to test 3D printed designs for uniaxial compression energy adsorption. A bayesian optimizer was used to iterate over 25,000 designs in a search space with trillions of possible candidates.
What are SDL’s good for?
As with any new laboratory technique, SDL’s are not an appropriate tool for everything! Given that their main benefit lies in automation and the ability to rapidly iterate through designs experimentally, they are best suited for:
material families with combinatorically large design spaces, that lack clear design theories or numerical models (e.g. metal organic frameworks, perovskites)
material families where synthesis and characterization are both relatively quick, cheap, and amenable to automated handling.
fields where numerical models are not accurate enough to use for training surrogate models (like DFT) or where there is a lack of experimental data repositories.
What are the benefits of SDL’s?
Similar to cloud labs, SDLs allow for higher degree of reproducibility in material synthesis and characterization.
SDL’s can also significantly improve labor productivity in science, allowing graduate students to focus on designing clever characterization experiments and understanding mechanistic theories from data, rather than spending large amounts of time doing that data collection.
Why do they matter and what should be done?
SDL’s are an emerging technology platform that could revolutionize how we discover materials.
SDL’s are an important evolution in the labor structure of Science and can significantly improve the labor productivity of academia, allowing graduate students to move away from menial synthesis towards advanced characterization techniques, elucidating mechanistic design, and discovery of new principles based on large-scale, high-quality experimental data.
They are not a pipe dream; the biotech industry has spent decades developing advanced high-throughput synthesis and automation. In contrast, material science has historically seen fewer capital investments in automation, primarily because it sits further upstream from where private investments anticipate predictable returns.
In an era of strategic competition, maintaining US competitiveness in materials innovation is more important than ever, for applications like energy, computing, and aerospace.
Other nations are beginning to recognize the importance of these platforms: University of Toronto’s Alan Aspuru-Guzik, a former Harvard professor who left the US in 2018, has created an Acceleration Consortium to deploy these SDLs and recently received $200M in research funding, Canada’s largest ever research grant.
What could a US research agenda look like?
While there are several labs in the US working on SDL’s, they have all received small, ad-hoc grants, that are not coordinated in any way. As a result, the SDL’s designed are constrained to low-hanging material systems (e.g. microfluidics), with the lack of capital hindering the ability for labs to scale these systems and realize their true potential. Here’s what a US research program for Self-Driving Labs could look like:
An ARPA-E Grand Challenge, similar to DARPA’s Grand Challenge that developed the US lead in self-driving cars, to demonstrate a clear state-of-the-art material discovery via SDL’s. This grand challenge could invite teams to submit pre-registered material systems for exploration, identify the current state of the art designs, and the metric for performance, and then fund meritorious applicants to develop SDL’s to demonstrate new breakthrough designs and validate SDL’s potential to supercharge materials innovation.
A Focused Research Organization that develops modular, open-source hardware and firmware for Self-Driving Labs. Many of the SDL’s today are creating customized equipment for their experiments, as current scientific characterization equipment is obviously not equipped to interface with a robot. Creating protocols for commoditized robots to interface with common synthesis and characterization lab equipment (or re-designing such equipment) can help accelerate adoption of SDL’s. A FRO can also explore how automation changes the labor structure of scientific teams, shifting skills to focus on greater computational experience or theoretical understanding of scientific mechanisms.
If you work on Self-Driving Labs and have comments, feedback, or are interested in colalborating, please reach out! If you want to see more video’s of robots doing science, check out this thread!