Sergei Kalinin on AI & Autonomous Microscopes

in-Silico

0:00

-50:19

Sergei Kalinin on AI & Autonomous Microscopes

Microscopy meets machine learning

Charles Yang

and

Maxwell Stern

Apr 29, 2025

Transcript

Introduction

In this episode, I sit down with Sergei V. Kalinin — chief scientist for AI in the physical sciences at Pacific Northwest National Lab and professor of materials science and engineering at the University of Tennessee, Knoxville — to explore how AI is transforming the microscope, one of the most powerful tools in science.

Kalinin shares his personal journey into the world of scanning probe and electron microscopy, and how his team is pioneering new approaches to high-dimensional scientific data. Below are three major takeaways based on our discussion:

AI is turning microscopes into atomic-scale fabricators
Kalinin highlights a foundational shift in microscopy, from capturing static images to enabling autonomous experimentation. His team uses machine learning to extract physical insights from complex, high-dimensional datasets generated by scanning probe and electron microscopes. Notably, these models can sometimes identify patterns and phenomena without predefined physical theories. This enables not only faster and more scalable interpretation of materials data but also repositions the microscope as a tool for fabrication and manipulation at the atomic scale, opening up possibilities for nanoscale manufacturing.
Limited data and human intuition keep scientists essential, despite AI advances
While AI can already outperform humans in routine optimization tasks in the lab, Kalinin recognizes that the bigger barrier lies in replacing human intuition, especially when it comes to experimental design and decision-making. Most scientific datasets, unlike commercial “big data,” are narrow in scope, sparsely populated, and often structured by physical constraints. This undermines off-the-shelf machine learning methods, which assume broad coverage and dense sampling. As a result, scientific ML must be deeply integrated with domain knowledge and often requires physics-informed architectures to be effective.
AI for science needs standards to scale
Despite early successes in using AI to automate microscopy tasks, the field lacks standard datasets, evaluation benchmarks, and interoperable workflows. Kalinin attributes this to fragmented scientific goals, limited sharing of proprietary data, and a still-developing culture around open-source tools. Community-building efforts (like the autonomous microscopy hackathon leveraging digital twins) are critical steps forward, helping create the foundation for Kalinin’s long-term vision of fully autonomous self-driving labs.

Transcript

How did you get into microscopy as a field of study?

Charles: Okay, awesome. Today we have Sergey Kalinin joining us. He's a professor at UT Knoxville and chief scientist for AI of the physical sciences at Pacific Northwest National Lab. Sergey, thanks for joining.

Sergei V. Kalinin: My pleasure. So happy to talk with you, Charles.

Charles: I'm really excited for today's conversation because I know people often think a lot about material synthesis or excited about new material discoveries, but often forget about the importance of material characterization, which often underpins a lot of these discoveries. So maybe it'd be helpful to hear first from you, how did you get into microscopy as a field of study? I know you mentioned before that it started in your undergrad.

Sergei V. Kalinin: Absolutely. Actually it was a kind of interesting story which started at the time when I was an undergraduate student back at Moscow State University in Russia in mid-90s.

So that was the time when the first material science department in Moscow state was designed and of course nobody knew what material science is. So the idea was to take as much as of the inorganic chemistry, state electrochemistry, add condensed metaphysics on top of it, some elements of the coordination chemistry, mechanics, a copious amount of mathematics all the way from calculus all the way to the functional analysis and that actually become a material science program.

And the interesting part there was that from the essentially year one as an undergraduate the research was a big part of the problem. So there was an exam on the research projects and you can get an if and only if. You also ask questions not only present your own results. It was an interesting organization. A lot of my colleagues are now in US or some of them stayed back in Russia. But one of the ideas that was discussed at that time was to have the students go to the foreign exchange programs and as it happened I spent half a year in post-tech in South Korea and the value proposition there was hey there is this new method called atomic force microscopy do you want to figure out what it is and maybe get some hands-on practice for that.

So I spent a year, half a year in Korea, so that was interesting experience. I didn't get much hands-on practice. I did, however, read pretty much all the papers that were published in this field from this moment. So the field was new, the number of groups was limited, and it sounded like a really great method to explore matter on the atomic scale. So the second side effect, I also got the black belt within the Hadon Kondog because you have to do something in the evenings as well. So anyway, with that, I returned back to Moscow and then it was time to look for the graduate program. And at that time was already looking for the locations where I can make around two attempts of learning how to use atomic force microscopy. And this is how I ended up being in Don Banel's group at the University of Pennsylvania. And from that moment on, for the last 25 plus years, my career was either scanning probe microscopies or electron microscopies or both.

When and how did you start incorporating AI into your microscopy work? (03:32)

Charles: That's awesome. Ton there both on getting in early to a field and also of a lot of the foreign exchange and talent discovery. I'm curious, did you have a similar moment when you started incorporating AI into your work? When it was like a six month or one year period, we kind of had an aha moment of jumping into that field as well.

Sergei V. Kalinin: You know, interestingly enough, pretty much yes. So when I was a graduate student, I quickly realized that the part of the scanning probe microscopy that I'm interested in is not the one when you get a lot of images, is the one when you have a data streaming from the instrument and you figure out what this data means in terms of materials properties. So just getting a picture, I mean picture looks awesome, but what you really want to do is to measure the functional properties of materials. It's not that easy because many scanning probe microscopy techniques have what's called topographic crosstalk, so the data is strongly biased by the variations on the surface morphology. But there are several types of techniques that allow you to measure quantitative properties like ferroelectricity, photovoltaic responses, electrostriction, so properties like that. And a lot of my work from 1998 till maybe 2005 was dedicated of learning what are the contrast formation mechanism in SPM. And then once I got to Oak Ridge as the Wigner fellow, we started to work on the expanding the modalities of the SPM measurements. So classical image is two dimensional, classical spectrum is one dimensional. If you take a spectral image, it is three dimensional, but it turned out that for certain modalities, of SPM we can make them quantitative and because they're quantitative we can build more complex spectroscopies on top of it. So around 2005-2007 we started to get three, four, even five, later even five dimensional datasets. Once you get this much data you basically have to figure out what it means. So typically we do that if we have data, we have a physical model, we dump the data in the model and we want to learn something about material from the data. But for many things that we are doing there was no model. So we had to figure out ways to experiment with the data and get some insights from large dimensional data sets. And that's exactly the proverbial large data problem. So we started to work on the simple multivariate statistics as a start. At that time that was not particularly easy because even though it was just 20 years ago you cannot analyze the data set on your desktop. It was way too large for that. But we were able to make things work. Then I started to look at the neural networks. So at that time there was a book by Heiken on the neural networks. MATLAB had some simple neural networks modules. And basically we started to experiment. So for example, in 2008-2009 we have a set of papers where we applied multivariate statistics to data, where we created a simple model, trained the neural network using this model and then applied this model for the experimental datasets. It didn't work particularly well, but that's thought that matters and we were able to actually publish it. So the record is there. And we did some interesting things where we took some images of the bacteria and trained our neural network to recognize the bacteria based on this spectra, so we don't understand what the spectra mean, but spectra ended up being a very reliable fingerprint of the bacteria type. So that was a typical example of the unsupervised and supervised and the theory-informed machine learning methods. So if you look back in time, it looks like that in 2009 we actually reported almost within a year some of the things that later grow to much more interesting areas.

And then it was at the beginning it was not something that we did because it is we believe that there is a great vision into that but it was a typical Friday night project. So basically for about five years about a day a week we experimented with the neural network concepts. Most of that never went anywhere other than my stack of the documentation of things that didn't quite work out as well. But there is a considerable amount of publications on this early use of the shallow networks for the data analysis. Obviously, the early criticism was that at that time that was machine learning doesn't really relate to the physics of the system, and that's absolutely true. So we started to dig into the meaning of the neural network data analysis in terms of what it tells us about the physics of the system. And that was about the time when the deep learning appeared and we started to realize that it's not a hobby, that's actually a career choice.

Is there still more data than compute in microscopy today? (09:00)

Charles: Yeah, that's awesome that even in 2009 at Oak Ridge National Lab, you guys were spending Fridays and weekends trying to use neural networks for a lot of the huge amounts of data you're getting from microscopes. I'm actually curious before we jump into the main bulk of the conversation on microscopy and the work you're doing there today. It seems to me like there's always been, you know, there's always been more data than compute when it comes to microscopy and even now with better instruments, we're still generating even larger amounts of data. Do you feel like that trend is still true today? And do feel like that trend is going to continue?

Sergei V. Kalinin: That's actually not necessarily true. So you see the whole definition of the big data It's a little bit elusive because big data doesn't mean that you have large data volumes. So generally in the machine learning community big data refers to the data that covers or extensively samples certain parameter space. So generally saying we have big data means that we know a lot about some specific system. So we have a big data about cats on the internet, right? Because everybody like to post cats on the Facebook or other social media. For material science, very often we can have large volumes of data, especially in electron microscopy, but strictly speaking it's a very high dimensional data, but it's not a big data because we probe just one specific material. And that actually makes machine learning so difficult to use in science because we can use ML primarily in the scenarios when there is a lot of data. A lot of data means that the system is already well explored and the field is mature. Science by definition is going where no one has gone before and exploring new things. If we do that, we really don't have a big data. We just have some ideas of some hypothesis of what may be happening there. So there are ways to use machine learning in these scenarios, but this is a very complex forms of machine learning that heavily rely on the prior physical knowledge. Essentially, we use ML to formulate some hypothesis and then our experiment become a way how we can falsify those hypothesis. But classical paradigm of let's get a lot of data and it will tell us everything really doesn't work in science.

What's involved in the calibration and operation of microscopes? (11:21)

Charles: Yeah, with that maybe we should start and like motivate some of this problem, right? So we have over a hundred thousand electron microscopes in the world. Every single time a grad student wants to use one, they have to take some amount of time to calibrate and get the proper measurement. You know, roughly how long does that calibration time take? What are the, you know, give us a sense of the dimensionality of the problem, the type number of knobs and parameters that you could tune that are required to get a correct image resolution.

Sergei V. Kalinin: So let's start with the scanning probe microscope because that one is relatively simpler and you know, electron microscope is a very complex device. So the level of complexity of electronics, it is really complicated and has multiple operation correction detectors and so on and so forth. So about, I think 30 years ago, Brown said that electron microscope is essentially a small synchrotron and it's true, the type of information we can get from electron microscopy is essentially equivalent to the synchrotron except that it is basically done in the single instrument. So scanning probe microscopy comparatively is much much simpler. It is essentially just a probe that interacts with the surface. We can operate this probe in the static regime when we just measure forces. We can vibrate the probe.

So the physics of the imaging process is relatively simple. It's just a oscillator in some force fields and what we want to do is to learn the parameters of these force fields. Now imagine you have a graduate student or a post-doc or a scientist getting in front of the instrument. So what you start with is actually you take a sample.

So notably the sample doesn't come out of nowhere. Usually there is a lot of thinking that goes into figuring out what sample you want to look at. And there is a lot of work involved in the sample preparation, which takes a lot of time and effort, but let's assume that now we already have it. So the first thing that you are going to do is you are going to tune your microscope. So you need to find the optimal imaging parameters. It will involve a little bit of struggle figuring out whether the probe is good, how does it behave away from the surface, how does the surface scanning look like. You start with the regular topographic imaging, then you start to optimize additional information channels like potential or electromechanical response. Then if you want to do the quantitative measurements of some spectroscopic responses, you start to optimize the spectroscopy. So as I mentioned, scanning probe microscopy is relatively simpler than electron microscopy and much much cheaper, at least if we compare ambient systems to the UHV microscopes, there is one caveat. So if you have the electron beam in the electron microscope, it can in principle create a hole in the material, so that's your beam damage, but you don't expect the electron beam to change. Scanning probe microscopy is relatively straightforward to damage or destroy your tip, and you can do it even if you do topographic imaging, you can do it if you apply high biases or high forces. So a lot of the work of the microscopist is basically carefully tuning your system, optimizing the scanning condition and doing it very very carefully in order to avoid the beam damage. If that happens you have to change your tip, you have to go to the different part of your sample.

It's much worse for the ultra high vacuum machines. So the folks who work on the scanning tunneling microscopes are ultra conservative because if they have a tip, changing it, if there is a considerable damage to it, takes several days. So nobody wants to do that. So you always spend time and effort to carefully condition your tip. You can improve it a little bit. So you want to stay in the regime where the damage is actually minimal or not happening at all. And if you are good on it, you can use the same tip in STM for essentially multiple months or years. For AFM, changing the tip takes half an hour, so essentially it's something that you use as a consumable material. Still, it takes time and effort. So anyway, then the end result would be that you tune your system, you have a good scanning, you have a good spectroscopy. Then the exploration of the system starts. So you look at some part of the sample surface, you zoom in, you can decide to take the spectroscopy data set, you move to the different part of the surface. And what's remarkable about it is that the operations that you execute on the instrument are usually very straightforward. It is tuning, zooming, taking images, taking spectra, adjusting the hyperparameters of the spectroscopy. But you execute these decisions hundreds times during the day on the microscope. So it's literally almost the same operations that you do many, many times on the road.

And it turns out that the very fascinating part about this whole process is what is the decision-making logic that goes on in the mind of the operator. So you start with optimization, then you go on to the exploring the statistically significant features in the image. So you see a lot of things of the certain type. So you can explore them to get a feeling whether they're representative, what the properties are. Then you typically start to explore anomalies. For example, you see some weird features, you want to have an idea of what it is. Or you may go to the point when you see the interesting objects and then you have a hypothesis about why these particular objects are interesting for you. So what they tell you about the physics of the material, how can they help you to improve the material. So all of a sudden your language in which you can express your actions or hyperlanguage is usually very, very simple, but the thinking and the logic that goes on the decision making starts with very simple optimization all the way to fairly complex hypothesis testing. So it's a really fascinating process. interestingly, it applies not only for microscopy, it actually applies for much more complex systems such as the self-driving labs or synthesis robots and so on and so forth.

Are there standardized benchmarks for evaluating AI models on microscopy tasks? (17:47)

Charles: So you mentioned that there's this large amount of calibration that's required. And I think you've also mentioned previously that you expect an AI to eventually be able to be about an average human operator at one of these microscopes. Also that there's, it sounds like, significant negative risk, right? You want to avoid damaging one of the tips, particularly in the more expensive machines. Is there, has anyone created a benchmark of standardized set of basic samples and on a probably cheaper AFM and the set of workflows that someone goes through and then compare different AI models and their abilities. Because that's one of the usually key drivers of progress within any domain, right, is having a benchmark data set that people can compare against.

Sergei V. Kalinin: That's an excellent question. This is where we are going, but that's not where we are for the time being. So in the classical microscopy, we always have a calibration samples, right? So that's a typical example of the physical benchmark. So we can use the calibration sample in order to establish the spatial resolution, in order to establish energy resolution, in order to calibrate the signal in terms of the relative signal strengths. So this is best practices in pretty much any microscopy or characterization or physical measurement field for that matter. They're well established and obviously they're going back to sort of tens and hundreds years ago.

Once it comes to the point of the human operating the microscope, this is not a mature field. So for the example, for many workflows that my group is developing, we have a strong feeling that we are somewhere roughly in the factor of 10 to 30 compared to the human performance based on how long the data set takes, what is our acceleration factors and so on and so forth. This however is not a benchmark yet because if my group works on the development of the autonomous workflows and my group works at the development of the benchmarks, we want to be really careful of using our benchmarks and our workflows. The hope is that once these workflows become more popular and will be deployed at more than several instruments worldwide, there would be a community drive to define the proper benchmarks. So the metrics for it interestingly can be created. So for example, if we consider something like learning the structure property relationships, we can always compare the full data set and then we can use the machine learning algorithm and see how fast we can learn and how rapidly we can approach the full knowledge. So sometimes that this type of benchmarks can be very frightening if you think about it. So for example, if I want to learn the structure property relationship, meaning build the predictive relationship between the local structure and let's say local history is a slope, sometimes it turns out that I can learn it in 30 steps for the data set that includes where the ground truth is 10,000 points. So if you look at the numbers for the face value, you get the acceleration by more than factor of 30 for measurements that typically can take an hour. So if you scale it, that suggests the potential of machine learning to operate with a large data volumes and high dimensional data sets is absolutely amazing. we can increase the efficiency of the instrument used by at least factor of 10 and for sufficiently complex problems probably by a factor of 100. So there is really disruptive potential here.

Are we seeing AI being rolled out in national lab facilities? (21:25)

Charles: That's awesome. You worked at Oak Ridge National Lab previously and the national labs have these large user facilities, including microscopy and characterization equipment. Are we starting to see AI being rolled out to these systems and what could be the potential efficiency improvement if we reduce, I mean, you're talking 10x, 100x, just give us a sense of how many users, how many minutes of productivity are we saving here?

Sergei V. Kalinin: So this is a very good question and it's fairly difficult to answer. So the reason for that is we never use microscopy as for the sake of microscopy, right? So we typically use microscopy in order to solve the specific material science problem. So even if you have the same microscope and the same sample, but the person operating the microscope has different scientific drivers, they will structure their experiment very differently.

So for example, if you want to understand, have the sample of some material and if you want to understand the mechanical properties, you would look at the dislocations. And let's say if you want to understand corrosion, you may look at what happens on the grain boundaries, even though it is the same microscope and the same sample. So if we can learn structure property relationships faster by considerable fraction, that accelerates one step of this workflow.

So however, there are multiple other steps. So first of all, we need to make the sample and we need to put the sample in the microscope, which is actually a type of operation which takes a lot of time and requires human work because it's kind of fine motorics. We always have to deal with the things like Moor-Evans paradox that there are operations that human is designed to do much better than the robots. Secondly, if I can do my task faster, I mean that's great, but what really matters is the downstream impact.

And the biggest problem that we face in the microscopy community now is that we very often consider microscopy to be the omega of the scientific discovery. We take the microscopic data set and we put it in the paper. We say that this data set is consistent with this hypothesis. It's not consistent with this hypothesis. So very often getting the microscopic data is where the experiment stops.

It doesn't have to be this way. So there are examples of the fields like cryo-EM when the microscopy workflows start to connect directly to the molecular discovery, molecular drug discovery and biology. And this is the field that now grows exponentially. We have electron crystallography, which it's very difficult to grow the crystals of the many biological molecules that are large enough for the X-ray scattering, but pretty much all of them form crystals that are in the nanometer range too small for x-ray but large enough for the electrons. So these are the examples when the microscopy has the downstream application and once any field start to have the downfield application then the field start to integrate and then the field start to develop much much faster. So the key type of challenges that my group is working on now is connecting the microscopy to the downstream in terms of physics and materials discovery. So a typical example here is the exploration of the combinatorial libraries, right? So combinatorial libraries have been around for almost 50 years. So there were several waves of interest to the combi libraries, first in 60s, then during the time when the superconductors appeared. If you take the MRS proceedings, so the famous blue books from the year 2000, there would be papers on the machine learning and combinatorial libraries that will sound very reasonable even by modern times.

So why did combinatorial libraries went up and down? Very simple. It's great to synthesize the material, which represent the cross section of the phase diagram. So we have multiple composition essentially at slightly more complex, at the price of slightly more complex experiment than making a simple sample. But unless we characterize it and learn what is the physical properties across this binary or ternary cross section, the gain is minimal.

So if it takes us half a day to make a sample and three days to characterize the sample, then accelerating sample production to infinity really doesn't accelerate our total knowledge process at all. So because the characterization is a bottleneck.

So it turns out that scanning probe microscopy is a great way to study certain type of properties across the combinatorial spaces. For example, ferroelectricity, photovoltaic, certain type of mechanical responses. And when I was envisioning my group at UT, so the choice of the instrument that I have was exactly the instrument for studying the combinatorial libraries. So we started with relatively simple binary libraries of the bithond ferrite, so it's kind of calibration sample that we use thanks to collaboration with Ichiro Takeuchi group at University of Maryland. So he was a person who did combi research for about 25 years or actually 30 years by now. And by now there is a, it's not yet visible, but there is a renaissance of the combinatorial research. So a lot of our colleagues in the NREL, University of Colorado, Penn State University of Maryland start to work on the combinatorial research again. And the reason why now is the time, and if you will the third time is the charm, is because now we can characterize this combinatorial library and Scanning Pro will play a key role into that.

Could future AI systems allow users to prompt microscopes in natural language to calibrate and measure specific properties? (27:35)

Charles: Yeah, I think that's a great point about how for self-driving labs, we need both of these loops, right, synthesis and characterization at the same pace. Before we talk more about that, which I wish I do want to get to, you mentioned earlier that the challenge is my cross-speed isn't in a vacuum, right? There is a wide diversity of tasks that it's downstream used for. Can we imagine a future where we have unsupervised, you know, AI agents that are helping auto-calibrate microscopes? You can imagine if you just provide a natural language prompt, want to look at the grain boundaries here for this kind of property that the microscope can automatically calibrate that. Because I know a of the work right now, it seems like it's very material specific. We were able to use this kind of microscopy system to automatically characterize this kind of material property or set. Is that kind of a pipe dream or is that something we can expect?

Sergei V. Kalinin: That's definitely something we can expect. So one of the difficulties here is that traditionally funding agencies like DOE or NSF put emphasis on the hypothesis-driven science, which of course is super important because science by definition is discovering new things.

There was general belief that development of the new methods is a little bit more applied, so Dewey at some point supported it as a part of the user facilities. NSF seems to be supporting it through the TIP Directorate, but everything said and done, technique development falls into this gap between what academic research should do and what the industrial research should do.

So the most important thing about the research though is that 95 % of what we are doing in the labs, if not 99%, is actually optimization. It starts with optimizing how we do the paperwork or how we arrange the workbench. It continues with how we optimize the imaging. If we run the microscopy, it includes things like optimizing the growth condition. So yeah, we do new science, but we never do it by boldly going where no one has done before by the straight line.

We always go there through the multiple connected optimization loops. And it turns out that I am personally very skeptical about the potential of the current machine learning, whether it is large language models or anything else, to create the fundamental new knowledge. However, I am absolutely confident about the potential of these models to integrate the prior human knowledge and in their capability to run effective optimization workflows. And that practically means that if we learn how to cast the everyday activities that we do in the lab as optimization problem, this is where the true impact of machine learning will be. It can save us 90 % of our effort or 95 % of our effort, depends on the goal. The role of the large language models in this case is actually fairly subtle because in order to optimize something, we need to know what to optimize. For example, if I want to build the microscope and improve the resolution in the atomically resolved imaging, this is relatively straightforward, because if you see atoms, it is very easy to define the self-consistent measure of the resolution. If you see the objects on the mesoscale, it's actually fairly difficult to define the measure of quality. Somehow, humans can look at the image and say that this image looks good and this image does not look good but for machine learning algorithm to do the same thing is actually very very difficult.

So one of the things that my group has been working on is trying to first systematize what is called the human heuristics in order to learn these measures of performance. And now we are starting collaborations where we use the large language models that can help us select these optimization targets. you may ask a question. So, okay, this guy is, of course, he has been doing machine learning for experiment for 20 years, but obviously he is not working in Google or Amazon or he is not a CS major. So what does he know about the machine learning? So there is a great book by Peter Norvig who obviously knows everything about machine learning because he essentially wrote the Bible of it and this book starts with introduction with very memorable statement that until now virtually all developments in machine learning and this is the book which was published in 2020 So it's very recent.

Assume that there is a clearly known measures of Optimization function or reward. So in other words if we can formulate what is that that we want to optimize Then our machine learning colleagues can do that. But if we cannot do that then It's not going to work. So in some sense our in order to make machine learning useful Our key task is to figure out what is that we wanted to accomplish. What is the reward?

How did the autonomous microscopy hackathon go, and what was the structure? (32:59)

Charles: Yeah, no, that's great. And I think it's really come out throughout this conversation that microscopy is not just an independent task, but a series of kind of embedded workflows that are somewhat hard to make legible. And that sounds like the work that you guys are setting out to do. Maybe on that point, you ran an autonomous microscopy hackathon recently. How did that go?

Sergei V. Kalinin: Oh, that went amazing. So that was an interesting exercise because hackathons is a well-known method of developing new things in the machine learning community. several of our colleagues in Argonne or Acceleration Consortium in Canada have run hackathons on Bayesian optimization and large language models, and that was big success. If I'm not mistaken, the one on the Bayesian optimization attracted 34 groups, so that's amazing result. And then our thought was that maybe this is an opportunity for us to make impact in microscopy field.

So why is it difficult? It's difficult because microscopy traditionally have the culture of collaboration on the level of the modeling software development, but it was usually assumed that once you get the data, your data makes your career. So some communities are built around the ideas of sharing data. Some communities are at the stage when the data sharing is still not a thing. So microscopy community, doesn't have a tradition of the data sharing.

At the same time, over the last maybe eight years, was a very strong interest from the microscopy community towards using the machine learning methods. And a lot of groups have been involved in developing their own solutions. And very often, you kind of see that the same technological solution is developed over and over and over again. Clearly, this is not the best way to use people's time and effort.

So we decided that maybe a hackathon is the way how we can at least make a step towards the integration of the community. And of course, our thought was to make it interesting, right? So typically hackathons are structured around specific tasks or the standard data sets. We decided that, you know, the future is the active learning on the microscopes. And therefore, in addition to having the static data sets, we are going to provide the digital twins of the microscopes that the hackathon participants can kind of interact with as if it is a microscope. So obviously digital twins are much simpler than the microscope, but if you don't know what's inside, mean black box is the black box, right? So if the content is simple, the if the content is simple, then figuring out what it is is still a considerable challenge. Can you give me a second?

So the idea was to create this hackathon as the way to integrate the community around it. And you know, there are some problems that we're interested in that we don't know how to solve. There are expectations that people will come up with their own data sets and then we can study it. And it was interesting process. So the preparation was about six months. So it's almost like iceberg. You need to build foundation that's not visible. And once we put the announcement out, there were several, about 300 people who expressed interest. So obviously we didn't expect that there would be 300 participants. Typically when we run the machine learning schools, there can be half a thousand of the registrants, but the school will still attract about 100 people. So the churn rate is about one out of five. So with the hackathon, so as you can imagine, when we started it, we didn't know what to expect because we have several teams. We don't know how many of them actually will show up. And to our surprise, there were 80 people. So there were people in Europe, few people in Japan, people in Brazil, a lot of groups over the US. So we ended up with essentially 80 people with a team formed from all over the world. So my colleagues at UT made an absolutely amazing job of helping people to self-organize to build the teams. My colleagues in Oak Ridge, especially Ramo Vasudevan, did this absolutely astounding work preparing for the hackathon by building these digital twins in the data set. But in the end, we had 20 submissions, which is kind of half of the Bayesian optimization. And that's about twice more than we expected, even in our best case estimates. We have about 80 people involved, the submissions were absolutely amazing both in terms of the science they developed and in terms of elegance of the solution. So all I can say we are doing it again.

How were digital twins used in the hackathon, and how realistic are they for training AI models? (38:12)

Charles: Awesome. folks have to stay tuned. So you mentioned these digital twins that you provided were I mean teams across the country, were they using their own microscopes or were they working with the digital twin or how did that kind of work?

Sergei V. Kalinin: So you see, the digital twin in this case was a very, very simple emulators of the microscope. So normally by now we don't have to be in the same room as the microscope in order to run it. We can connect to it remotely. So it was a capability for quite a while, but the really big step forward using this remote connection started during the COVID time for very obvious reasons. And once we work on that, autonomous workflows, we can deploy the workflow on the pre-acquired data set, right? We can deploy it on the instrument. Obviously, it's much higher risk to deploy it on the instruments. But what we can also do is to take either the pre-acquired data set, which can be acquired as if it is a microscope. So we kind of informally call it the data regurgitator. So all it can do is to give out the stored data.

Or we can take the pre-acquired data set or simulated data set and build a simple emulator of the microscope. For example, for scanning probe microscopy, the emulator can be a simple PID loop. If you take the PID loop and you add some tip shape function that can change the function of time, you already have the system that's very rough, but it actually has a lot of properties of the real microscope.

So now imagine that you build a machine learning algorithm that can control the instrument through some set of the control knobs. So interestingly enough, the algorithm doesn't know what is that in the black box that represents the instrument. It can be the real microscope or it can be a very crude emulator. And it turns out that you can get 90 % of the useful outcomes if you run your machine learning algorithm even on a very simple black box because if you learn something unknown, you don't know in advance whether it is something simple or something complex. So obviously the real instruments show much more edge type of behaviors, it's roughly the same difference as building the primitive emulator of the self-driving car for the simplified conditions, for the real conditions. But you see the level of risk in the autonomous microscopy is much much lower than in an autonomous car. So it's actually super useful.

Can microscopes actually be used for material synthesis, and what is the future of synthesis-informed microscopy? (40:56)

Charles: Right, okay, that's awesome. I hadn't realized that we could do emulators for microscopes now. You mentioned, I think previously, somewhere on one of your posts that we can also use microscopes actually for synthesis, that by using the tip of the microscope you can induce phase changes in materials. Could you talk a little bit more about that? I mean, is there a future for autonomous microscopes to actually be involved in the synthesis of point materials?

Sergei V. Kalinin: So it's a super interesting topic and again there are several aspects to that. So it is true that most of the time we can see the microscope as the observational system.

But we also know that both scanning probe microscopes and electron microscopes can in fact change the material. So the second thing that we can do is that a microscope can in fact be connected to the synthesis system upstream. So for example, if we have a combinatorial library, that means that we have already fabricated all the samples over some binary ternary cross-section. But there is no reason why the microscope cannot issue the command to the synthesis robot to synthesize a different sample. So it makes sense to do if and only if the synthesis cycle is fast and comparable in length to the measurement cycle, but that's doable. So for example, we are currently working with a group of Professor Mahshid Ahmadi at the University of Tennessee to essentially realize the scheme when the microscopic measurements inform the synthesis robot of what next material to make in the form of the droplet. The reason why it makes sense is because droplet libraries that can be made are usually fairly rough and it is much easier to use the microscope measurement to tell the synthesis robot what to do next than to make multiple things and stick them all under the microscope. So there is a reasonable about factor of two expected increase in efficiency if rather than exploring droplet libraries we use the microscope in order to inform the next synthesis step.

So that being said, we also can modify material under the microscope. So for scanning probe there are several channels of material modification. We can apply bias and induce electrochemical reactions. We can apply force and cause some form of mechanical degradation.

So for electrochemical reactions, there are readily available observables. For example, we can make currents, we can measure electromechanical responses. Unfortunately, we cannot measure chemistry using the scanning probe microscope very well, but there are systems where it's combined SPM with the SEM, which has the EDS detection. And there are systems that combine SPM with the TOF-SIMS, like a system run by my colleague, Anton Ivly-Fenokhrin, and system that combines the AFM with essentially non-IR, which gives you some vibrational spectroscopy. So as you can imagine, if I have to write my Christmas wish list, this system would be close to the top of it because we of course can access some of them at CNMS, but there is a big difference in terms of what you can develop if the tool is in your lab and if it is not a part of the user facility, meaning that you develop serious things on top of it.

But the ultimate vision here is that we should be able to essentially go through the electrochemical transformation cycle for materials such as batteries and fuel cells and detect the associated chemical and volumetric changes. The reason why it is exciting is because the volume of material that we ultimately explore is on the level of the nanometer. So it's not even micrograms, it's well below nanograms. And if you can explore the electrochemistry of the tens of nanometer small volumes over short amount of times. That's a transformative tool to how rapidly you can discover new materials for energy storage and conversion.

What are the capabilities of electron microscopes in manipulating matter at the atomic scale? (45:17)

With electron microscope, it gets even more interesting because it is well known that the electron beam can change the structure of solid. So I think it was known since the time of Ernst Ruzka before the World War II. And most of the time, this change has a character of essentially creating a hole in the sample. So the word is beam damage. Once the aberration corrected microscopes appeared, one thing that they brought is the capability of reliable, atomically resolved imaging. So before 2010, atomically resolved images was you have to spend a lot of time and effort in the lab to get it done. With aberration correction becoming commercially available, you basically have it as something that you can buy, admittedly, for several million dollars. But then it generally works.

But the key capability of the aberration correction is that it becomes possible to focus your electron beam either on the individual atomic column or atom, if you talk about 2D materials, or between them. And it turns out that in certain material systems, we can effectively manipulate matter on the atomic level. For example, in the layered dehalcogenides, the Max2s, we can kick out sulfur atoms and see what's remaining inside the materials. So you can go from molybdenum sulfide to molybdenum S. Of course, it's unstable. So it wants to form a cubic sulfide phase. And in this process, it can create a certain pore. You can control this essentially atom by atom.

In materials such as silicon atoms in the graphene, we can basically move the silicon atoms almost like we do it with STM. And if you can move them, we can move two silicon atoms together and then they form a cluster. We can build the artificial molecules. For example, if you have the platinum and the atom, you can make a platinum silicide, platinum desilicide, and so on and so forth. So it's really fascinating. So in amorphous material,

It is possible to take silicon and basically crystallize it by the electron beam. So it's almost like a 3D printing, but it's a 3D printing on the atomic level.

So is it new or not? Well, many of these phenomena have been known for quite a while. For example, this phenomenon of the crystallization of amorphous semiconductors by electron beam has been known since 80s and to the best of my knowledge was rediscovered at least four times based on the published sources. I suspect that in the electron microscopy community it is very well known as something that everybody knows about but nobody seems that it's important. So what is the new opportunity here? So the new opportunity is that in principle, we discovered or rediscovered, if you will, this phenomena when I was still in Oak Ridge in 2014. The immediate parallel was that if we use scanning tunneling microscopy to manipulate matter atom by atom, we should be able to do it by the electron microscopy as well. So since my original field is SPM, I met at some point with folks like Don Eigler, who was very famous for the quantum corals.

And basically they started to think about atomic manipulation when they looked at the STM image, all atoms look like balls and one atom looked like a snowball, right? So snowballs cannot exist on the atomic scale. Snowball means that the atom moves around in some potential and then it basically means that you can kick it out to the different potential well. So around 2012-2013, a lot of electron microscopes

Data sets that we were looking at showed that half of the atom, I mean half of the atom cannot exist. It means that atom was there during scanning and then halfway through the scanning the beam pushed it in a different location. And then the next thought on analogy with STM is that if we can tell the beam where to go we may just learn how to move the atoms where we want them to be. So that ended up being a very long quest because again, this is a Friday night experiment until you get someone to support this work full-time and unfortunately Friday night experiment that involve computers or if you will frogs floating in the magnetic field as in case of in the game That's relatively cheap experiments. So Friday night experiments involving electron microscopes are not super cheap experiment so

We've been returning to this topic over and over again, but ultimately a lot of things that we built with the machine learning agents controlling electron microscope have as the North Star goal exactly enabling this atomic fabrication. So the idea is that humans can perform operation with a certain rate. So let's say one operation per second. I mean, admittedly, when I see my kids playing ultimate doom, I have a feeling that they can do it at least three times faster, but I'm not in that weight category anymore. But electron microscope can execute commands hundreds times per second. So if we harness machine learning to control the electron beam and find out what are the rules of the electron beam induced transformation, we can build machines that can perform hundreds and thousands attempts of atomic manipulations per second.

So attempts doesn't mean actions, but you know, sky is the limit. We can, it all goes to the point of finding the right materials, right condition. So doing things that we as a scientist are good at, again, because this is optimization and that's what we are doing for living most of the time.

What is the hardest thing for AI to automate in microscopy and scientific discovery? (51:29)

Charles: Yeah, that's a beautiful vision that by making microscopes more efficient, we can actually use them also to make atomic machines. Can't think of a better place to end, but our closing question for guests we usually like to end on is what do you see as the hardest or most difficult thing in your field for AI to automate or to understand?

Sergei V. Kalinin: So the most difficult thing for AI to automate is the human decision-making logic, because human decision making is very complicated. And you know, it's almost funny because when I started my career at Oak Ridge, at that time I was 100% focused on making microscopes do better because it's exciting, because we can learn interesting physics and so on and so forth. And throughout the career, I was constantly both pushed and going by myself to going from how to make microscopes quantitative to make how to collect more data, how to figure what it means and how to expand the scope of things that microscope can do to things like learning physics, atomic manipulation. You know, question you may ask is look at what our colleagues and astronomy do. So they look at the stars based on the scientific observations of the stars. They can learn the correlative laws like Kepler laws. They can learn Newtonian mechanics.

On the anomalies and the Kepler's laws we can predict that there is a Neptune somewhere, a Pluto somewhere there even though we cannot see that. There is an interesting discussion of whether the general relativity could have been discovered from the anomalies of the Mercury motion. So maybe yes, maybe no, who knows. I have colleagues who believe that if Einstein hadn't come up with this idea, maybe we actually wouldn't discover with the relativity theory, who knows. But ultimately, a lot of physics started from the astronomical observations.

Sergei V. Kalinin: Microscopes allow us to observe how the nature behaves on the atomic level and not only observe but also modify. So learning from microscopy the rules by which the atomic level works is actually a huge challenge. That being said, I think that this challenge is something that we can accomplish. Obviously, given funding, given the capability of building the right teams, but that's something that's doable.

What is more complicated is to go from the relatively well-defined world of the physical laws to the bigger picture of why people do science. As an interesting corollary, so students and postdocs in my group spend a lot of time timing the individual operation of the microscope. If we build the integrated workflows, we need to know how long they take, what is our time key.

The type of things that I work with, interestingly, deal a lot with the basic motivation of science. So how do you define science as the process of learning the unknown? Because unless we can define it in the way that makes sense, we will never be able to relegate this type of job to the machine learning algorithm. So in other words, humans operate based on the combination of the knowledge, intuition, certain reward functions, gut feeling, and so on and so forth. Machine learning methods by definition can incorporate some aspects of this. For example, we can create the machine learning methods for exploring unknown which is called curiosity learning. So just try to understand as much as you can about the system. But intrinsically machine learning doesn't have the reward function, they have to come from humans. And if humans cannot communicate this reward function to machine learning algorithm in a way that makes it actionable, then you know, I can

Sergei V. Kalinin: Buy a robot that can make a sandwich, but robot ultimately doesn't need a sandwich. I need a sandwich. So that's a part which is difficult and which probably will always remain as the interface between what the humans are doing and what the machine learning type mathematical methods are doing.

Charles: Awesome. All right, Sergei, thanks so much for joining.

Sergei V. Kalinin: Fantastic. It was pleasure to talk to you and looking forward for the next steps.

ML4Sci

Sergei Kalinin on AI & Autonomous Microscopes

Introduction

Transcript

How did you get into microscopy as a field of study?

When and how did you start incorporating AI into your microscopy work? (03:32)

Is there still more data than compute in microscopy today? (09:00)

What's involved in the calibration and operation of microscopes? (11:21)

Are there standardized benchmarks for evaluating AI models on microscopy tasks? (17:47)

Are we seeing AI being rolled out in national lab facilities? (21:25)

Could future AI systems allow users to prompt microscopes in natural language to calibrate and measure specific properties? (27:35)

How did the autonomous microscopy hackathon go, and what was the structure? (32:59)

How were digital twins used in the hackathon, and how realistic are they for training AI models? (38:12)

Can microscopes actually be used for material synthesis, and what is the future of synthesis-informed microscopy? (40:56)

What are the capabilities of electron microscopes in manipulating matter at the atomic scale? (45:17)

What is the hardest thing for AI to automate in microscopy and scientific discovery? (51:29)

Discussion about this episode