ML4Sci #17: Predicting the long-term stability of compact multiplanet systems; Weakly-Supervised DL of Heat Transport via Physics Informed Loss

Also, a robot designs photocatalysts

Hi, I’m Charles Yang and I’m sharing (roughly) weekly issues about applications of artificial intelligence and machine learning to problems of interest for scientists and engineers.

If you enjoy reading ML4Sci, send us a ❤️. Or forward it to someone who you think might enjoy it!

Share ML4Sci

As COVID-19 continues to spread, let’s all do our part to help protect those who are most vulnerable to this epidemic. Wash your hands frequently (maybe after reading this?), wear a mask, check in on someone (potentially virtually), and continue to practice social distancing.

Predicting the long-term stability of compact multiplanet systems

Published July 13, 2020

Predicting the evolution of orbital systems is a difficult numerical problems, with ties to chaos theory. This new work proposes SPOCK, an xgboost based model that classifies the long-term stability of orbital systems by using features from only the first subset of orbits. The authors generate a dataset of 10⁶ orbital systems and demonstrate the ML models improved speed and performance to other approximation approaches. To their credit, the authors also demonstrate a scenario where their model fails and explain, based on their physical understanding, of why such edge cases pose difficulties, given the features passed into the model.

See more: using deep learning to predict evolution of other chaotic systems


Weakly-Supervised Deep Learning of Heat Transport via Physics Informed Loss

Published August 21, 2018

A neat paper that encodes the heat transport differential equation into a convolution kernel. In other words, if you know the weak form of the differential equation and have constant boundary conditions, then you can derive a convolution kernel that can be used as a loss function for modelling the system without any training data (hence the weakly-supervised part).

Conversely, the authors also show that given some data, a CNN kernel can learn the differential equation that governs a system (given a constant boundary condition). This paper also borrows some ideas from Progressive Growing GAN’s to prevent the model from learning trivial solutions, showing how ideas from standard computer vision can also be applied to physics-based problems.

I’m usually excited when I see work about differential equations and deep learning because differential equations are so ubiquitous in modelling natural systems - curious to see if the convolution kernels could be derived for other common differential equations outside of physics.


🔒A Mobile Robotic Chemist

Published July 08, 2020

In ML4Sci #15, I covered BEAR: Bayesian Experimental Autonomous Researcher, where researchers autonomously explored a design space via 3D printing and a robot arm. This paper similarly from a multifunctional team at University of Liverpool, UK, combines two trends (robotics/automation and AI) to automate experimental discovery, but this time for photocatalytic systems for hydrogen production from water. As the tides of technology rise and recede, their impact will be felt differently, at different times, but when we examine things across disciplines, then the patterns will emerge. The question then becomes: which fields have not yet felt this disruption?

Reading this paper feels like a step into the future of science: the paper alternates between authors proposing hypotheses and design spaces, and then detailing the robots automated experimentation. In silico experiments are placed in the same context as, and used to augment experimental results. The authors also provide some insight into the high initial labor cost:

“The autonomous robot that we present here also requires half a day to set it up initially, but it then runs unattended over multiple days (1,000 experiments take 0.5 days of researcher time). It took an initial investment of time to build this workflow (approximately 2 years), but once operating with a low error rate, it can be used as a routine tool.”

Developing these kinds of scientific automation tools will require a unique combined understanding of scientific characterization, robotics, AI, and domain expertise. This multi-year effort from researchers across the University of Liverpool shows how such efforts might pay off.


📰In the News


Using AI to disentangle conspiracy theory narratives [PLOS One]. How large NLP models might help us to systematically understand ourselves

From Google: GShard, a 600B transformer model that demonstrates state-of-the-art translation from 100 languages to English. Trained on 2048 TPU’s for 4 days, further demonstrating how compute is a defensible barrier to entry.

Why General AI will not be realized [Nature Humanities and Social Sciences Communication]

Don’t ask if AI is good or fair, ask how it shifts power [Nature Worldview]

New open-source course on “Industrializing AI” called “Full Stack Deep Learning” from deep learning practitioners in industry.

Challenges of Comparing Human and Machine Perception [The Gradient]

OpenAI Scholars Spring 2020 Projects


Quantum computing: how conditions created by the COVID-19 shutdown are delivering ‘the best data we have ever seen’ [Nature News]

⚕️Secure, privacy-preserving, and federated ML in medical imaging [Nature Machine Intelligence Perspective]

Recordings from the 1st Workshop on Scientific Deep Learning hosted by researchers at UC Berkeley, LBL, and University of Washington

From the Google AI Team: Exploring Faster Screening with Fewer Tests via Bayesian Group Testing

Physics Meets ML seminar recordings. Check out the schedule and sign up for the zoom link here

The Science of Science

The unplanned impact of mathematics”, an essay arguing for pure research in Nature. 7 stories of how pure math showed up in unexpected applications.

🗺️Map of Math [Quanta Magazine]

The Bandwagon, by Claude Shannon (1956). The founding father of modern information theory argues that information theory has become . . . overhyped. Seems relevant today

Inventing the NIH. Excellent summary of what the NIH has accomplished and the history of how it came to be.

👩🏾‍🔬Unequal effects of the COVID-19 pandemic on scientists

🔋Fostering a sustainable community in batteries. I’ve already previously discussed how COVID-19 will upend the way we do science. This paper is a good real-world example of how COVID-19 and remote-work is accelerating technology-driven innovations in the way we organize scientists. I’d like to think this newsletter is part of that trend!

🌎Out in the World of Tech

NVIDIA eclipses Intel as most valuable US chipmaker [Reuters] (highlights how GPU’s are becoming increasingly important in the chipmaking industry, particularly for AI)

Data, Compute, Labor [Ada Lovelace Institute]

🔓 Detecting Misinformation on WhatsApp without Breaking Encryption

By adding just one line of code, you can now run any TensorFlow model on the Google Cloud Platform (GCP). Google is trying to dominate both the open-source space as well as the cloud compute space by tying the two together. See also: google colaboratory notebooks with subscription service for cloud computes, Kaggle migrates to Google cloud for all notebooks

Thanks for Reading!

I hope you’re as excited as I am about the future of machine learning for solving exciting problems in science. You can find the archive of all past issues here and click here to subscribe to the newsletter.

Have any questions, feedback, or suggestions for articles? Contact me at or on Twitter @charlesxjyang21