ML4Sci #33: AI models that cost $1B USD to train; Transformers continue eating the world; Blockchain for research papers

Also, tales of ML4Sci from the Twitter-verse

Hi, I’m Charles Yang and I’m sharing (roughly) weekly issues about applications of artificial intelligence and machine learning to problems of interest for scientists and engineers.

If you enjoy reading ML4Sci, send us a ❤️. Or forward it to someone who you think might enjoy it!

Share ML4Sci

As COVID-19 continues to spread, let’s all do our part to help protect those who are most vulnerable to this epidemic. Wash your hands frequently (maybe after reading this?), wear a mask, check in on someone (potentially virtually), and continue to practice social distancing.

Department of Machine Learning

Paperswithoutcode: a new platform to identify unreproducible papers

🤯NVIDIA VP predicts the next big language model is going to cost $1B USD to train. Training AI models is approaching the same order of magnitude of cost as the Manhattan project. Lots of lessons here: the democratizing access of computers and software, the monolithic scale of tech companies and the impossibly high barriers to building performant AI models, and of course, the coming shifts in the geopolitical landscape.

[Medium] ML-in-production: “How We Scaled Bert To Serve 1+ Billion Daily Requests on CPUs”@Roblox

[Arxiv] From the Google datacenters: neural architecture search optimized for TPU’s on datacenter scale, up to 7x speedup in training. As people begin developing customized AI-hardware, there will be a dance between optimizing new hardware for existing AI models and optimizing new AI models for existing hardware. Google, for better or worst, is well-posed to do both
📱Also, apparently Google’s Pixel smartphone can automatically call emergency services in the event of a car crash

💸[VentureBeat] A nice overview of the difficulties in building GPT-3 business models

A summary of a round-table discussion around GPT-3 safety concerns from OpenAI, Stanford, and other universities (anonymized under Chatham House Rules)

💻 A great slide-deck by Amir Gholami on the future of AI compute, delivered as the opening keynote at Intel System Architecture Summit.

Keeping up with the transformers: a new arxiv preprint demonstrates the first all-transformer GAN, “Two Transformers make one strong GAN”

Should I tweak 30 different hyperparameters and add that cool new trick I read about in this one preprint? Probably not:

Near-Future Science

💼classified: University of Minnesota seeking ML specialist for calculating atomic potentials for 2D materials

🌎[Arxiv] “Integrating Machine Learning for Planetary Science: Perspectives for the Next Decade” - a particularly striking figure of the amount of ML incorporated into NASA research

🪐[IEEE Spectrum] The story of NASA’s perseverance (running on open-sourced Linux!)

AI Feynman: using symbolic solvers and deep learning to derive 100 equations from the Feynman Lectures on Physics [paper][code][data]

[Nature Communications] Converting patent data into experimental synthesis procedures. Most of the worlds latent scientific knowledge is locked in people’s brains or in pdf’s - NLP is helping to unlock these pools of knowledge

Tales of the Twitterverse:

Marinka Zitnik @marinkazitnik
Excited to share preprint on Therapeutics Data Commons! Paper: Website: TDC is a unifying framework across the entire range of #therapeutics #ML. Ecosystem of tools, leaderboards & community resr 66 ML-ready datasets 22 ML tasks

The Science of Science

An actually useful application of blockchain, from Packy McCormick’s Substack (NFT = non-fungible token, of which blockchain is a prominent example)

[Wired] Academic fraud: Fake co-authors found on AI papers, presumably to include the appearance of international collaborations

Matt Clancy dives into how innovative collaborations occur: “Adjacent knowledge is useful”. For instance, an examination of agricultural patents shows most of them cite adjacent knowledge, as opposed to topics that are directly related.
ML4Sci is predictated on the hypothesis that AI is going to become a knowledge multiplier lying within the adjacent possible of every scientific field

⚕️A nice explainer on the nuances in FDA regulatory approval of AI for medical imaging and why this field is far from solved

[Reddit] “A good title is all you need” - why are preprints on Arxiv popping up with increasingly catchy, but descriptively useless, titles?

[Medium] MIT students announce success in negotiating with administration for guaranteed transitional funding for students trying to leave unhealthy advisor relationships

🌎Out in the World of Tech

🎲[TechCrunch] Latitude, an AI startup built around creating inifinite game narratives, raises a $3.3M seed fund. Also, Latitude posted a job-posting for GPT-3 hackerer: are these the new jobs promised by the AI revolution?

[Reuters] “Google fires second AI ethics leader as dispute over research, diversity grows”

Policy and Regulation

💰[CSET@Georgetown] A nice breakdown of corporate investors in AI startups based on Crunchbase data. A mix of tech companies (Google, Intel) and finance (Wells Fargo, Goldman Sachs) top the list.

🤐[Protocol] “I helped build ByteDance's censorship machine” - the dark side of TikTok’s parent company in China

Thanks for Reading!

I hope you’re as excited as I am about the future of machine learning for solving exciting problems in science. You can find the archive of all past issues here and click here to subscribe to the newsletter.

Have any questions, feedback, or suggestions for articles? Contact me at or on Twitter @charlesxjyang