ML4Sci #34: 2 National Reports on AI in the US; Unpacking the Full-Stack of news from Facebook this week

Also, new interpretability methods+adversarial attacks on Neural Networks from OpenAI

Hi, I’m Charles Yang and I’m sharing (roughly) weekly issues about applications of artificial intelligence and machine learning to problems of interest for scientists and engineers.

If you enjoy reading ML4Sci, send us a ❤️. Or forward it to someone who you think might enjoy it!

Share ML4Sci


📝Nationalized AI Reports

Two big national reports on AI from the US came out this week.

The first is the National Security Council on AI, chaired by Eric Schmidt, former CEO of Google, which submitted its final report and set of recommendations to Congress. This comprehensive report is the best overview of where America, particularly its defense base, is heading in terms of AI. Here’s a brief list of some of the snippets that stood out to me:

  • the US should strive to “achieve a state of military AI readiness”[military] and “widespread integration of AI”[broader economy] by 2025

  • does not rule out the use of autonomous weapons: “Provided their use is authorized by a human commander or operator, properly designed and tested AI-enabled and autonomous weapon systems can be used in ways that are consistent with international humanitarian law.”

  • recommend the creation of a US Digital Service Academy

  • an entire chapter on “Establishing Justified Confidence in AI Systems”, which focuses on verification and robustness of AI systems

  • recommending significant immigration overhaul, including doubling number of employment green cards, creating an entrepeneur visa, and giving green cards to graduates of STEM doctoral programs

  • recommend a tightening of export controls for dual-use technologies and applications

Meanwhile, the other side of the country, HAI@Stanford released its annual AI Index, which measures AI progress and trends. Some highlights:

  • 34% growth in AI papers from 2019-2020, outpacing the change from 2018-2019

  • The majority of US PhD students in AI chose to work in industry after graduating, as opposed to academia

  • Less than 6% of US PhD students in AI are African American or Hispanic

  • To reach roughly human accuracy on ImageNet now only requires $7.43 of compute

  • AI has captivated the attention of the US Congress

And a nice quote on the push and pull between NLP and CV progress:

“It’s interesting to note the dominance of the Transformers architecture, which started for machine translation but has become the de facto neural network architecture. More broadly, whereas NLP trailed vision in terms of adoption of deep learning, now it seems like advances in NLP are also driving vision.” —Percy Liang, Stanford University (Chapter 2, Expert Highlight)

Unpacking the Facebook stack

A significant portion of advanced AI research is now being done by pseudo-research groups hosted at large tech companies, which usually derive their revenue from attention economics i.e. ads. The past two weeks, I’ve come across the full-stack of news from Facebook - ranging from new cutting-edge AI techniques in self-supervised computer vision to a new indicting article from MIT Technology Review on how Facebook has failed to address concerns about its social recommendation engine. It’s pretty crazy to think that AI is being developed by people who are funded by a company that also has to create a playbook for employees to explain why the company they work for is not destroying America.

In Research: The Facebook AI team develops a powerful unsupervised image recognition algorithm called SEER (they also published this broader blog on the increasing importance of self-supervised learning, which is worth reading), which reaches a new SOTA on self-supervised training for ImageNet. Advances in contrastive learning for computer vision mean that CV models can now learn from the vast unlabelled image data on the web, in a similar fashion to NLP models.
In regulation: this algorithm was probably trained on your instagram photos! Unless of course, you live in Europe, in which case your data was protected by GDPR. These issues of sovereignty, data provenance, and privacy are starting to seriously impact the way large tech companies develop algorithms - I wonder if SEER is biased against European images because of a lack of data?
In academia: The University College London’s CS department is deeply entangled financially with Facebook - the tech monopolization of AI academics continues
In society: Karen Hao at MIT Tech Review does a deep dive on AI at Facebook, nicely tying together the people, the algorithms, and the business models that have created the Facebook we have today.

And now, for the news!

Department of Machine Learning

[Distill] Incredible visualizations of “multimodal neurons” in neural networks by an OpenAI team. Their evidence for multimodal neurons also inspired new adversarial attacks:

A nice overview of graph neural networks and the limitations of the current message-passing paradigm

🤖Folding clothes is a surprisingly hard benchmark - researchers from Berkeley demo demonstrate a real-world system that can actually fold different fabrics [video]

Near-Future Science

💼 another classified: Group Lead of the ML team at Lawrence Berkeley National Lab - definitely a high-impact opportunity to work in ML4Sci!

🏆The Department of Energy’s Solar Energy Technology Office (SETO) announces $7.3M in awarded funding for AI solutions - awardees are mostly universities, national labs, and the startup Camus Energy

🪐How the Perserverance Mars Rover used AI to navigate and land

More exciting work on equivariant deep learning, which demonstrates SOTA and sample efficiency on molecular simulations tasks. See 👇 for a paper-in-a-thread

The Science of Science

💀Researchers that die young are more likely to be cited than their peers, as a form of posthumous recognition

📈[NatureNews] Growing inequality in research citations: the top 1% of researchers capture more than 1/5 of total citations

💻[NatureNews] A poll of researchers showed that a strong majority prefer some form of virtual conferences in the future. There’s a huge opportunity here for a platform that can provide the ease of digital conferences while reproducing the spontaneity of in-person conferences. Virtual conferences also have the benefit of being more accessible, cheaper, and significantly reduce carbon emissions from air travel

new-journal-alert: Living Journal of Computational Molecular Science, a journal that essentially lives on Github. Software is finally starting to change the way we view scientific publishing, from static articles to living journals (with pull-requests too!)

🌎Out in the World of Tech

blog-of-the-week: “7 reasons not to join a startup and 1 reason to” from Chip Huyen, formerly at Snorkel

🏥 Commercializing AI is hard: IBM is considering selling off its IBM Watson Health business. Recall late last year that Alibaba also shut down its AI lab and Canada’s Elemental AI was sold-off at a fairly devalued pricing

🏎️[LATimes] Internal Tesla emails reveal a disconnect between Elon Musk’s marketing claims of full self-driving Teslas and what Tesla is telling regulators

The good and bad from Google:
- backlash against Google continues after they fired the leads of their AI Ethics team: [VentureBeat] “AI ethics research conference suspends Google sponsorship”.
- 🌞 but Google Cloud announces a new method for certifying clean energy that focuses on time-of-use. This is an important step forward for companies that are trying to decarbonize because it rewards developers of energy systems that can complement renewables that are only available during at certain times of day e.g. solar.

Policy and Regulation

🏴󠁵󠁳󠁶󠁡󠁿From my homestate of Virginia - state legislature passes bipartisan ban on facial recognition being used by local law enforcement

🕵️ The UK’s GCHQ (basically their NSA) releases a white paper embracing the use of AI in intelligence applications. emphasizing the importance of AI ethics.

Thanks for Reading!

I hope you’re as excited as I am about the future of machine learning for solving exciting problems in science. You can find the archive of all past issues here and click here to subscribe to the newsletter.

Have any questions, feedback, or suggestions for articles? Contact me at ml4science@gmail.com or on Twitter @charlesxjyang