Resources

This is a (growing) compilation of all the resources that are likely to be useful to people in the AI safety space that we are aware of.

Introductions to AI safety

We think these are great resources to share with others who might be interested in learning about what AI safety is in a low-cost way. We think these are great to look at before applying to our programme, to get a feel for what the content is going to be about. Check out our more comprehensive introductions for more detail and more academic material.

For a popular audience
More academic introductions

These resources are useful for people who want a more comprehensive introduction to AI safety than the previous section. These resources take a variety of different approaches. Some are specifically targeted at machine learning practicioners, which we feel is useful for connecting elements of AI safety with the forefronts of machine learning research.

  • Bibliography of research areas that need further attention according to the Centre for Human Compatible AI, UC Berkeley. On that site, you can set a priority threshold for which materials to show.

  • Unsolved problems in ML Safety – Dan Hendrycks. This paper frames many problems in AI safety in the context of modern machine learning systems. We think it’s a good introduction for machine learning academics hoping to learn more about machine learning problems they could help work on.

  • ML safety scholars’ course – Dan Hendrycks. This course takes a Machine Learning-first approach to AI safety, which forms the basis of a closely-related field Hendrycks terms ‘ML safety’. It may be a good way to learn safety-relevant ML techniques and concepts, at the same time as ML safety concepts.

  • AI safety from first principles – Richard Ngo, an “attempt to put together the most compelling case for why the development of artificial general intelligence (AGI) might pose an existential threat”

  • Alignment forum curated sequences

A community member has compiled a longer list of introductions here, which you can use as you see fit.

Recommended books

Technical alignment focus:

  • Human Compatible by Stuart Russell (UC Berkeley) is our default recommendation for people who haven’t already read books about AI safety. We think this book provides the best and most focused introduction to the alignment problem, though it doesn’t cover a wide breadth of solutions.

  • The Alignment Problem by Brian Christian is our recommendation for people who are more familiar with AI safety arguments but not recent research in deep learning and reinforcement learning.

  • The Coming Wave by Mustafa Suleyman is the most up to date of these book recommendations. It mostly discusses misuse risks from AI (like bioweapons, cyberattacks, political unrest), with a small amount on misalignment risk. Other books in this section were written before the latest paradigm of the massive scaling of LLMs, so may appear more outdated against this book.
  • Superintelligence is recommended to those who are familiar with both AI safety and recent research in ML, for a more detailed analysis of many safety-related topics. Be aware that it was written some time ago, and is not highly grounded in modern ML systems.

Strategy & policy focus:

  • Chip War introduces the history of the semiconductor supply chain, and geopolitical tensions around AI chips. It mostly focuses on Taiwan and relations between US and China (the two most important powers in the semiconductor industry).
  • The New Fire – US focus.

  • Four Battlegrounds – US focus.

Learning to code in Python

If you’re interested in working on technical AI safety via Machine Learning, it’s overwhelmingly likely that you’ll use Python. These are our recommended resources for picking up the Python programming language.

ML courses (free online)

If you’re interested in working on technical AI safety, the current AI paradigm dictates that Machine Learning knowledge is essential. The below resources are not safety-relevant, but could be a good place to start to learn more about machine learning.

Basic introductions:

More advanced:

ML textbooks (mostly) free online

If you’re interested in working on technical AI safety, the current AI paradigm dictates that Machine Learning knowledge is essential. The below resources are not safety-relevant, but could be a good place to start if you want to learn more about machine learning.

All are free online except the first.

Keeping up with AI safety research and communities

Communication forums

Research compilations

  • AI Alignment Forum; many alignment researchers debate. It’s not highly accessible to the beginner, and not everyone who contributes to alignment research contributes to the forum, but it’s a good place to see what some researchers are discussing today.

  • Alignment Newsletter Database; search for all papers on a variety of topics. This is useful if you’re trying to find relevant research on one topic in particular.

  • AI Safety Papers – this is a database of relevant AI safety papers. It’s unclear whether this is still being updated, but there is an existing database which might be useful for searching previous articles.

  • AI safety videos – Rob Miles

Podcasts about AI safety
  • AI x-risk research podcast (AXRP); Daniel Filan (UC Berkeley) interviews leading researchers in the field of AI safety.

  • 80,000 Hours’ catastrophic risks from Artificial Intelligence podcast; this series often includes prominent researchers on risks from frontier AI.

  • Alignment newsletter podcast; there haven’t been any new issues recently, but there’s a wealth of previous episodes that could be of interest (though many will be outdated).

  • Future of Life Institute AI Alignment podcast

  • Hear this idea podcast; occasionally this podcast includes episodes related to AI risk.

  • The Inside View podcast; this podcast strikes a middle ground between 80k and AXRP in terms of how technical the content is. It also often has policy-oriented guests.
Newsletters

Technical safety focus

  • ML Safety Newsletter – by Dan Hendrycks. The latest news and research from the ML community, related to making ML safer.

  • AI Safety Google Group. This google group connects people who want to share and receive AI safety news and opportunities; it is lightly maintained by 80,000 Hours, but mostly contributed to by individuals.

  • The Alignment Newsletter – by Rohin Shah (also see Alignment Newsletter database above)

  • AI Safety in China – stay up to date with AI safety news from China.

Strategy & policy focus

  • Import AI – Jack Clark rounds up the latest progress towards advanced AI. This newsletter mostly takes a policy angle, though includes technical advances.

  • policy.ai – by Centre for Security and Emerging Technology (CSET). A biweekly newsletter on AI policy.

  • Digital Bridge – by Politico. A “weekly transatlantic tech newsletter uncovers the digital relationship between critical power-centers through exclusive insights and breaking news for global technology elites and political influencers.”

  • Jeffrey Ding’s ChinAI – by Jeffrey Ding. “ChinAI bets on the proposition that the people with the most knowledge and insight [on AI development in China] are Chinese people themselves who are sharing their insights in Chinese.”

Funding for technical AI safety and AI governance research
  • Open Philanthropy Undergraduate Scholarship

    • This program aims to provide support for highly promising and altruistically-minded students who are hoping to start an undergraduate degree at one of the top universities in the USA or UK, and who do not qualify as domestic students at these institutions for the purposes of admission and financial aid.

    • Applications due 12 Nov, 2021

  • Long-Term Future Fund – EA Funds

    • Applying to the EA Funds is an easy and flexible process, so we recommend you err on the side of applying if you’re not sure.

    • They have historically funded: up-skilling in a field to prepare for future work; movement-building programs; scholarships, academic teaching buy-outs, and additional funding for academics to free up their time; funding to make existing researchers more effective; direct work in AI; seed money for new organizations; and more

  • Early-career funding for individuals interested in improving the long-term future – Open Philanthropy

    • This program aims to provide support – primarily in the form of funding for graduate study, but also for other types of one-off career capital-building activities – for early-career individuals who want to pursue careers that help improve the long-term future

    • As with the EA Funds, applying is an easy and flexible process.

    • Applications due 21 January, 2022

  • Open Philanthropy AI Fellowship

    • A fellowship for full-time PhD students focused on artificial intelligence or machine learning.

    • Applications due by 29 October, 2021

  • Future of Life Institute – Grants

    • 4 different grant opportunities: project proposals; PhD fellowships; post-doctoral fellowships; and for professors to join their AI Existential Safety community

    • The PhD fellowship is targeted towards people starting a PhD in 2022

  • Survival and Flourishing – this is not currently open

Collaboration and research opportunities

Research communities:

  • Effective Thesis – What if you could use your thesis to work on solving the most important problems our world faces? ET provides free coaching and guidance to students, from undergraduate to PhD level, who want to begin research careers that significantly improve the world.

Research Resources:

 Career resources
 Internship opportunities

This list is limited and may fall out of date.

Sign up to our mailing list, for the most relevant and up to date opportunities.

You can also view our opportunities board.

 Relevant organisations (non-exhaustive)

This is our subjective summary of different organisations in the AI risk space. Last reviewed 2021-11-27. You will find more organisations on our opportunities board.

  • 2020 AI Alignment Literature Review and Charity Comparison – Larks

  • Companies / nonprofits

    • OpenAI

      • OpenAI was founded “to ensure that artificial general intelligence benefits all of humanity”. They openly commit to building safe AGI and have created some of the most impressive examples of AI systems today, such as the large language model GPT-3 (now open access) and CLIP.

      • Largely a technical company, they have a dedicated “alignment” team and a governance branch called the “AI Futures” team. Originally not-for-profit, it now has a for-profit arm.

    • DeepMind

      • A deep learning company aiming to “build safe AI systems, … [solve] intelligence, advance science and benefit humanity”. Founded in 2010 and bought by Alphabet Inc. (Google’s parent company) in 2014.

      • They have demonstrated the power of deep reinforcement learning to beat human experts in games (AlphaGo, AlphaZero, etc), and are starting to make inroads in scientific discovery and medicine with e.g. AlphaFold (which tangibly accelerated the field of protein folding).

      • They are largely a technical company with a safety & ethics branch, and their policy research team includes Alan Dafoe, leader of GovAI (see below).

    • Anthropic 

      • A public benefit corporation founded in 2021 by a group of former OpenAI researchers, described as “an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems.”

      • A broad range of interests at this point – watch this space!

    • Machine Intelligence Research Institute (MIRI)

      • Foundational mathematical research non-profit institute, focused on making superintelligence go well. Leaders in the field of safety on such topics as learned optimisation (inner misalignment), and embedded agency.

    • Alignment Research Center

    • Nonlinear

      • Meta/grant making organisation. “We systematically search for high impact strategies to reduce existential and suffering risks… [and] make it happen. An incubator for interventions.” They are currently in a research phase, and I don’t see many opportunities to obtain funding yet.

    • Redwood Research

      • Founded in 2021, its mission is “to align superhuman AI”, which it openly believes is more likely than not to be developed than not this century. They are currently internalising human values into language models, and doing some field-building work such as the ML for Alignment Bootcamp.

    • Center for Security and Emerging Technology (CSET)

    • AI Objectives Institute

    • Cooperative AI Foundation

    • Center on Long-Term Risk (CLR)

    • Centre for the Governance of AI (GovAI)

      • A new spin-out of the Future of Humanity Institute (FHI, see below). “We are building a global research community, dedicated to helping humanity navigate the transition to a world with advanced AI”

    • AI Impacts

  • Academic institutions

    • Center for Human-Compatible AI (CHAI)

      • A research institute in Berkeley, California. Led by Stuart Russell, long term advocate of the problem of control in AI and author of a best-selling AI textbook. They led the charge on inverse reinforcement learning, and are interested in a wide range of control-focused projects.

    • Future of Humanity Institute (FHI)

      • A research institute in Oxford, UK. They collaborate with Oxford DPhil students – Oxford’s name for PhD students – and house academics. Their interests are broad, spanning governance and alignment, and “include idealized reasoning, the incentives of AI systems, and the limitations of value learning.”

    • Centre for the Study of Existential Risk (CSER)

      • A research institute in Cambridge, UK, “dedicated to the study and mitigation of risks that could lead to human extinction or civilisational collapse”. They are largely governance-focused and consider the complex interactions of global risks.

    • David Krueger’s Research Group (University of Cambridge)

      • Set up in 2021. Krueger is interested in many alignment-related topics, and is an accomplished ML researcher.

  • Funding bodies (also see opportunities for funding section for more info)

  • Other institutes (not necessarily focused on safety)

  • Think tanks in the US that work on AI policy (non exhaustive):

 Other resource lists

© 2023. BlueDot Impact is funded by Open Philanthropy, and is a project of the Effective Ventures group, the umbrella term for Effective Ventures Foundation (England and Wales registered charity number 1149828, registered company number 07962181, and also a Netherlands registered tax-deductible entity ANBI 825776867) and Effective Ventures Foundation USA, Inc. (a section 501(c)(3) public charity in the USA, EIN 47-1988398).

We’ve just updated our websites. If you find a bug, please contact us.

Designed by And—Now

Designed by And—Now

We use essential cookies on our website to provide a richer experience. By accepting, you agree to our use of such cookies. Cookie Policy.