AI Alignment (2024 Jun)

Unravelling Superposition: an online self-guided course

By Robin Stringer (Published on October 15, 2024)

This project was runner up for the "Education and community building" prize on our AI Alignment (2024 Jun) course. The text below is an excerpt from the final project.

In the nascent field of AI Safety and Alignment, superposition is a key topic illustrative of the elusive nature of neural networks and how and what they learn, and how we can develop techniques to achieve greater insight.

The notebooks in this course will explain the concept and introduce you to some of the technical approches in PyTorch we can use to conduct practical research in the field.

If you are a technical practitioner and familiar with Python and PyTorch, the code notebooks highlighting the concept start at ‘Introduction to Superposition’.

For policy specialists, please have a look at the ‘Superposition for Policy Specialists’ section.

View the full project.