careers careers-m

Projects

Here are the top project submissions from participants on our AI safety courses.

Participants worked on these projects for 4 weeks during the Project Sprint, applying their learnings from the course to their next steps in AI safety.

Featured

AI Safety Project

Competition Winners

AI Alignment (2024 Mar)

Fixing open issues in TransformerLens

By Anthony Duong

Compression and GenAI

By Raymond Tana

Replicating Toy Models of Universality

By Nathan Reed

Diffusion Model From Scratch

By Hong Jeon

Feature Visualization Learning Journey

By Caleb Sattler

Introducing Agent Tokens

By Bryson Tang

Mabel: On-origin AI detection

By Christopher Tardy

Can AI Call its Own Bluffs?

By Aleksei Rozhkov

Polysemanticity vs Superposition

By Noah Topper

Exploring MARL Safety in meltingpot

By Gema Parreño, Peter Francis, Cam Tice, Chris Pond, Yohan Mathew, Tomasz Steifer, Marina Levay

Understanding Transformer’s Induction Heads

By Natalia Burton

Representation Tuning

By Christopher Ackerman

A Research Agenda for Psychology and AI

By Carter Allen

Root Cause Analysis of AI Safety Incidents

By Simon Mylius

A Closer Look at “How to Catch an LLM Liar”

By Karolina Dabkowska

Using an SAE as a steering vector

By Nelson Gardner-Challis

Demonstration of AI Safety via Market Making

By Cameron Holmes

Sparse Features Through Time

By Rogan Inglis

We use analytics cookies to improve our website and measure ad performance. Cookie Policy.