AI Alignment (2024 Mar)

Demonstration of AI Safety via Market Making

By Cameron Holmes (Published on June 19, 2024)

This project won the 'Scalable oversight' prize on our AI Alignment (March 2024) course.

Abstract

This notebook provides a practical (toy) implementation that demonstrates the AI Safety via Market Making proposal (AISvMM)

This project is still relatively nascent with some gaps, most notably the RL & backprop steps are not yet implemented and lots of future work is identified.

While this work falls short of evaluating the effectiveness of the proposal it partially demonstrates the viability of the AISvMM proposal by showing that current open weights LLMs are capable of acting as agents in the framework and provides a clear path to further work in the area.

Read the full piece here.

We use analytics cookies to improve our website and measure ad performance. Cookie Policy.