AI Governance (2024 April)

How Governments Can Leverage AI Safety Bounties

By Carolin Basilowski (Published on September 11, 2024)

This project was one of three winners of the "Outstanding Governance Project" prize on our AI Governance (April 2024) course. The text below is an excerpt from the final project.

AI Safety (AIS) bounties are a monetary incentive for people to contribute to making AI systems safer. They can complement existing AI auditing processes by AI labs, governments, or independent third parties through overcoming hurdles for knowledge and insight sharing like slow-moving bureaucratic processes, misaligned incentives, and visa issues. AIS bounties can be understood as a crowdsourced red-teaming effort.

Coming from the software community and used by both governments and private companies, bounties have been successfully used over the past decade to uncover software bugs. While software bugs can usually be fixed quickly, addressing misaligned frontier AI systems is significantly more difficult. However, AIS bounties could be a valuable addition to the AI audit landscape as an additional risk assessment and evaluation tool alongside more formal audit procedures.

This report is an open exploration of whether AIS bounties could be implemented by governments to make progress on the safety of frontier AI systems. It draws heavily on the report from Levermore (2023) on AIS bounties. This writing focuses on the government as the program host, basing its program design on most of the key variables as identified in Levermore’s writing. 

To read the full project submission, click here.

 

We use analytics cookies to improve our website and measure ad performance. Cookie Policy.