How to do an excellent AI alignment project

By Adam Jones (Published on April 10, 2024)

As we finish up the taught content in our AI alignment course, we’ll be moving to our four-week project sprint. This is an opportunity for you to build something that enables you to:

Apply the knowledge you’ve gained from the taught content weeks
Develop your portfolio to help you land new opportunities
Make a genuinely valuable contribution to the field
Take your next steps in AI safety
Have fun working on something impactful and interesting!

What to expect

Before session 7, you’ll come up with an initial project idea. The prompts in the exercises will help you develop this idea, and you can review our list of ideas for inspiration. In the session, your cohort will help you improve your idea.

Between sessions 7 and 8, you should:

Submit the project details form. After your session 7 discussion, we'll send you a form to collect basic details about your project idea. We’ll then match you to a new cohort that will have you engaging with people working on similar projects. This is a different cohort to the one you’re in currently for the taught content stage, but as before we’ll try to arrange it at a time that suits you. If needed, you can update your time availability at the same time.
Do rapid tests to validate your idea. See the session 8 resources for more info.
(Optional) Attend project launch event. We’ll deliver a short presentation about projects, run a short Q&A, and then give you the opportunity to network with others on the course and get feedback from them on your project idea. We’ll record the presentation or make the slides available for participants who can’t make it to catch up later.

From session 8 to session 11 you will:

Work on your project. This should be where you spend the majority of your time on this part of the course (at least 4 hours per week). We’ve got some tips below for making the most of this time.
Meet weekly with your new project sprint cohort. Unlike the taught content sessions, where the bulk of learning is done during the meetings, the Project Sprint is focused on work done independently. There are minimal readings and preparation, and sessions will be much shorter (30 minutes to 1 hour), primarily focusing on project mentorship and peer feedback.

After session 11, you will:

Submit your project. Put the final touches on your project, and submit it to us for judging! You should produce some public product as a result of your project: this might be the project itself, a blog post or YouTube video explaining your project, or something else entirely. We recommend attaching your name to it, but you’re welcome to publish your work anonymously or under a pseudonym.
(Optional) Prepare a 3-5 minute presentation of your work. You don’t have to have a slide deck, but you should be able to explain your project to other participants.
Attend project closing event. You’ll present your work in small breakouts with other participants. Finally, we’ll announce the winners of the project competition, award prizes, and close out the course.

How do I make the most out of the project sprint?

Start with a narrow project scope.

Write out one (just one!) primary deliverable for your project. This will help you to stay laser-focused on scoping and completing your project, as well as make it more likely to deliver something successful. A good question to ask yourself is “could I make this simpler?”.

You can always expand the scope of your project later if you do complete it quickly!

For example, say your project involves exploring how weak-to-strong generalisation works on different types of tasks. An initial project deliverable might be to build an open-source weak-to-strong generalisation setup with a 7B and 70B open-source model on a single task.

Only after achieving this, should you consider how you might expand it in directions like adding more tasks, adding more model sizes, adding more models, doing bootstrapping, implementing auxiliary confidence loss, or testing out different loss functions.

Just do it!

In the past, some participants spent most of the project time planning and researching what to do, rather than executing on their projects. The better projects that we’ve seen tend to come from iterating on concrete and narrow ideas.

Try to create a minimal thing that ‘works’ by week 10. Then ask for specific feedback from:

Your project cohort
The AISF Slack workspace
People you know in your target audience
People you generally trust

You’ll learn a lot more from the project, having created multiple versions and iterating, rather than trying to create one perfect version (which almost never works!).

Commit to one idea

There are loads of exciting projects you could work on! The key is picking one and sticking to it. Although it is tempting to try out many things, you won’t have enough time to do multiple ideas well. You will get more out of the experience, and benefit the field more, by focusing on just one path.

If you do want to pursue multiple substantial ideas, consider picking one to do in this project sprint, and doing others after the course (you’ll still have access to the Slack and our support, and many alumni often do this each iteration).

To clarify, if your rapid tests before session 8 raise a significant blocker to your project you should consider pivoting. However, we strongly discourage major pivots after session 9. We suspect that if you hit a project blocker after this point (but presumably your rapid tests indicated it was plausible), there is likely something interesting in this: an analysis of why something might be harder than it looks would still be a useful result.

Carve out the time to work on your project

Although the discussion sessions serve as a space for accountability, you’re much more likely to put the time into working on the project if you carve out space in your calendar to put your head down. If it suits your style, you could even have co-working sessions with your project cohort or others on the course!

Similar to how you have better discussions when you’ve prepared for the sessions, you’ll get better feedback and learn more when you’ve put time into your project. Consistency will bring you far with your project, even if it’s just a few hours a week.

Ask for help

Doing novel work in a technical area is difficult! Throughout the project sprint, you should leverage your facilitator and peers for support (such as in your cohort channel or #dumb-questions-encouraged). Remember that we’re all here because we want to learn: the bar for asking for help should be very low.

What are the prizes for the project sprint?

There are a range of prizes, with details in this post!

FAQs

Are there examples of successful projects from previous groups?
There are some previous AI governance projects.

Unfortunately there aren’t examples to share of AI alignment projects. This is because the last iteration of the alignment course was actually back in March 2023, and was the first course we ever tried doing projects with. As a new thing, we didn't execute the project sprint as well as we had wanted, so there were relatively few excellent projects.

Additionally, we have since changed the structure of project sprints considerably and therefore those projects don't resemble the projects in this course very well (they were more focused on things like career planning - and we now think people should do research instead because skill transfer is hard).

Finally, as a logistical hurdle we just haven't got the consent to share these. Not that this was refused, but we haven't got round to emailing people, converting them into website form, and uploading them etc. and don't intend to given the above.

The AI governance course is newer, and we ran one in August 2023. We took the above learnings to make a better project sprint, and hence got many more better projects that we have published. For a broader view of what's changed, you might be interested in our 2023 impact report summary.

Can we work on a project sprint with someone else?
Yes, as long as everyone in the group is contributing and learning during the project. However, we expect most participants to do projects independently.

If you have already agreed to work on a project with someone else, please indicate this in the project details form that we send around the time of session 7 so we can put you in the same project cohort!

Can I work on a project about [topic X]?
Yes, probably. As long as it’s legal, and you can draw a connection between the work you’re doing and benefitting AI safety, we’re likely happy for you to do it. We recommend asking your facilitator if you’re still uncertain.

Is it okay to work on a project that I have already started? (e.g. before the course start)
In general, we encourage participants to pursue new ideas because these are likely better informed by the knowledge gained in the course and tend to be more relevant to AI safety. However, continuing an existing project is fine as long as it meets the same bar for relevance as any other project and will produce a deliverable to submit after session 11.

How can I get access to compute, or other resources?
Working on AI systems can often be compute intensive (but not always!). We suggest:

Consider whether cost is really a barrier to your project! Many people are surprised to learn that you can run and fine-tune many open-source models on your own computer or on VMs that you can get for free. Additionally, people are often surprised API inference quite so cheap - Claude’s Haiku model (which humans rate higher than GPT-4-0613) will read and then write 1 million tokens (roughly about eight novels) for just $1.50. You can also rent GPU capacity from sites like vast.ai, including top AI chips like H100s for $4/hour, or RTX 4090s for $0.5/hour.
Consider whether you can tweak your project to reduce the need for compute. For example, could you actually test your hypothesis using smaller or cheaper models, or a subset of the dataset you might want to test on? If the project looks promising on a small model or partial dataset, it’s extremely likely we’ll be able to find funding to scale it up.
Applying for free or subsidised access at AI companies or your university. OpenAI offer a researcher access program (turnaround time of a few weeks though), and most universities will have some process for students to request access to a compute cluster they can use for projects (turnaround time varies, but often is only a few days).
Applying for funding from us. This is a low-friction and speedy process to unblock your project, so that it would be rare that you’re unable to do most projects because of lack of funds.
Applying for funding from other sources. In particular, the Long-Term Future Fund is usually quick at making grants (most decisions in 3 weeks - although they may be slower due to recent high demand). You can apply to multiple sources of funding at once, and withdraw other applications once you get the funding you need!

We don’t want lack of access to compute to be a reason you don’t do a promising project, and will strive to work with you to find a solution if you find this is a bottleneck. Please apply for our funding or get in touch early if your initial tests suggests this will be the case.

I have a different question
Ask in your session 7 discussion, or in #alignment-mar24-logistics-questions.