What we changed for the June 2024 AI alignment course
We recently wrapped up the March 2024 iteration of our AI alignment course, and are onto the June 2024 iteration! This is a brief summary of the high-level changes we’ve made and why.[1]
Eliminating week 0
We used to have an ‘Introduction to machine learning’ week before the course, which we named ‘week 0’. It was meant to bring participants up to speed on the basics of machine learning and AI systems, particularly for people less familiar with these concepts. It had resources and exercises but no corresponding discussion session, and participants were expected to complete it in the same week as their icebreaker session.
This week added a lot of confusion. People would get mixed up between this content and their icebreaker, and it set a bad precedent that the sessions didn’t line up with the content (which they all do, apart from this week!).
We therefore eliminated week 0, moving its key content into week 1 (and choosing to move some resources from the core readings to the further resources where we realised they were not strictly necessary to understand the rest of the course). We also bumped some week 1 content to week 2 where it was appropriate to keep the amount of content per session balanced.
This makes all our sessions work the same: you read the resources, complete the exercises, and go through group activities in your discussion session.
Adding a robustness, unlearning and control session
At the end of the last course’s learning phase, we reflected on our course content (and some people provided us this feedback which got it over the line - thank you!):
- AI control being in our technical AI governance week kind of worked, but was a little out of place. There’s also lots to cover in AI governance so the more we can move out of this week the better.
- Adversarial robustness and machine unlearning are both actively pursued research agendas missing from the course - and that participants have expressed interest in. They’re important both from for preventing catastrophic misuse, and loss of control.
- Our AI alignment learning phase was 7 sessions instead of 8. Most of our courses have 8, so adding a week would be nice to standardise things.
This course we’ve added a session on robustness, unlearning and control to cover all these topics.
Participants stay in the same cohort for the learning and project phases
On our March 2024 course, we did a much better job integrating the projects with the learning phase that we have previously. However, there’s still much to improve here!
A mistake we think we made was rearranging the cohorts between the learning phase to the project phase. We thought this would be good because then people doing similar projects could help each other with greater understanding of the subject area. However, in practice moving people’s cohorts confused participants, and participants may have felt less connected to these new people (compared to their existing cohort they’d been meeting for 8 weeks). This also created a lot of work for us: both to find new cohorts that worked for people, but also dealing with queries about the moving times and manual cohort switches.
Going ahead, we’re going to keep people in the same cohort as:
- It’s less disorienting for participants
- There is greater connection and accountability for participants in an existing cohort
- People get to see a range of projects developing, rather than being pigeonholed into one specific area
- It gives us more time to focus on making the project phase better elsewhere
Focusing on communication skills in the projects
We received a huge number of high-quality project submissions on the last round of the course. We’re incredibly proud of our graduates for doing such great work!
For most projects, we think that one of the ways they could be improved further was in being able to communicate what they had achieved clearly.
We haven’t fully planned out how we’re going to achieve this, but this will likely include:
- Creating template structures for people to use for write-ups. These will be optional, but will help participants uncertain what a good structure might look like.
- Continue to include and test resources on good writing in the project phase resources.
- Signpost guidance on other forms of communication, such as creating recorded whiteboard or video presentations.
- More explicitly arranging peer-review of each other’s work, with a view to ensuring projects are communicated clearly.
Providing project examples
This is not really something we’ve directly done, but is using the work of our brilliant March 2024 graduates.
The most common question we’re asked about projects is ‘Do you have examples of previous good projects?’
Our previous answer was effectively no,[2] but now we can say yes and point people to our project page!
We expect this will significantly help people understand the wide range of outputs they can work towards during the project phase, and in conjunction with our project guidance and project ideas list will support participants to make great things!
Not making other drastic changes
Apart from the above, we don’t intend to make large changes. This is because:
- Feedback on the last iteration was excellent - if it ain’t broke, don’t fix it! From our March 2024 graduates (emphasis mine):
- Sign up to this course NOW! It is quite simply the best course you will ever take. It's all arranged by an attentive team who are constantly getting feedback to improve the course. The interface for the resources is simple to use and the resources and exercises themselves slowly ramp up your understanding and knowledge fitting in perfectly with your expertly facilitated group sessions. This all culminates in you producing an impactful project which you are fully supported with. It's sure to look great on any CV or job application. ~James Bashford
- I want to transition my career into AI safety/ethics, but I don't have a technical background. I found the course to be the perfect way to gain a broad and comprehensive view of the various ways I could contribute to the field and to get an initial understanding of the technical aspects of alignment in an easy-to-understand manner. I'm starting a Master's degree in AI ethics in October, and this course provided an excellent introduction to potential lines of research that I could pursue, that could contribute to safety. I highly recommend this course to anyone looking to transition into the field, whether they have a technical background or not. ~Maria Hernandez
- As a Machine Learning engineer, I was looking for more education and chances to work on exciting projects. I believe AI Alignment and Safety are very important problems to solve. This course helped me learn a lot and meet amazing people who think like me. The experience was great. I got to discuss AI Safety concepts that really opened my eyes. I especially enjoyed talking about technical governance in AI Safety and learning about constitutional AI. These discussions were fascinating and very helpful. I recommend this course to anyone interested in AI, especially AI Alignment. Even if you don't have a background in AI, you can still ask important questions and offer new perspectives. If you care about AI and its safe development, you should definitely apply. ~Christopher Danz
- I entered the course as a software engineer who wanted to get involved in "AI safety", but wasn't fully sure what that meant. The course was a great survey of this burgeoning field, and I'm so grateful to have had the chance to explore it with a cohort full of thoughtful people from a variety of backgrounds, from research engineering to journalism. I also really appreciated the gentle guidance and mentorship of our facilitator, Diogo Cruz. I'm looking forward to applying what I learned to start contributing directly to this vital problem. ~Andrew Ash
- Coming from a background leading an Engineering team, and a with long-standing interest in AI, I was hoping to pick up an understanding of the technical aspects of AI Safety to inform a decision about whether to shift career into the field. The course was really well structured with excellent course materials and exercises. As well as rich discussion around the resources, the sessions included activities such as role-playing responses to scenarios in groups or trying to elicit certain behaviours from LLMs. The level of engagement amongst participants really surprised me and a strong sense of community built up on the Slack channels as well as in cohorts along the way. Huge thanks to the BlueDot team! ~Simon Mylius
- We want to focus on areas where we can have the greatest impact. Prioritisation is crucial as a small team of 5[3] so we can deliver amazing experiences. We’ve published some details about the work we’ll be doing over the next few months in our recent case for support.
Next steps
I’m incredibly excited about this iteration of the AI alignment course and expect it to be the best one yet. If you have feedback on the course, don’t hesitate to get in touch!
Footnotes
We’ve also made a number of smaller changes to account for the progress in the field over the last 3 months, like replacing specific resources and updating exercises. We’ll also be continuing to update the course week-to-week based on feedback from course participants and facilitators.
Specifically our usual answer was:
There are some previous AI governance projects. Unfortunately there aren’t examples to share of AI alignment projects.
This is because the iteration of the alignment course before March 2024 was actually back in March 2023, and was the first course we ever tried doing projects with (before then people just did the learning phase and that was it). As a new thing, we didn’t execute the project sprint as well as we had wanted, so there were relatively few excellent projects.
Additionally, because of this poor execution on our part we changed the structure of project sprints considerably and therefore those projects don’t resemble the projects in this course very well (they were more focused on things like career planning, rather than actually doing research – and we think people should actually do research because skill transfer is hard). We’d be worried that sharing the few excellent project examples from there would confuse participants.
Finally, as a logistical hurdle we just haven’t got the consent to share these. Not that this was refused, but we haven’t got round to emailing people, converting them into website form, and uploading them etc. – and probably don’t intend to given the above.
The AI governance course is newer, and we ran one in August 2023. We took the learnings to make a better project sprint, and hence got many more better projects that we’ve published from that.
We wrote a blog article on our learnings and what we’ve changed in 2023, that touches on adjacent topics: https://bluedot.org/blog/2023-impact-report/
Want to help increase our capacity? At time of writing we’re looking to hire great people and raise more funds!