Activate Love – Steering AI Text Generation
This project won the 'Interactive deliverable' prize on our AI Alignment (March 2024) course.
AI Text Generation can seem magical and inscrutable, but recent research has shown that it is possible to steer the output of a model by modifying its activations. Even better, it is quite intuitive and fun!
This demo allows you to input a message and two prompts, and then steer the model's output towards one prompt and away from another. You can also control the strength of the steering and the layer of the model to steer. Try it out and see what you can create!
If you end up with something you like, feel free to share it with us on the community tab. We would love to see what you come up with!
You can use the »copy«-button on the upper right corner of the generated text box to copy your results to your clipboard. Have fun exploring the interface! 🚀
Learn more about the research behind this below. 📚
Engage with the full piece here.