Exploring the Use of Constitutional AI to Reduce Sycophancy in LLMs
By Aleksandr Eliseev (Published on July 4, 2024)
In the scope of this research, we’ve attempted to fine-tune the 4-bit quantized Mistral 7B model using the Constitutional AI technique (Bai 2022) with a constitution aimed at reducing Sycophancy. An approach similar to synthetically generated data (Wei 2024) has been used as the training data. We found a constitution that reduces sycophancy by ~26.5%, but the sycophancy of the fine-tuned models has increased after the fine-tuning.
Read the full piece here.