New Capabilities, New Risks? – Evaluating Agentic General Assistants using Elements of GAIA & METR Frameworks
This project was runner-up for the "AI Governance" prize on our AI Alignment (June 2024) course. The text below is an excerpt from the final project.
Abstract
This project evaluates three general-purpose agentic AI systems using elements from the GAIA and METR frameworks, focusing on their capabilities and risks. Agentic systems, such as AutoGPT, AgentGPT, and NinjaTech AI, promise greater autonomy by performing complex tasks with minimal user input. They are designed to overcome the limitations of traditional large language models (LLMs) like GPT-4, enhancing productivity and reducing human oversight. However, the autonomy of these systems also has the potential to introduce added risks.
Through evaluations based on GAIA’s benchmark for AI assistants and METR's Task Suite, this project assesses their helpfulness and alignment. Results indicate that today's general-purpose agentic systems do not yet surpass frontier LLMs in general capabilities but do exhibit some promise in complex tasks. However, they also reveal potential risks, such as fewer safety guardrails and greater vulnerability to misuse. The findings suggest that, while agentic systems offer exciting capabilities, their growing autonomy demands careful management to balance benefits and risks. As these systems develop, attention must be given to preventing harmful outcomes due to increased capability.
Full project
You can view the full project here.