New SocialReasoning Bench measures AI agents' alignment with user interests. Agents demonstrate competence but struggle to consistently prioritize user benefit, even when instructed to optimize for it. This highlights a critical gap in current AI agent design for true user advocacy.
Opening Kapyn…