SocialReasoning-Bench measures AI agent alignment with user interests. Agents perform tasks competently but often fail to optimize for user benefit, even when explicitly instructed. This highlights a critical gap in AI safety and alignment research.
Opening Kapyn…