New benchmark SocialReasoning-Bench measures AI agent alignment with user interests. Agents perform tasks competently but don't consistently optimize for user benefit, even when explicitly instructed. This highlights a crucial gap in current AI agent design for user-centric applications.
Opening Kapyn…