SocialReasoning-Bench measures AI agents' alignment with user interests. Researchers found agents perform tasks competently but struggle to consistently optimize user outcomes, even when explicitly instructed to do so. This highlights a critical gap in developing truly beneficial AI assistants.
Opening Kapyn…