AI agents struggle to run fictional companies in a new 500-day survival test. Princeton researchers developed CEO-Bench, revealing that most current models fail financially, with a basic heuristic outperforming nearly all AI agents. This highlights the gap between current AI capabilities and complex, strategic business decision-making.
Opening Kapyn…