
This research company just launched an enterprise simulator game and research paper, MAPs, a new benchmark for evaluating agents (including Chatgpt, gemini, Claude and more) on long-horizon planning, world modelling, and strategic decision-making in stochastic, dynamic environments.
