Relax, You’re Still Better at Playing ‘Doom’ Than AI
Despite the buzz surrounding artificial intelligence, even the most advanced vision-language models—GPT-4o, Claude Sonnet 3.7, and Gemini 2.5 Pro—struggle with a decades-old challenge: playing the classic first-person shooter Doom. On Thursday, a new research project introduced VideoGameBench, an AI benchmark designed to test whether state-of-the-art vision-language models can play—and beat—a suite of 20 popular video … Read more