Sponsored by Looka AI – Exclusive lifetime deal

Anthropic Uses Pokémon Red to Test Claude 3.7 Sonnet

In a recent blog post, Anthropic revealed that it used the classic Game Boy game Pokémon Red as a benchmark for its latest AI model, Claude 3.7 Sonnet. The company equipped the model with basic memory, screen pixel input, and function calls to press buttons and navigate through the game, allowing it to play continuously. 

This test showcased the model’s ability to engage in what Anthropic calls extended thinking, meaning that it can take extra time and use more computing power to solve complex tasks. Unlike its predecessor, Claude 3.0 Sonnet, which struggled to leave the starting area of Pallet Town, the new Claude 3.7 Sonnet managed to battle three gym leaders and win their badges, demonstrating significant improvements in reasoning and planning. 

Anthropic noted that the model performed 35,000 actions to reach the final gym leader, Surge, emphasizing its capacity to execute a vast number of operations to overcome challenges. Although the exact computing resources and time required for each step were not disclosed, this achievement clearly illustrates the model’s progress and its enhanced problem-solving skills. 

Using Pokémon Red as a test may seem like a playful choice, but it is part of a long tradition of employing video games as benchmarks for AI performance. Recently, several platforms have emerged that evaluate AI abilities in games ranging from fighting games like Street Fighter to creative puzzles like Pictionary. Anthropic’s experiment is both a nod to gaming culture and a practical demonstration of advanced AI capabilities. With its improved extended thinking feature, Claude 3.7 Sonnet is now better prepared to handle intricate challenges beyond simple tasks. This experiment serves as a clear example of the rapid evolution of AI technology. It is likely that other developers will adopt similar gaming benchmarks to test their systems, further pushing the boundaries of artificial intelligence.

Facebook
X
LinkedIn
Pinterest
Reddit
'

Thank You!

Check you email for prompt book

Exclusive Gift 🎁

Get FREE AI Prompt Book!

Sign up & Get  1000’s of Prompts and Weekly AI Updates Directly in your Inbox !