AI spends 7,000 hours beating Pokemon Red's first gym, but still can't find the second one after 50,000 hours

(Image credit: Nintendo/Peter Whidden)

One programmer has given an AI model 50,000 hours worth of training in how to play Pokemon Red, leading to an algorithm that's capable of exploring the game and building a team to defeat the first gym leader - but not one that can find its way through Mt. Moon or know better than to keep buying Magikarp. Most of all, this exercise is a fascinating way to get an idea of how machine learning actually works.

As outlined in an extensive video by Peter Whidden, the AI is able to interact with the game through the usual control inputs on an emulator. It hits a button and looks at the screen to see what happened, the same as a human player. Whidden set learning sessions at two hours worth of game time apiece, though with emulation sped up those sessions could be completed in around six minutes of real-time - and the process was further sped up by running 40 testing sessions simultaneously.

Since a machine algorithm doesn't inherently care about beating a video game, Whidden set up particular goals for the AI to be rewarded for. To encourage curious exploration, the AI got a reward point whenever it saw something new, as measured by noticeably different pixels appearing on-screen. That has some unintended consequences - the AI would just stare, fascinated, at the slight animation of water, for example - but it broadly served to get the computer motivated to make it from Pallet Town through Viridian Forest and up to Pewter City, where the first gym battle against Brock takes place.

You may like

The AI needs further rewards and punishments, too. With rewards all tied up in seeing new things, the AI just wants to keep moving forward, which means it doesn't care about fighting battles or catching Pokemon, so it initially just ran away from every encounter. So Whidden added a system where the AI is rewarded based on the total level of its active Pokemon party.

That worked to keep the AI fighting for XP and catching Pokemon, but it had an unintended consequence, too. When the AI went to a Pokemon Center, it interacted with the PC there and deposited a few Pokemon. That dramatically dropped the total level of the party, ripping away a mass of reward points all at once. That was roughly equivalent to a traumatic experience for the AI, causing it to avoid Pokemon Centers altogether - thus refusing to heal the party until Whidden tweaked the reward systems again.

Since the AI essentially keeps doing things at random until it manages to figure out something that'll get it reward points, the fight against Brock proved to be a particular issue since you need to take advantage of his rock-type Pokemon's elemental weaknesses to do any real damage against them. It's only by virtue of one particular iteration where the AI's Squirtle happened to be out of PP for every move except Bubblebeam that the algorithm managed to pick up on how to beat the gym.

Yet while the AI is bad at figuring out things that might come pretty naturally to human players, it pretty quickly learns other, much more esoteric things. Whidden realized at a certain point that the algorithm would always plot a very specific, seemingly nonsensical path from Pallet Town up until the first encounter with a wild Pokemon. That seemed weird until it became clear that this precise series of inputs guaranteed that the wild Pokemon could be captured with a single throw of a Pokeball. Yes, the AI spontaneously learned the very art of RNG manipulation that speedrunners spend years developing.

Beating Brock made for a pretty natural end goal for the project, but Whidden did let the AI run longer to see what would happen, and it did make it deep into Mt. Moon - but the dungeon's dank, samey passages were so off-putting to the AI that it was never able to find its way to the other side, so it was never able to find the second gym at Cerulean City.

One thing the AI did love, however, was buying Magikarp. The shady guy who sells you the worst Pokemon of all time at a ridiculous markup is pretty much a joke at this point, but for the AI, buying that Magikarp is a quick way to get five more levels worth of Pokemon in its party - the best deal in the game! Apparently, the AI bought that Magikarp over 10,000 times.

Oh, and for one last anecdote about the magic of a computer doing random things: at one point, the AI captured a Rattata and named the Pokemon 'AI.' Sometimes these things work out just a little too perfectly.

AI-generated art and writing is extremely controversial, but some veteran developers believe that in the game industry, "the money is still going to drive absolutely everybody" to make use of machine learning.

See more Games News

Dustin Bailey joined the GamesRadar team as a Staff Writer in May 2022, and is currently based in Missouri. He's been covering games (with occasional dalliances in the worlds of anime and pro wrestling) since 2015, first as a freelancer, then as a news writer at PCGamesN for nearly five years. His love for games was sparked somewhere between Metal Gear Solid 2 and Knights of the Old Republic, and these days you can usually find him splitting his entertainment time between retro gaming, the latest big action-adventure title, or a long haul in American Truck Simulator.

Pikachu fainted, looking worn out on the ground in the Pokemon anime.

An AI is trying to 'teach' itself Pokemon Red after its previous versions weren't smart enough to solve it, and it's very slowly working its way through the Kanto gym leaders

A Flareon in a top hat from the Pokemon anime

Pokemon streamer beats the hardest challenge the RPG offers after 4,000 attempts over 15 months, all thanks to the Flareon that could

Sheep-like creatures holding mounted machine guns in Palworld

Palworld has revealed what a Pokemon game should be capable of – here's everything we want to see from Gen 10

The biggest loser in Zelda: Majora's Mask finally ends its 25-year losing streak, as 30,000 brute-force attempts finally debunk the myth of Blue Dog

Pokemon professor does the math to work out the most damage possible in each generation over the past 28 years, and the number is staggering

A Pokemon fan has calculated the win-loss record of Ash's party across over 1,000 episodes of the Pokemon anime, and it might have just ruined my childhood

Latest in RPG

BioWare lead proves he's onto us while celebrating Mass Effect 3's birthday, clarifying that's all he's doing so that "nobody can say I purposefully teased them"

Avowed has a Skyrim Easter egg that pokes fun at the RPG's most memed about companion NPC

Bethesda breaks silence as Starfield fans hope for an update: "We have a lot of exciting things planned for the game this year"

A painting shows a woman sleeping as a demon with three eyes bites her chest.

Final Fantasy 7 concept artist Yoshitaka Amano's new gallery exhibit summarizes everything I love about the Square Enix games, even though he might have stood me up

Kingdom Come: Deliverance 2 screenshot of Henry wearing a fancy coat, hat, and spectacles

I'm terrible at the combat in Kingdom Come: Deliverance 2, so I'm beating the RPG as a medieval rizz master instead

Four adventurers journey into a dim dungeon

Dark and Darker's legal battles are far from over as the popular extraction RPG has been pulled from the Epic Games Store again

Latest in News

After a comics Easter egg was spotted, Daredevil: Born Again fans now think that [SPOILER] is secretly alive

Screenshots from the Hero Forge custom dice designer Kickstarter campaign

Hero Forge custom dice Kickstarter is live, and will let you design your own additions to the asset catalog

BioWare lead proves he's onto us while celebrating Mass Effect 3's birthday, clarifying that's all he's doing so that "nobody can say I purposefully teased them"

ben starr dressed in harequinn makeup chomping down on a banana

Balatro's first few hours generated over $600,000, "far more money" than the roguelike's creator had ever made: "It is the most surreal day of my life"

One-Punch Man season 3's return date has been narrowed down – thanks to this hard-hitting new trailer

The two characters in Split Fiction in futuristic bodysuits staring at a huge crack in the simulation they're in

Split Fiction and It Takes Two director still hates microtransactions: "I think it's a huge problem and it's stopping our industry from a creative perspective"

More about rpg

BioWare lead proves he's onto us while celebrating Mass Effect 3's birthday, clarifying that's all he's doing so that "nobody can say I purposefully teased them"

Avowed has a Skyrim Easter egg that pokes fun at the RPG's most memed about companion NPC

Balatro's first few hours generated over $600,000, "far more money" than the roguelike's creator had ever made: "It is the most surreal day of my life"

See more latest

Sign up to the 12DOVE Newsletter