The Last AI Breakthrough DeepMind Made Before Google Bought It For $400m

The end is nigh. Humans have lost another key battle in the war against computer domination

The Physics arXiv Blog
The Physics arXiv Blog
4 min readJan 29, 2014

--

The Atari 2600 games console has a special place in the hearts of any gamers who grew up in the 1970s. It popularised a number of games that changed the games industry for ever, such as Pong, Breakout and Space Invaders. Today, these games have legendary status and still play an important role in the gaming world.

One curious thing about these games is that computers themselves have never been very good at playing them in the same way as humans. That means playing them by looking at the monitor and judging actions accordingly. This kind of “hand-to-eye” co-ordination has always been a special human skill.

Not any more. Now, Volodymyr Mnih and pals at DeepMind Technologies in London say they’ve created a neural network that learns how to play video games in the same way as humans: using the computer equivalent of hand-to-eye co-ordination.

And not only is their neural network a handy player, it learns so well that it can actually beat expert human players in games such as Pong and Breakout.

The approach is relatively straightforward. These guys have set their neural network against a set of seven games for the Atari 2600 available on the Arcade Learning Environment. These are Pong, Breakout, Space Invaders, Seaquest, Beam Rider, Enduro and Q*bert.

At any instant in time during a game, a player can choose from a finite set actions that the game allows: move to the left, move to the right, fire and so on. So the task for any player—human or otherwise—is to choose an action at each point in the game that maximises the eventual score.

That’s often easier said than done because the reward from any given action is not always immediately apparent. For example, taking cover from a space invader’s bomb does not increase the score but does allow it to increase later.

So the gamer must learn from its actions. In other words, it must try different strategies, compare them and learn which to choose in future games.

All this is standard fare for a human player and straightforward for a computer too. What’s hard is making sense of the screen, a visual task that computers have never really taken to. (Most computer competitors play using direct inputs from the game parameters rather than from the screen.)

Mnih and co tackle this problem by first simplifying the visual problem. The Atari 2600 produces a series of frames that are each 210 x 160 pixels with a 128-colour palette. So these guys begin by converting the game into a greyscale consisting of only four colours and down-sampling it to a 110-84 image. This is further cropped to 84 x 84 pixels since the system requires a square input.

The neural network works by evaluating each image and assessing how it will change given any of the possible actions. It makes this assessment based on its experience of the past (although Mnih and co are tight-lipped about the secret sauce they use to achieve this).

Significantly, the computer has no advanced knowledge of what the screen means. “Our agents only receive the raw RGB screenshots as input and must learn to detect objects on their own,” say Mnih and co.

The results are impressive. The neural network not only learns how to play all of the games but becomes pretty good at most of them too. “Our method achieves better performance than an expert human player on Breakout, Enduro and Pong and it achieves close to human performance on Beam Rider,” say Mnih and co.

What’s more they achieved this performance without any special fine-tuning for any of the games played.

That’s significantly better than other attempts to build AI systems that can take on human players in these kinds of games. It’s also a serious threat to human domination of the video game world.

There is one crumb of hope for human gamers. The neural net cannot yet beat humans experts at Q*bert, Seaquest and, most important of all, Space Invaders. So we have a few years yet before computer domination is total.

Ref: arxiv.org/abs/1312.5602 : Playing Atari with Deep Reinforcement LearningWrite your story

This story was originally published under the title “Neural Net Learns Breakout Then Thrashes Human Gamers”

--

--