In December 2017, London-based AI research company DeepMind created a sensation in the chess world by releasing 10 games of a 100-game match between its AI algorithm AlphaZero and Stockfish 8, the hitherto strongest chess-playing computer in the world.
A year later, the sensation became still greater when – as part of a peer-reviewed paper published in Science magazine – DeepMind released 210 games of a follow-up 1000-game match which AlphaZero had won by the princely score of 155 wins to 6 losses. Since IBM’s Deep Blue defeated World Champion Garry Kasparov in a match in 1997, the world’s best chess players have become accustomed to playing second fiddle to chess computers. Indeed, grandmasters invariably check their games and analysis with chess computers before and after their games. So why did AlphaZero cause such a stir?
Let’s start by looking at how traditional chess computers work. Traditional chess computers search a wide range of possibilities to an unimaginable depth in every position. A chess computer like Stockfish analyses roughly 60 million moves a second (humans manage 4-5 moves a second!) This search is augmented by sophisticated heuristics, distilled from chess grandmaster knowledge, and Stockfish evaluates a couple of thousand chess factors in taking its decisions. Despite being explicitly programmed by humans, the play of such chess computers is somewhat inhuman – ugly, you might say – since their strength is derived from qualities which humans do not possess, namely relentlessly accurate, lightning-fast calculation. This strength is particularly potent in the defence of difficult positions.
AlphaZero shocked the chess world not only through its strength, but through its style of play. We were back in the Romantic era of the 1850s or once again watching Kasparov at his peak in the pre-computer age, boldly sacrificing material without inhibition to hunt the opponent’s king. It’s a style that humans intuitively understand and are inspired by. The paradox is that AlphaZero developed this human-like style without human input! AlphaZero is a self-learning algorithm; taught only the rules of chess, it developed its chess strategies through 44 million lightning-fast self-play games. In the 8 hours AlphaZero took to reach superhuman level, it essentially rediscovered – and often improved on – 300 years of human endeavour. Even more impressively, the algorithm is a general one that can be applied to any 2-player perfect information game. Indeed, AlphaZero performed similar feats in the other classic games of Go and shogi (Japanese chess).
Why did AlphaZero succeed against Stockfish where humans failed? AlphaZero certainly didn’t do it by calculating more: AlphaZero searches fewer positions than Stockfish (roughly 60,000 moves a second). AlphaZero’s neural net architecture – inspired by the human brain – seemed to combine the human ability to visualise a complicated long-term scenario with the machine-like strengths of accurate and consistent calculation. A great human player can play like this maybe twice in 100 games; AlphaZero was managing it the other 98!
These games gave professional players a great deal of food for thought. The chess computer which they had used extensively both to prepare their openings and verify their games had been heavily defeated. Moreover, the manner in which these games had been won suggested that Stockfish’s evaluation of strategically complicated, materially unbalanced situations with conflicts all over the board – “fuzzy” positions you might say in which the exact evaluation is impossible to calculate (even for Stockfish) and will only become clear in 20 or 30 moves – was rather less stellar than the rest of its amazing play. The open source project Leela Zero – heavily indebted to the DeepMind scientific paper of 2018 – brought AlphaZero-like thinking to the grandmasters’ personal computers allowing them to test out ideas in their own laboratories.
It may be coincidence, but recent top-level tournaments have witnessed swathes of powerful, attacking games for which the description “AlphaZero-like” often seems like the only suitable one! Most impressively, World Champion Magnus Carlsen, who described AlphaZero’s games as “quite inspirational” and – perhaps slightly tongue-in-cheek – called AlphaZero one of his “heroes”, has started on an amazing run of form while frequently employing AlphaZero’s favourite strategies, most notably the early push of a rook’s pawn as close as possible to the opponent’s king. The early use of rook’s pawns was known from the games of the Danish player Bent Larsen, one of the great players of the 1970s . However, Larsen was considered a maverick (a Russian coach once told me that Larsen was considered as a coffee-house player – in other words, not a serious player) so the technique remained pretty much his personal speciality… until now! The early advance of the rook’s pawn is just one of the positional themes examined in our book, Game Changer, which shows how AlphaZero’s self-learned techniques can be used to improve humans’ play.
We’ve shared our enthusiasm as chess players for AlphaZero’s play, but we mustn’t forget that DeepMind’s goal wasn’t to create the world’s strongest chess/go/shogi-playing machine. In fact, the goal was to demonstrate the power of a generalisable algorithm that – given only minimal human input – could achieve a superhuman level through a self-learning process. Before we saw AlphaZero’s games, both Natasha and I had considered that the current traditional chess computers were close to optimal play; it was astonishing to see the new exploration space that AlphaZero discovered within chess. It felt as if, thanks to AlphaZero, the chess board had got bigger! This feeling convinces us that AlphaZero-like techniques could make a difference in real-world scientific problems in areas such as medicine or climate change. For example, the DeepMind algorithm AlphaFold recently placed clear first in the prestigious CASP competition, a global competition that assesses techniques for predicting protein structure. Scientists believe that the ability to predict a protein’s shape is fundamental to understanding its role within the body as well providing insights into diseases such as Alzheimer’s. If AlphaZero-like techniques fulfil this promise, then it’s gratifying to think that all the hours we have wasted on our beautiful, unendingly complex game, might in the end contribute to realising some of humanity’s greatest future discoveries!
Game Changer: AlphaZero’s Groundbreaking Chess Strategies and the Promise of AI by Matthew Sadler and Natasha Regan is out now.