Machines That Play series has been broken into 7 parts: This is Part 6 of the series.
This series covers the history of Artificial Intelligence and games (until Deep Blue) and focuses on machines that played chess, checkers, and backgammon. The following topics are covered: how to build chess machines, Shannonâs work on chess, Turingâs work on chess, The Turk, El Ajedrecista, MANIAC, Bernstein chess program, Samuelâs checkers, Mac Hack VI, Cray Blitz, BKG, HiTech, Chinook, Deep Thought, TD-Gammon, and Deep Blue.
Part 1: Machines That Play (Overview)âââthis one
Part 2: Machines That Play (Building Chess Machines)
Part 3: Machines That Play (Chess-Before Deep Blue)
Part 4: Machines That Play (Deep Blue)
Part 5: Machines That Play (Post Deep Blue)
If you want a summary of the first 5 parts, focusing on the human elements, go here.
Part 6: Machines That Play (Checkers)âââthis one
Part 7: Machines That Play (Backgammon)
Part 6: Machines That Play (Checkers)
This part will cover Samuelâs checkers (1950s) and Jonathan Schaefferâs Chinook (1990s). Samuelâs checkers was the first program to learn (before AI was coined). Chinook was the first computer program to win the world champion title against humans. Schaeffer also solved Checkers in 2007âââthis is coming soon.
Game complexity
Before we begin, letâs look at some ways to measure game complexity.
The state-space complexity of a game is the number of legal game positions reachable from the initial position of the game.
The game tree size is the total number of possible games that can be played: the number of leaf nodes in the game tree rooted at the gameâs initial position.
The branching factor is the number of children at each node. For example, chess, suppose that a ânodeâ is considered to be a legal position, then the average branching factor is estimated to be about 35. This means that, on average, a player has about 35 legal moves available at each turn. By comparison, the average branching factor for the game Go is 250!
Optimal status: it is not possible to perform better (some of these entries were solved by humans)
Super-human: performs better than all humans
Minimax can be traced to a 1913 paper by Ernst Zermelo, the father of modern set theory. The paper contained several errors and did not describe minimax correctly. He did, however, propose (but did not prove) what became known as Zermeloâs Theorem: in any finite deterministic (i.e. in which chance does not affect the decision making process) perfect information two-player game (such as chess) there are three possibilities: either the first-player can force a win, or the second-player can force a win, or both players can force a draw. When applied to chess, Zermeloâs Theorem states âeither White can force a win, or Black can force a win, or both sides can force at least a drawâ. We donât (yet) know which it is for chess. (Checkers, on the other hand, has been solved: either player can force a draw.)
Improving the minimax procedure
Von Neumann improved Zermeloâs minimax theorem to include games involving imperfect information and games with more than two players. In 1944, John Von Neumann and Oskar Morgenstern published Theory of Games and Economic Behavior. This is the groundbreaking book that created the research field of game theory. Originally, game theory addressed zero-sum games, but since then it has been applied to a wide range of behavioral relations, and has become a framework for modeling scenarios in which conflicts of interest exist among the participants.
What does it mean to say a game is perfect? Each Player has complete knowledge of the game state. For two players games, they take alternate turns. Examples: Chess, Checkers, Connect-Four, Go, Othello
What does it mean to say a game is imperfect? Some of the game state is hidden. Examples: Poker, Bridge
What does it mean to say a game is not deterministic? Game involves an element of chance. Example: Backgammon (involves dice rolls)
Now, letâs talk about the machines.
AI before AI (1952â1962)
Samuelâs checkers
Game: Checkers
IBM 701âââthe first major commercial computer
In 1949, Arthur Samuel joined IBMâs Poughkeepsie lab and in 1952 he programmed IBMâs first major computerâââIBM 701âââto play checkers. It was the first game-playing program to run on a computerâââit was the first checkers program
On February 24, 1956, Arthur Samuel demonstrated the program and played a game of checkers on television.Before the demonstration, Thomas J. Watson, Sr., the founder and president of IBM, predicted that the demonstration would make a strong impressionâââand it would raise the price of IBM stock by 15 points. It did!
Arthur Samuel playing checkers on IBMÂ 701
[Side note. See videos titled The Computer Pioneers: The Development of the IBM 701: âThe Computer Pioneers is a video oral history project produced by Richard Jay Solomon. This segment of the series discusses the development of the IBM 701 model computer, also known as the Defense Calculator, in the early 1950s. These interviews were conducted on July 12th, 1983 and feature several members of the IBM 701âs development team including Jerrier Haddad, Clarence Frizzell, Nathan Rochester, and Richard Whalen.â]
Samuelâs checkers: The first machine to learn
By 1955, Samuel had done something groundbreaking; he had created a program that could learnâââsomething that no one had done beforeâââand this was demonstrated on television in 1956.
Samuel had been thinking about machine learning since he joined IBM and wanted to focus on building programs that could learn to play the game of checkers. In 1959 he coined the term machine learning as the âfield of study that gives computers the ability to learn without being explicitly programmedâ. He later said, âI became so intrigued with this general problem of writing a program that would appear to exhibit intelligence that it was to occupy my thoughts during almost every free moment for the entire duration of my employment by IBM and indeed for some years beyond.â
He published a seminal paper in 1959, titled Some Studies in Machine Learning Using the Game of Checkers, where he talked about how a machine could look ahead âby evaluating the resulting board positions much as a human player might doâ. The computer started out losing to Samuel and eventually beat Samuel.
He had programmed the âdigital computer to behave in a way which, if done by human beings or animals, would be described as involving the process of learning.â
In fact, the program had beaten its creator and it had done that by learning in a relatively short amount of timeââââa computer can be programmed so that it will learn to play a better game of checkers that can be played by the person who wrote the program. Furthermore, it can learn to do this in a remarkably short period of time (8 or 10 hours of machine-playing time).â
Imagine this:
8AM: Your computer program isnât any good at checkers. You teach it how to play.
9AM: You leave it alone and let it play by itself the entire day.
It plays against itself and learns from its mistakes.
6PM: You come back and play it. It beats you.
And that was in the 1950sâââwith slow computers of that time. In some sense, this is the earliest glimpse of what lies ahead in games: no matter what the rate of improvement is for humans, once machines learn to learn, it will be hard to keep up with machines, whose progress will end up being measured exponentially. And ours wonât.
No matter what the rate of improvement is for humans, once machines learn to learn, it will be hard to keep up with machines, whose progress will end up being measured exponentially. And ours wonât.
Insanely Immense Search Trees (solution: alpha-beta pruning)
We know (from from Shannonâs work described in Machines That Play (Chess) that the main driver of the machine that played games was a search tree of the set of all possible board positions reachable from the current state.
Checkers is extremely complexâââit has roughly 500 billion billion possible positions (5 x 10²â°). About 10²Ⱐpossible board positions compared to Chess (10â´âˇ) or Go (10²âľâ°)). Even though checkers is simpler than those games, it is still complex enough that a brute force only approach is impractical.
In his paper, Samuel said that âat our present stage of knowledge, the only practical approach, even with the help of the digital computer, will be through the development of heuristics which tend to ape human behavior.â
Samuelâs program was based on Shannonâs minimax strategy to find the best move from a given current position.
Recall that in two-player games, the minimax algorithm determines the best moveâââminimax is a recursive algorithm and it chooses an optimal move based on the assumption that both players play optimally. Simply put, itâs trying to model what we do when we play: we think âIf I make this move, then my opponent can only make these moves, then I will make this move,âŚâ
From Samuelâs paper explaining minimax
The problem with a full minimax search algorithm is that it explores all parts of the tree, including the parts of the tree it doesnât need to. In other words, given a board position, human experts tend to âknowâ that some moves are irrelevant and some moves are good. They donât explore all moves, they prune out the irrelevant or bad ones up front.
Minimax example (full search): Circles represent the moves of the maximizing player, and squares represent the moves of the opponent (minimizing player). The values inside the circles and squares represent the value ι of the minimax algorithm. The red arrows represent the chosen move, the numbers on the left represent the tree depth and the blue arrow the chosen move.
The rules of thumb that human experts use to prune out bad moves are called âheuristicsâ. In games, heuristics are implemented to cut down on computation time. An (easy) example of a heuristic in chess is that if a move leads to the playerâs king being in checkmate, then the algorithm should not look further down that path because a player will never want to make that move. There is no point in exploring that branchâââit is a waste of precious computational resources, but simple minimax search will still explore that path. Alpha-beta pruning will not explore that path.
To attack the complexity problem, Samuel implemented alpha-beta pruningâââinstead of searching each path to the gameâs end, Samuel developed a scoring function based on the position of the board at any given time.
Alpha-beta pruning is an optimization of minimax. At a given node n, if a player has a better choice a (at a node further up), then n will not be reached. By looking at some successors of n, we can know enough about n to prune it out. The idea is that the algorithm will maintain two values, alpha and beta, where alpha represents the minimum score that the maximizing player is guaranteed and beta represents the maximum score that the minimizing player is guaranteed. Both players start with their worst possible score: alpha is negative infinity and beta is positive infinity. So when to prune? Well, each time beta (the maximum score that the minimizing player is guaranteed) becomes less than or equal to alpha (the minimum score that the maximizing player is guaranteed) (or when alpha becomes greater than or equal to beta), the maximizing player can prune those nodes. The chart below explains this:
Alpha-Beta pruning example: There is no need need to explore any of the paths whose edges are crossed-out, since other moves (that will perform better) have been found
The alpha-beta algorithm produces the same result as the minimax algorithm, but is more efficient because entire branches can be eliminated from the search process. This allows the program to evaluate the minimax search tree much deeper, while using the same resources. This combination of the minimax algorithm and alpha-beta pruning significantly reduced the total number of branches of the tree, which made it feasible to play games on a computer. Hereâs the (commonly referenced) basic idea:
âIf you have an idea that is surely bad, donât take the time to see how truly awful it is.ââââPat Winston
Introducing Machine Learning
In addition to minimax and alpha-beta pruning, Samuel used a fundamentally new and important component: learning. He used two main learning methods to create the first competent AI program: a) rote learning and b) learning by generalization.
Rote learning meant the program would save all of the board positions encountered during play, along with their scores. Then when those moves need to be evaluated further down the tree, they could simply be referenced instead of evaluated, thus saving some computing time. Even though this wasnât an advanced form of learning, the idea is for the program to use the saved time to compute further in depth. And rote learning produced slow but continuous improvements, which were most effective for openings and endgames.
From Samuelâs paper explaining rote-learning
Learning by generalization meant modifying the evaluation function based on previous games played by the program. This was groundbreaking because it showed that a a program could learn by playing against previous versions of itself (which would be a key aspect of AlphaGo). This learning method came closest to the reinforcement learning technique later dubbed temporal-difference learning (TD-Lambda)âââthat the value of a state should equal the value of likely following states. (Side note: Samuelâs method was conceptually the same as that used much later by Tesauro in TD-Gammon.) The version using generalized learning was able to develop a good middle game but remained weak in opening and endgame play. Later in the series, we will see how some of the most successful programs would attack these issues.
Samuel did his work despite poor computation power in that period, while working essentially alone and doing his own programming. The principles of machine learning verified by the programâs âexperiments are, of course, applicable to many other situations.â Today there is no field that is untouched by machine learning.
Successive enhancements made by Samuel allowed the computer to improve and achieve good skillâââit eventually beat Samuel, but it still could not beat the experts consistently.
But it did defeat an expert player Robert Nealy, however, there is a controversy about whether Nealy lost the game when a draw was available (i.e. he played poorly) or whether Samuelâs program won (i.e. it played well).
The next year a six match rematch was won by Nealy 5â1.
Samuelâs Checkers Program and Robert Nealy
Samuelâs checkers demonstrated that a computer could be programmed to learn to play a game better than its creator. It could improve its performance doing something that humans could notâââby playing thousands of games against itself to practice.
This was a monumental achievement for his day. But Samuel himself had no illusions about the strength of his AI program. Even though it is a milestone in AI, it was blown out of proportion by the media. Some said, âThis work led to the false impression that checkers was a âsolvedâ game. As a result, researchers moved on to chess, and largely ignored checkers for over 25 years.â
Samuelâs work, while groundbreaking on its own, may have restricted research into Checkers until 1989 when Jonathan Schaeffer began working on Chinook.
In 1990, Schaeffer reached out to Samuel to tell him about his programâs massive improvements, but Samuel had just died. It was the end of one of the founders, one who gave us the very first glimpse of machine intelligence.
Miles ahead of the strongest human players
Chinook (1989â1996)
Game: Checkers
Chinook meets humanâââTake 1
Is he really human or is he superhuman?
The year was 1991. Marion Tinsley had agreed to a friendly checkers match against Chinook. Tinsley had been the worldâs best checkers player for 40 years. Regarding Tinsley it was said that he was âto checkers what Leonardo da Vinci was to science, what Michelangelo was to art and what Beethoven was to music.â
After Samuelâs work on checkers, there was a false impression that checkers was a âsolvedâ game. As a result, researchers moved on to chess and mostly ignored checkers until Jonathan Schaeffer began working on Chinook in 1989. Schaefferâs goal was to develop a program capable of defeating the best checkers player.
Back to 1991, Chinook was playing the best player ever, Tinsley. The first nine games they played were all draws. In the tenth game, Chinook made a move that it thought was slightly advantageous. Seeing this Tinsley remarked, âYouâre going to regret that.â In an interview Schaeffer said. âAnd at the time, I was thinking, what the heck does he know, what could possibly go wrong?â It turned out that Tinsley began to pull ahead and Chinook resigned. Schaeffer said (about Tinsley), âIn his notes to the game, he later wrote that he had seen all the way to the end of the game and he knew he was going to win.â
After the game when Schaeffer looked back into the database, he discovered that from that move to the end of the game, if both sides played perfectly, he would lose every time. But to see that, a computer or a human would have to look 64 moves ahead. Tinsley seemed to have picked the only strategy that could have defeated Chinook from that pointâââTinsley was able to see the win 64 moves ahead.
âI was absolutely stunned,â Schaeffer said.
âHow do you compete with somebody whose understanding of the game is so deep that he immediately knows through experience or knowledge or doing some amazing search that he was gonna win that position?â
Chinook meets humanâââTake 2
Itâs only a matter of time now
Chinook and Marion Tinsley met again in 1992 at the Man-Machine World Championship. It was the first time in history a human was playing a computer for a world championship title.
Chinook team (August 1992). From left to right: Duane Szafron, Joe Culberson, Paul Lu, Brent Knight, Jonathan Schaeffer, Rob Lake, and Steve Sutphen. Our checkers expert, Norman Treloar, is missing
Because of Tinsleyâs greatness, most players would play cautiously against Tinsley, hoping for a draw. Chinook, however, played a very different game. Regarding how Chinook played, Schaeffer said, âIt played brash, aggressive movesâââit walked on the edge of a precipiceâŚIt would do things people looked at and said, âMan, is that program crazy?ââ
Even though Chinook lost against Tinsley, it came close to defeating Tinsley. It played a stunning eighth game and won; it was Tinsleyâs sixth loss in 40 years.
According to The Atlantic, Schaffer felt sad about Tinsleyâs loss and later wrote in this book:
âWeâre still members of the human race and Chinook defeating Tinsley in a single game means that it will only be a matter of time before computers will be supreme in checkers, and eventually in other games like chess.â
But it wasnât just one single game. Chinook won again in game 16. No living player had defeated Tinsley more than once. Tinsley had lost only seven games in his entire 45 year career, two of them to Chinook.
Finally, a machine was becoming more perfect.
But Chinook had some kind of an error, which forced Schaffer to resign the game. Schaeffer was devastated.
Chinook meets humanâââTake 3
Losing his divine backing
In 1994, Chinook, after having beaten another player, was ready to face Tinsley again. Scientific American reported, âThe night before the match,
âŚTinsley dreamt that God spoke to him and said, âI like Jonathan, too,â which had led him to believe that he might have lost exclusive divine backing.â
Chinook hadnât lost a game in its last 125 games and since 1992, Schaefferâs team had spent thousands of hours improving Chinook.
Leading up to the match, Tinsleyâs stomach had been hurting. It hurt again the day of the match. After six games, all of which were draws, he needed to see a doctor so Schaeffer took him to the hospital. The next day Tinsley was told there was a lump on his pancreasâââhe had pancreatic cancer. He withdrew.
Don Lafferty, rated the number two player in the world at the time, replaced Tinsley and played Chinook. Chinook won and became the first computer program in history to win a human world championship. Seven months later, Tinsley died.
Schaefferâs program never got to beat the best checkers player everâââand this was why he started it all in 1989. The best human player never truly lost a match to Chinook. By the late 1980s checkers programs had become more advanced. Chinook played its last tournament in 1996, where Chinook finished âmiles ahead of the strongest human players in the U.S. Championship.â
But that wasnât enough for Schaeffer. He needed to make sure his program could beat the best player ever.
Humans are only almost perfect. âBut my computer program is perfect.â
From 1997 to 2001, Schaeffer suspended Chinook and began working on solving the game of checkers, which meant his program would always know the right move. It would be perfect.
In an interview with Scientific American, Schaeffer said, âFrom the end of the Tinsley saga in â94ââ95 until 2007, I worked obsessively on building a perfect checkers program. The reason was simple: I wanted to get rid of the ghost of Marion Tinsley. People said to me, âYou could never have beaten Tinsley because he was perfect.â
âWell, yes, we would have beaten Tinsley because he was only almost perfect. But my computer program is perfect.â
In 2007 Schaeffer and his team published a paper in the journal Science titled Checkers Is Solved: the program could no longer be defeated by anyone, human or otherwise.
âEighteen years later, Iâm finally done.â Schaeffer said.
Solving Checkers (2007)
Game: checkers
In 2007, the makers of Chinook published a paper in the journal Science announcing that Chinook had completely solved Checkers: the program could no longer be defeated by any, human or otherwise. Jonathan Schaeffer and his team had been working to solve the checkers problem since 1989. The article stated, ââŚcheckers is now solved: Perfect play by both sides leads to a draw. This is the most challenging popular game to be solved to date, roughly one million times as complex as Connect Four.â Checkers is the largest game that has been solved to date, with a search space of 5Ă10²â°. âThe number of calculations involved was 10šâ´, which were done over a period of 18 years. The process involved from 200 desktop computers at its peak down to around 50â.
Coming Soon