MENACE

The idea of building a computer capable of playing tic-tac-toe came to Donald Michie in the 1960s. Since he did not have a computer available at the time, he realized his idea using about 304 matchboxes and numerous beads. He named his machine MENACE (Matchbox Educable Noughts And Crosses Engine). What makes MENACE special is its learning algorithm, which enables it to play tic-tac-toe.

The way MENACE works is an early form of reinforcement learning. Here, an agent (in this case MENACE) learns by interacting with its environment (tic-tac-toe). At the beginning, all of MENACE's possible plays are equally likely. But the more games she plays, the more likely she is to choose the most promising moves. This is based on a reward system represented by the number of beads. If MENACE's moves lead to a win, she is "rewarded" by adding more beads representing that successful move. However, if her decisions lead to a loss, pearls are removed as a "punishment".

Go To Explorable

Despite careful control of the content, we assume no liability for the content of external links. The operators of the linked pages are solely responsible for their content.