Minimax Game Tree Search

From Realpolitik.io
Jump to navigation Jump to search

In combinatorial game theory, the minimax algorithm provides a way to search a game tree in order to find the best move for a given player. From a given game state, the algorithm explores various lines of play, evaluates the terminal states from the perspective of each player, and then reasons backwards about which moves are best for each player who took a turn along each line. The minimax algorithm thus looks for the sequence of moves through the game tree that represents the players' best responses to each other's moves. The resulting sequence is called a principal variation and it can be visualized as the red path in the imaginary game tree below. The first step in the principal variation is the moving player's best move.

Principal variation.png

Because minimax game tree search is a well-known algorithm (see Russell and Norvig, ch. 6, which includes an example for multiplayer games), we do not explain it here but merely highlight the features that are specific to quantitative realism.

Move Order

Minimax search presumes that players move sequentially, rather than simultaneously. Simultaneous play would be a more natural choice for quantitative realism, because everyone would move, the law of motion would update agent sizes, and then everyone would move again. In contrast, sequential play introduces a fiction into the already abstract model, as we must somehow decide in what order the players are to move. Should they go "around the table," or completely at random, or with each round randomized? Should a player be able to move twice in a row? Should everyone get an equal number of turns, or should more powerful agents move more frequently? And should the law of motion update after each individual turn, or after each round?

These questions are significant because the answers can drastically change the behavior of the agents in the model. We have not yet resolved these questions, however, our current approach to simulating move sequences is as follows: Every player has an equal opportunity to move, but is chosen randomly. Within the game tree search of a particular move, a player's probability of moving is proportional to their PrinceRank, causing them to pay extra attention to the more important players. This topic is discussed in more detail here.

An alternative approach, not yet tried, would be to reduce the multi-player game to a two-player one where the focal player plays against everyone else. This so-called paranoid decision rule also uses minimax game tree search. At nodes where it is the other players' turn, the value to be minimized is the focal player's score minus the total of the other players' scores (see Sturtevant 2003, sec. 4.2).

Tree Traversal

Pruning

Tree traversal is based upon depth first search, using a single processor. The quality of the search for a principal variation depends upon the ability to prune away unpromising branches of the game tree. Game engines for two-person games, like chess, use alpha-beta pruning to reduce the size of the tree to be searched. However, alpha-beta pruning is not available for games involving three or more players. As a substitute, we use PrinceRank-based pruning: a node and its children are pruned from the game tree if the value of the node plus some delta value Δ is less than the best known minimax value for the applicable branch. The assumption is that PrinceRank is a reasonable proxy for the minimax value, and if that value is too low, it is unlikely that that node will be a best move. The value Δ is set as a global parameter. Leaf nodes are not pruned. More work is needed to refine this pruning algorithm.

Breadth

Because PrinceRank-based pruning is used, a maximum breadth can be set for each game tree node. Candidate moves are tested in order of their PrinceRank for the moving player, and a limit on the number of moves to be tested can be imposed.

Depth

The depth of the tree search is limited by the amount of time one is willing to wait for a result. Iterative deepening is not used because it degrades performance, presumably because the current implementation does not cache the game tree, which has to be re-traversed with each iteration.

Randomness and Indeterminacy

Because minimax search is based on random move orders, it is the primary source of randomness in the model. Randomization is also used within the minimax algorithm to break ties when two moves are of equal value. However, this turns out to rarely occur unless more than one agent dies, and tie-breaking situations can usually be avoided by making sure that no two agents have the same exact size when initiating a simulation.