Games, players, strategies & payoffs
A game in the sense of this volume is not a pastime; it is a model of interaction between agents whose outcomes are intertwined. The minimal data is a triple. There is a set of players \(N = \{1, 2, \ldots, n\}\). Each player \(i\) has a set of strategies \(S_i\) — the actions available to them. And each player has a payoff function \(u_i\) that assigns a real number to every combination of choices, one from each player:
For two players with finitely many actions, the game is a payoff matrix: rows are player 1's strategies, columns are player 2's, and each cell holds an ordered pair \((u_1, u_2)\). The canonical example is the Prisoner's Dilemma. Two suspects each choose to Cooperate (stay silent) or Defect (betray). Higher numbers are better; the cell entries are (row payoff, column payoff):
| row ↓ / col → | Cooperate | Defect |
|---|---|---|
| Cooperate | (2, 2) | (0, 3) |
| Defect | (3, 0) | (1, 1) |
A central modeling assumption runs through everything below: players are rational (each maximizes their own payoff) and this rationality is common knowledge (each knows the others are rational, knows that they know it, and so on). This is a strong, often unrealistic idealization — real humans deviate systematically, and behavioral game theory exists precisely to map those deviations. We will be honest about where the idealization bites. But it is the assumption that gives the predictions their teeth.
The tool that turns this raw data into prediction is the best response. Holding everyone else's choices \(s_{-i}\) fixed, player \(i\)'s best responses are the strategies that maximize their payoff:
Dominant strategies & iterated elimination
Sometimes a strategy is good no matter what anyone else does. Strategy \(s_i\) strictly dominates \(s_i'\) when it pays more against every possible choice of the opponents:
In the Prisoner's Dilemma, Defect strictly dominates Cooperate for both players: against a Cooperating opponent, \(3 > 2\); against a Defecting opponent, \(1 > 0\). Each player has a strictly dominant strategy — Defect — so the predicted outcome is mutual defection, paying \((1, 1)\). And here is the dilemma's sting: both would have preferred \((2, 2)\), but \((2, 2)\) is not stable, because each could unilaterally jump to \(3\) by defecting. Individual rationality drives the pair to a jointly worse outcome. This single fact underwrites everything from arms races to tragedy-of-the-commons depletion to why multi-agent AI systems trained only on self-interest can converge on collectively destructive policies.
Most games have no dominant strategy. But dominance still buys leverage through iterated elimination of strictly dominated strategies (IESDS): delete a dominated strategy, and in the smaller game some other strategy may now be dominated, so delete that too, and repeat. With strict dominance the order of deletion does not change the result. If the process leaves exactly one strategy per player, the game is dominance-solvable and we have a prediction without ever invoking equilibrium.
# Pure Nash equilibria of a 2x2 game by best response (EQ G1.2)
import numpy as np
# Prisoner's Dilemma, "higher is better". A = row payoffs, B = column payoffs.
A = np.array([[2, 0], # row plays Cooperate
[3, 1]]) # row plays Defect
B = np.array([[2, 3], # column payoffs, same cell layout
[0, 1]])
acts = ["Cooperate", "Defect"]
# A cell (i,j) is Nash if i maximizes row's payoff in column j
# AND j maximizes column's payoff in row i.
row_br = (A == A.max(axis=0, keepdims=True)) # best row responses per column
col_br = (B == B.max(axis=1, keepdims=True)) # best col responses per row
nash = row_br & col_br
print("row best-response mask (per column):\n", row_br.astype(int))
print("col best-response mask (per row):\n", col_br.astype(int))
for i in range(2):
for j in range(2):
if nash[i, j]:
print(f"PURE NASH -> (row={acts[i]}, col={acts[j]}), "
f"payoffs=({A[i,j]}, {B[i,j]})")
Nash equilibrium
Dominance handles only the easy games. The concept that handles all of them — John Nash's 1950 contribution, and the reason game theory became the lingua franca of economics, biology, and multi-agent AI — is a notion of mutual stability. A strategy profile \(s^\star\) is a Nash equilibrium when no single player can improve their payoff by unilaterally changing their own strategy, holding everyone else's fixed:
Read the definition as a test, not a recipe: given a candidate profile, you check it by asking each player in turn, "could you do better by switching, assuming the others stand pat?" If every answer is no, it is an equilibrium. The Prisoner's Dilemma's \((D, D)\) passes: from payoff \(1\), unilaterally cooperating drops you to \(0\). The cooperative \((C, C)\) fails: from \(2\), unilaterally defecting jumps you to \(3\).
Two cautions the textbooks insist on, and rightly. First, equilibria need not be unique — coordination games have several, and the theory alone does not say which one rational players will land on (this is the equilibrium selection problem, genuinely unsettled). Second, equilibrium is not optimality: the Prisoner's Dilemma equilibrium is Pareto-dominated by mutual cooperation — every player prefers the non-equilibrium outcome. Nash equilibrium predicts what self-interested rational agents will do, not what is collectively best. Conflating the two is the most common error in applying the concept.
Nash's theorem — proved with the Kakutani fixed-point theorem — guarantees that every finite game has at least one equilibrium, provided we allow mixed strategies (randomization over actions, §1.5). Existence is the deep result; the Prisoner's Dilemma happens to have a pure one, but games like Matching Pennies have an equilibrium only once randomization is permitted.
Zero-sum games & minimax
A two-player zero-sum game is pure conflict: whatever one player wins, the other loses, so \(u_1 + u_2 = 0\) in every cell. We can then describe the whole game by a single matrix \(A\), where \(A_{ij}\) is the row player's payoff (the column player's is \(-A_{ij}\)). The row player maximizes; the column player minimizes. This is the setting von Neumann solved in 1928, decades before the general Nash concept — and it is far better behaved.
The conservative move for the row player is to choose the strategy whose worst case is best — the maximin. The column player symmetrically picks the strategy whose worst case (from their side) is best — the minimax:
Von Neumann's Minimax Theorem is the crown jewel: in any finite two-player zero-sum game, once mixed strategies are allowed, the maximin and minimax are always equal. Their common value \(v\) is the value of the game, and the optimal mixed strategies are interchangeable and worst-case optimal:
# Solve a 2x2 zero-sum game's mixed strategy + value (EQ G1.6-G1.7)
import numpy as np
A = np.array([[ 1.0, -1.0], # Matching Pennies, row payoffs
[-1.0, 1.0]])
# First check for a pure saddle point.
maximin = A.min(axis=1).max() # best worst-case row
minimax = A.max(axis=0).min() # best worst-case column
print(f"pure maximin = {maximin:+.3f}, pure minimax = {minimax:+.3f}")
if np.isclose(maximin, minimax):
print("saddle point exists -> pure value =", maximin)
else:
a, b, c, d = A.ravel()
denom = a - b - c + d # EQ G1.7 denominator
p = (d - c) / denom # P(row plays top)
q = (d - b) / denom # P(col plays left)
v = (a*d - b*c) / denom # value of the game
print(f"no saddle -> mix needed")
print(f"row mix p* = ({p:.3f}, {1-p:.3f})")
print(f"col mix q* = ({q:.3f}, {1-q:.3f})")
print(f"value v* = {v:+.3f}")
# sanity: with these mixes the column is indifferent (both columns equal)
col_payoffs = np.array([p, 1-p]) @ A
print("row's mix makes columns equal:", np.round(col_payoffs, 6))
Mixed strategies
Matching Pennies has no pure equilibrium — any pure choice you make, the opponent can exploit. The escape is to randomize. A mixed strategy for player \(i\) is a probability distribution \(\sigma_i\) over their actions; a pure strategy is the degenerate case that puts all mass on one action. Payoffs become expected payoffs, and the strategy space is now the simplex \(\Delta(S_i)\) — for two actions, just the line segment of probabilities \((p, 1-p)\).
The key to solving for a mixed equilibrium is the indifference principle: if a player is randomizing between several actions in equilibrium, every action they put positive weight on must yield the same expected payoff. Otherwise they would shift all their probability to the better one — so the opponent must mix precisely so as to leave them indifferent. This flips the intuition inside out: your mixing probabilities are pinned down by making the other player indifferent, not yourself.
For a 2×2 zero-sum game with row-payoff matrix \(A = \begin{psmallmatrix} a & b \\ c & d \end{psmallmatrix}\) and no saddle point, the indifference conditions give closed forms. The row player mixes with probability \(p\) on the top row so that the column player's two columns pay equally; the column player mixes with \(q\) on the left column likewise:
The mixed equilibrium has a striking robustness: at \((p^\star, q^\star)\), neither player can be exploited, because each has made the other indifferent across all their options. There is nothing to grab. This is the precise sense in which a randomized strategy can be safer than any deterministic one — an idea that reappears, dressed differently, in adversarial robustness and in the stochastic policies of reinforcement learning.
One-shot games answer "what is stable?" — but life is repeated, and repetition changes everything. When the Prisoner's Dilemma is played again and again, cooperation can become rational through the shadow of the future, and entire strategies (tit-for-tat, grim trigger) emerge that have no meaning in a single round. Chapter 02 takes up repeated games, the Folk Theorem, and the cooperative side of game theory.
References
- von Neumann, J. & Morgenstern, O. (1944). Theory of Games and Economic Behavior.
- Nash, J. F. (1950). Equilibrium Points in n-Person Games.
- Nash, J. F. (1951). Non-Cooperative Games.
- von Neumann, J. (1928). Zur Theorie der Gesellschaftsspiele.
- Osborne, M. J. & Rubinstein, A. (1994). A Course in Game Theory.
- Easley, D. & Kleinberg, J. (2010). Networks, Crowds, and Markets.