The aim of this work is to provide an analytical framework for investigating the dynamics of the Ant Colony Optimization (ACO) metaheuristic, and to illustrate the applicability of this framework by comparing different encoding strategies for subset problems. ACO is an optimization technique inspired by observations on the behavior of biological ant colonies; it has been introduced by Dorigo, Maniezzo and Colorni [2] and developed later into a metaheuristic for combinatorial optimization problems [1]. From a technical point of view, ACO can be considered as reinforcement learning on a graph that encodes the problem instance, the so-called construction graph, as it has been formally defined in [3]. A solution to the given problem is encoded as a feasible walk in the construction graph. A feasible walk starts in a fixed initial node of the construction graph and has to satisfy the constraint that each node is visited at most once, i.e., already visited nodes are “tabu”. There may be additional, problem-specific rules defining certain nodes as tabu in certain cirumstances. When there is no neighbor anymore that is not tabu, the walk stops and is decoded as a solution to the given problem. The probability pkl to go from a node k to an allowed neighbor node l is chosen proportional to the so-called pheromone τkl, a memory value storing how good transition (k, l) has been in previous walks. Pheromone is initialized by a constant value. After the walk has finished, pheromone is updated. There are several possible ways to do that. The “classical” update rule, which we consider in this work, is the following: First, set τkl = (1 − ρ)τkl for each edge (k, l), where ρ ∈]0, 1[ is the so-called evaporation factor. Then, increase the pheromone on the edges of the chosen walk x by a factor proportional to the fitness f(x) of the walk. The process of random walk construction and pheromone update is iterated. Instead of a single walk (“1 ant”), also s > 1 walks may be constructed simultaneously or sequentially (“s ants”) in each iteration (“round”). In general, the procedure above needs not to converge. For an alternative pheromone update rule, where only the best walk seen so far is reinforced (“Global Best”), convergence results have been obtained ([3] – [5]). However, in theoretical research, the behavior of the ACO procedure under other update rules than Global Best is still widely unexplored. A remarkable exception is the article by Merkle and Middendorf [6] who analyze the ACO dynamics for a specific problem under an update rule where the best walk of each round is reinforced.