Working with environments which are mazes allows the usage of graphs as models. Every position is a vertex (node, state) and every possible move is a directed edge, an arrow.

In the case of mazes in the style of WOOD1-worlds there are theoretically 9 possible moves. We reduce the model even further that we consider only free space '.' and food 'F'.

The experimental setup measures the number of steps from start to the food. All learning systems shall become better than 'pure chance'. To be able to compare a learning system with a random system we will use a random system A0 as a point of reference. This gives us 'empirical' results. But even in the case of the random system we have to clarify the theoretical framework how to compute the 'probable' possible outcomes of the random walks.

The question is, which kinds of theoretical models are appropriate to describe the behavior of random walks.