Random System

(I am just rewriting this text)

We assume as the overall structure the world (W) as described before (cf. 3.1) consisting of an environment (ENV) and at least one system (SYS), where the environment is realized as a grid of positions with properties. In our case we assume the further specialization of a WOOD1-style environment (cf. 3.3) and the system has always a certain position in the environment.

$\displaystyle WORLD$ $\displaystyle \in$ $\displaystyle ENV \times AGENT$ (7.1)
$\displaystyle ENV$ $\displaystyle \subseteq$ $\displaystyle POS \times PROP$ (7.2)
$\displaystyle AGENT$ $\displaystyle \subseteq$ $\displaystyle PERC \times ISTATES$ (7.3)
$\displaystyle ainp$ $\displaystyle :$ $\displaystyle (POS \times PROP) \longmapsto PERC$ (7.4)
$\displaystyle aout$ $\displaystyle :$ $\displaystyle ACT \longmapsto POS$ (7.5)
$\displaystyle POS$ $\displaystyle \subseteq$ $\displaystyle X \times Y$ (7.6)
$\displaystyle PROP$ $\displaystyle =$ $\displaystyle \{ Space, Object, Food, Border\}$ (7.7)
$\displaystyle Space$ $\displaystyle =$ $\displaystyle '.'$ (7.8)
$\displaystyle Object$ $\displaystyle =$ $\displaystyle 'O'$ (7.9)
$\displaystyle Food$ $\displaystyle =$ $\displaystyle 'F'$ (7.10)
$\displaystyle Border$ $\displaystyle =$ $\displaystyle 'BB'$ (7.11)

With this we could realize mappings like $ aout(a)=(x,y)$ saying that agent 'a' is mapped to the position '(x,y)' . And because in the world every position of an environment is associated with at least one property the agent can be connected to the property of the position or to the properties of many positions, if its perception does allow this.

While the mappings ainp() and aout() are meta functions working 'above' the environment and the agent, we have to assume some functions which are internal to the agent. For this we assume the following ones: decode() translates the perceived properties of the environment into some 'agent relevant values' and the function eval() evaluates the internal states with regard to rewards and recommended actions.

$\displaystyle decode$ $\displaystyle :$ $\displaystyle PERC \longmapsto ISTATES$ (7.12)
$\displaystyle eval$ $\displaystyle :$ $\displaystyle ISTATES \longmapsto REW \times ACT$ (7.13)

The random agent $ AGENT^{0}$ will not make use of all these additional informations. It will only generate by chance a proposal for a new action independent of the actual perception and independent of any other internal states. Thus we have the following sequence of mappings happening:

  1. The agent $ a$ has a certain position $ (x,y)$ in an environment $ e$. Associated with the position is some property $ p$.
  2. The agent can perceive certain properties $ \{p_{1}, ..., p_{k} \}$ of those positions he is determined to perceive.
  3. The agent can decode these external properties.
  4. The agent can evaluate these decoded properties.
  5. The agent proposes a new action (in the random case this is only like throwing a 'dice')
  6. The action will be realized. If the action is a mapping to a possible position in the environment then the agent will be placed at this new position $ (x',y')$. Continue at the beginning again.

Possible measures of performance with such an agent could be:

  1. Number of actions needed to reach a certain goal (e.g. to find some food)
  2. The amount of reward after a predefined number of actions.
  3. Number of errorts/ faults if these can be defined in advance.

With these measures it would be possible to establish some ordering between agents: those which need less actions or have a larger amount of reward or have less errors.

Gerd Doeben-Henisch 2013-01-14