

IALGO SS03  ALGORITHMS AND DATASTRUCTURES  Lecture with Exercices

In the preceding lecture we have selected adaptive binary treestructures as the most interesting candidates for possible answeres to a lot of interesting realworld problems. This involves some basic theoretical concepts which we have to introduce here as necessary framework for the upcoming concrete structures.
From a practical point of view there are onyl a few basic properties of graphs which are important:
Some important possible properties of graphs
In an informal way one can describe these properties as follows:
Graphs := a set of vertices (nodes) which are connected by edges (arcs) with no special direction; not all vertices must be connected. Following the edges you are generating a path, which can lead back to a vertex which you have already passed.
Undirected/ directed := a graph becomes a directed graph, if you can follow the edges only in one direction
Cyclic/ acyclic := if you are following the edges of a graph and you will never return to a vertex you have already passed, then the graph is acyclic.
Not labeled/ labeled := a label is some information attached to a vertex (node). Example: you have a sheet of paper with vertices and edges drawn on it and the vertices e.g. have attached names of cities.
Not weighted/ weighted := a weight is some additional information attached to an edge. This is usual an information about the costs to pass this edge from one vertex to the other (e.g. 'the time to travel', the 'expense to reach', 'the work to do', 'the amount of energy to spent' etc.).
As you can easily see are these basic properties independent one from another; a graph can be undirected and acycliy, with or without labels or weights. Thus you can mix up these basic properties as you need them. Some of the more often occuring combinations are given in the following diagram:
Some combinations of graphproperties
As you can see from this diagram are trees a special case of graphs; trees are that subset of graphs which are directed and have only one root. The root is the only node from which you can reach all other nodes. Trees can then have additional properties like to be acyclic, labeled, weithted or to be binary. Binary means that from each node in a binary tree you have at most two possible continuations. Again it holds that all these properties are independed insofar you can mix these properties as needed.
Because of this independent character of most of these properties to be 'single rooted' depends from beeing directed it is difficult to give a complete systematic of formal concepts; thus we will introduce these concepts as possible 'building blocks' of larger structures.
First we will give some formale definitions of the mostly used concepts. As usual, there are several possibilities how to define a concept. We have tried to cover the common usage in the textbooks; but there are concepts which rarely are explicitly defined, thus we have given one possible formula. It is up to you to improve these concepts if this would be necessary for some special applications.
The basic concept is the concept of graphs.
(UNLABELED/ UNWEIGHTED) GRAPH(g) iff g = <V,E>
with
V := finite set of vertices (nodes)
E := finite set of edges (arcs)
E C V x V
In the general case it holds: REFLEXIVE(E): (x,x) in E &
SYMMETRICAL(E): (x,y) in E ==> (y,x) in E
Def: GRAPH(g) ==>
vertices(g) = g_{1}
Def: GRAPH(g)==>
edges(g) = g_{2}
Def: ADJACENT(x,y,k,g) iff GRAPH(g) & k in edges(g) & k = (x,y)
Def: CONNECT(k, x,y,g) iff ADJACENT(x,y,k,g)
Def: ENDPOINTS(x,y,k,g) iff ADJACENT(x,y,k,g)
Def: INCIDENT(x,y,k,g) iff ADJACENT(x,y,k,g)
Def: NEIGHBOURS(x,y) iff (E:k,g) ADJACENT(x,y,k,g)
Def: SIMPLEX(g) iff GRAPH(g) & (A:x,y)( x,y in vertices(g) ==> NEIGHBOURS(x,y))
Def: DIRECTED_GRAPH(g) DG(g) /* also called Digraph */ iff GRAPH(g) & ~SYMMETRICAL(edges(g)).
Def: HEAD_OF(x,k,g) iff ( DG(g) & k in edges(g) & x = pr1(k) )
Def: TAIL_OF(y,k,g) iff ( DG(g) & k in edges(g) & y = pr2(k) )
Def: LEADS_FROM_TO(x,y,k,g) iff HEAD_OF(x,k,g) & TAIL_OF(y,k,g)
Def: PREDECESSOR_OF(y,x,g) iff (E:k)( LEADS_FROM_TO(y,x,k,g) )
Def: SUCCCESSOR_OF(y,x,g) iff (E:k)( LEADS_FROM_TO(x,y,k,g) )
Def: indegree(x,g) = {y PREDECESSOR_OF(y,x,g) }
Def: PATH(p,g) iff DG(g) & SEQUENCE(p) & rng(p) C vertices(g) & (A:i,j)( i,j in dom(p) & j=i+1 ==> (p_{i},p_{j}) in edges(g))
Def: PATH(p,g) ==>
length(p) = dom(p)1 /* Number of edges, not vertices! */
Def: HEAD_OF_PATH(x,p,g) iff PATH(p,g) & x = p_{1}
Def: END_OF_PATH(y,p,g) iff PATH(p,g) & x = p_{length(p)+1}
Def: CYCLIC_PATH(p,g) iff PATH(p,g) & (E:x,y)( HEAD_OF_PATH(x,p,g) & END_OF_PATH(y,p,g) & x=y )
Def: GRAPH(g) ==>
successors(x,g) = {y (E:p)( PATH(p,g) & HEAD_OF_PATH(x,p,g) &
END_OF_PATH(y,p,g) )}
Def: SUBGRAPH(g',g) iff GRAPH(g) & (E:x,n,a)( g' = <n,a > & x in vertices(g) & n = {y successors(x,g)} & a = n x n cut edges(g))
Def: DIRECTED_ACYCLIC_GRAPH(g) DAG(g) iff DG(g) & ~(E:p)( CYCLIC_PATH(p,g))
The concept of graphs which is used in practice is usually more complex than shown before. This is caused by the fact that in many applications one is not only interested in the 'graph structure as such' (vertices connected by edges), but one has to connect the vertices and the edges with additional information. These additional information can be 'costs' (or 'weights') of passing from one vertex to another or can be information hosted by a vertex at a certain position represented as a vector of values, also called 'label'. Examples are the names of cities which are the labels of vertices in a graph or the keyvaluepairs of binary searchtrees.
To meet these requirements one has to enlarge the formal concept of graphs by additional labels and costs.
LABELED/ WEIGHTED GRAPH(g) LGRAPH(g) or LG(g) iff g = << V, R, VL, d1, EL, d2 >,E>
with
V := finite set of vertices (nodes)
R := finite set of root vertices
VL := finite set of vertex labels
d1 := dimensionality of vertex labels
EL := finite set of edgelabels ('costs', 'weights')
d2 := dimensionality of edge labels
E := finite set of edges (arcs)
R C V
E C ((V x VL^{d1}) x (V x VL^{d1})) x EL^{d2}
Thus an element of the set E has the general form: ((x,<vl_{1}, ...
,vl_{d1}>),(y,<vl_{1}, ... ,vl_{d1}>)),<el_{1}, ...
,el_{d2}>)
this represents an edge with vertices x and y and vertex x has the label <vl_{1},
... ,vl_{d1}> and vertex y has the label <vl_{1}, ...
,vl_{d1}> and the cost of passing from x to y (or back) is given by <el_{1}, ...
,el_{d2}>
Def: LGRAPH(g) ==>
vertices(g) = g_{1.1}
Def: LGRAPH(g) ==>
roots(g) = g_{1.2}
Def: LGRAPH(g) ==>
vertexlabels(g) = g_{1.3}
Def: LGRAPH(g) ==>
vertex_label_dimensionality(g) = g_{1.4}
values(g) = vertexlabels(g)
Def: LGRAPH(g) ==>
edgelabels(g) = g_{1.5}
costs(g) = weights(g) = edgelabels(g)
Def: LGRAPH(g) ==>
edge_label_dimensionality(g) = g_{1.6}
Def: LGRAPH(g)==>
edges(g) = g_{2}
Def: LGRAPH(g)==>
LABELED_EDGE(le,g) iff le in edges(g)
Def: LGRAPH(g)==>
LABELED_VERTEX(lv,le) iff LABELED_EDGE(le,g) & lv = proj1(proj1(le)) or lv = proj2(proj1(le))
Def: LGRAPH(g)==>
WEIGHTS(w,le) iff LABELED_EDGE(le,g) & w = proj2(le)
COSTS(c,le) iff WEIGHTS(w,le)
Def: LGRAPH(g) & LABELED_EDGE(le,g) ==>
head(le) = proj1(proj1(le))
tail(le) = proj2(proj1(le))
Def: LGRAPH(g)==>
NREFLEXIVE(E) iff E = edges(g) & (A:e)(LABELED_EDGE(e,g) ==> head(e) = tail(e) )
Def: LGRAPH(g)==>
NSYMMETRICAL(E) iff E = edges(g) & (A:e)(LABELED_EDGE(e,g) ==> (E:e')(head(e) = tail(e') & tail(e) =
head(e')) )
Axiom: LGRAPH(g)==> NREFLEXIVE(edges(g)) & NSYMMETRICAL(edges(g))
Axiom: LGRAPH(g)==> & r in roots(g) ==> ~(E:e)( e in edges(g) & r = tail(e) )
Def: LGRAPH(g)==>
ADJACENT(x,y,k) iff k in edges(g) & x= pro1(pro1(k)) & y = pro2(pro1(k))
Def: Def: LGRAPH(g)==>
CONNECT(k, x,y) iff ADJACENT(x,y,k)
ENDPOINTS(x,y,k) iff ADJACENT(x,y,k)
INCIDENT(x,y,k) iff ADJACENT(x,y,k)
Def: LGRAPH(g)==>
NEIGHBOURS(x,y) iff (E:k) ADJACENT(x,y,k)
Def: LSIMPLEX(g) iff LGRAPH(g) & (A:x,y)( x,y in vertices(g) ==> NEIGHBOURS(x,y))
Def: LABELED_DIRECTED_GRAPH(g) LDG(g) iff LGRAPH(g) & ~NSYMMETRICAL(edges(g)).
HEAD_OF(x,k,g) iff ( LDG(g) & k in edges(g) & x = head(k) )
TAIL_OF(y,k,g) iff ( LDG(g) & k in edges(g) & y = tail(k) )
LEADS_FROM_TO(x,y,k,g) iff ( HEAD_OF(x,k,g) & TAIL_OF(y,k,g) )
PREDECESSOR_OF(y,x,g) iff (E:k)( LEADS_FROM_TO(y,x,k,g) )
Def: SUCCCESSOR_OF(y,x,g) iff (E:k)( LEADS_FROM_TO(x,y,k,g) )
Def: LPATH(p,g) iff LDG(g) & SEQUENCE(p) & rng(p) C vertices(g) x vertex_labels(g)^{k} & (A:i,j)( i,j in dom(p) & j=i+1 ==> (E:w)( w in weights(g)^{s} & ((p_{i},p_{j}),w) in edges(g))
Def: LPATH(p,g) ==>
length(p) = dom(p)1 /* Number of edges, not vertices! */
Def: values(p,g) = { (i,v)  i in dm(p) & p_{i} in p & ((p_{i}, p_{j}),v) in edges(g) }
Def: HEAD_OF_LPATH(x,p,g) iff LPATH(p,g) & x = p_{1}
Def: END_OF_LPATH(y,p,g) iff LPATH(p,g) & y = p_{length(p)+1}
Def: LABELED_CYCLIC_PATH(p,g) iff LPATH(p,g) & (E:x,y)( HEAD_OF_LPATH(x,p,g) & END_OF_LPATH(y,p,g) & x=y )
Def: LABELED_DIRECTED_ACYCLIC_GRAPH(g) LDAG(g) iff LDG(g) & ~(E:p)( LABELED_CYCLIC_PATH(p,g))
Simply speaking is a labeled tree in the general case a labeled directed acyclic graph with only one rootvertex from which all vertices of the tree can be reached.
LABELED_TREE(t) LT(t) iff LDAG(t) & roots(t) = 1 & LSIMLPLEX(t)
There is some special terminology connected with trees different to graphs:
Def: LABELED_TREE(t) ==>
nodes(t) = vertices(t)
Def: PARENT(x,y) iff PREDECESSOR_OF(x,y)
Def: CHILD(y,x) iff SUCCCESSOR_OF(y,x)
Def: children(x) = {y CHILD(y,x) }.
Def: LEAF(l,t) iff ( LABELED_TREE(t) & l in nodes(t) & ~(E:e)( e in edges(t) & l = head(e) ) )
Def: LABELED_TREE(t) ==>
leafs(t) = {x LEAF(x,t) }
Def: LABELED_TREE(t) ==>
inner_nodes(t) = nodes(t)\leaves(t)
Def: LABELED_TREE(t) ==>
PATH_TO(p,x) iff LPATH(p,t) & (E:r)( r in roots(t) & HEAD_OF_LPATH(r,p,t) & END_OF_LPATH(x,p,t) )
Def: LABELED_TREE(t) ==>
PATH_FROM(p,x) iff LPATH(p,t) & (E:l)( LEAF(l,t) & HEAD_OF_LPATH(x,p,t) & END_OF_LPATH(l,p,t) & ~(E:l',p')(
LEAF(l',t) & LPATH(p',t) & HEAD_OF_LPATH(x,p',t) & END_OF_LPATH(l',p',t) & lenth(p) < length(p') ))
Def: LABELED_TREE(t) & PATH_TO(p,x) ==>
depth(x) = length(p)
Def: LABELED_TREE(t) & PATH_FROM(p,x) ==>
height(x) = length(p)
Def: LABELED_TREE(t) ==>
ORDER(n,t) iff (A:x)( x in nodes(t) ==> children(x) < n )
Def: LABELED_TREE(t) & PATH_FROM(p,x) ==>
height(t) = length(p) iff x in roots(t) & ~(E:p')( PATH_FROM(p',x ) & length(p') >
length(p) )
Def: COMPLETE(t) iff LABELED_TREE(t) & (E:n)( ORDER(n,t) &
(A:x,p)( PATH_TO(p,x) & LEAVE(x,t) ==> depth(x) = n ) &
(A:i,p)( PATH_TO(p,x) ) & ~LEAVE(x,t) & i in n & depth(x)= i ==> children(x) = n )
Trees are simplified forms of labeled trees:.
TREE(t) iff LT(t) & vertex_labels_dimensionality(t) = 0 & edge_labels_dimensionality(t) = 0
A binary tree can simply de defined as:
LABELED_BINARY_TREE(t) LBT(t) iff LABELED_TREE(t) & ORDER(2,t)
BINARY_TREE(t) BT(t) iff TREE(t) & ORDER(2,t)
A binary search tree is informally something 'between' a labeled binary tree and a binary tree. A binary search tree has no weights, but 2dimensional labels with the first elemet of the label called 'key' and the second element of the label called 'value'. These labels are distributed over the tree in a way that they follow some order.
Def: BINARY_SEARCH_TREE(t) BST(t) iff LABELED_BINARY_TREE(t) & vertex_labels_dimensionality(t) = 2 &
edge_labels_dimensionality(t) = 0
Def: BINARY_SEARCH_TREE(t) & LABELED_EDGE(e,g) & LABELED_VERTEX(v,e) ==>
label(v) = proj2(v)
key(v) = proj1(label(v))
value(v) = proj2(label(v))
Def: BINARY_SEARCH_TREE(t) & x in nodes(t) x
vertex_labels(t)^{vertex_labels_dimensionylity(t)} ==>
keys(x,t) = {k (E:y)( y in
successors(x,t) & k=key(y) ) }
Axiom: BINARY_SEARCH_TREE(t) & a in nodes(t) & x,y in children(a) ==> (A:i)(
i in keys(x,t) ==> ~(E:j)( j in keys(y,t) & j > i )) )
/* In this definition no 'leftrightorder' has bee used */
Will be explained in the next lectures when we will deal with AVLtrees and RedBlacktrees.
Many problems can succesfully be represented within the framework of graphs. In the following we are giving a short overview of some of these problems which are of practical importance. Some of these problems we will analyze further in the upcoming lectures, but not all, because the time is too short.
A digraph can be interpreted as a weakly ordered structure where the direction of the edge is representing the 'order'. In a topological sort one maps the set of vertices with cardinality V in a subset of the natural numbers with top_sort: V > V such, that the ordering holds in V. It follows from this, that a topological sorted graph can not have cycles, because in this case one would have a pair like (C,C) which is contradicting the ordering relation (see diagram).
Cyclic digraph with no topological order
More detailed:
Assume that an edge is a weakly ordered structure like '<'. It holds e.g. that TRANS(<) & ¬SYM(<). Thus if we translate the graph from the diagram into relational expressions we are getting the following subset: C < D, D < E, E < F and F <C. From C < D, D < E, E < F it follows by TANSITIVITY that C < F. But then we have a contradiction because C < F and F <C causes the statements SYM(<) and ¬SYM(<).
Thus it follows that a topological sorted graph is acyclic and therefore includes only one vertex x with indegree(x) = 0.
Therefore one can use an algorithm for testing of topological sorting to check the property of being cyclic or acyclic.
The transitive closure of a graph is connected to the problem of finding all connections between vertices of a graph. A possible definition that c is the reflexive transitive closure of g could be: RTC(c,g) iff GRAPH(g) with g = <V,E > & c = <V,E* > such that (v,v') in E* iff there is a path in g from v to v'.
In the diagram below there are the following elements in E*: {(C,D,), (C,F),(C,E),(C,C),(C,A),(C,B), (D,F), (D,E), (E,C)}
Digraph and transtitive closure
Thus to compute the transitive closure of a graph g can be a starting point to find some optimizations.
Traversing a graph g means generally to 'visit' every vertex from some arbitrarily selected starting point x at least once. A simple, linear version of such a search could be to take the set E of edges(g), organize the set as a list and look to every edge of E. If we declare the edges(g) in the beginning as the borderset distinguished from the visitedset, then the search continus until the borderset is empty.
One interesting aspect can be to search in a graph g all connected subgraphs. One can elaborate these searches to reveal more about the struture of the graph being searched.
In many cases is the edge (v,v') of a graph g connected with some properties which represent the 'length' or the 'costs' or something else of moving from vertex v to vertex v'. In these cases it is interesting to know the 'shortest' path in the sense to have the path with the minimal 'length' or minimal 'costs'.
A special case is the 'critical path'. In most projects depends the final vertex the goal mostly not only from one single path but from the interactions of different paths where one path is depending from some other paths. Here it is an interesting question what is the shortest path constrained by all the other paths.
Related to the question of shortest path in a graph g is the question of a minimum spanning tree. A spanning tree of an undirected graph g is a tree t formed by graph edges that connect all the vertices of graph g. The cost of the spanning tree ist the sum of the costs of the edges in the tree. The minimum spanning tree is a connected subgraph of g that spans all vertices at minimum cost. A graph g is connected if there exists a path p for all vertices in g. Because a minimum spanning tree of a subgraph of g exists only if the the subgraph of g is connected one can use the algorithm for the computation of mimum spanning trees as a test of connectivity.
Another interesting problem is the question of the computation of the flow of certain capacities (cars, passenger, datapakets, goods, visitors...) through a network.
A heritagegraph represents the relationship between parents and children over generations, where the vertices are labeled (names, dates, living/ dead,...) and with regard to these information one has to compute special predicates representing special types of relationships and the amount of inhertance someone can inherit by law.
A very common problem is the optimal mapping of employees with certain skills to tasks in projects, or of teaching people to lectures, resources and timeslots.
In the following you will find a list of topics taken from the preceding text. Try to worke out the following tasks with regard to some of the mentioned topics:
Create an examplegrah which could be an example for the selected topic..
Try to give a formal definition of this graph.
Try to find an algorithm to solve the problem.
Describe whiche data structure you are using to represent the graph and which data structures to support the actual searchoperations of the algorithm.
Compute the complexity of your algorithm.
In which way can you minimize the complexity of your algorithm with regard to your datastructures?
List of possible topics:
Topological Sorting
Transitive Closure
Traversing a Graph
Shortest Path
Critical Path
Minimal spanning Trees
Capacities in Networks
InheritanceGraphs
Mapping