Evolving Clustering Method (ECM)

The first method which will be introduced is the Evolving Clustering Method (ECM) as described in Kasabov (2002) [161]:146ff and Kasabov 2007 [151]:61ff. This method is quite satisfying for the task at hand but it will show up severe deficiencies -like other methods- in those cases which are most characteristic for our moving scenarios.

The main idea of the ECM method is to map data points into a finite set of categories $ c_{i} \in Cat$ which are defined by a center coordinate $ Cc_{i}$ with a radius $ Ra \in \Re$. While the center coordinate as well as the radius will be built up in an unsupervised manner some category threshold $ D_{thr}$ is assumed as some apriori knowledge. This apriori threshold induces a clear limit to the growth of the categories. It must hold:


$\displaystyle 0 \leq$ $\displaystyle Ra$ $\displaystyle < D_{thr}$ (8.1)

From this it is clear that the selection of the category threshold $ D_{thr}$ is decisive whether and how the ecm algorithm will identify categories (= 'clusters', 'regions', 'areas', 'patterns',...). The procedure of forming categories runs then as follows (cf. Kasabov 2002 [161]:146):

In this algorithm the distance $ D = \mid\mid x - c\mid\mid$ is understood as the Euclidean Distance defined as:


$\displaystyle \mid\mid x - y \mid\mid$ $\displaystyle =$ $\displaystyle [\frac{1}{n}\sum^{n}_{i=1}{\mid x_{i} - y_{i}\mid}^{2}]^{\frac{1}{2}}$ (8.2)

A user-defined scilab functiondoing this job is the following one:

//*******************************************************
// Computes the euclidean distance between two vectors x,y

function [d]=euclideanDistance(x,y)
  
  lx=length(x), ly=length(y)
  if lx <> ly then error('SIZE of VECTORS DO NOT MATCH'),
  else
    s=0
    for i=1:lx
    s=s+abs(x(i)-y(i))^2, 
  end
  d=sqrt(s/lx)
  end
  
endfunction
//***************************************************

Another often used method to define the similarity between two vectors v,w is the computation of the angle $ \theta$ between these two vectors:


$\displaystyle \textbf{v}\cdot \textbf{w}$ $\displaystyle =$ $\displaystyle \mid\mid\textbf{v}\mid\mid \mid\mid\textbf{w}\mid\mid cos \theta$ (8.3)
$\displaystyle cos \theta$ $\displaystyle =$ $\displaystyle \frac{\textbf{v} \textbf{w}}{\mid\mid\textbf{v}\mid\mid\mid\mid\textbf{w}\mid\mid}$ (8.4)
$\displaystyle \theta$ $\displaystyle =$ $\displaystyle cos^{-1} \left(\frac{\textbf{v} \textbf{w}}{\mid\mid\textbf{v}\mid\mid\mid\mid\textbf{w}\mid\mid}\right)$ (8.5)

But as a comparison shows (cf. table 8.2.1) is the angle-method less detailed than the euclidean distance measure, because the angle-method can not distinguish between different distances within the 'same' direction:

Nr. input x input y euclideanDistance() cosvec()
1 $ [2, 2]$ $ [2, 3]$ 0.7071068 $ 11.309932^{o}$
2 $ [2, 2]$ $ [3, 3]$ 1.0 $ 0^{o}$
3 $ [2, 2]$ $ [3, 2]$ 0.707106 $ 11.309932^{o}$
4 $ [2, 2]$ $ [3, 4]$ 1.5811388 $ 8.1301024^{o}$
5 $ [2, 2]$ $ [4, 4]$ 2.0 $ 0.0^{o}$
6 $ [2, 2]$ $ [4, 3]$ 1.5811388 $ 8.1301024^{o}$
7 $ [2, 2]$ $ [4, 5]$ 2.5495098 $ 6.3401917^{o}$
8 $ [2, 2]$ $ [4, 6]$ 3.1622777 $ 11.30993^{o}$
9 $ [2, 2]$ $ [5, 5]$ 3.0 $ 0.0^{o}$
10 $ [2, 2]$ $ [5, 6]$ 3.5355339 $ 5.1944289^{o}$

In scilab one can apply the ecm algorithm with the following command: CAT=ecm(INP3). This activates the user-defined function ecm with the input argument $ INP3$ (the normalized input data) and computes possible categories stored in $ CAT$. The category threshold is set to $ \theta=0.1$. After onle one pass (!) this algorithm gives three cluster centers which are located within the three objects (a description of the scilab implementation of the ecm algorithm is given in the mathematical appendix).

    0.2029629    0.3464794    
    0.6150446    0.6212005    
    0.8897657    0.2091188

Figure 8.3: Output data case 1 combined with input data; output black diamonds
\includegraphics[width=4.0in]{case1_output.eps}

Again one can use the plot-command in scilab to print the new data embedded into the old data:
plot2d(CAT(:,1), CAT(:,2), style=-4, rect=[-0.1,-0.1,1.1,1.1]) (cf. figure 8.3).

Within the ecm algorithm there is a special sub-algorithm linePartition() managing the adaptation of the old cluster center Cc to a new position following a line between the old cluster center $ Cc_{old}$ and the new datapoint $ INP_{i}$. This algorithms works as follows:

//*******************************************************
// Computes a new point P(x,y) which intersects a line between 
// P2(x2,y2) and P1(x1,y2) in the proportion n:m where
// n = P2P and m = PP1.

function [P]=linePartition(P1,P2,n)
  
  D=euclideanDistance(P1,P2)
  m=D-n
  lm=m/n  //lm denotes lambda
  
  x= (P1(1)+lm*P2(1))/(1+lm)  //P1(1) = x1, P1(2) = y1,...
  y = (P1(2)+lm*P2(2))/(1+lm)

  P=[x y]
  
endfunction
**********************************************************
Gerd Doeben-Henisch 2012-03-31