Chi Square Test

Knuth (1981)[179]:39ff gives three examples of throwing 2 dices 144 times in each test:

DICETEST1 =[2 4 10 12 22 29 21 15 14 9 6]
DICETEST2 = [4 10 10 13 20 18 18 11 13 14 13]
DICETEST3 = [3 7 11 15 19 24 21 17 13 9 5]

With the assumed probabilities

PROBABILITIES= [2 3 4 5 6 7 8 9 10 11 12; 1/36 1/18 1/12 1/9 5/36 1/6 5/36 1/9 1/12 1/18 1/36]

this gives the following values with a chi-square test:

 -->N=144,SHOW=0, [CHISQUARE] =chisquare2(N,DICETEST1,PROBABILITIES,SHOW)
...
 CHISQUARE  =
 
    7.1458333  
 
-->N=144,SHOW=0, [CHISQUARE] =chisquare2(N,DICETEST2,PROBABILITIES,SHOW)
...
 CHISQUARE  =
 
    29.491667  
 
-->N=144,SHOW=0, [CHISQUARE] =chisquare2(N,DICETEST3,PROBABILITIES,SHOW)
...
 CHISQUARE  =
 
    1.1416667

While the value '1.1416667' is far below and the value '29.491667' is far too high is the value '7.1458333' acceptable according to the Chi-Sqaure Distribution (cf. [179]:41).

If we apply the chi-square test to the actual program by combining the equidistribution-test with the chi-square-test then we are getting results which are not completely satisfying. In the program zcs_wood1.sce from April-19, 2010 we can distinguish three types of generating random numbers and two versions of the handling of the chi-square test (cf. list below).

  1. countingrands(N,C,SHOW) and chisquare1(N,C,EQUISUMS,SHOW)
  2. countingrands(N,C,SHOW) and chisquare1b(N,C,EQUISUMS,SHOW)
  3. countingMatrix1(N,C,SHOW) and chisquare1(N,C,EQUISUMS,SHOW)
  4. countingMatrix1(N,C,SHOW) and chisquare1b(N,C,EQUISUMS,SHOW)
  5. countingMatrix2(N,C,SHOW) and chisquare1(N,C,EQUISUMS,SHOW)
  6. countingMatrix2(N,C,SHOW) and chisquare1b(N,C,EQUISUMS,SHOW)

If we apply these options and repeating every type 5 times then we can see (cf. tables below), that the two versions of the chi-square test operate completely in the same manner, but the way how the random numbers are generated shows differences. There is only one series of 5 runs which has all values within the 'allowed' area of about '4.5' to '11.5', this is a run with a type '1' and '2' configuration with N=384.

TYPE (N=81) 1 2 3 4 5
1 4.88 7.77 6.22 2.44 6.88
2 4.88 7.77 6.22 2.44 6.88
3 6.88 17.11 2.44 8.88 10
4 6.88 17.11 2.44 8.88 10
5 9.11 10.66 5.33 6 4
6 9.11 10.66 5.33 6 4

TYPE (N=192) 1 2 3 4 5
1 9.28 15.18 7.5 6.37 8.81
2 9.28 15.18 7.5 6.37 8.81

TYPE (N=192) 1 2 3 4 5
5 4.5 7.87 7.5 6.28 3
6 4.5 7.87 7.5 6.28 3

TYPE (N=384) 1 2 3 4 5
1 9.46 9.18 6.42 10.73 5.34
2 9.46 9.18 6.42 10.73 5.34

TYPE (N=384) 1 2 3 4 5
5 18.28 4.59 4.45 6.42 7.82
6 18.28 4.59 4.45 6.42 7.82

We draw the 'empirical' conclusion that the type '1' and the type '2' configurations seem to work within the theoretically accepted area of the chi-square distribution if the number of events is about 384.

Gerd Doeben-Henisch 2012-03-31