faqts : Computers : Programming : Languages : Tse : Math

+ Search
Add Entry AlertManage Folder Edit Entry Add page to http://del.icio.us/
Did You Find This Entry Useful?

4 of 14 people (29%) answered Yes
Recently 2 of 10 people (20%) answered Yes

Entry

TSE: Random: Number: Generator: Check: How random is random number generator? [chi square]

Oct 3rd, 2004 17:08
Knud van Eeden,


----------------------------------------------------------------------
--- Knud van Eeden --- 03 October 2004 - 10:40 pm --------------------

TSE: Random: Number: Generator: Check: How random is random number 
generator? [chi square]

---

The basic idea is that the frequency of your values should be about
the same as the frequency theoretically expected.

If that is the case then the sum of the squares of the differences
between the frequency of your values and frequency of the expected
values is almost zero.

Thus you should get a value near zero for each.

Thus is the sum of this values near zero (though this sum will
grow if more values are involved, as it are all positive numbers
because of squaring).

---
---

Method: Chi square method

1. Suppose your run this random generator to throw values 100000 times.

2. Each possible value of this random generator is by design between 1
   and &FFFF, or thus between 1 and 65535, and all chances are equal
   here to each other, or thus 1 / 65535.

3. You create an array with 65535 entries.

4. Each time you throw a particular value (e.g. 1532), a counter for
   this value is increased with 1, thus ++arrayvalue[ 1532 ].

   (you could simulate this by increasing the value on that linenumber
    1532 of a file with 1)

--- cut here: begin --------------------------------------------------

INTEGER I = 0

INTEGER J = 0

dim array[ 65535 ]

for I = 1 TO 100000

 J = FNYourRandomGenerator() // this should return an integer

 array[ J ] = array[ J ] + 1

endfor

--- cut here: end ----------------------------------------------------

4. What you expect for each value is at the end:

   1.

     ( total amount of throwing ) . ( probability for each throw )

   2. Here all values are 1 / 65535

   3. thus for this particular example here, you expect

      ( 100000 throws ) . ( ( 1 / 65535 ) per throw )

      or thus you expect each array value to be around
      100000 / 65535 or thus about 2.

      Thus you should see that every array value from 1 to 65535 should
      contain a value of around 2.

5. When finished throwing, you go through each array value, and you
   calculate

((value array[I]) - ( (total number of throws) . (chance per throw))^2
 --------------------------------------------------------------------
           ( (total number of throws) . (chance per throw)

6. or thus as a program:

--- cut here: begin --------------------------------------------------

DOUBLE sum = 0

DOUBLE v = 0

INTEGER I = 0

for I = 1 to 65535

v = (( value in this array[ I ]) -(100000 . ( 1/65535)))^2

v = v / (100000 . ( 1/65535))

sum = sum + v

endfor

--- cut here: end ----------------------------------------------------

check then in a table for degrees of freedom equals 100000 - 1, or thus
99999 for the found final sum value.

---

Now to understand the result, if you get a sum value of about zero, it
means you have a very good random generator (as what you get is equal
to what you expected, that is for each value a value of
100000 . 1/65535).

If the sum value is very much larger than zero, the less likely it is
your random generator is really random.

---
---

Hereby a program in TSE which will let you visually inspect the values:

--- cut here: begin --------------------------------------------------

 // library: math: set: number: random: seed <author>Sammy 
Mitchell</author> (filenamemacro=setmansr.s) [kn, ri, su, 03-10-2004 
17:31:29]

 PROC PROCMathSetNumberRandomSeed( INTEGER multiplierI )

  // e.g. PROC Main()

  // e.g.  PROCMathSetNumberRandomSeed( 0 )

  // e.g.  REPEAT

  // e.g.   Warn( FNMathGetNumberRandomI() )

  // e.g.  UNTIL FALSE

  // e.g. END

  // e.g.

  // e.g. <F12> Main()

  INTEGER hourI = 0

  INTEGER minuteI = 0

  INTEGER secondI = 0

  INTEGER hundredI = 0

  IF ( multiplierI == 0 )

   // Randomize using system clock

   GetTime( hourI, minuteI, secondI, hundredI )

   multiplierI = secondI * 100 + hundredI

   multiplierI = multiplierI * 0ffffh / 5999

  ELSE

   // Must NOT be greater than 0ffffh

   multiplierI = multiplierI & 0ffffh

  ENDIF

  // Add 1 IF even TO make it odd

  multiplierI = multiplierI | 1

  SetGlobalInt( "randproc_multiplier", multiplierI )

 END

 // library: math: get: number: rand <author>Sammy Mitchell</author> 
(filenamemacro=central.s) [kn, ri, su, 03-10-2004 17:31:45]

 INTEGER PROC FNMathGetNumberRandomI()

  INTEGER seed = 37584381

  INTEGER multiplier = GetGlobalInt( "randproc_multiplier" )

  INTEGER result = 0

  result = seed * multiplier

  multiplier = result & 0ffffh

  SetGlobalInt( "randproc_multiplier", multiplier )

  result = result & 0ffff00h

  result = result SHR 8

  RETURN( result )

 END

 // library: math: check: random: generator: test [kn, ri, su, 03-10-
2004 23:15:56]

 PROC PROCMathCheckRandomGeneratorTest( INTEGER maxI, STRING 
filenameS )

  // e.g. PROC Main()

  // e.g.  PROCMathCheckRandomGeneratorTest( 100000, "c:\ddd.txt" )

  // e.g. END

  // e.g.

  // e.g. <F12> Main()

  INTEGER I = 0

  INTEGER minI = 0

  INTEGER J = 0

  INTEGER valueI = 0

  STRING s[255] = ""

  INTEGER fileP = 0

  // dim array[ 65535 ]

  //

  FOR I = 1 TO 65535

   s = "array" + STR( I )

   valueI = SetGlobalInt( s, 0 )

  ENDFOR

  //

  PROCMathSetNumberRandomSeed( 0 )

  //

  FOR I = minI TO maxI

   //

   J = FNMathGetNumberRandomI() // this should return an integer

   //

   // increase the array value at that position J with 1

   //

   s = "array" + STR( J )

   message( s )

   valueI = GetGlobalInt( s )

   SetGlobalInt( s, valueI + 1 )

   //

  ENDFOR

  //

  // check the values (e.g. visually)

  //

  warn( "Showing the result" )

  //

  EditFile( filenameS )

  fileP = GetBufferId( filenameS )

  //

  FOR I = 1 TO 65535

   s = "array" + STR( I )

   valueI = GetGlobalInt( s )

   Message( I, " : ", valueI )

   AddLine( Format( STR( I ) : 8 ) + Format( STR( valueI ) : 5 ), 
fileP )

  ENDFOR

  //

 END

 PROC Main()

  PROCMathCheckRandomGeneratorTest( 100000, "c:\ddd.txt" )

 END

 <F12> Main()

--- cut here: end ----------------------------------------------------

---
---

I wrote a little test program, based upon my writing about the Chi
Square test, which basically increases a global integer array value
with 1 each time that integer value is generated by the random
generator

(you similate an array in TSE by building a string with concatenation,
e.g. "array" + "1", or thus "array1", "array" + "2", or thus "array2",
..., "array" + "65535", or thus "array65535" and set and get its
values).

You can see the generated results in the file 'c:\ddd.txt'.

And it showed clearly that a lot of values between 1 and 65535 are not
generated, thus equal to 0, while all others are or 6 or 7.

What you expect is each value around 100000 . ( 1 / 65535 ), or thus 0,
1 or 2 and possibly 3.

---

This is similar to throwing a dice 100000 times.

You expect that the one will occur 100000 / 6 times.
You expect that the two will occur 100000 / 6 times.
You expect that the three will occur 100000 / 6 times.
You expect that the four will occur 100000 / 6 times.
You expect that the five will occur 100000 / 6 times.
You expect that the six will occur 100000 / 6 times.

If you then create an
array of 6 entries, you should expect in each entry a value of 100000 .
( 1 / 6 ), or thus a value around 16667.

---

Or to use the analogy, you are throwing here a dice with 65535 sides.

If this dice is not biased, you expect each side to occur with an equal
frequency.

So each side should occur theoretically

(total amount of throws) . (chance on each side)

or thus here

(total amount of throws) . ( 1 / ( maximal number of sides ) )

or thus here

( 100000 ) . ( 1 / 65535 )

or thus

You expect that the value 1 will occur 100000 / 65535 times.
You expect that the value 2 will occur 100000 / 65535 times.
You expect that the value 3 will occur 100000 / 65535 times.
You expect that the value 4 will occur 100000 / 65535 times.
You expect that the value 5 will occur 100000 / 65535 times.
You expect that the value 6 will occur 100000 / 65535 times.
You expect that the value 7 will occur 100000 / 65535 times.
You expect that the value 8 will occur 100000 / 65535 times.
...
You expect that the value 65533 will occur 100000 / 65535 times.
You expect that the value 65534 will occur 100000 / 65535 times.
You expect that the value 65535 will occur 100000 / 65535 times.

If you then create an
array of 65535 entries, you should expect in each entry a value of

 100000 . ( 1 / 65535 ),

or thus a value around 2.

---

You do not even have to do the real chi square test, as you can
visually inspect the values, and see if your expectation is
fulfilled at least approximately. If it is not the case, your
dice looks to be biased, i.e. your random generator is not
really generating random values.

---

Here some values in c:\ddd.txt:

       1    6
       2    0
       3    6
       4    0
       5    6
       6    0
       7    6
       8    0
       9    6
      10    0
      11    6
      12    0
      13    6
      14    6
      15    0
      16    0
      17    6
      18    0
      19    6
      20    0
      21    6
      22    0
      23    6
      24    0
      25    6
      26    0
      27    6
      28    0
      29    6
      30    0
      31    6
      32    0
      33    6
      34    0
      35    6
      36    0
      37    6
      38    0
      39    6
      40    0
      41    6
      42    0
      43    6
      44    6
      45    0
      46    0
      47    7
      48    0
      49    6
      50    0
      51    6
      52    0
      53    6
      54    0
      55    6
      56    0
      57    6
      58    6
      59    0
      60    6
      61    0
      62    0
      63    6
      64    0
      65    6
      66    0
      67    6
      68    0
      69    6
      70    0
      71    7
      72    6
      73    0
      74    6
      75    0
      76    6
      77    0
      78    0
      79    7
      80    0
      81    6
      82    0
      83    6
      84    0
      85    6
      86    0
      87    6
      88    6
      89    0
      90    6
      91    0
      92    0
      93    6
      94    0
      95    6
      96    0
      97    6
      98    0
      99    6
     100    0
     101    6
     102    6
     103    0
     104    6
     105    0
     106    6
     107    0
     108    0
     109    6
     110    0
     111    6
     112    0
     113    6
     114    0
     115    6
     116    6
     117    0
     118    6
     119    0
     120    6
     121    0
     122    7
     123    0
     124    0
     125    6
     126    0
     127    6
     128    0
     129    6
     130    0
     131    6
     132    6
     133    0
     134    6
     135    0
     136    6
     137    0
     138    0
     139    6
     140    0
     141    6
     142    0
     143    6
     144    0
     145    6
     146    6
     147    0
     148    6
     149    0
     150    7
     151    0
     152    7
     153    0
     154    0
     155    6
     156    0
     157    6
     158    0
     159    7
     160    6
     161    0
     162    6
     163    0
     164    6
     165    0
     166    6
     167    0
     168    6
     169    0
     170    0
     171    6
     172    0
     173    6
     174    6
     175    0
     176    6
     177    0
     178    6
     179    0
     180    6
     181    0
     182    6
     183    0
     184    0
     185    6
     186    0
     187    6
     188    0
     189    6
     190    6
     191    0
     192    6
     193    0
     194    6
     195    0
     196    6
     197    0
     198    7
     ...    ...
   65516    0
   65517    0
   65518    0
   65519    6
   65520    0
   65521    0
   65522    0
   65523    0
   65524    0
   65525    6
   65526    0
   65527    0
   65528    0
   65529    6
   65530    0
   65531    0
   65532    0
   65533    6
   65534    0
   65535    0

---

Given that I have not made any mistakes in my programming, in my
opinion it looks like this is not a random generator at the moment, I
think so.

As what you should expect, based on equal probability for each value,
is instead something of a value around 2, like:

       1    1
       2    2
       3    1
       4    2
       5    1
       6    2
       7    1
       8    2
       9    1
      12    2
      11    1
      12    2
      13    1
      14    1
      15    2
      16    2
      17    1
      18    2
      19    1
      22    2
      21    1
      22    2
      23    1
      24    2
      25    1
      26    2
      27    1
      28    2
      29    1
      32    2
      31    1
      32    2
      33    1
      34    2
      35    1
      36    2
      37    1
      38    2
      39    1
      42    2
      41    1
      42    2
      43    1
      44    1
      45    2
      46    2
      47    2
      48    2
      49    1
      52    2
      51    1
      52    2
      53    1
      54    2
      55    1
      56    2
      57    1
      58    1
      59    2
      62    1
      61    2
      62    2
      63    1
      64    2
      65    1
      66    2
      67    1
      68    2
      69    1
      72    2
      71    2
      72    1
      73    2
      74    1
      75    2
      76    1
      77    2
      78    2
      79    2
      82    2
      81    1
      82    2
      83    1
      84    2
      85    1
      86    2
      87    1
      88    1
      89    2
      92    1
      91    2
      92    2
      93    1
      94    2
      95    1
      96    2
      97    1
      98    2
      99    1
     122    2
     121    1
     122    1
     123    2
     124    1
     125    2
     126    1
     127    2
     128    2
     129    1
     112    2
     111    1
     112    2
     113    0
     114    2
     115    1
     116    1
     117    2
     118    1
     119    2
     122    1
     121    2
     122    2
     123    2
     124    2
     125    1
     126    2
     127    1
     128    2
     129    1
     132    2
     131    1
     132    1
     133    2
     134    1
     135    2
     136    1
     137    2
     138    2
     139    1
     142    2
     141    1
     142    2
     143    1
     144    2
     145    1
     146    1
     147    2
     148    1
     149    2
     152    2
     151    2
     152    2
     153    2
     154    2
     155    1
     156    2
     157    1
     158    2
     159    2
     162    1
     161    2
     162    1
     163    2
     164    1
     165    2
     166    1
     167    2
     168    1
     169    2
     172    0
     171    1
     172    2
     173    1
     174    1
     175    2
     176    1
     177    2
     178    1
     179    2
     182    1
     181    2
     182    1
     183    2
     184    2
     185    1
     186    2
     187    1
     188    2
     189    1
     192    1
     191    2
     192    1
     193    0
     194    1
     195    2
     196    1
     197    2
     198    2
     ...    ...
   65516    1
   65517    2
   65518    1
   65519    2
   65520    2
   65521    1
   65522    2
   65523    1
   65524    2
   65525    1
   65526    2
   65527    2
   65528    1
   65529    3
   65530    2
   65531    1
   65532    1
   65533    2
   65534    0
   65535    2

---
---

Note:

About using the internal clock: this is certainly not a random
function, but a linearly growing sawtooth (which resets each say 60
units to 0 again). You are basically sampling values from a sawtooth
so.

  /|   /|   /|   /|       /|
 / |  / |  / |  / |      / |
/  | /  | /  | /  | ... /  |

---
---

Internet: see also:

---

Math: Number: Random: Link: Overview: Can you give an overview of 
links about random numbers?
http://www.faqts.com/knowledge_base/view.phtml/aid/31689/fid/1712

----------------------------------------------------------------------