Notes 

(0012434)

wiz

080208 21:02


uploaded a test based on Nicolas's comments about
all large integer randoms being odd.
A fix should make this test pass. ( or help debug the test! ;)
A fix might also suggest a more extensive test but this will do for a start.
When the test fails debug it and explore the rejects.
Yours in curiosity and service, Jerome Peace 


(0012477)

wiz

081008 21:38


more analysis:
The limitation is not necessarily in the generator but in the combining of the floating point and large integer in nextInt:
and of course this is a problem
24 factorial + 1000000 = (24 factorial * 1.0)
will answer true.
while
( 24 factorial  (24 factorial * 1.0 ) asInteger ) = 0
will answer false.
There is a truncation difficulty with floats.
I say difficulty rather than bug because it is not spec'ed to work otherwise.
======
I finally got around to looking inside the Random class to see exactly what
it was doing.
Basiclly it generates a 31 bit random integer from a 31 bit seed
for each next request.
It answers a float based on seed / ((2 raisedTo: 31) 1) . So the float looks like 53 bits of randomness but really represents only 31.
Furthermore the internal state of the generator is just the seed 31 bits.
The random protocol does not provide direct access to the seed. and the method for generating the next seed #nextValue is catagorized as private.
nextValue returns a 31 bit integer between 0 and the modulus.
===
So a solution to the problem would require LargePositiveInteger to have its own method for #atRandom: and in that method to do all the arithmetic as integers i..e. not mix float and integer aritmetic to avoid the floating point round off.
Then it would require a generator that could produce a large number of uncorrelated random integers on a separate passes.
The protocol for Random should probably be expanded to allow for an integer to be returned as the seed gets updated. Or the new generator could reimplement #getInt: to handle largeInts gracefully.



(0012503)

wiz

081608 20:32


uploaded RandomnextLargeInt.st
This is just a patched version of nextInt: which does all the arithmetic
with integers rather than floats.
This will return randoms with the low order bits distributed.
And it demonstrates the problem is with the floating point/ integer arithmetic.
Obviously there is also a limitation in that no more that 2^311 choices can be had by any one random.
And it raises some questions.
Examining a random show that all of the parameters are stored as floating points. Even the seed is stored in floating point form after the first iteration of next is performed.
So why is algorithm limiting itself to 31 bits of randomness?
How many bits would be practical to get out of the generator considerng
everything is stored as fp with a 53 bit mantissa?
How (Where, this is the age of the internet) would you find one.
Alternately, could a faster version of this rand generator be writen by using only small integer arithmetic? (I suspect the practical answer would be no or not worthwile, but the question itself is worth raising.)
Yours in service and curiosity, Jerome Peace 


(0012511)

nicolas cellier

082208 21:54


Random comments clearly states why Float are used.
Random>>#nextValue comments clearly states the 31 bit limit.
You clearly know that provided method in Integer arithmetic won't work,
Because you clearly have to subclass Random for the LargeInteger case...
By the way, (a / b) truncated = (a quo: b),
while (a / b) floor = (a // b).
Not so clear however how to implement a GOOD PRNG able of arbitrary Interval size.
GOOD means at least:
 uniform density
 cycles not too short (larger than the targeted LargeInteger)
 whiteness, that is uncorrelated sequences,
For a silly example of what whiteness is about, (seed := seed + 1 \\ m) is uniform, has long sequences (of length m) whatever the seed, but is absolutely not a good candidate regarding correlation!
To have an idea about correlation properties, generate a pseudo random sequence seq of length N, and compute autocorrelation like this:
avg := seq avg.
autocorr := Array new: N.
0 to: N  1 do: [:j 
autocorr at: j + 1 put:
((1 to: N) detectSum: [:i  (seq at: i)  avg * ((seq atWrap: i + j)  avg)]) / (N1)].
If you plot (autocorr at: j) versus j, the first element of autocorr ideally should be a high value (seq variance), while every other element should be near zero.
For a uniform distribution between 0 and m1, average is (m1)/2, while variance is (m/2) squared / 12.
PRNG are usually cyclic, so if N grows enough, you'll find the exact same sequence again and get a periodic autocorrelation function.
Be prepared for more maths, notably group theory, number theory etc...
Good luck. 


(0012551)

wiz

083008 01:02


Hi Nicholas,
Thank for chiming in. Useful knowledge to be reminded of.
Not too surprised to find the routine doesn't completely work.
I am still on the searching/learning curve here.
It was not meant as a solution but as a demo of the float/integer difficulty.
Also for positives numbers truncation and floor are identical cousins. So I did not try to hard to distinguish because the code I was swiping from didn't.
Number class takes care of sign removal/addition before delagating getInt to LargePositiveInteger.
And again this is analysis (and solution searching). I don't claim to have reached the end of that phase.
=====
I am also curious If the seed equal to the modullous is possible to produce in the chain.
Since the modulous is what next divides by to obtain the fraction., if the modulus (16r7FFFFFFF) can be produced then the interval that next returns would be [0 1] or maybe (0 1] and not what the comment says [0 1).
I don't think next can produce a 0 since the seed can not be zero unless it is seeded as such. The lowest it could go would be 1/modulus.
Can you tell how to run the random generator backwards. I.e. get a previous value from a seed? (s.t.
us := Random new.
( Random seed: us prevValue ) nextValue = us seed )
I looked at ita little but got lost.
Again thanks for your input.
Yours in curiosity and service, Jerome Peace 
