SYSTEM WARNING: Creating default object from empty value

Mantis - Squeak
Viewing Issue Advanced Details
7142 Kernel minor always 08-02-08 15:50 08-30-08 01:02
Cerebus  
 
normal  
new 3.10.2  
open  
none    
none  
0007142: LargerInteger>>atRandom produces insufficient random bits
(2 raisedTo: 128) atRandom

yields:

B2B5A021656B4000000000000000001
4B223F08964480000000000000000001
B5584CC16AB098000000000000000001
B8172E43702E60000000000000000001

and so on.

The issue is Random produces Floats, which have only 53 bits of entropy. A fix could be to add a #bits: method to Random that supplies a requested number of bits of entropy (with a concurrent change in #atRandom), or to alter LargePositiveInteger to handle the way Random produces bits.
The Cryptography Team addressed this by implementing a new random generator class (RandomGenerator abstract class and a SecureRandom subclass). However, the entropy needs for cryptography are stringent and the PRNG as implemented in Random may not be sufficient for that use.

child of 0007152new  Mother of squeak roundoff and truncation errors. 
child of 0007192new  A mother for issues with Random and random number generators in general 
 testAtRandomOddity-wiz.1.cs [^] (1,531 bytes) 08-02-08 20:54
 Random-nextLargeInt.st [^] (402 bytes) 08-16-08 20:17

Notes
(0012434)
wiz   
08-02-08 21:02   
uploaded a test based on Nicolas's comments about
all large integer randoms being odd.

A fix should make this test pass. ( or help debug the test! ;-)

A fix might also suggest a more extensive test but this will do for a start.

When the test fails debug it and explore the rejects.

Yours in curiosity and service, --Jerome Peace
(0012477)
wiz   
08-10-08 21:38   
more analysis:

The limitation is not necessarily in the generator but in the combining of the floating point and large integer in nextInt:

and of course this is a problem

24 factorial + 1000000 = (24 factorial * 1.0)
will answer true.

while
( 24 factorial - (24 factorial * 1.0 ) asInteger ) = 0
will answer false.

There is a truncation difficulty with floats.

I say difficulty rather than bug because it is not spec'ed to work otherwise.

======

I finally got around to looking inside the Random class to see exactly what
it was doing.

Basiclly it generates a 31 bit random integer from a 31 bit seed
for each next request.
It answers a float based on seed / ((2 raisedTo: 31) -1) . So the float looks like 53 bits of randomness but really represents only 31.

Furthermore the internal state of the generator is just the seed 31 bits.

The random protocol does not provide direct access to the seed. and the method for generating the next seed #nextValue is catagorized as private.

nextValue returns a 31 bit integer between 0 and the modulus.

===

So a solution to the problem would require LargePositiveInteger to have its own method for #atRandom: and in that method to do all the arithmetic as integers i..e. not mix float and integer aritmetic to avoid the floating point round off.

Then it would require a generator that could produce a large number of uncorrelated random integers on a separate passes.

The protocol for Random should probably be expanded to allow for an integer to be returned as the seed gets updated. Or the new generator could reimplement #getInt: to handle largeInts gracefully.

(0012503)
wiz   
08-16-08 20:32   
uploaded Random-nextLargeInt.st

This is just a patched version of nextInt: which does all the arithmetic
with integers rather than floats.

This will return randoms with the low order bits distributed.
And it demonstrates the problem is with the floating point/ integer arithmetic.

Obviously there is also a limitation in that no more that 2^31-1 choices can be had by any one random.

And it raises some questions.
Examining a random show that all of the parameters are stored as floating points. Even the seed is stored in floating point form after the first iteration of next is performed.

So why is algorithm limiting itself to 31 bits of randomness?
How many bits would be practical to get out of the generator considerng
everything is stored as fp with a 53 bit mantissa?
How (Where, this is the age of the internet) would you find one.

Alternately, could a faster version of this rand generator be writen by using only small integer arithmetic? (I suspect the practical answer would be no or not worthwile, but the question itself is worth raising.)

Yours in service and curiosity, --Jerome Peace
(0012511)
nicolas cellier   
08-22-08 21:54   
Random comments clearly states why Float are used.
Random>>#nextValue comments clearly states the 31 bit limit.

You clearly know that provided method in Integer arithmetic won't work,
Because you clearly have to subclass Random for the LargeInteger case...

By the way, (a / b) truncated = (a quo: b),
while (a / b) floor = (a // b).

Not so clear however how to implement a GOOD PRNG able of arbitrary Interval size.
GOOD means at least:
- uniform density
- cycles not too short (larger than the targeted LargeInteger)
- whiteness, that is uncorrelated sequences,

For a silly example of what whiteness is about, (seed := seed + 1 \\ m) is uniform, has long sequences (of length m) whatever the seed, but is absolutely not a good candidate regarding correlation!

To have an idea about correlation properties, generate a pseudo random sequence seq of length N, and compute autocorrelation like this:

avg := seq avg.
autocorr := Array new: N.
0 to: N - 1 do: [:j |
   autocorr at: j + 1 put:
     ((1 to: N) detectSum: [:i | (seq at: i) - avg * ((seq atWrap: i + j) - avg)]) / (N-1)].

If you plot (autocorr at: j) versus j, the first element of autocorr ideally should be a high value (seq variance), while every other element should be near zero.

For a uniform distribution between 0 and m-1, average is (m-1)/2, while variance is (m/2) squared / 12.

PRNG are usually cyclic, so if N grows enough, you'll find the exact same sequence again and get a periodic autocorrelation function.
Be prepared for more maths, notably group theory, number theory etc...
Good luck.
(0012551)
wiz   
08-30-08 01:02   
Hi Nicholas,

Thank for chiming in. Useful knowledge to be reminded of.


Not too surprised to find the routine doesn't completely work.
I am still on the searching/learning curve here.

It was not meant as a solution but as a demo of the float/integer difficulty.
Also for positives numbers truncation and floor are identical cousins. So I did not try to hard to distinguish because the code I was swiping from didn't.

 Number class takes care of sign removal/addition before delagating getInt to LargePositiveInteger.

And again this is analysis (and solution searching). I don't claim to have reached the end of that phase.

=====

I am also curious If the seed equal to the modullous is possible to produce in the chain.

Since the modulous is what next divides by to obtain the fraction., if the modulus (16r7FFFFFFF) can be produced then the interval that next returns would be [0 1] or maybe (0 1] and not what the comment says [0 1).
I don't think next can produce a 0 since the seed can not be zero unless it is seeded as such. The lowest it could go would be 1/modulus.





Can you tell how to run the random generator backwards. I.e. get a previous value from a seed? (s.t.
us := Random new.
( Random seed: us prevValue ) nextValue = us seed )

I looked at ita little but got lost.

Again thanks for your input.

Yours in curiosity and service, --Jerome Peace