Mantis Bugtracker
  

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0007142 [Squeak] Kernel minor always 08-02-08 15:50 08-30-08 01:02
Reporter Cerebus View Status public  
Assigned To
Priority normal Resolution open  
Status new   Product Version 3.10.2
Summary 0007142: LargerInteger>>atRandom produces insufficient random bits
Description (2 raisedTo: 128) atRandom

yields:

B2B5A021656B4000000000000000001
4B223F08964480000000000000000001
B5584CC16AB098000000000000000001
B8172E43702E60000000000000000001

and so on.

The issue is Random produces Floats, which have only 53 bits of entropy. A fix could be to add a #bits: method to Random that supplies a requested number of bits of entropy (with a concurrent change in #atRandom), or to alter LargePositiveInteger to handle the way Random produces bits.
Additional Information The Cryptography Team addressed this by implementing a new random generator class (RandomGenerator abstract class and a SecureRandom subclass). However, the entropy needs for cryptography are stringent and the PRNG as implemented in Random may not be sufficient for that use.

Attached Files  testAtRandomOddity-wiz.1.cs [^] (1,531 bytes) 08-02-08 20:54
 Random-nextLargeInt.st [^] (402 bytes) 08-16-08 20:17

- Relationships

SYSTEM WARNING: Creating default object from empty value

child of 0007152new  Mother of squeak roundoff and truncation errors. 
child of 0007192new  A mother for issues with Random and random number generators in general 

- Notes
(0012434 - 343 - 397 - 397 - 397 - 397 - 397)
wiz
08-02-08 21:02

uploaded a test based on Nicolas's comments about
all large integer randoms being odd.

A fix should make this test pass. ( or help debug the test! ;-)

A fix might also suggest a more extensive test but this will do for a start.

When the test fails debug it and explore the rejects.

Yours in curiosity and service, --Jerome Peace
 
(0012477 - 1669 - 1909 - 1909 - 1909 - 1909 - 1909)
wiz
08-10-08 21:38
edited on: 08-10-08 22:01

more analysis:

The limitation is not necessarily in the generator but in the combining of the floating point and large integer in nextInt:

and of course this is a problem

24 factorial + 1000000 = (24 factorial * 1.0)
will answer true.

while
( 24 factorial - (24 factorial * 1.0 ) asInteger ) = 0
will answer false.

There is a truncation difficulty with floats.

I say difficulty rather than bug because it is not spec'ed to work otherwise.

======

I finally got around to looking inside the Random class to see exactly what
it was doing.

Basiclly it generates a 31 bit random integer from a 31 bit seed
for each next request.
It answers a float based on seed / ((2 raisedTo: 31) -1) . So the float looks like 53 bits of randomness but really represents only 31.

Furthermore the internal state of the generator is just the seed 31 bits.

The random protocol does not provide direct access to the seed. and the method for generating the next seed #nextValue is catagorized as private.

nextValue returns a 31 bit integer between 0 and the modulus.

===

So a solution to the problem would require LargePositiveInteger to have its own method for #atRandom: and in that method to do all the arithmetic as integers i..e. not mix float and integer aritmetic to avoid the floating point round off.

Then it would require a generator that could produce a large number of uncorrelated random integers on a separate passes.

The protocol for Random should probably be expanded to allow for an integer to be returned as the seed gets updated. Or the new generator could reimplement #getInt: to handle largeInts gracefully.

 
(0012503 - 1132 - 1252 - 1252 - 1252 - 1252 - 1252)
wiz
08-16-08 20:32

uploaded Random-nextLargeInt.st

This is just a patched version of nextInt: which does all the arithmetic
with integers rather than floats.

This will return randoms with the low order bits distributed.
And it demonstrates the problem is with the floating point/ integer arithmetic.

Obviously there is also a limitation in that no more that 2^31-1 choices can be had by any one random.

And it raises some questions.
Examining a random show that all of the parameters are stored as floating points. Even the seed is stored in floating point form after the first iteration of next is performed.

So why is algorithm limiting itself to 31 bits of randomness?
How many bits would be practical to get out of the generator considerng
everything is stored as fp with a 53 bit mantissa?
How (Where, this is the age of the internet) would you find one.

Alternately, could a faster version of this rand generator be writen by using only small integer arithmetic? (I suspect the practical answer would be no or not worthwile, but the question itself is worth raising.)

Yours in service and curiosity, --Jerome Peace
 
(0012511 - 1603 - 1835 - 1835 - 1835 - 1835 - 1835)
nicolas cellier
08-22-08 21:54

Random comments clearly states why Float are used.
Random>>#nextValue comments clearly states the 31 bit limit.

You clearly know that provided method in Integer arithmetic won't work,
Because you clearly have to subclass Random for the LargeInteger case...

By the way, (a / b) truncated = (a quo: b),
while (a / b) floor = (a // b).

Not so clear however how to implement a GOOD PRNG able of arbitrary Interval size.
GOOD means at least:
- uniform density
- cycles not too short (larger than the targeted LargeInteger)
- whiteness, that is uncorrelated sequences,

For a silly example of what whiteness is about, (seed := seed + 1 \\ m) is uniform, has long sequences (of length m) whatever the seed, but is absolutely not a good candidate regarding correlation!

To have an idea about correlation properties, generate a pseudo random sequence seq of length N, and compute autocorrelation like this:

avg := seq avg.
autocorr := Array new: N.
0 to: N - 1 do: [:j |
   autocorr at: j + 1 put:
     ((1 to: N) detectSum: [:i | (seq at: i) - avg * ((seq atWrap: i + j) - avg)]) / (N-1)].

If you plot (autocorr at: j) versus j, the first element of autocorr ideally should be a high value (seq variance), while every other element should be near zero.

For a uniform distribution between 0 and m-1, average is (m-1)/2, while variance is (m/2) squared / 12.

PRNG are usually cyclic, so if N grows enough, you'll find the exact same sequence again and get a periodic autocorrelation function.
Be prepared for more maths, notably group theory, number theory etc...
Good luck.
 
(0012551 - 1417 - 1626 - 1626 - 1626 - 1626 - 1626)
wiz
08-30-08 01:02

Hi Nicholas,

Thank for chiming in. Useful knowledge to be reminded of.


Not too surprised to find the routine doesn't completely work.
I am still on the searching/learning curve here.

It was not meant as a solution but as a demo of the float/integer difficulty.
Also for positives numbers truncation and floor are identical cousins. So I did not try to hard to distinguish because the code I was swiping from didn't.

 Number class takes care of sign removal/addition before delagating getInt to LargePositiveInteger.

And again this is analysis (and solution searching). I don't claim to have reached the end of that phase.

=====

I am also curious If the seed equal to the modullous is possible to produce in the chain.

Since the modulous is what next divides by to obtain the fraction., if the modulus (16r7FFFFFFF) can be produced then the interval that next returns would be [0 1] or maybe (0 1] and not what the comment says [0 1).
I don't think next can produce a 0 since the seed can not be zero unless it is seeded as such. The lowest it could go would be 1/modulus.





Can you tell how to run the random generator backwards. I.e. get a previous value from a seed? (s.t.
us := Random new.
( Random seed: us prevValue ) nextValue = us seed )

I looked at ita little but got lost.

Again thanks for your input.

Yours in curiosity and service, --Jerome Peace
 

- Issue History
Date Modified Username Field Change
08-02-08 15:50 Cerebus New Issue
08-02-08 15:50 Cerebus Issue Monitored: Cerebus
08-02-08 20:54 wiz File Added: testAtRandomOddity-wiz.1.cs
08-02-08 21:02 wiz Note Added: 0012434
08-10-08 21:38 wiz Note Added: 0012477
08-10-08 22:01 wiz Note Edited: 0012477
08-10-08 22:25 wiz Relationship added child of 0007152
08-16-08 20:17 wiz File Added: Random-nextLargeInt.st
08-16-08 20:32 wiz Note Added: 0012503
08-22-08 21:54 nicolas cellier Note Added: 0012511
08-30-08 01:02 wiz Note Added: 0012551
09-15-08 21:33 wiz Relationship added child of 0007192


Mantis 1.0.8[^]
Copyright © 2000 - 2007 Mantis Group
70 total queries executed.
43 unique queries executed.
Powered by Mantis Bugtracker