Mantis Bugtracker
  

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0001876 [Squeak] Collections minor always 10-06-05 18:45 02-06-11 23:48
Reporter MarcusDenker View Status public  
Assigned To andreas
Priority normal Resolution fixed  
Status closed   Product Version 3.9
Summary 0001876: [ENH] identityHash
Description In 2001, Scott A Crosby posted a couple of changesets to fix the bad scalability of collections that
use identityhash.

The idea: put the primitive into a method "hashBits", then define "identityHash" as hashBits*primNumber.
So we do the upscaling directly in the hash method, not later in scanFor:

The good effect: scanFor: does not need to test for size, all object that use identityHash get useable
hash values.

Negative: #identityHash now is slower due to the multiplication... but that should not be noticable.

Phillipe did a test of the idea by implementing #hash like that and using OrderdCollection. The
picture attached show the result.

TODO: check if such a big scale is needed... maybe a smaller number is enough?
Additional Information
Attached Files  objectchart.png [^] (17,200 bytes) 10-06-05 18:45
 001IdentityHash.cs [^] (412 bytes) 10-06-05 19:02
 002IdentityHash.cs [^] (1,921 bytes) 10-06-05 19:03
 003dentityHash.cs [^] (246 bytes) 10-06-05 19:03
 004IdentityHash.cs [^] (2,416 bytes) 10-06-05 19:03
 IdentityHashFixForDystemTrace.cs [^] (1,319 bytes) 10-06-05 19:03
 hiccups.png [^] (21,542 bytes) 02-19-06 15:36
 new_dictionary.png [^] (29,068 bytes) 02-21-06 20:28

- Relationships

- Notes
(0002786 - 882 - 945 - 945 - 945 - 945 - 945)
pmm
10-06-05 22:56
edited on: 10-06-05 22:59

Some clarification:

The picture shows the time for accessing 100 Objects in a Dictionary in relation to the size of the Dictionary.
The Object line is the access time as is. The PrimeObject line is the access time when multiplying #identityHash by 100799.
It should be contant, because the number of Objects accessed (100) is constant.
The OrderedCollection is just used to store the keys.

The code (originally posted by Lukas):
| test ord |
test := Dictionary new.
[ test size >= 10000] whileFalse: [
    ord := OrderedCollection new: 100.
    100 timesRepeat: [ 
        test at: ( ord add: Object new ) put: nil ].
    Transcript
        show: test size asString;
        tab.
    Transcript show:
        [
            10 timesRepeat: [ 
                ord do: [ :each | test at: each ] ]
        ] timeToRun asString.
    Transcript cr ]


 
(0003646 - 495 - 594 - 594 - 594 - 594 - 594)
andreas
01-28-06 20:51
edited on: 01-28-06 20:52

Just stumbled across this. Two things to consider:
- If the problem is in Dictionary (contrary to IDDict), why not implement Object>>hash as "self identityHash * BigPrime"?
- There is still a scalability problem in IDSet/IDDict due to an unfortunate choice of the initial size of sets. This should really be changed to something like:
  Set class>>new
     ^self basicNew init: 5.
(try the benchmark with an IDDict with and without this and notice the major difference around 4k elements)

 
(0003906 - 200 - 200 - 200 - 200 - 200 - 200)
pmm
02-19-06 15:40

Yes, Set new is a different issue. See hiccups.png. It shows this is not fixed by multiplying #identityHash by 100799. But changing Set new doesn't fix Dictionaries. So they should both be integrated.
 
(0004774 - 69 - 81 - 81 - 81 - 81 - 81)
ducasse
04-22-06 13:43

Ok guys
I would like to conclude....
So could you give some code :)
 
(0012857 - 77 - 109 - 109 - 109 - 109 - 109)
Keith_Hodges
12-17-08 04:17

"fix begin"
Installer mantis bug: 1876 fix: '004IdentityHash.cs'.
"fix end"
 
(0013858 - 465 - 643 - 643 - 643 - 643 - 643)
nicolas cellier
09-07-10 18:46
edited on: 09-07-10 18:46

This has been resolved in trunk (4.1) with introduction of this method:
scaledIdentityHash
    "For identityHash values returned by primitive 75, answer
    such values times 2^18. Otherwise, match the existing
    identityHash implementation"

    ^self identityHash * 262144 "bitShift: 18"

#scaledIdentityHash must be used as hash index (IdentityDictionary, IdentitySet, ...)
#identityHash can still be used for printing or when collection is known small enough.

 

- Issue History
Date Modified Username Field Change
10-06-05 18:45 MarcusDenker New Issue
10-06-05 18:45 MarcusDenker File Added: objectchart.png
10-06-05 18:46 MarcusDenker Description Updated
10-06-05 19:02 MarcusDenker File Added: 001IdentityHash.cs
10-06-05 19:03 MarcusDenker File Added: 002IdentityHash.cs
10-06-05 19:03 MarcusDenker File Added: 003dentityHash.cs
10-06-05 19:03 MarcusDenker File Added: 004IdentityHash.cs
10-06-05 19:03 MarcusDenker File Added: IdentityHashFixForDystemTrace.cs
10-06-05 22:36 pmm Issue Monitored: pmm
10-06-05 22:56 pmm Note Added: 0002786
10-06-05 22:59 pmm Note Edited: 0002786
01-28-06 20:51 andreas Note Added: 0003646
01-28-06 20:52 andreas Note Edited: 0003646
02-19-06 15:36 pmm File Added: hiccups.png
02-19-06 15:40 pmm Note Added: 0003906
02-21-06 20:28 pmm File Added: new_dictionary.png
04-22-06 13:43 ducasse Note Added: 0004774
12-17-08 04:17 Keith_Hodges Note Added: 0012857
12-17-08 04:17 Keith_Hodges Issue Monitored: Keith_Hodges
12-17-08 04:18 Keith_Hodges Status new => acknowledged
01-09-09 23:54 Keith_Hodges Status acknowledged => testing
01-10-09 03:39 Keith_Hodges Status testing => resolved
01-10-09 03:39 Keith_Hodges Fixed in Version  => 3.11
01-10-09 03:39 Keith_Hodges Resolution open => fixed
01-10-09 03:39 Keith_Hodges Assigned To  => Keith_Hodges
01-10-09 03:41 Keith_Hodges Status resolved => testing
10-03-09 19:34 Keith_Hodges Status testing => assigned
10-03-09 19:34 Keith_Hodges Assigned To Keith_Hodges => andreas
09-07-10 18:46 nicolas cellier Status assigned => resolved
09-07-10 18:46 nicolas cellier Fixed in Version 3.11 => 4.1
09-07-10 18:46 nicolas cellier Note Added: 0013858
09-07-10 18:46 nicolas cellier Note Edited: 0013858
02-06-11 23:48 leves Status resolved => closed


Mantis 1.0.8[^]
Copyright © 2000 - 2007 Mantis Group
100 total queries executed.
46 unique queries executed.
Powered by Mantis Bugtracker