Mantis Bugtracker
  

Viewing Issue Advanced Details Jump to Notes ] View Simple ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0007109 [Squeak] VM minor always 07-03-08 23:56 05-16-11 12:06
Reporter nicolas cellier View Status public  
Assigned To lewis
Priority normal Resolution open Platform
Status testing   OS
Projection none   OS Version
ETA none Fixed in Version Product Version 3.10
  Product Build
Summary 0007109: [ENH] Speed-up LargeInteger bitShift:
Description Positive (left) shift are optimized when byte-aligned
Negative (right) shifts are not!

You can see this with a code like this:
| x |
x := SmallInteger maxVal raisedTo: 1000.
x digitLength. x highBit.
{Time millisecondsToRun: [10000 timesRepeat: [x bitShift: -12000]].
Time millisecondsToRun: [10000 timesRepeat: [x bitShift: -11997]].
}
=> #(324 376) Not much difference whether byte aligned or not

While:
| x |
x := SmallInteger maxVal raisedTo: 100.
{Time millisecondsToRun: [10000 timesRepeat: [x bitShift: 12000]].
Time millisecondsToRun: [10000 timesRepeat: [x bitShift: 11997]].
}
=> #(90 180) a big difference whether byte aligned or not
Steps To Reproduce
Additional Information bitShifts should be fast, because often used in tight loops
(I need them for fast divide and conquer algorithms).
I can as well write a byteShift: more efficient than bitShift: (see below)
But it would be preferable to improve LargeIntegersPlugin

The byteShift: code is here if you wanna try:
LargePositiveInteger>>byteShift: n
    "shift by bytes, which is faster than bits when n < 0"
    | shifted |
    n = 0 ifTrue: [^ self].
    n > 0 ifTrue: [
        shifted := self class new: self digitLength + n.
        shifted replaceFrom: n+1 to: self digitLength + n with: self startingAt: 1.
        ^ shifted].
    self digitLength + n <= 0 ifTrue: [^0].
    shifted := self class new: self digitLength + n.
    shifted replaceFrom: 1 to: self digitLength + n with: self startingAt: 1 - n.
    ^ shifted normalize

Which could also be written:
LargePositiveInteger>>byteShift: n
    "shift by bytes, which is faster than bits when n < 0"
    | shifted |
    n = 0 ifTrue: [^ self].
    self digitLength + n <= 0 ifTrue: [^0].
    shifted := self class new: self digitLength + n.
    shifted replaceFrom: (1 max: 1 + n) to: self digitLength + n with: self
startingAt: (1 max: 1 - n).
    ^ shifted normalize
Attached Files  LargeIntegersPlugin-SpeedUp-BitShift-Patch-M7109-nice.1.cs [^] (3,852 bytes) 07-04-08 00:39
 LargeIntegersPlugin-SpeedUp-BitShift-Test-M7109-nice.1.cs [^] (1,423 bytes) 07-04-08 00:40

- Relationships

- Notes
(0012331 - 1994 - 2247 - 2247 - 2247 - 2247 - 2247)
nicolas cellier
07-04-08 00:26
edited on: 07-04-08 19:24

I loaded VMMaker 3.8b6 from SqueakMap.
I took a look at the plugin loaded.
I added a test for rshift on byte boundary.
I simplified the code a little bit (ah ah) by removing some variables:
I worked on 16 bits directly rather than operating twice on the same digit to fit in 8 bits.
I changed all bitShift: into explicit << and >> inside loop
(bitShift: costs a runtime test when the shift is a variable and direction cannot be infered).
I opened VMMakerTool openInWorld.
I generated LargeIntegersPlugin as external (just a hack to generate the minimum)
then copied in src/vm/intplugins, compiled installed executed tested.

Yet, the test passes and the image does not crash...

This is a very simple naive improvment, but yet some gain running this test:

| x |
x := SmallInteger maxVal raisedTo: 100.
OrderedCollection new
 add: (Time millisecondsToRun: [100000 timesRepeat: [x bitShift: -1600]]);
 add: (Time millisecondsToRun: [100000 timesRepeat: [x bitShift: -1597]]);
 add: (Time millisecondsToRun: [100000 timesRepeat: [x bitShift: 1597]]);
 add: (Time millisecondsToRun: [100000 timesRepeat: [x bitShift: 1600]]);
 asArray

Elder 3.10-1 unix VM: #(505 550 947 495) #(503 554 933 489)
Patched 3.10-1nice VM: #(357 416 673 495) #(362 417 675 498)
VW 7.6 reference VM: #(130 138 356 344) #(124 139 347 348)

Still much room for improvment! (though silly test measures GC activity as well)

I guess we can safely write (uchar at: i put: x) instead of (uchar at: i put: (x bitAnd: 255)) (don't know if compiler eliminates that...).
Didn't do it, because i thought a Simulator might fail...

Much more could be obtained by using a OS optimized memcopy() in case of byte alignment, and machine specific code for example fetching bytes 4 by 4 or 8 by 8 on 64 bits (some kind of loop unrolling). Of course, one has to deal carefully with proper unsigned declarations, endianness etc...
Quite sure some of the enhancements can be written in SLang though...

 
(0013846 - 531 - 686 - 686 - 686 - 686 - 686)
nicolas cellier
08-29-10 18:45

The patch also is COG-ready
I just tried above VMMaker-oscog.26
(same source modulo <var:> declaration)

| x |
x := SmallInteger maxVal raisedTo: 100.
#(-1600 -1597 1597 1600) collect: [ :shift |
       [ 1 to: 100000 do: [ :i | x bitShift: shift ] ] timeToRun ].


What I get on my Mac with deployment build is
Original COG
#(116 114 219 65)
#(116 113 219 65)
COG + 7109
#(47 77 120 61)
#(46 75 120 61)

VWNC7.7 for reference:
#(62.578 milliseconds 71.045 milliseconds 178.985 milliseconds 159.213 milliseconds)
 
(0014106 - 147 - 159 - 159 - 159 - 159 - 159)
lewis
05-16-11 12:06

Tested on 32 and 64-bit HOST, standard interpreter, and 64-bit image, standard interpreter, all tests pass.

Added to SqS/VMMaker VMMaker-dtl.234
 

- Issue History
Date Modified Username Field Change
07-03-08 23:56 nicolas cellier New Issue
07-04-08 00:26 nicolas cellier Note Added: 0012331
07-04-08 00:39 nicolas cellier File Added: LargeIntegersPlugin-SpeedUp-BitShift-Patch-M7109-nice.1.cs
07-04-08 00:40 nicolas cellier File Added: LargeIntegersPlugin-SpeedUp-BitShift-Test-M7109-nice.1.cs
07-04-08 17:26 KenCausey Assigned To  => tim
07-04-08 17:26 KenCausey Status new => assigned
07-04-08 17:26 KenCausey Category Kernel => VM
07-04-08 19:24 nicolas cellier Note Edited: 0012331
01-22-09 02:12 lewis Issue Monitored: lewis
05-14-10 22:02 lewis Assigned To tim => lewis
08-29-10 18:45 nicolas cellier Note Added: 0013846
05-16-11 12:06 lewis Note Added: 0014106
05-16-11 12:06 lewis Status assigned => testing


Mantis 1.0.8[^]
Copyright © 2000 - 2007 Mantis Group
56 total queries executed.
36 unique queries executed.
Powered by Mantis Bugtracker