Mantis - Squeak
Viewing Issue Advanced Details
6755 Collections feature N/A 11-06-07 23:25 11-09-07 22:55
nicolas cellier  
 
normal  
new 3.10  
open  
none    
none  
0006755: [ENH] Connect EndOfStream
EndOfStream Exception seems to be unused in 3.10

My understanding was that (if ever signalled):

  [[aStream next doSomething] repeat]
      on: EndOfStream do: [:exc | exc return: nil].

would be more efficient than testing atEnd at each loop:

  [aStream atEnd]
      whileFalse: [aStream next doSomething].

Maybe isNil test is efficient too but I want to be able to stream on nil!

  | nxt |
  [nxt := aStream next.
  nxt == nil]
      whileFalse: [aStream next doSomething].
This has been proposed many times but never connected...

http://lists.squeakfoundation.org/pipermail/squeak-dev/2000-May/014403.html [^]
http://lists.squeakfoundation.org/pipermail/squeak-dev/2000-May/020882.html [^]

http://aspn.activestate.com/ASPN/Mail/Message/squeak-list/1924925 [^]

http://lists.squeakfoundation.org/pipermail/squeak-harvest/2004-January/001114.html [^]
http://lists.squeakfoundation.org/pipermail/squeak-harvest/2004-October/006974.html [^]
http://lists.squeakfoundation.org/pipermail/squeak-harvest/2004-October/007009.html [^]

According to last ref, it would break some code.
I guess because such code is catching wide
    [...] on: Error do: [...]

So a simple solution would be to change EndOfStream superclass from Error to Notification.

Please find a patch following
 ConnectEndOfStream-M6755-nice.1.cs [^] (3,315 bytes) 11-06-07 23:45
 RepeatUntil-M6755-nice.1.cs [^] (511 bytes) 11-07-07 00:53
 EndOfStream-M6755-nice-PerformanceTest.1.cs [^] (8,907 bytes) 11-08-07 22:51
 NumberReadFromDoNotReadPastEnd-M6755-nice.1.cs [^] (945 bytes) 11-09-07 22:55

Notes
(0011427)
nicolas cellier   
11-07-07 00:13   
This ugly code:

  [[aStream next doSomething] repeat]
      on: EndOfStream do: [:exc | exc return: nil].

would be smarter like this:

  [aStream next doSomething] repeatUntil: EndOfStream.

See uploaded item 2


PS: UNFORTUNATELY, repeat loops are not inlined by Squeak old Compiler...
This tend to degrade tight loops performance.

(0011430)
nicolas cellier   
11-08-07 23:18   
Uploaded some micro benchmark test as initiated by Andreas Raab at http://lists.squeakfoundation.org/pipermail/squeak-dev/2007-November/122434.html. [^]

These tests answers important questions:
- what performance gain/drop can we expect?
- would it be worth rewriting some Stream loops if change accepted?

They don't answer this one:
- what is the real impact on performances with real stream usage?

I added the nextIfAtEnd: block loop as proposed by Matthew and Jason on same thread.
Tests definitely show this approach is wrong for performances as already said by me and Paolo on same thread.

In results below:
nil) is current ^nil implementation.
eos) is ^EndOfStream signal
blk) is ^self nextIfAtEnd: [nil].
rtb) is ^self nextIfAtEnd: [^nil]. (non local return)


On my 1GHz Athlon linux VM I get:

-------------------------------------
TESTS ON SHORT STRING 10 chars
-------------------------------------
implementation nil) pastEndTest = 88
implementation nil) atEndTest = 106

implementation eos) pastEndTest = 465
implementation eos) atEndTest = 105
implementation eos) ExceptionTest = 555

implementation blk) pastEndTest = 510
implementation blk) atEndTest = 482
implementation blk) BlockTest = 323

implementation rtb) pastEndTest = 510
implementation rtb) atEndTest = 487
implementation rtb) BlockTest = 326

-------------------------------------
TESTS ON ByteArray 100 bytes
-------------------------------------
implementation nil) pastEndTest = 72
implementation nil) atEndTest = 92

implementation eos) pastEndTest = 110
implementation eos) atEndTest = 94
implementation eos) ExceptionTest = 111

implementation blk) pastEndTest = 450
implementation blk) atEndTest = 468
implementation blk) BlockTest = 276

implementation rtb) pastEndTest = 452
implementation rtb) atEndTest = 472
implementation rtb) BlockTest = 280

-------------------------------------
TESTS ON ByteArray 1000 bytes
-------------------------------------
implementation nil) pastEndTest = 702
implementation nil) atEndTest = 918

implementation eos) pastEndTest = 756
implementation eos) atEndTest = 928
implementation eos) ExceptionTest = 675

implementation blk) pastEndTest = 4458
implementation blk) atEndTest = 4675
implementation blk) BlockTest = 2736

implementation rtb) pastEndTest = 4475
implementation rtb) atEndTest = 4722
implementation rtb) BlockTest = 2767

[atEnd] whileFalse loops are unchanged of course, they don't read pastEnd.
[==nil] whileFalse pastEnd loops are degrading if no rewrite is applied.
This becomes neglectable at about 1000 elements (5% for an empty loop, of course less for real processing loop)

Rewriting could benefit to atEnd loop over 130 elements (not shown here)
and to pastEnd loop over 400 elements (not shown neither).
However it would be worth only on large streams (files).

Real interrogation still on usage of ==nil pastEnd tests on short streams in tight loops...
Given 223 users of ReadStream and 6603 senders of ==, that's not that easy to analyze.
(0011432)
nicolas cellier   
11-09-07 22:31   
A way to find real senders is to
- load ConnectEndOfStream-M6755-nice.1.cs
- execute instrumenting hack below in a workspace,
- then play your favourite activities...
If huge tallies are obtained you can post at ncellier at ifrance dot com.
Thanks.

tally := MessageTally new.
tally spyEvery: 100 on: ['em ezilaitini ot si siht' reverse].
tally class: World class method: World class>>#doOneCycle.
   "middle lines begin"
tallyEnd := false.
[(Delay forSeconds: 300) wait. tallyEnd := true] fork.
[[World doOneCycle. Processor yield.tallyEnd] whileFalse]
        on: EndOfStream
        do: [:exc | tally tally: exc signalerContext by: 1.
            exc resume].
(StringHolder new contents:
        (String streamContents: [:s | tally report: s]))
    openLabel: 'EndOfStream Spy Results'.
   "middle lines end"
tally close.

"Note: If you do not execute tally close, and execute middle lines again, I think results can be cumulated."