Notes |
|
(0010928)
|
andreas
|
07-24-07 07:29
|
|
"Change Set: SafeDelay
Date: 23 July 2007
Author: Andreas Raab
This change set fixes a set of severe problems with concurrent use of Delay. Previously, many of the delay-internal structures were modified by the calling process which made it susceptible to being terminated in the middle of manipulating these structures and leave Delay (and consequently the entire system) in an inconsistent state.
This change set fixes this problem by moving *all* manipulation of Delay's internal structures out of the calling process. As a side-effect it also removes the requirement of Delays being limited to SmallInteger range; the new code has no limitation on the duration of a delay.
No tests are provided since outside of true asynchronous environments (networks) it is basically impossible to recreate the situation reliably." |
|
|
(0010931)
|
andreas
|
07-24-07 16:05
|
|
Updated the change set to one including the class definition. |
|
|
(0010936)
|
Keith_Hodges
|
07-25-07 17:17
|
|
To install this fix use: Installer mantis ensureFix: '5676 make Delay thread safe'.
"fix begin"
Installer mantis bug: 6576 fix: 'SafeDelay-2.cs'.
"fix end"
|
|
|
(0011083)
|
andreas
|
08-31-07 03:38
|
|
I've added a set of tweaks that should go on top of the previous changes:
"Change Set: DelayTweaks-ar
Date: 30 August 2007
Author: Andreas Raab
Some more delay tweaks:
- Ensure proper unscheduling of delays which are terminated in the wait
- Only unschedule delays that are beingWaitedOn
- Bullet-proof handleTimerEvent so that it only polls the clock once (each poll is a potential cause for msecs clock overflow) and cope with an obscure race condition" |
|
|
(0011091)
|
edgardec
|
09-04-07 11:13
|
|
This now is 7146DelayTweaks-ar.cs and was in updates for 3.10
Thanks Andreas ! |
|
|
(0011095)
|
edgardec
|
09-07-07 09:50
|
|
Reminder sent to: andreas Andreas:
I sorry to tell i should remove the .cs from updates stream
I found I can't save the image and all freezes.
It's happening on PPC Mac OS X with John VM.,
If you have new files for me to test, I put again when works |
|
|
(0011096)
|
andreas
|
09-07-07 21:34
|
|
Without some more detail there is nothing I can do. The only thing I can say is that we run this code in our production servers 24/7 and that we run in our production clients and that we run it under load and it works. |
|
|
(0011100)
|
edgardec
|
09-08-07 10:12
|
|
Reminder sent to: andreas I take the 7143 image, filed in the two changes sets and again I can't save the image.
This time I try on my old Windows 98 Pc whit SqueakVM-Win32-3.10.4.
I appreciate any leads why you have many prodution things and I can't save.
|
|
|
(0011104)
|
andreas
|
09-11-07 02:08
|
|
I finally got around to look at this problem and it turns out it's a weird problem with (as one would expect in Delay) all sorts of tricky implications. It worked for us only because all of our images are contaminated by artifacts of earlier delay problems. We are currently testing a fix for all the problems; I'll let you know how it goes. |
|
|
(0011218)
|
GazzaGuru
|
09-28-07 15:14
|
|
I've also noticed that these changes bring back a resumption time bug. Try interrupting [true] whileTrue. Then, after saving the image, try again...
C.f. http://bugs.squeak.org/view.php?id=381 [^] |
|
|
(0011230)
|
andreas
|
10-01-07 08:30
|
|
Just as an FYI, this still isn't resolved completely. The problem reported by Edgar can be fixed simply by changing system startup/shutdown (turns out that delay is shut down too early which causes a problem when Sensor is shut down) but it seems as if there is still another problem that leads to Delay's AccessProtect mutex being signaled multiple times. I still haven't tracked this one down - I have a suspicion and added a guard but it hasn't fired yet. |
|
|
(0011235)
|
edgardec
|
10-05-07 10:07
|
|
Reminder sent to: andreas Could send or put here the right startup order ?
I try to change doing
Smalltalk removeFromStartUpList: Delay.
Smalltalk addToStartUpList: Delay
And with Sensor, but don't work.
Too deep for me :-) |
|
|
(0011236)
|
GazzaGuru
|
10-05-07 21:48
|
|
Just a test for "allegedly" correct operation..
s _ Semaphore forMutualExclusion.
i _ 1.
"Loop while there's not too many signals on the semaphore (should be
0 or 1)"
[[(s instVarNamed: #excessSignals) < 2] whileTrue: [
"Fork two processes to do things inside the semaphore"
p _ [[true] whileTrue: [s critical: [(Delay forMilliseconds: 10) wait]]]
forkNamed: 'p', i printString.
q _ [[true] whileTrue: [s critical: [(Delay forMilliseconds: 10) wait]]]
forkNamed: 'q', i printString.
"Increment the counter just to make it easy to ID the processes" i _ i + 1.
"Delay to give the processes a chance to resume and potentially get
into the critical on the Semaphore"
(Delay forMilliseconds: 500 atRandom) wait.
p terminate.
(Delay forMilliseconds: 500 atRandom) wait.
q terminate.
"After terminating the two processes, the excess signals should be 1
on the Semaphore because
nothing is in critical any more, assuming the processes terminated
properly and the unwind happened
as it should, but... "
"In our images with Andreas' recent Delay and Semaphore fixes, we
always get one of these two
error conditions (although normally too many signals). This would
cause deadlock in 'real world' situations"
(s instVarNamed: #excessSignals) = 0 ifTrue: [
WorldState addDeferredUIMessage: [self inform: 'Too few signals
after ', i printString, ' loops'].
self halt]].
self inform: 'Too many signals after ', i printString, ' loops'] fork |
|
|
(0011238)
|
andreas
|
10-06-07 01:22
|
|
DelayStartup-ar should bring order in the Delay startup/shutdown. Also note the changes to 0006588
|
|
|
(0011240)
|
edgardec
|
10-06-07 10:05
|
|
Thanks Andreas !
I put again 7146DelayTweaks-ar.cs and was in updates for 3.10.
Now I could save the image and continue the updates |
|
|
(0011256)
|
edgardec
|
10-08-07 12:16
|
|
Seems this is causing the "go limbo" on both Mac and Windows XP when I try to run the 'all test" of SUnit.
At this moment I must delete from updates again until was solved.
I also try the related 6588, but still I could complete the all test run.
People wishing this must try the Installer way and not our updates way until Ralph and Andreas agree how to deal this important issue |
|
|
(0012804)
|
KenCausey
|
11-25-08 23:12
|
|
Harvested as update 7144 and released with Squeak 3.10. |
|