|Anonymous | Login||08-11-2020 22:41 UTC|
|Main | My View | View Issues | Change Log | Docs|
|Viewing Issue Simple Details [ Jump to Notes ]||[ View Advanced ] [ Issue History ] [ Print ]|
|ID||Category||Severity||Reproducibility||Date Submitted||Last Update|
|0006830||[Squeak] Kernel||major||always||12-28-07 21:38||08-14-12 00:30|
|Summary||0006830: A mutex can wind up with a semaphore with more than 1 excessSignal|
I am new to smalltalk and was playing around with processes to see how they work and I've encountered something that should never happen.
It may be related to this bug: http://bugs.squeak.org/view.php?id=6576 [^] since I am using Delay.
The semaphore in a mutex should never go above 1 excess signal.
This script here will cause that to happen if you put it into a Transcript window and "do it":
|p keepGoing i t resume suspend stop semcounts m |
m := Mutex new. p := Processor activeProcess.keepGoing := true.i := 0.t := Transcript.m critical: [t cr;cr;cr;cr;cr.].
m critical: [ t cr; show: 'init: ', Processor activeProcess oopString; cr].
semcounts := ''.
semcounts := 'init:', m excessSignals asString, ' ', semcounts.
resume := [
m critical: [ t cr; show: 'resume: ', Processor activeProcess oopString; cr].
semcounts := 'resume1:', m excessSignals asString, ' ', semcounts.
(Delay forSeconds: 20) wait.
m critical: [ t cr; show: 'resuming'; cr.
semcounts := 'resume2:', m excessSignals asString, ' ', semcounts.
suspend := [
m critical: [ t cr; show: 'pause: ', Processor activeProcess oopString; cr].
(Delay forSeconds: 10) wait.
m critical: [ t cr; show: 'pausing'; cr.
semcounts := 'suspend:', m excessSignals asString, ' ', semcounts.
stop := [
m critical: [ t cr; show: 'stop: ', Processor activeProcess oopString; cr].
(Delay forSeconds: 30) wait.
m critical: [ t cr; show: 'stopping'; cr.
keepGoing := false].
semcounts := 'stop:', m excessSignals asString, ' ', semcounts.
m critical: [ t cr; show: 'end: ', Processor activeProcess oopString; cr].
semcounts := 'end:', m excessSignals asString, ' ', semcounts.
[keepGoing] whileTrue: [ m critical: [t show: i; space].
i := i + 1].
semcounts := 'beforedone:', m excessSignals asString, ' ', semcounts.
(Delay forSeconds: 10) wait.
m critical: [ t cr; show: 'done.';cr.].
semcounts := 'done:', m excessSignals asString, ' ', semcounts.
Transcript show: semcounts asString.
The output of the last line is: done:2 beforedone:2 stop:1 resume2:1 suspend:1 resume1:0 end:1 init:1
I have ran a version of this that stores the mutex in the Smalltalk dictionary for reuse and if it's reused it hangs the system. Infact, that's why I originally started trying to figure out what was going on because it was hanging on the second run every time.
Also, notice the commented out "Processor yield." If you uncomment that then the program will function as expected (ie, excessSignals will not go over 1.) That may help diagnose the problem.
This happens in 3.9 and 3.10
I added 2 methods to the system to expose excessSignals. You need to fileIn the attatched change set to get them, or you can add them manually from here:
!Mutex methodsFor: '*miles-debug' stamp: 'mbg 12/28/2007 13:19'!
^semaphore excessSignals.! !
!Semaphore methodsFor: '*miles-debug' stamp: 'mbg 12/28/2007 13:19'!
miles-debug.1.cs [^] (714 bytes) 12-28-07 21:38
mutex_bug.st [^] (1,707 bytes) 12-28-07 21:42
(0014215 - 696 - 774 - 774 - 774 - 774 - 774)
edited on: 02-06-12 10:36
Just stumbled over this. The example is a bit obscure but the problem is an old one, namely that suspend and resume of processes waiting on semaphores doesn't work properly. To wit:
| sema process |
sema := Semaphore new.
process := [sema critical:] fork.
(Delay forSeconds: 1) wait.
When the process is suspended, it is taken off the semaphore's list, but when it is resumed it's not put back onto it (which would be very difficult to do) but the code in Semaphore critical will signal the semaphore regardless.
Basically, external process manipulation is always tricky and should be done with great care, or better, left only for the debugger.
(0014234 - 38 - 48 - 48 - 48 - 48 - 48)
|Basically I read this as "not fixable"|
(0014247 - 576 - 576 - 576 - 576 - 576 - 576)
|I have an idea how to fix it, I wrote a mail about it once, but can't find it now. The idea is to add another variable to Process in order to hold a reference to the Semaphore it was waiting for while it's suspended. When the process is resumed, the process can #wait again for the same Semaphore. I started hacking the VM and got a partially working solution (the suspend/resume mechanism is quite spread out across many methods), but the code is probably gone due to a hard disk failure. It might still be available in the backups, but I don't have access to them right now.|
|12-28-07 21:38||orgsow||New Issue|
|12-28-07 21:38||orgsow||File Added: miles-debug.1.cs|
|12-28-07 21:42||orgsow||File Added: mutex_bug.st|
|12-10-11 03:56||lewis||Issue Monitored: lewis|
|02-06-12 10:35||andreas||Note Added: 0014215|
|02-06-12 10:36||andreas||Note Edited: 0014215|
|07-14-12 11:12||laza||Status||new => resolved|
|07-14-12 11:12||laza||Resolution||open => not fixable|
|07-14-12 11:12||laza||Assigned To||=> laza|
|07-14-12 11:12||laza||Note Added: 0014234|
|07-14-12 11:12||laza||Status||resolved => closed|
|08-14-12 00:30||leves||Assigned To||laza => leves|
|08-14-12 00:30||leves||Status||closed => feedback|
|08-14-12 00:30||leves||Resolution||not fixable => reopened|
|08-14-12 00:30||leves||Note Added: 0014247|
| Mantis 1.0.8[^]
Copyright © 2000 - 2007 Mantis Group
59 total queries executed.|
37 unique queries executed.