Mantis Bugtracker

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0004882 [Squeak] VM block random 09-12-06 12:04 09-11-07 23:35
Reporter renggli View Status public  
Assigned To
Priority normal Resolution fixed  
Status closed   Product Version 3.9
Summary 0004882: VM lockup
Description It randomly happens to me that the VM locks on an Intel Mac. Even moving the VM window around seems to be sluggish. CPU usage is below 1%. Memory usage is normal.

There is nothing special about the Squeak processes, everything looks fine there.

However the C stack looks strange:

(gdb) bt
#0 0x9001aafc in select ()
#1 0x000d72a7 in aioPoll ()
Previous frame inner to this frame (corrupt stack?)
Additional Information
Attached Files

- Relationships

SYSTEM WARNING: Creating default object from empty value

SYSTEM WARNING: Creating default object from empty value

related to 0006588closed  Broken Semaphore>>critical: leads to frozen processes in Delay 
related to 0006581closed andreas Image freezes (background processes like Seaside make no progress) and Squeak hoggs CPU 

- Notes
(0007116 - 36 - 36 - 36 - 36 - 36 - 36)
09-14-06 19:16

So which VM is this? version etc etc
(0007117 - 44 - 44 - 44 - 44 - 44 - 44)
09-14-06 19:23

As it says in the report: Squeak3.8.12beta4U
(0007118 - 10 - 10 - 10 - 10 - 10 - 10)
09-14-06 19:23

Intel OS X
(0007136 - 553 - 565 - 565 - 565 - 565 - 565)
09-14-06 21:07

Ah, well I'm not sure I saw anything in the report (which report?) that talked about it being Squeak3.8.12beta4U. Do you have other diagnostic information. Also doing a printAllStacks() call via Gnu debug is helpful to see what the Smalltalk processes are doing. Plus use Apple's sample program or shark to see what the procededure call trees are is useful to decide what is going on.

I'll note in 3.8.12 we moved from having squeak on it's own pthread to being part of the main UI thread which means it runs in conjuction with processing UI events.
(0007141 - 335 - 345 - 345 - 345 - 345 - 345)
09-14-06 21:55

To see the platform and vm information you have to go to "View Advanced", there is a link on the top-right. I don't have other diagnostic information right now, but will do what you suggest the next time this happens. I did a printAllStacks() but didn't see anything special, as far as I remember it looked like it always does. Thanks.
(0007325 - 623 - 713 - 713 - 713 - 713 - 713)
09-22-06 18:24
edited on: 10-02-06 14:22

I think I have a similar problem... really often with the last 3.9g image
nearly no cpu usage and the UI frozen. I cannot access the stack with <cmd> .

- on linux vm 3.9-5

- and also with windows
(here sometimes a console open and when moving the mouse on the main window, I get infinites messages saying:
WARNING: event buffer overflow

What is strange is that when frozen, I still can use seaside apps, and really often squeak gets unfrozen after playing with seaside apps !

to get the hand back... I use RFB and the seaside interface that allows to suspend the UI process and restart it

(0007352 - 964 - 1094 - 1300 - 1300 - 1300 - 1300)
09-25-06 14:29

A similar problem looks like the one described by Bert here: [^]

Here is a snapshot:

I haven't been able to get at the cause for the freezes, yet, for
impara's squeaksource installation (3.7 image, stock Linux 3.7-7 VM).

We guess it's related to a networking problem in the VM. Because when it
  freezes, I just need to VNC into the machine, move the mouse pointer,
and it comes alive again.

The freezes regularly occur under heavy load, that is, multiple package
uploads causing a save of the in-memory meta data. My gut feeling is
this is related to async i/o, which somehow blocks, and is woken up
again when an X11 network packet arrives.

OTOH, I haven't heard of similar reports for other seaside apps. OTTH,
the usage pattern in squeaksource might be unique, nobody else is
serializing a multi-megabyte object network for each modification.

- Bert -
(0007353 - 3027 - 3418 - 3806 - 3806 - 3806 - 3806)
09-25-06 16:04

VM for Squeak with message sent instrumentation, might be useful to figure out what is going on...

Oct 29th, 2005. Coming out of a conversation I had with Tim and Michael at a Sophie team meeting I built a Mac OS-X carbon VM that records messsage send information from the interpreter bytecode loop. Unlike image based solutions this solution shows the sequence of actual message sends as the interpreter process them. Needlessly to say tracing generates 100 of MB of data which is written to the messageTraceFile.txt file found in your vm directory. The two change sets and allow you to build a patched VM, and invoke:

Smalltalk setVMStatsTraceMessageSendLevels: 1.

to set message send recording on. Note that you pass an integer so you can say

Smalltalk setVMStatsTraceMessageSendLevels: 5.

to set the number of levels in the call stack to print to 5.

Smalltalk setVMStatsTraceMessageSendLevels: 0.

turns the recording off, and closes the file, the file is always opened truncated writing.

Lastly under os-x (unix/darwin) a kill signal to turn recording on.

kill -USR1 19278 {Pick a squeak VM process id}

kill -USR1 19278 {increment the level by one, go to 2}
kill -USR1 19278 {increment the level by one, go to 3}

kill -USR2 19278 {turn recording off, and close the file}

If there is sufficent intrest we will consider incorporating this logic into other platform VMs. We also could use some post processing applications to read the raw data and construct interesting views of the data.

Found in the experimental directory as or higher [^]

I'll note David Lewis did some work with OSPRocess pluin, unix plugin runnable from the os-x carbon VM to allow signalling

From: "David T. Lewis" <>
Date: November 8, 2005 10:10:00 AM PST
To: The general-purpose Squeak developers list <>,
Subject: VMStatsTraceControl (was: Instrumenting Send and what is StrikeFontSet>displayString:on:from:to:at:kern:baselineY: doing?)
Reply-To: The general-purpose Squeak developers list <>


This is an updated version of VMStatsTraceControl-dtl for controlling
your VM stats logging. It is updated to take advantage of OSP 4.0.1 and
OSPP 4.0.1, which provide primitives to identify the (platform-dependent)
SIGUSR1 and SIGUSR2 signal numbers. It is backward compatible, using the
default values 10 and 12 respectively if the primitives are not available.

It also provides convenience methods VMStatsTraceControl class>>incrSigNum:
and VMStatsTraceControl class>>clearSigNum: to control the signals
to be monitored.

VMStatsTraceControl-dtl.9.cs.gz is the updated change set. I am also
attaching the original (unchanged) VMStatsInterpreterChanges-dtl.cs.gz
for completeness.

(0007892 - 1560 - 2136 - 2136 - 2222 - 2222 - 2222)
10-26-06 18:18

I am having the same issue. Most of the time I get the image back by accessing a seaside web app. I asked for the vm version sometimes to see if the VM was still alive and then end up getting Warning event buffer overflow errors. This is not related to high traffic. I have very little going on and it usually only happens when I leave the image running for some time and come back to it.

One time with a halt in the code I got back this code, it looked like it was stalled on wait. The value of wait I believe was 9. I don't know if this was coincidental or part of this problem. It could be just my hitting cmd-. while it was frozen.

VM: Win32 - a SmalltalkImage
Image: Squeak3.9alpha [latest update: 0007051]
VM Version: Squeak 3.7.1 (release) from Sep 23 2004
Compiler: gcc 2.95.2 19991024 (release)

WorldState >> interCyclePause: milliSecs
    "delay enough that the previous cycle plus the amount of delay will equal milliSecs. If the cycle is already expensive, then no delay occurs. However, if the system is idly waiting for interaction from the user, the method will delay for a proportionally long time and cause the overall CPU usage of Squeak to be low."

    | currentTime wait |

    (lastCycleTime notNil and: [CanSurrenderToOS ~~ false]) ifTrue: [
         currentTime _ Time millisecondClockValue.
          wait _ lastCycleTime + milliSecs - currentTime.
          (wait > 0 and: [ wait <= milliSecs ] )
        ifTrue: [
            (Delay forMilliseconds: wait) wait ].

    lastCycleTime _ Time millisecondClockValue.
    CanSurrenderToOS _ true.
(0007893 - 756 - 867 - 867 - 867 - 867 - 867)
10-26-06 18:40

Image locked up without seaside running:

Ok iteresting note to add. I got locked up with the same symptoms without having seaside running at all. I do not save seaside started in my image so this image has never had it running at all. I do have seaside loaded in the image, but in the time after I started the image, wrote the previous note to this bug and went back it was locked up. I went to dump the call stack and got:

# Debug console
# To close: F2 -> 'debug options' -> 'show output console'
# To disable: F2 -> 'debug options' -> 'show console on errors'
615362432 >idleProcess
615217556 [] in >startUp
615217648 [] in BlockContext>newProcess

then got many

WARNING: event buffer overflow

moving the mouse over the window.
(0007895 - 972 - 1241 - 1367 - 1367 - 1367 - 1367)
10-27-06 00:00


Ok after a little poking around I noticed:

CTPusher >> startUp
    self shutDown.
    delay := Delay forSeconds: 15.
    process := SeasidePlatformSupport
        withName: 'ping'
        withLowestPriorityFork: [ [ self pingProcess ] repeat ]

Iíve been working with Squeak for a long time now and never had the image freeze up on me before. The only thing I did differently was to load the Comet application.

In the bug [^] I noted that I was able to reproduce the error without starting seaside. The above code seemed like a good candidate, so I terminated the process. I was not able to get the image to freeze up after this. I tried changing the priority from 10 to 11 so that it would not run concurrently with idle process and it still had not frozen up on me.

Does this process have to run at the same priority as idle? Is there a problem with a deadlock in the priority switching process of the vm?

Ron Teitelbaum
(0008698 - 76 - 76 - 76 - 76 - 76 - 76)
12-13-06 15:13

since the new version of comet, i don't have the image that froze anymore ;)
(0010933 - 226 - 226 - 226 - 226 - 226 - 226)
07-24-07 21:55

As noted by Andreas the problem seems to be an update issue in Delay when the process doing the update gets terminated as it's sorting the Delay collection. This leads to the Delay being confused and everything stops running.
(0010958 - 169 - 181 - 437 - 437 - 437 - 437)
08-02-07 11:57

The problems described here seem to be covered by one of the two following reports:
- [^]
- [^]

- Issue History
Date Modified Username Field Change
09-12-06 12:04 renggli New Issue
09-12-06 12:04 renggli Issue Monitored: renggli
09-14-06 19:16 johnmci Note Added: 0007116
09-14-06 19:23 renggli Note Added: 0007117
09-14-06 19:23 renggli Note Added: 0007118
09-14-06 21:07 johnmci Note Added: 0007136
09-14-06 21:55 renggli Note Added: 0007141
09-22-06 18:24 cdrick Note Added: 0007325
09-25-06 14:00 cdrick Note Edited: 0007325
09-25-06 14:29 cdrick Note Added: 0007352
09-25-06 16:04 johnmci Note Added: 0007353
10-02-06 14:22 cdrick Note Edited: 0007325
10-26-06 18:18 Ron Note Added: 0007892
10-26-06 18:18 Ron Issue Monitored: Ron
10-26-06 18:40 Ron Note Added: 0007893
10-27-06 00:00 Ron Note Added: 0007895
12-13-06 15:13 cdrick Note Added: 0008698
07-24-07 21:55 johnmci Note Added: 0010933
07-24-07 21:55 johnmci Status new => resolved
07-24-07 21:55 johnmci Resolution open => fixed
08-02-07 11:57 al Note Added: 0010958
08-03-07 00:29 wiz Relationship added related to 0006588
08-03-07 00:29 wiz Relationship added related to 0006581
09-11-07 23:35 tim Status resolved => closed

Mantis 1.0.8[^]
Copyright © 2000 - 2007 Mantis Group
115 total queries executed.
68 unique queries executed.
Powered by Mantis Bugtracker