Implement mechanism to catch hangs and launch Talkback



Core Graveyard
Talkback Client
17 years ago
7 years ago


(Reporter: selmer (gone), Assigned: jay)


Windows NT
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)




17 years ago
Much of the feedback from Netscape 6 indicates that there are hangs that many
people are hitting.  We currently have no mechanism for collecting data about
hangs.  If we could trap these hangs into talkback, we
whttp://client/mojo/feedback/newsgroup_feedback_overview.htmlould be able to get
information that helps us fix them.  Even just getting people we can contact to
reproduce problems would be an enormous help.

I talked this over with Chris Saari and he believes it's straightforward to trap
at least one class of hangs by having a timer that works just like the busy
cursor does on the Mac.  After some set number seconds where we have not
returned to the event loop, this timer fires and either drops into talkback (if
we're very sure it's always the right thing to do) or puts up a dialog asking
the user how they'd like to procede (if we think that's safe or we're not always

Chris doesn't know how to get into talkback, but can help with hooking up the
event stuff.  Someone would need to own the UI if we did a dialog.  It would
help to be able to tell talkback this report was due to a hang rather than
relying on the user to say that in their comments.

Comment 1

17 years ago
This mechanism will work except we wont get stacktrace. With our qfa 
component we can trigger an artificial incident and bring the customized 
talkback dialogbox(using talkback server UI). By doing this, data will be 
maintained by talkback servers. Note: Customized Talkback UI is not available
for Mac. 

Comment 2

17 years ago
There's no way to force a stacktrace?  What if we forced a crash by explicitly
dereferencing a null pointer?  :-)

Comment 3

17 years ago
I meant valid stack trace. We can always create an artificial crash. Will that 
be useful ? One major problem with that is it may make the system unstable.


Comment 4

17 years ago
See bug 62447 for some discussion about how to intentionally cause a crash.

Comment 5

17 years ago
Chris Saari,  Chofmann claims that the method you proposed is not sufficient
because it's Mac-specific in some way (I forget the details.)  Could we get some
of that discussion here in the bug?

Comment 6

17 years ago
I think it would work XP, assuming your hung app was still going through the
event loop. If not, it might still work on Windows and/or linux if we're
processing timers asychronously (I'm not familar with how timers are implemented
there), ie. they're not processed as part of the event loop.

If you're hung at some interrupt level higher than timers, well, your life
sucks. I highly doubt that though.

Comment 7

17 years ago
this would be great if we could do it.  how long is "too long to be
away from the UI event loop"?   what part of the code monitors, or could
monitor such lack of activity?

Comment 8

17 years ago
In the timer callback you check the machine's tick count and store it. In the
event processing loop, you store the current time. If more than say, 10 seconds
have passed between the last time you were in the event loop and the current
time during the timer callback, you may wish to consider doing something about it.

This only catches bugs that stop you from going through the event loop. It is
entirely possible to make the app appear hung, yet still be going thorough the
event loop. I cite my many 0.9.1 command dispatching/handling bugs.

Comment 9

17 years ago
Are you storing the system time?  There may be scenarios that
we aren't thinking of where this might cause problems:
  * What about sleep functions on laptops?
    The app will be suspended but "current time" will keep going.
  * What if I reset the time on my machine?
    Sudden time jumps might trigger a crash.  Daylight savings time?

Maybe we could use alecf's mozilla timer service instead of system time.

If this goes in, we should make tunable/turn-off-able with a pref.


17 years ago
Blocks: 79151

Comment 10

16 years ago
The Microsoft Error Reporting Tool catches hangs.

Comment 11

16 years ago
I thought I saw somewhere that this feature is already present.
Is it fixed??
*** Bug 179855 has been marked as a duplicate of this bug. ***

Comment 13

13 years ago
re comment 11, this isn't fixed from my limited POV


13 years ago
Blocks: 238292

Comment 14

12 years ago
Currently on the trunk nightlies, (in my experience) there have been more hangs than usual, so implementing this is more important than it's been for a while.

I would suggest, rather than trying to identify when the CPU is too busy for too long or the cursor is in some mode or another, just catch a KILL signal (as distinct from a normal term signal, or whatever they call it on windows), because if it's hung, eventually someone has to kill it.  Win XP already does this and offers to send reports to MS.

Comment 15

10 years ago
idea for breakpad?  
not that it could "catch" a hang, but perhaps a tool that can be user initiated ...
Assignee: namachi → jay


9 years ago
Component: Talkback Client → Talkback Client
Product: Core → Core Graveyard
Talkback isn't used anymore:
R.Invalid now.
Last Resolved: 7 years ago
Resolution: --- → INVALID

Comment 17

7 years ago
See bug 429592 for the same idea in Breakpad.
You need to log in before you can comment on or make changes to this bug.