Closed Bug 589496 Opened 10 years ago Closed 9 years ago
No longer getting stacks from shutdown hangs on Windows, causes "Shutdown | application timed out after 330 seconds with no output" with no clue about cause
As of 2010-08-16, in http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1281950474.1281953347.13418.gz, we were getting stacks after the 300 second shutdown timeout that's the result of bug 523319, but since some time after that (yay for ignoring the puzzling orange!), we've been getting things like http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1282423115.1282426931.18779.gz instead, with the 300 second timeout followed by a 1200 second no output timeout.
As loading those log is painful, here are excerpts: 2010-08-16, get a stack: --DOCSHELL 05C2DB60 == 3 --DOCSHELL 06029CD0 == 2 NEXT ERROR TEST-UNEXPECTED-FAIL | Shutdown | application timed out after 330 seconds with no output INFO | automation.py | Application ran for: 0:38:15.734000 INFO | automation.py | Reading PID log: c:\docume~1\cltbld\locals~1\temp\tmpsil9i-pidlog ==> process 3728 launched child process 1056 ==> process 3728 launched child process 3660 INFO | automation.py | Checking for orphan process with PID: 1056 INFO | automation.py | Checking for orphan process with PID: 3660 PROCESS-CRASH | Shutdown | application crashed (minidump found) Operating system: Windows NT 5.2.3790 Service Pack 2 CPU: x86 GenuineIntel family 6 model 23 stepping 8 1 CPU Crash reason: EXCEPTION_ACCESS_VIOLATION Crash address: 0x0 Thread 35 (crashed) 0 crashinjectdll.dll!CrashingThread(void *) [crashinjectdll.cpp:ce4d646e8a1c : 13 + 0x3] ... ---------------- 2010-08-21, no stack: --DOMWINDOW == 12 (0AA40560) [serial = 1846] [outer = 00000000] [url = about:blank] --DOCSHELL 088DD6E8 == 2 TEST-UNEXPECTED-FAIL | Shutdown | application timed out after 330 seconds with no output command timed out: 1200 seconds without output ---------------- AFAICT we're downloading and unpacking the symbol files the same in both logs, the crash zip files are similar in size, and the call to runtests.py is the same. Got a VM both times. The difference is that runtests is not outputting anything after detecting the timeout. Regression from http://hg.mozilla.org/mozilla-central/log/cba4071f3551/build/automation.py.in ?
That was my first thought too, but I don't see any smoking gun there.
Pretty sure I'm terribly unlucky here with the try server and am hitting this on three runs in a row: http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1282676248.1282680355.8855.gz&fulltext=1 http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1282666795.1282670609.22992.gz&fulltext=1 http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1282667189.1282671016.25419.gz&fulltext=1
Component: Release Engineering → General
Product: mozilla.org → Core
QA Contact: release → general
Version: other → Trunk
Summary: No longer getting stacks from shutdown hangs on Windows? → No longer getting stacks from shutdown hangs on Windows, causes "Shutdown | application timed out after 330 seconds with no output" with no clue about cause
(In reply to comment #1) > AFAICT we're downloading and unpacking the symbol files the same in both logs, > the crash zip files are similar in size, and the call to runtests.py is the > same. Got a VM both times. Hmm. I guess that rules out crashinject not working on a different OS version or something easy like that. What sorts of OPSI rollouts went live around this time period? It's possible a configuration change on the machines caused it to stop working. Looking at the code, though: http://mxr.mozilla.org/mozilla-central/source/build/win32/crashinject.cpp It's pretty good about printing errors in most cases. In addition, in automation.py, if the exe didn't exist, or exited with an error code, I'd expect it to fall through and print "Can't trigger Breakpad, just killing process": http://mxr.mozilla.org/mozilla-central/source/build/automation.py.in#693
Does crashinject work properly for non-Shutdown hangs?
It worked fine in my testing. It literally just injects a thread into the program that intentionally crashes, so it shouldn't matter what the app is doing.
Let's just take all those logs starred as "No longer getting stacks" which all have stacks as evidence that it fixed itself (and that nobody ever actually opens a log).
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.