Closed
Bug 1139597
Opened 9 years ago
Closed 6 years ago
Intermittent PROCESS-CRASH | Main app process exited normally | application crashed [None]
Categories
(Core :: XPCOM, defect, P3)
Tracking
()
RESOLVED
INVALID
People
(Reporter: RyanVM, Unassigned)
References
Details
(Whiteboard: [stockwell unknown])
08:42:52 INFO - TEST-INFO | Main app process: exit 0 08:42:52 INFO - runtests.py | Application ran for: 0:02:25.307033 08:42:52 INFO - zombiecheck | Reading PID log: /var/folders/4q/slhhrqs92_l1cnypjyf7mbxh00000w/T/tmpT1DQe0pidlog 08:42:52 INFO - mozcrash Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-macosx64/1425479998/firefox-39.0a1.en-US.mac.crashreporter-symbols.zip 08:43:33 INFO - mozcrash Saved minidump as /builds/slave/talos-slave/test/build/blobber_upload_dir/76A70A28-D2AA-42EB-975B-03BE94580FF0.dmp 08:43:33 INFO - mozcrash Saved app info as /builds/slave/talos-slave/test/build/blobber_upload_dir/76A70A28-D2AA-42EB-975B-03BE94580FF0.extra 08:43:33 WARNING - PROCESS-CRASH | Main app process exited normally | application crashed [None] 08:43:33 INFO - Crash dump filename: /var/folders/4q/slhhrqs92_l1cnypjyf7mbxh00000w/T/tmpo47v6k.mozrunner/minidumps/76A70A28-D2AA-42EB-975B-03BE94580FF0.dmp 08:43:33 INFO - Operating system: Mac OS X 08:43:33 INFO - 10.10.2 14C109 08:43:33 INFO - CPU: amd64 08:43:33 INFO - family 6 model 42 stepping 7 08:43:33 INFO - 8 CPUs 08:43:33 INFO - Crash reason: EXC_SOFTWARE / SIGABRT 08:43:33 INFO - Crash address: 0x7fff912fb4de 08:43:33 INFO - Thread 0 08:43:33 INFO - 0 XUL!nsCycleCollector::FreeSnowWhite(bool) [nsISupportsImpl.h:a190978e94f1 : 314 + 0x4] 08:43:33 INFO - rbx = 0x000000010a9e9000 r12 = 0x0000000153489cf8 08:43:33 INFO - r13 = 0x0000000153489008 r14 = 0x0000000000000d08 08:43:33 INFO - r15 = 0x000000012fd9fae0 rip = 0x0000000100490cd1 08:43:33 INFO - rsp = 0x00007fff5fbfc0b0 rbp = 0x00007fff5fbfc140 08:43:33 INFO - Found by: given as instruction pointer in context 08:43:33 INFO - 1 XUL!nsCycleCollector_doDeferredDeletion() [nsCycleCollector.cpp:a190978e94f1 : 4160 + 0x4] 08:43:33 INFO - rbx = 0x000000a6d5230929 r12 = 0x000000010a916128 08:43:33 INFO - r13 = 0x00007fff5fbfc267 r14 = 0x000000010a916100 08:43:33 INFO - r15 = 0x00000001104f9580 rip = 0x0000000100493aba 08:43:33 INFO - rsp = 0x00007fff5fbfc150 rbp = 0x00007fff5fbfc150 08:43:33 INFO - Found by: call frame info 08:43:33 INFO - 2 XUL!AsyncFreeSnowWhite::Run() [XPCJSRuntime.cpp:a190978e94f1 : 228 + 0x4] 08:43:33 INFO - rbx = 0x000000a6d5230929 r12 = 0x000000010a916128 08:43:33 INFO - r13 = 0x00007fff5fbfc267 r14 = 0x000000010a916100 08:43:33 INFO - r15 = 0x00000001104f9580 rip = 0x0000000100bc382f 08:43:33 INFO - rsp = 0x00007fff5fbfc160 rbp = 0x00007fff5fbfc180 08:43:33 INFO - Found by: call frame info 08:43:33 INFO - 3 XUL!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:a190978e94f1 : 855 + 0x5] 08:43:33 INFO - rbx = 0x000000010a916168 r12 = 0x000000010a916128 08:43:33 INFO - r13 = 0x00007fff5fbfc267 r14 = 0x000000010a916100 08:43:33 INFO - r15 = 0x000000010a916100 rip = 0x00000001004e867d 08:43:33 INFO - rsp = 0x00007fff5fbfc190 rbp = 0x00007fff5fbfc250 08:43:33 INFO - Found by: call frame info 08:43:33 INFO - 4 XUL!NS_ProcessPendingEvents(nsIThread*, unsigned int) [nsThreadUtils.cpp:a190978e94f1 : 207 + 0xd] 08:43:33 INFO - rbx = 0x0000000000000000 r12 = 0x00000000000aeef5 08:43:33 INFO - r13 = 0x00007fff5fbfc267 r14 = 0x0000000000000014 08:43:33 INFO - r15 = 0x000000010a916100 rip = 0x0000000100509686 08:43:33 INFO - rsp = 0x00007fff5fbfc260 rbp = 0x00007fff5fbfc290 08:43:33 INFO - Found by: call frame info 08:43:33 INFO - 5 XUL!nsBaseAppShell::NativeEventCallback() [nsBaseAppShell.cpp:a190978e94f1 : 98 + 0xe] 08:43:33 INFO - rbx = 0x000000010a92fde0 r12 = 0x0000000000000000 08:43:33 INFO - r13 = 0x0000000000001d03 r14 = 0x000000010a916100 08:43:33 INFO - r15 = 0x000000010a92fd00 rip = 0x0000000101ffdf97 08:43:33 INFO - rsp = 0x00007fff5fbfc2a0 rbp = 0x00007fff5fbfc2c0 08:43:33 INFO - Found by: call frame info 08:43:33 INFO - 6 XUL!nsAppShell::ProcessGeckoEvents(void*) [nsAppShell.mm:a190978e94f1 : 377 + 0x7] 08:43:33 INFO - rbx = 0x000000010a931000 r12 = 0x0000000108f0b440 08:43:33 INFO - r13 = 0x0000000000001d03 r14 = 0x0000000108f0b458 08:43:33 INFO - r15 = 0x000000010a92fde0 rip = 0x0000000102044cb2 08:43:33 INFO - rsp = 0x00007fff5fbfc2d0 rbp = 0x00007fff5fbfc310 08:43:33 INFO - Found by: call frame info
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 3•9 years ago
|
||
[Mass Closure] Closing bug as the WORKSFORME as the intermittent failure has not been seen for 45+ days If this has been closed and you feel that it should Not have been closed, please reopen and add [leave open] to the whiteboard.
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 12•9 years ago
|
||
Reopening as per recent comments.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 23•9 years ago
|
||
Ted, I'm not sure if there's anything useful here that you can do something with, but it looks like minidump_stackwalk is failing to get a stack somehow.
Flags: needinfo?(ted)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 62•8 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #23) > Ted, I'm not sure if there's anything useful here that you can do something > with, but it looks like minidump_stackwalk is failing to get a stack somehow. I poked at a few of the minidumps from the logs, and they're basically all malformed. The Windows ones are completely empty, and the OS X ones have a valid header but are missing the information in the minidump directory that points to the actual useful contents. This is why minidump_stackwalk fails to print anything. This usually means memory was trashed, but in this case, since the main app process is shutting down cleanly, it might be a race between the child process crashing and the parent process exiting. It's possible we only wind up writing part of the dump before the main process exits, or something like that.
Flags: needinfo?(ted)
Comment 63•8 years ago
|
||
Thanks for taking a look, Ted. That sounds plausible. I looked at all of the crashes in the last week or so and they all had [None] so I'm going to change the signature. A number of them were hitting assertion failures along the lines of: Assertion failure: lock != NULL, at c:/builds/moz2_slave/fx-team-w32-d-0000000000000000/build/src/nsprpub/pr/src/threads/combined/prulock.c:211
Summary: Intermittent OSX 10.10 PROCESS-CRASH | Main app process exited normally | application crashed [None][@ nsCycleCollector::FreeSnowWhite(bool)] → Intermittent OSX 10.10 PROCESS-CRASH | Main app process exited normally | application crashed [None]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 69•8 years ago
|
||
This occurred only on OS X 10.6 and Windows 7 in the last month.
Summary: Intermittent OSX 10.10 PROCESS-CRASH | Main app process exited normally | application crashed [None] → Intermittent PROCESS-CRASH | Main app process exited normally | application crashed [None]
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 76•8 years ago
|
||
Bulk assigning P3 to all open intermittent bugs without a priority set in Firefox components per bug 1298978.
Priority: -- → P3
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 88•6 years ago
|
||
There are 44 failures in the past 7 days. Platforms: Linux opt/debug, Linux x64 opt/debug, linux64-qr opt/debug, linux64-stylo-disabled opt/debug, OS X 10.10 opt/debug and linux32-stylo-disabled debug. Recent log: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=155746958 [task 2018-01-11T22:20:03.840Z] 22:20:03 INFO - Buffered messages logged at 22:13:37 17221 [task 2018-01-11T22:20:03.842Z] 22:20:03 INFO - SpawnTask.js | Entering test test_drawWindow 17222 [task 2018-01-11T22:20:03.843Z] 22:20:03 INFO - Extension loaded 17223 [task 2018-01-11T22:20:03.844Z] 22:20:03 INFO - Extension loaded 17224 [task 2018-01-11T22:20:03.845Z] 22:20:03 INFO - Buffered messages finished 17225 [task 2018-01-11T22:20:03.846Z] 22:20:03 ERROR - TEST-UNEXPECTED-TIMEOUT | toolkit/components/extensions/test/mochitest/test-oop-extensions/test_ext_contentscript_canvas.html | application timed out after 370 seconds with no output 17226 [task 2018-01-11T22:20:03.860Z] 22:20:03 ERROR - Force-terminating active process(es). 17227 [task 2018-01-11T22:20:03.861Z] 22:20:03 INFO - Determining child pids from psutil... 17228 [task 2018-01-11T22:20:03.862Z] 22:20:03 INFO - [3341, 3370] 17229 [task 2018-01-11T22:20:03.863Z] 22:20:03 INFO - ==> process 3236 launched child process 3259 17230 [task 2018-01-11T22:20:03.864Z] 22:20:03 INFO - ==> process 3236 launched child process 3287 17231 [task 2018-01-11T22:20:03.866Z] 22:20:03 INFO - ==> process 3236 launched child process 3341 17232 [task 2018-01-11T22:20:03.867Z] 22:20:03 INFO - ==> process 3236 launched child process 3370 17233 [task 2018-01-11T22:20:03.868Z] 22:20:03 INFO - ==> process 3236 launched child process 3410 17234 [task 2018-01-11T22:20:03.869Z] 22:20:03 INFO - Found child pids: set([3370, 3259, 3410, 3341, 3287]) 17235 [task 2018-01-11T22:20:03.870Z] 22:20:03 INFO - Failed to get child procs 17236 [task 2018-01-11T22:20:03.871Z] 22:20:03 INFO - Killing process: 3370 17237 [task 2018-01-11T22:20:03.873Z] 22:20:03 INFO - TEST-INFO | started process screentopng 17238 [task 2018-01-11T22:20:04.213Z] 22:20:04 INFO - TEST-INFO | screentopng: exit 0 17239 [task 2018-01-11T22:20:04.214Z] 22:20:04 INFO - Killing process: 3259 17240 [task 2018-01-11T22:20:04.215Z] 22:20:04 INFO - Not taking screenshot here: see the one that was previously logged 17241 [task 2018-01-11T22:20:04.215Z] 22:20:04 INFO - Can't trigger Breakpad, process no longer exists 17242 [task 2018-01-11T22:20:04.216Z] 22:20:04 INFO - Killing process: 3410 17243 [task 2018-01-11T22:20:04.217Z] 22:20:04 INFO - Not taking screenshot here: see the one that was previously logged 17244 [task 2018-01-11T22:20:04.217Z] 22:20:04 INFO - Can't trigger Breakpad, process no longer exists 17245 [task 2018-01-11T22:20:04.218Z] 22:20:04 INFO - Killing process: 3341 17246 [task 2018-01-11T22:20:04.219Z] 22:20:04 INFO - Not taking screenshot here: see the one that was previously logged 17247 [task 2018-01-11T22:20:04.219Z] 22:20:04 INFO - Killing process: 3287 17248 [task 2018-01-11T22:20:04.220Z] 22:20:04 INFO - Not taking screenshot here: see the one that was previously logged 17249 [task 2018-01-11T22:20:04.221Z] 22:20:04 INFO - Can't trigger Breakpad, process no longer exists 17250 [task 2018-01-11T22:20:04.221Z] 22:20:04 INFO - Killing process: 3236 17251 [task 2018-01-11T22:20:04.221Z] 22:20:04 INFO - Not taking screenshot here: see the one that was previously logged 17252 [task 2018-01-11T22:20:04.516Z] 22:20:04 INFO - psutil found pid 3236 dead 17253 [task 2018-01-11T22:20:04.686Z] 22:20:04 WARNING - runtests.py | Failed to get app exit code - running/crashed? 17254 [task 2018-01-11T22:20:04.689Z] 22:20:04 INFO - TEST-INFO | Main app process: exit 0 17255 [task 2018-01-11T22:20:04.690Z] 22:20:04 INFO - runtests.py | Application ran for: 0:07:48.650529 17256 [task 2018-01-11T22:20:04.691Z] 22:20:04 INFO - zombiecheck | Reading PID log: /tmp/tmpmZubAOpidlog 17257 [task 2018-01-11T22:20:04.692Z] 22:20:04 INFO - ==> process 3236 launched child process 3259 17258 [task 2018-01-11T22:20:04.694Z] 22:20:04 INFO - ==> process 3236 launched child process 3287 17259 [task 2018-01-11T22:20:04.695Z] 22:20:04 INFO - ==> process 3236 launched child process 3341 17260 [task 2018-01-11T22:20:04.696Z] 22:20:04 INFO - ==> process 3236 launched child process 3370 17261 [task 2018-01-11T22:20:04.697Z] 22:20:04 INFO - ==> process 3236 launched child process 3410 :froydnj Could you please take a look?
Flags: needinfo?(nfroyd)
Whiteboard: [stockwell needswork]
Comment 89•6 years ago
|
||
From the log in comment 88: The uploaded screenshot shows a tab crashed page: https://public-artifacts.taskcluster.net/Ymt8IHhhRgChqIOpKZqqXw/0/public/test_info//mozilla-test-fail-screenshot_5YFdB5.png That seems bad. I thought we were setting `MOZ_CRASHREPORTER_SHUTDOWN=1` to force the test harness to quit when this happens? From the log snippet above, looks like maybe the browser process was hung waiting on a content process, and then our cleanup routine did `kill -ABRT` on that content process, which unstuck the browser? You can see that the browser pid here was 3236, and after we loop through all of the pids it launched we try to kill it and hit "psutil found pid 3236 dead". I'm not quite sure what's going on with the `application crashed [None]` bit--we get a couple of other sensible looking stacks in the log: the one for the browser process shows it just hanging out spinning the event loop, and there's one for a content process (which looks like the actual underlying problem here) which starts out: [task 2018-01-11T22:20:25.106Z] 22:20:25 INFO - Thread 26 (crashed) [task 2018-01-11T22:20:25.107Z] 22:20:25 INFO - 0 firefox!MOZ_CrashOOL [Assertions.cpp:adb5fab118c5 : 33 + 0x0] [task 2018-01-11T22:20:25.108Z] 22:20:25 INFO - eip = 0x0804d213 esp = 0xdcefea20 ebp = 0xdcefea28 ebx = 0xf398394a [task 2018-01-11T22:20:25.109Z] 22:20:25 INFO - esi = 0x08086000 edi = 0xf54a1000 eax = 0x08086980 ecx = 0xf7560864 [task 2018-01-11T22:20:25.110Z] 22:20:25 INFO - edx = 0x00000000 efl = 0x00210282 [task 2018-01-11T22:20:25.111Z] 22:20:25 INFO - Found by: given as instruction pointer in context [task 2018-01-11T22:20:25.111Z] 22:20:25 INFO - 1 libxul.so!nsAutoOwningThread::AssertCurrentThreadOwnsMe [nsISupportsImpl.cpp:adb5fab118c5 : 46 + 0x12] [task 2018-01-11T22:20:25.112Z] 22:20:25 INFO - eip = 0xef62e01d esp = 0xdcefea30 ebp = 0xdcefea48 ebx = 0xf54a1000 [task 2018-01-11T22:20:25.113Z] 22:20:25 INFO - esi = 0xdf611200 edi = 0xf54a1000 [task 2018-01-11T22:20:25.114Z] 22:20:25 INFO - Found by: call frame info [task 2018-01-11T22:20:25.114Z] 22:20:25 INFO - 2 libxul.so!mozilla::net::nsStandardURL::Release [nsISupportsImpl.h:adb5fab118c5 : 69 + 0x6] [task 2018-01-11T22:20:25.115Z] 22:20:25 INFO - eip = 0xef7400f9 esp = 0xdcefea50 ebp = 0xdcefea78 ebx = 0xdf690640 [task 2018-01-11T22:20:25.116Z] 22:20:25 INFO - esi = 0xdf611200 edi = 0xf54a1000 [task 2018-01-11T22:20:25.116Z] 22:20:25 INFO - Found by: call frame info The bogus minidump might just be an artifact of something racing during shutdown?
Comment 90•6 years ago
|
||
https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=155746958&lineNumber=20426 Redirecting to Valentin for the nsStandardURL issue from comment 88. I don't see the same URL failures in other logs like the brasstacks links in comment 87, but we might as well start somewhere...
Flags: needinfo?(nfroyd) → needinfo?(valentin.gosu)
Comment 91•6 years ago
|
||
This looks like jar channel gets released off-MT and so does the nsStandardURL. This is probably due to bug 1373708 (which got backed out). Gary, note that we should use nsMainThreadPtrHandle<T> in nsJARChannel for all URI types if the channel will be released OMT.
Flags: needinfo?(valentin.gosu)
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 94•6 years ago
|
||
Everything starred as this over the past two weeks either has another bug it should be starred as, or should have another bug filed for it, or was bustage and shouldn't have been starred at all. This bug is nothing but an attractive nuisance.
Status: REOPENED → RESOLVED
Closed: 9 years ago → 6 years ago
Keywords: crash,
intermittent-failure
Resolution: --- → INVALID
Comment hidden (Intermittent Failures Robot) |
You need to log in
before you can comment on or make changes to this bug.
Description
•