Closed
Bug 1379081
Opened 7 years ago
Closed 4 years ago
Intermittent test_tabbar_session_restore_button.py TestTabbarSessionRestoreButton.test_click_restore | application crashed [@ mozilla::(anonymous namespace)::RunWatchdog(void*)]
Categories
(Core :: General, defect)
Core
General
Tracking
()
RESOLVED
INCOMPLETE
People
(Reporter: intermittent-bug-filer, Unassigned)
References
(Depends on 1 open bug, Blocks 1 open bug)
Details
(4 keywords)
Filed by: cbook [at] mozilla.com https://treeherder.mozilla.org/logviewer.html#?job_id=112436127&repo=autoland https://queue.taskcluster.net/v1/task/VLDF69ItTRivRNbFqMdGRQ/runs/0/artifacts/public/logs/live_backing.log
Comment 1•7 years ago
|
||
This is actually a hang during shutdown of the application. I see a couple of warnings/errors and a final assertion: 19:02:39 INFO - 1499392959713 Marionette TRACE 2 -> [0,174,"quit",{"flags":["eAttemptQuit"]}] 19:02:39 INFO - 1499392959716 Marionette INFO New connections will no longer be accepted 19:02:39 INFO - [Parent 1612] WARNING: NS_ENSURE_TRUE(weak) failed: file /home/worker/workspace/build/src/dom/base/nsFrameMessageManager.cpp, line 343 19:02:40 INFO - --DOCSHELL 0x12527f000 == 8 [pid = 1612] [id = {51337ebd-ca6a-104e-b30e-499164194a11}] 19:02:40 INFO - 1499392960284 Marionette TRACE 2 <- [1,174,null,{"cause":"shutdown"}] 19:02:40 INFO - [Parent 1612] WARNING: cannot post event if not initialized: file /home/worker/workspace/build/src/netwerk/protocol/http/nsHttpConnectionMgr.cpp, line 290 19:02:40 INFO - [Parent 1612] WARNING: NS_ENSURE_TRUE(weak) failed: file /home/worker/workspace/build/src/dom/base/nsFrameMessageManager.cpp, line 343 19:02:40 INFO - 1499392960355 Marionette DEBUG Closed connection 2 19:02:40 INFO - [Child 1613] WARNING: nsAppShell::Exit() called redundantly: file /home/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 690 19:02:40 INFO - ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost 19:02:40 INFO - [Child 1613] WARNING: NS_ENSURE_TRUE(maybeContext) failed: file /home/worker/workspace/build/src/xpcom/threads/nsThread.cpp, line 1040 19:02:40 INFO - [Child 1614] WARNING: nsAppShell::Exit() called redundantly: file /home/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 690 19:02:40 INFO - ###!!! [Parent][RunMessage] Error: Channel closing: too late to send/recv, messages will be lost 19:02:40 INFO - [Child 1614] WARNING: NS_ENSURE_TRUE(maybeContext) failed: file /home/worker/workspace/build/src/xpcom/threads/nsThread.cpp, line 1040 19:02:40 INFO - [Child 1615] WARNING: nsAppShell::Exit() called redundantly: file /home/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 690 19:02:40 INFO - [Child 1615] WARNING: NS_ENSURE_TRUE(maybeContext) failed: file /home/worker/workspace/build/src/xpcom/threads/nsThread.cpp, line 1040 19:02:40 INFO - [Child 1613] WARNING: '!mMainThread', file /home/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 330 [..] 19:02:40 INFO - Assertion failure: 0 == rv, at /home/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:870 19:02:40 INFO - Redirecting call to abort() to mozalloc_abort 19:02:40 INFO - Hit MOZ_CRASH() at /home/worker/workspace/build/src/memory/mozalloc/mozalloc_abort.cpp:33 19:02:40 INFO - nsStringStats 19:02:40 INFO - => mAllocCount: 18517 19:02:40 INFO - => mReallocCount: 344 19:02:40 INFO - => mFreeCount: 18517 19:02:40 INFO - => mShareCount: 15093 19:02:40 INFO - => mAdoptCount: 1967 19:02:40 INFO - Assertion failure: 0 == rv, at /home/worker/workspace/build/src/nsprpub/pr/src/pthreads/ptthread.c:292 19:02:40 INFO - Redirecting call to abort() to mozalloc_abort 19:02:40 INFO - Hit MOZ_CRASH() at /home/worker/workspace/build/src/memory/mozalloc/mozalloc_abort.cpp:33 [..] 19:02:41 INFO - *** UTM:SVC TimerManager:registerTimer called after profile-before-change notification. Ignoring timer registration for id: telemetry_modules_ping 19:02:43 INFO - --DOCSHELL 0x118aaa800 == 7 [pid = 1612] [id = {4aff02b3-457b-d14f-8ed3-9f932689ed2e}] 19:02:43 INFO - --DOCSHELL 0x126b71800 == 6 [pid = 1612] [id = {73ba26ce-d4f9-fb45-8d34-ee8c815103e3}] 19:02:43 INFO - 1499392963204 Marionette DEBUG Received observer notification "xpcom-shutdown" 19:02:43 INFO - 1499392963205 Marionette DEBUG Received observer notification "xpcom-shutdown" 19:02:43 INFO - [Parent 1612] WARNING: nsAppShell::Exit() called redundantly: file /home/worker/workspace/build/src/widget/cocoa/nsAppShell.mm, line 690 19:02:43 INFO - [Parent 1612] WARNING: NS_ENSURE_TRUE(maybeContext) failed: file /home/worker/workspace/build/src/xpcom/threads/nsThread.cpp, line 1040 19:02:44 INFO - [Parent 1612] WARNING: '!mMainThread', file /home/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 330 19:02:44 INFO - [Parent 1612] WARNING: '!mMainThread', file /home/worker/workspace/build/src/xpcom/threads/nsThreadManager.cpp, line 330 Benjamin, is that information helpful to see what could have been caused this hang?
status-firefox56:
--- → affected
Component: Marionette → Startup and Profile System
Flags: needinfo?(benjamin)
Product: Testing → Toolkit
Version: Version 3 → unspecified
Comment 2•7 years ago
|
||
Without stacks this is mostly meaningless. Automation should be providing stacks from assertions (typically because they save minidumps). For the hang itself, if the harness is killing the process after a hang it should take a minidump before it kills it off and dump the stack.
Flags: needinfo?(benjamin)
Comment 3•7 years ago
|
||
Hm, we have a 65s timeout for the process to be killed. Isn't Firefox killed after 60s if a thread hangs? Has this timeout been increased? Or should we wait a bit longer similar to application update which has 70s set?
Flags: needinfo?(benjamin)
Comment 4•7 years ago
|
||
Don't count on the internal shutdown hang monitor to be reliable. Depending on the context, it might itself deadlock or otherwise fail. The harness should do a dump+kill procedure after some reasonable timeout.
Flags: needinfo?(benjamin)
Comment 5•7 years ago
|
||
(In reply to Benjamin Smedberg [:bsmedberg] from comment #4) > Don't count on the internal shutdown hang monitor to be reliable. Depending > on the context, it might itself deadlock or otherwise fail. The harness > should do a dump+kill procedure after some reasonable timeout. Which timeout would you find reasonable? I don't think it would hurt to increase it to somewhat 90 or even 120s.
Comment 6•7 years ago
|
||
Oh, I don't care about the actual value. It could be 30s for all I care. What I care about is the harness collecting a stack.
Updated•7 years ago
|
Component: Startup and Profile System → Untriaged
Product: Toolkit → Core
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Keywords: bulk-close-intermittents
Resolution: --- → INCOMPLETE
Comment 11•7 years ago
|
||
The problem here is actually a too long shutdown which causes a crash: 01:18:42 INFO - Hit MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.) at /builds/worker/workspace/build/src/toolkit/components/terminator/nsTerminator.cpp:160 01:18:42 INFO - #01: PR_GetThreadName[/Users/cltbld/tasks/task_1505808525/build/application/NightlyDebug.app/Contents/MacOS/libnss3.dylib +0x144389] 01:18:42 INFO - #02: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x405a] 01:18:42 INFO - #03: _pthread_body[/usr/lib/system/libsystem_pthread.dylib +0x3fd7]
Blocks: RunWatchdogShutdownhang
Status: RESOLVED → REOPENED
status-firefox57:
--- → affected
Keywords: bulk-close-intermittents → crash
Resolution: INCOMPLETE → ---
Summary: Intermittent test_tabbar_session_restore_button.py TestTabbarSessionRestoreButton.test_click_restore | IOError: Process killed because a requested application quit did not happen within 65s. Check gecko.log for errors. → Intermittent test_tabbar_session_restore_button.py TestTabbarSessionRestoreButton.test_click_restore | application crashed [@ mozilla::(anonymous namespace)::RunWatchdog(void*)]
Comment 12•7 years ago
|
||
Moving from Core::Untriaged to Core::General https://bugzilla.mozilla.org/show_bug.cgi?id=1407598
Component: Untriaged → General
Updated•4 years ago
|
Status: REOPENED → RESOLVED
Closed: 7 years ago → 4 years ago
Resolution: --- → INCOMPLETE
You need to log in
before you can comment on or make changes to this bug.
Description
•