Closed Bug 1764903 Opened 3 years ago Closed 2 years ago

Intermittent Wd TEST-UNEXPECTED-TIMEOUT | expected OK [shutdown hang in CrashManager.jsm]

Categories

(Remote Protocol :: Marionette, defect, P5)

defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: intermittent-failure)

Crash Data

Filed by: nfay [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=374623022&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/eTnzFrIBQu2UsHX0rXU-_w/runs/0/artifacts/public/logs/live_backing.log


[task 2022-04-15T09:55:56.635Z] 09:55:56     INFO - PID 1688 | [Child 4541, Main Thread] WARNING: IPC message 'PContent::Msg_UpdateChildScalars' discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:506
[task 2022-04-15T09:55:56.636Z] 09:55:56     INFO - PID 1688 | [Child 4541, Main Thread] WARNING: '!ipcActor->SendUpdateChildScalars(scalarsToSend)', file /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:285
[task 2022-04-15T09:55:56.637Z] 09:55:56     INFO - PID 1688 | [Child 4541, Main Thread] WARNING: IPC message 'PContent::Msg_RecordDiscardedData' discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:506
[task 2022-04-15T09:55:56.638Z] 09:55:56     INFO - PID 1688 | [Child 4541, Main Thread] WARNING: '!ipcActor->SendRecordDiscardedData(discardedData)', file /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:296
[task 2022-04-15T09:56:00.378Z] 09:56:00     INFO - PID 1688 | JavaScript error: resource://gre/modules/CrashManager.jsm, line 898: AbortError: IOUtils: Shutting down and refusing additional I/O tasks
[task 2022-04-15T09:56:20.661Z] 09:56:20     INFO - PID 1688 | XPCOM_MEM_BLOAT_LOG: /tmp/tmp80k389wa.mozrunner/runtests_leaks_1177.log
[task 2022-04-15T09:56:20.661Z] 09:56:20     INFO - PID 1688 | Writing to log: /tmp/tmp80k389wa.mozrunner/runtests_leaks_1177.log
[task 2022-04-15T09:56:20.662Z] 09:56:20     INFO - PID 1688 | RunWatchdog: Mainthread nested event loops during hang:
[task 2022-04-15T09:56:20.662Z] 09:56:20     INFO - PID 1688 |  --- CompositorThreadHolder::Shutdown
[task 2022-04-15T09:56:21.662Z] 09:56:21     INFO - PID 1688 | Hit MOZ_CRASH(Shutdown hanging after all known phases and workers finished.) at /builds/worker/checkouts/gecko/toolkit/components/terminator/nsTerminator.cpp:256
[task 2022-04-15T09:56:21.668Z] 09:56:21     INFO - STDOUT: Initializing stack-fixing for the first stack frame, this may take a while...
[task 2022-04-15T09:56:21.834Z] 09:56:21     INFO - TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/bidi/browsing_context/navigate/wait.py | expected OK
[task 2022-04-15T09:56:21.835Z] 09:56:21     INFO - TEST-INFO took 80004ms
[task 2022-04-15T09:56:21.878Z] 09:56:21 CRITICAL - Test harness output was not a valid structured log message: 
[task 2022-04-15T09:56:21.878Z] 09:56:21 CRITICAL - Task exception was never retrieved
[task 2022-04-15T09:56:21.879Z] 09:56:21 CRITICAL - future: <Task finished coro=<Transport.read_messages() done, defined at /builds/worker/workspace/build/tests/web-platform/tests/tools/webdriver/webdriver/bidi/transport.py:71> exception=InvalidStateError('invalid state',)>
[task 2022-04-15T09:56:21.880Z] 09:56:21 CRITICAL - Traceback (most recent call last):
[task 2022-04-15T09:56:21.880Z] 09:56:21 CRITICAL -   File "/builds/worker/workspace/build/tests/web-platform/tests/tools/webdriver/webdriver/bidi/transport.py", line 76, in read_messages
[task 2022-04-15T09:56:21.880Z] 09:56:21 CRITICAL -     await self.handle(msg)
[task 2022-04-15T09:56:21.881Z] 09:56:21 CRITICAL -   File "/builds/worker/workspace/build/tests/web-platform/tests/tools/webdriver/webdriver/bidi/transport.py", line 64, in handle
[task 2022-04-15T09:56:21.881Z] 09:56:21 CRITICAL -     await self.msg_handler(data)
[task 2022-04-15T09:56:21.882Z] 09:56:21 CRITICAL -   File "/builds/worker/workspace/build/tests/web-platform/tests/tools/webdriver/webdriver/bidi/client.py", line 170, in on_message
[task 2022-04-15T09:56:21.882Z] 09:56:21 CRITICAL -     future.set_result(data["result"])
[task 2022-04-15T09:56:21.882Z] 09:56:21 CRITICAL - asyncio.base_futures.InvalidStateError: invalid state
[task 2022-04-15T09:56:21.882Z] 09:56:21 CRITICAL - Task exception was never retrieved
[task 2022-04-15T09:56:21.882Z] 09:56:21 CRITICAL - future: <Task finished coro=<Transport.read_messages() done, defined at /builds/worker/workspace/build/tests/web-platform/tests/tools/webdriver/webdriver/bidi/transport.py:71> exception=InvalidStateError('invalid state',)>
[task 2022-04-15T09:56:21.882Z] 09:56:21 CRITICAL - Traceback (most recent call last):
[task 2022-04-15T09:56:21.883Z] 09:56:21 CRITICAL -   File "/builds/worker/workspace/build/tests/web-platform/tests/tools/webdriver/webdriver/bidi/transport.py", line 76, in read_messages
[task 2022-04-15T09:56:21.883Z] 09:56:21 CRITICAL -     await self.handle(msg)
[task 2022-04-15T09:56:21.883Z] 09:56:21 CRITICAL -   File "/builds/worker/workspace/build/tests/web-platform/tests/tools/webdriver/webdriver/bidi/transport.py", line 64, in handle
[task 2022-04-15T09:56:21.883Z] 09:56:21 CRITICAL -     await self.msg_handler(data)
[task 2022-04-15T09:56:21.883Z] 09:56:21 CRITICAL -   File "/builds/worker/workspace/build/tests/web-platform/tests/tools/webdriver/webdriver/bidi/client.py", line 170, in on_message
[task 2022-04-15T09:56:21.883Z] 09:56:21 CRITICAL -     future.set_result(data["result"])
[task 2022-04-15T09:56:21.883Z] 09:56:21 CRITICAL - asyncio.base_futures.InvalidStateError: invalid state
[task 2022-04-15T09:56:27.887Z] 09:56:27     INFO - PID 1688 | #01: mozilla::(anonymous namespace)::RunWatchdog(void*) [toolkit/components/terminator/nsTerminator.cpp:256]
[task 2022-04-15T09:56:27.897Z] 09:56:27     INFO - PID 1688 | #02: _pt_root [nsprpub/pr/src/pthreads/ptthread.c:204]
[task 2022-04-15T09:56:27.898Z] 09:56:27     INFO - PID 1688 | #03: set_alt_signal_stack_and_start(PthreadCreateParams*) [/builds/worker/workspace/build/application/firefox/firefox + 0xe5553]
[task 2022-04-15T09:56:27.898Z] 09:56:27     INFO - PID 1688 | #04: ??? [/lib/x86_64-linux-gnu/libpthread.so.0 + 0x76db]
[task 2022-04-15T09:56:27.900Z] 09:56:27     INFO - PID 1688 | #05: clone [/lib/x86_64-linux-gnu/libc.so.6 + 0x121a3f]
[task 2022-04-15T09:56:27.901Z] 09:56:27     INFO - PID 1688 | #06: ??? (???:???)
[task 2022-04-15T09:56:27.901Z] 09:56:27     INFO - PID 1688 | ExceptionHandler::GenerateDump cloned child 4689
[task 2022-04-15T09:56:27.902Z] 09:56:27     INFO - PID 1688 | ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2022-04-15T09:56:27.902Z] 09:56:27     INFO - PID 1688 | ExceptionHandler::WaitForContinueSignal waiting for continue signal...
[task 2022-04-15T09:56:27.903Z] 09:56:27     INFO - PID 1688 | [Child 4541, IPC I/O Child] WARNING: [37459C836AAE4E8.3F4183DC2893607A]: Ignoring message 'EVENT_MESSAGE' to peer 1.1 due to a missing broker: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:348
[task 2022-04-15T09:56:27.903Z] 09:56:27     INFO - PID 1688 | [Child 4541, IPC I/O Child] WARNING: [37459C836AAE4E8.3F4183DC2893607A]: Ignoring message 'EVENT_MESSAGE' to peer 1.1 due to a missing broker: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:348
[task 2022-04-15T09:56:27.904Z] 09:56:27     INFO - PID 1688 | [Child 4541, IPC I/O Child] WARNING: [37459C836AAE4E8.3F4183DC2893607A]: Ignoring message 'EVENT_MESSAGE' to peer 1.1 due to a missing broker: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:348
[task 2022-04-15T09:56:27.904Z] 09:56:27     INFO - PID 1688 | [Child 4541, IPC I/O Child] WARNING: [37459C836AAE4E8.3F4183DC2893607A]: Ignoring message 'EVENT_MESSAGE' to peer 1.1 due to a missing broker: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:348
[task 2022-04-15T09:56:27.905Z] 09:56:27     INFO - PID 1688 | [Child 4541, IPC I/O Child] WARNING: [37459C836AAE4E8.3F4183DC2893607A]: Ignoring message 'EVENT_MESSAGE' to peer 1.1 due to a missing broker: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:348
[task 2022-04-15T09:56:27.906Z] 09:56:27     INFO - PID 1688 | [Child 4541, IPC I/O Child] WARNING: [37459C836AAE4E8.3F4183DC2893607A]: Ignoring message 'EVENT_MESSAGE' to peer 1.1 due to a missing broker: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:348
[task 2022-04-15T09:56:27.906Z] 09:56:27     INFO - PID 1688 | [Child 4541, IPC I/O Child] WARNING: [37459C836AAE4E8.3F4183DC2893607A]: Ignoring message 'EVENT_MESSAGE' to peer 1.1 due to a missing broker: file /builds/worker/checkouts/gecko/ipc/glue/NodeController.cpp:348
[task 2022-04-15T09:56:27.907Z] 09:56:27     INFO - PID 1688 | [Child 4541, Main Thread] WARNING: IPC message 'PContent::Msg_GraphicsError' discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:506
[task 2022-04-15T09:56:27.907Z] 09:56:27     INFO - PID 1688 | [GFX1-]: Receive IPC close with reason=AbnormalShutdown
[task 2022-04-15T09:56:27.908Z] 09:56:27  WARNING - Firefox didn't exit cleanly, not processing leak logs
[task 2022-04-15T09:56:27.910Z] 09:56:27     INFO - STDOUT: ======================== 12 passed in 79.95s (0:01:19) =========================
[task 2022-04-15T09:56:27.910Z] 09:56:27     INFO - Closing logging queue
[task 2022-04-15T09:56:27.910Z] 09:56:27     INFO - queue closed
[task 2022-04-15T09:56:27.913Z] 09:56:27     INFO - PID 4696 | 1650016587911	geckodriver	INFO	Listening on 127.0.0.1:49014
[task 2022-04-15T09:56:27.914Z] 09:56:27     INFO - Starting runner
[task 2022-04-15T09:56:28.168Z] 09:56:28     INFO - TEST-START | /webdriver/tests/bidi/errors/errors.py

Shutdown hang of Firefox

[task 2022-04-15T09:55:17.363Z] 09:55:17     INFO - PID 1688 | [Child 4307, Main Thread] WARNING: NS_ENSURE_TRUE(Preferences::InitStaticMembers()) failed: file /builds/worker/checkouts/gecko/modules/libpref/Preferences.cpp:4464
[task 2022-04-15T09:55:31.727Z] 09:55:31     INFO - PID 1688 | console.error: (new Error("SessionFile is closed", "resource:///modules/sessionstore/SessionFile.jsm", 379))
[task 2022-04-15T09:55:56.635Z] 09:55:56     INFO - PID 1688 | [Child 4541, Main Thread] WARNING: IPC message 'PContent::Msg_UpdateChildScalars' discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:506
[task 2022-04-15T09:55:56.636Z] 09:55:56     INFO - PID 1688 | [Child 4541, Main Thread] WARNING: '!ipcActor->SendUpdateChildScalars(scalarsToSend)', file /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:285
[task 2022-04-15T09:55:56.637Z] 09:55:56     INFO - PID 1688 | [Child 4541, Main Thread] WARNING: IPC message 'PContent::Msg_RecordDiscardedData' discarded: actor cannot send: file /builds/worker/checkouts/gecko/ipc/glue/ProtocolUtils.cpp:506
[task 2022-04-15T09:55:56.638Z] 09:55:56     INFO - PID 1688 | [Child 4541, Main Thread] WARNING: '!ipcActor->SendRecordDiscardedData(discardedData)', file /builds/worker/checkouts/gecko/toolkit/components/telemetry/core/ipc/TelemetryIPCAccumulator.cpp:296
[task 2022-04-15T09:56:00.378Z] 09:56:00     INFO - PID 1688 | JavaScript error: resource://gre/modules/CrashManager.jsm, line 898: AbortError: IOUtils: Shutting down and refusing additional I/O tasks
[task 2022-04-15T09:56:20.661Z] 09:56:20     INFO - PID 1688 | XPCOM_MEM_BLOAT_LOG: /tmp/tmp80k389wa.mozrunner/runtests_leaks_1177.log
[task 2022-04-15T09:56:20.661Z] 09:56:20     INFO - PID 1688 | Writing to log: /tmp/tmp80k389wa.mozrunner/runtests_leaks_1177.log
[task 2022-04-15T09:56:20.662Z] 09:56:20     INFO - PID 1688 | RunWatchdog: Mainthread nested event loops during hang:
[task 2022-04-15T09:56:20.662Z] 09:56:20     INFO - PID 1688 |  --- CompositorThreadHolder::Shutdown
[task 2022-04-15T09:56:21.662Z] 09:56:21     INFO - PID 1688 | Hit MOZ_CRASH(Shutdown hanging after all known phases and workers finished.) at /builds/worker/checkouts/gecko/toolkit/components/terminator/nsTerminator.cpp:256

I wonder if this is IPC related and relates to warning log entry for PContent::Msg_RecordDiscardedData'. But maybe something else in-between these calls might be triggering it.

Lets see if we get some more hangs like that. If not we can close as dupe for bug 1630162.

See Also: → 1630162

This is a hang specifically to the CrashManager.jsm module. Bug 1734543 covers this scenario.

Depends on: 1734543
Summary: Intermittent TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/bidi/browsing_context/navigate/wait.py | expected OK → Intermittent TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/bidi/browsing_context/navigate/wait.py | expected OK [shutdown hang in CrashManager.jsm]
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INCOMPLETE

This failure doesn't happen right now because the underlying bug 1734543 hasn't been fixed yet and we further delay the start of a WebDriver session. I'm fine with having it closed as incomplete for now. We can reopen if needed at a later time.

Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---
Summary: Intermittent TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/bidi/browsing_context/navigate/wait.py | expected OK [shutdown hang in CrashManager.jsm] → Intermittent Wd TEST-UNEXPECTED-TIMEOUT | expected OK [shutdown hang in CrashManager.jsm]

Copying crash signatures from duplicate bugs.

Crash Signature: [@ NS_DebugBreak + 0xb0f]
Status: REOPENED → RESOLVED
Closed: 3 years ago3 years ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---

(In reply to Treeherder Bug Filer from comment #18)

New failure instance: https://treeherder.mozilla.org/logviewer?job_id=402071461&repo=mozilla-central

This is actually bug 1809845.

Status: REOPENED → RESOLVED
Closed: 3 years ago2 years ago
Resolution: --- → INCOMPLETE
Moving bug to Remote Protocol::Marionette component per bug 1815831.
Component: geckodriver → Marionette
Product: Testing → Remote Protocol
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---

(In reply to Treeherder Bug Filer from comment #21)

New failure instance: https://treeherder.mozilla.org/logviewer?job_id=416834216&repo=mozilla-beta

No error can be easily seen given that due to bug 1834716 log parsing doesn't work. I've re-classified.

Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → INCOMPLETE
Status: RESOLVED → REOPENED
Resolution: INCOMPLETE → ---

The last failure was again a wrongly classified failure.

Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.