Closed Bug 1805761 Opened 1 year ago Closed 1 year ago

Intermittent [DO NOT USE FOR CLASSIFICATION] WARNING: Process "xxx" hanging at shutdown; attempting crash report (fatal error): file Z:/task_167103636611679/build/src/ipc/chromium/src/chrome/common/process_watcher_win.cc:154

Categories

(Core :: IPC, defect, P5)

defect

Tracking

()

RESOLVED FIXED
110 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox108 --- unaffected
firefox109 --- unaffected
firefox110 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jld)

References

(Regression)

Details

(Keywords: intermittent-failure, regression)

Attachments

(1 file)

Filed by: imoraru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=399644640&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/d2jmJ_VHSbmm71dQRPmgEg/runs/0/artifacts/public/logs/live_backing.log


[task 2022-12-14T18:18:53.440Z] 18:18:53     INFO - PID 8108 | WARNING: Error: Phase "profile-before-change" is finished, it is too late to register completion condition "OS.File: flush I/O queued before profileBeforeChange"
[task 2022-12-14T18:18:53.440Z] 18:18:53     INFO - PID 8108 | WARNING: addBlocker@resource://gre/modules/AsyncShutdown.jsm:718:15
[task 2022-12-14T18:18:53.440Z] 18:18:53     INFO - PID 8108 | addBlocker@resource://gre/modules/AsyncShutdown.jsm:523:26
[task 2022-12-14T18:18:53.440Z] 18:18:53     INFO - PID 8108 | addBlocker@resource://gre/modules/AsyncShutdown.jsm:462:15
[task 2022-12-14T18:18:53.440Z] 18:18:53     INFO - PID 8108 | setupShutdown@resource://gre/modules/osfile/osfile_async_front.jsm:1548:28
[task 2022-12-14T18:18:53.440Z] 18:18:53     INFO - PID 8108 | @resource://gre/modules/osfile/osfile_async_front.jsm:1568:16
[task 2022-12-14T18:18:53.440Z] 18:18:53     INFO - PID 8108 | @resource://gre/modules/osfile.jsm:12:30
[task 2022-12-14T18:18:53.440Z] 18:18:53     INFO - PID 8108 | @resource://gre/modules/TelemetryStorage.jsm:19:28
[task 2022-12-14T18:18:53.440Z] 18:18:53     INFO - PID 8108 | removeAbortedSessionPing@resource://gre/modules/TelemetryControllerParent.jsm:721:5
[task 2022-12-14T18:18:53.448Z] 18:18:53     INFO - PID 8108 | removeAbortedSessionPing@resource://gre/modules/TelemetryControllerParent.jsm:249:17
[task 2022-12-14T18:18:53.448Z] 18:18:53     INFO - PID 8108 | shutdownChromeProcess/cleanup/<@resource://gre/modules/TelemetrySession.jsm:1235:42
[task 2022-12-14T18:18:53.450Z] 18:18:53     INFO - PID 8108 | observe@resource://gre/modules/AsyncShutdown.jsm:572:16
[task 2022-12-14T18:18:53.450Z] 18:18:53     INFO - PID 8108 | JavaScript error: resource://gre/modules/AsyncShutdown.jsm, line 718: Error: Phase "profile-before-change" is finished, it is too late to register completion condition "OS.File: flush I/O queued before profileBeforeChange"
[task 2022-12-14T18:18:53.450Z] 18:18:53     INFO - PID 8108 | JavaScript error: resource://gre/modules/AsyncShutdown.jsm, line 718: Error: Phase "profile-before-change" is finished, it is too late to register completion condition "OS.File: flush I/O queued before profileBeforeChange"
[task 2022-12-14T18:18:53.450Z] 18:18:53     INFO - PID 8108 | JavaScript error: resource://gre/modules/AsyncShutdown.jsm, line 718: Error: Phase "profile-before-change" is finished, it is too late to register completion condition "OS.File: flush I/O queued before profileBeforeChange"
[task 2022-12-14T18:18:53.450Z] 18:18:53     INFO - PID 8108 | JavaScript error: resource://gre/modules/AsyncShutdown.jsm, line 718: Error: Phase "profile-before-change" is finished, it is too late to register completion condition "OS.File: flush I/O queued before profileBeforeChange"
[task 2022-12-14T18:18:53.455Z] 18:18:53     INFO - PID 8108 | JavaScript error: resource://gre/modules/AsyncShutdown.jsm, line 718: Error: Phase "profile-before-change" is finished, it is too late to register completion condition "OS.File: flush I/O queued before profileBeforeChange"
[task 2022-12-14T18:18:54.322Z] 18:18:54     INFO - PID 8108 | [Parent 796, IPC I/O Parent] WARNING: Process 4840 may be hanging at shutdown; will wait for up to 8000ms: file Z:/task_167103636611679/build/src/ipc/chromium/src/chrome/common/process_watcher_win.cc:119
[task 2022-12-14T18:19:02.323Z] 18:19:02     INFO - PID 8108 | [Parent 796, IPC I/O Parent] WARNING: Process 4840 hanging at shutdown; attempting crash report (fatal error): file Z:/task_167103636611679/build/src/ipc/chromium/src/chrome/common/process_watcher_win.cc:154
[task 2022-12-14T18:19:02.326Z] 18:19:02     INFO - PID 8108 | [Parent 796, IPC I/O Parent] WARNING: CreateRemoteThread: error 5: file Z:/task_167103636611679/build/src/ipc/chromium/src/chrome/common/process_watcher_win.cc:204
[task 2022-12-14T18:19:02.791Z] 18:19:02     INFO - PID 8108 | [Parent 796, IPC I/O Parent] WARNING: Process 2856 hanging at shutdown; attempting crash report (fatal error): file Z:/task_167103636611679/build/src/ipc/chromium/src/chrome/common/process_watcher_win.cc:154
[task 2022-12-14T18:19:02.802Z] 18:19:02     INFO - PID 8108 | [Parent 796, IPC I/O Parent] WARNING: CreateRemoteThread: error 5: file Z:/task_167103636611679/build/src/ipc/chromium/src/chrome/common/process_watcher_win.cc:204
[task 2022-12-14T18:19:19.842Z] 18:19:19     INFO - PID 8108 | 1671041959843	geckodriver::browser	DEBUG	Browser process stopped: exit code: 0
[task 2022-12-14T18:19:19.845Z] 18:19:19     INFO - PID 8108 | 1671041959844	webdriver::server	DEBUG	<- 200 OK {"value":null}
[task 2022-12-14T18:19:19.846Z] 18:19:19     INFO - STDOUT: ============================= 5 passed in 32.25s ==============================
[task 2022-12-14T18:19:19.849Z] 18:19:19     INFO - .....
[task 2022-12-14T18:19:19.849Z] 18:19:19     INFO - TEST-OK | /webdriver/tests/fullscreen_window/fullscreen.py | took 32772ms
[task 2022-12-14T18:19:19.850Z] 18:19:19     INFO - TEST-START | /webdriver/tests/fullscreen_window/stress.py

This failure line [Parent 796, IPC I/O Parent] WARNING: Process 4840 hanging at shutdown; attempting crash report (fatal error): file Z:/task_167103636611679/build/src/ipc/chromium/src/chrome/common/process_watcher_win.cc:154 appears often on different jobs.
Sometimes it spams this line and the real failure doesn't appear in treeherder. example: https://treeherder.mozilla.org/logviewer?job_id=399644640&repo=mozilla-central&lineNumber=37399

Hi Jed! Can you please take a look at this?
Thank you!

These failures seem to all be on Windows code coverage builds, running the WebDriver spec conformance tests in headless mode. I'm not sure what part of that combination is important, but I could believe that ccov inherently needs a longer timeout. It's a little weirder, but not implausible, that it's only this test suite and happens in every run ­— the message isn't being spammed in a loop, it's just that every test case in this suite starts and stops the browser, and every(?) time some of the child processes appear to be hanging on shutdown.

It's unexpected that we're not getting an actual crash report (PROCESS-CRASH, etc.) here; browser/config/mozconfigs/win64/code-coverage implies that the crash reporter is active on those builds.

I'll see how much of this I can reproduce….

Assignee: nobody → jld
Regressed by: 1793525

Set release status flags based on info from the regressing bug 1793525

The CreateRemoteThread: error 5 lines answer part of this: we're getting ERROR_ACCESS_DENIED trying to inject a thread into the process to crash it. I can reproduce that locally with a ccov build but not a regular build (and it does seem to be ccov itself rather than the use of --disable-sandbox). I don't know why this is happening, nor if it's something we could fix.

I've used Try to see what happens if I increase the timeout from 8s to 80s on Windows ccov builds, and the wdspec tests pass (but some of the child processes seem to need up to 20s of that timeout; note that we normally kill the parent process (nsTerminator) after 60s but on ccov builds that's increased to 180s). So I think I can land that and file another bug to investigate the CreateRemoteThread failure.

Duplicate of this bug: 1805903
See Also: → 1806050
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/d6814aa8f90e
Increase the child process shutdown timeout for Windows ccov. r=ipc-reviewers,nika
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 110 Branch
See Also: → 1822728
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: