win ccov noise in logs with "Process X hanging at shutdown; attempting crash report (fatal error)" omitting the relevant failure lines
Categories
(Core :: IPC, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr102 | --- | unaffected |
firefox108 | --- | unaffected |
firefox109 | --- | unaffected |
firefox110 | --- | affected |
People
(Reporter: CosminS, Unassigned)
References
(Regression)
Details
(Keywords: regression)
There are logs on windows ccov that have a lot lines like this one:
INFO - PID 6244 | [Parent 8752, IPC I/O Parent] WARNING: Process 4228 hanging at shutdown; attempting crash report (fatal error): file Z:/task_167109748916971/build/src/ipc/chromium/src/chrome/common/process_watcher_win.cc:154
Failure log: https://treeherder.mozilla.org/logviewer?job_id=399723335&repo=mozilla-central&lineNumber=34968
These lines get picked up by Treeherder as a failure line and by being so many of them the actual failure that made that job turn orange is no longer suggested in Failure summary tab of Treeherder, eg: TEST-UNEXPECTED-TIMEOUT | /webdriver/tests/get_named_cookie/get.py | expected OK
This is an inconvenience when sherrifing because one needs to open the log every time and search for the actual fail which leads to time lost while classifying failures.
I think this all started after Bug 1793525 reached central. Jed, could you please have a look over it? Thank you.
Comment 1•1 year ago
|
||
Set release status flags based on info from the regressing bug 1793525
Comment 2•1 year ago
•
|
||
I'll probably dup this onto bug 1805761, but to summarize:
-
Windows ccov builds seem to need longer for child processes to shut down, at least in some cases. (The timeout might also need to be increased on other build types, or even in general, once we have more data.) This is simple to fix.
-
The crash reports mentioned in the messages don't work on Windows ccov builds, because my attempt to cause a crash by injecting a thread fails with
ERROR_ACCESS_DENIED
. This happens on ccov builds and, as far as I can tell, only ccov builds; I don't know why yet. Maybe this isn't fixable and we should just useTerminateProcess
or ignore the situation entirely. But, once I fix the timeout so we're not spamming false positives on the wdspec tests, this won't be very high urgency to fix.
Updated•1 year ago
|
Description
•