Closed Bug 1835219 Opened 1 years ago Closed 1 year ago

Intermittent TCw TEST-UNEXPECTED-TIMEOUT | /css/selectors/has-style-sharing-005.html | single tracking bug

Categories

(Testing :: Code Coverage, defect, P5)

defect

Tracking

(firefox-esr102 unaffected, firefox113 unaffected, firefox114 unaffected, firefox115 wontfix, firefox116 wontfix)

RESOLVED INCOMPLETE
Tracking Status
firefox-esr102 --- unaffected
firefox113 --- unaffected
firefox114 --- unaffected
firefox115 --- wontfix
firefox116 --- wontfix

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Regression)

Details

(Keywords: intermittent-failure, intermittent-testcase, regression)

Filed by: imoraru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=417143171&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/YOBIy63NRAyufOj8TI5t1w/runs/0/artifacts/public/logs/live_backing.log


[task 2023-05-26T06:15:59.209Z] 06:15:59     INFO - PID 9128 | [CodeCoverage] JS flush completed.
[task 2023-05-26T06:16:00.674Z] 06:16:00     INFO - PID 9128 | 1685081760679	Marionette	INFO	Testing http://web-platform.test:8000/css/selectors/has-style-sharing-005.html == http://web-platform.test:8000/css/selectors/has-style-sharing-005-ref.html
[task 2023-05-26T06:16:00.847Z] 06:16:00     INFO - PID 9128 | 1685081760853	Marionette	INFO	No differences allowed
[task 2023-05-26T06:16:00.879Z] 06:16:00     INFO - PID 9128 | [CodeCoverage] Requested flush for 9028.
[task 2023-05-26T06:16:10.498Z] 06:16:10     INFO - PID 9128 | [CodeCoverage] flush completed.
[task 2023-05-26T06:16:10.686Z] 06:16:10     INFO - PID 9128 | [CodeCoverage] JS flush completed.
[task 2023-05-26T06:16:10.690Z] 06:16:10     INFO - PID 9128 | [CodeCoverage] Requested flush for 5872.
[task 2023-05-26T06:16:10.690Z] 06:16:10     INFO - PID 9128 | [CodeCoverage] Requested flush for 9116.
[task 2023-05-26T06:16:10.691Z] 06:16:10     INFO - PID 9128 | [CodeCoverage] Requested flush for 5636.
[task 2023-05-26T06:16:10.691Z] 06:16:10     INFO - PID 9128 | [CodeCoverage] Requested flush for 6756.
[task 2023-05-26T06:16:17.985Z] 06:16:17     INFO - PID 9128 | console.error: (new Error("Polling for changes failed: Unexpected content-type \"text/plain;charset=US-ASCII\".", "resource://services-settings/remote-settings.sys.mjs", 324))
[task 2023-05-26T06:16:21.751Z] 06:16:21     INFO - Got timeout in harness
[task 2023-05-26T06:16:21.754Z] 06:16:21     INFO - TEST-UNEXPECTED-TIMEOUT | /css/selectors/has-style-sharing-005.html | TestRunner hit external timeout (this may indicate a hang)
[task 2023-05-26T06:16:21.754Z] 06:16:21     INFO - TEST-INFO took 35012ms
[task 2023-05-26T06:16:21.754Z] 06:16:21     INFO - No more tests
[task 2023-05-26T06:16:24.811Z] 06:16:24     INFO - PID 9128 | [CodeCoverage] flush completed.
[task 2023-05-26T06:16:24.827Z] 06:16:24     INFO - PID 9128 | [CodeCoverage] JS flush completed.
[task 2023-05-26T06:16:27.181Z] 06:16:27     INFO - PID 9128 | [CodeCoverage] flush completed.
[task 2023-05-26T06:16:27.228Z] 06:16:27     INFO - PID 9128 | [CodeCoverage] JS flush completed.
[task 2023-05-26T06:16:31.199Z] 06:16:31     INFO - PID 9128 | JavaScript error: chrome://remote/content/shared/TabManager.sys.mjs, line 107: TypeError: can't access property "gBrowser", win is null
[task 2023-05-26T06:16:31.224Z] 06:16:31     INFO - PID 9128 | 1685081791223	Marionette	INFO	Stopped listening on port 65280
[task 2023-05-26T06:16:35.298Z] 06:16:35     INFO - PID 9128 | [CodeCoverage] flush completed.
[task 2023-05-26T06:16:35.309Z] 06:16:35     INFO - PID 9128 | [CodeCoverage] JS flush completed.
[task 2023-05-26T06:16:49.387Z] 06:16:49     INFO - PID 9128 | WARNING: At least one completion condition is taking too long to complete. Conditions: [{"name":"ContentParent: id=2469b131800","state":{"remoteTypePrefix":"prealloc"},"filename":"Z:/task_168507398913774/build/src/dom/ipc/ContentParent.cpp","lineNumber":3910,"stack":""}] Barrier: profile-before-change
[task 2023-05-26T06:16:49.576Z] 06:16:49     INFO - PID 9128 | [CodeCoverage] flush completed.
[task 2023-05-26T06:16:49.588Z] 06:16:49     INFO - PID 9128 | [CodeCoverage] JS flush completed.
[task 2023-05-26T06:16:50.988Z] 06:16:50     INFO - PID 9128 | [Parent 9028, IPC I/O Parent] WARNING: Process 6756 may be hanging at shutdown; will wait for up to 80000ms: file Z:/task_168507398913774/build/src/ipc/chromium/src/chrome/common/process_watcher_win.cc:129
[task 2023-05-26T06:16:51.613Z] 06:16:51     INFO - PID 9128 | [Parent 9028, IPC I/O Parent] WARNING: Process 8536 may be hanging at shutdown; will wait for up to 79375ms: file Z:/task_168507398913774/build/src/ipc/chromium/src/chrome/common/process_watcher_win.cc:129
[task 2023-05-26T06:16:53.925Z] 06:16:53     INFO - Browser exited with return code 0

Set release status flags based on info from the regressing bug 1793012

:dshin, since you are the author of the regressor, bug 1793012, could you take a look?

For more information, please visit BugBot documentation.

Hm.. code coverage runs seem to take an extraordinary amount of time:

[task 2023-05-26T06:11:58.802Z] 06:11:58     INFO - TEST-PASS | /css/selectors/has-style-sharing-001.html | took 25131ms
[task 2023-05-26T06:13:00.278Z] 06:13:00     INFO - TEST-PASS | /css/selectors/has-style-sharing-002.html | took 23563ms
[task 2023-05-26T06:14:01.520Z] 06:14:01     INFO - TEST-PASS | /css/selectors/has-style-sharing-003.html | took 24268ms
[task 2023-05-26T06:15:08.920Z] 06:15:08     INFO - TEST-PASS | /css/selectors/has-style-sharing-004.html | took 24350ms
[task 2023-05-26T06:16:21.754Z] 06:16:21     INFO - TEST-INFO took 35012ms

The last failing test does take extra long.
There's nothing specific in :has code branch that would get stuck, so not sure if this indicates an actual bug.

Also of note is that similar issues were recently(-ish) filed for win11 code coverage builds, with similar output (e.g. bug 1834845). Perhaps a timeout threshold issue?

Flags: needinfo?(dshin)

Most recent log ( https://firefoxci.taskcluster-artifacts.net/YOBIy63NRAyufOj8TI5t1w/0/public/logs/live_backing.log) has the following:

[taskcluster 2023-05-26T06:07:19.202Z] Worker Type (gecko-t/win11-64-2009-ssd) settings:
[...]
[task 2023-05-26T06:15:46.740Z] 06:15:46     INFO - TEST-START | /css/selectors/has-style-sharing-005.html
[...]
[task 2023-05-26T06:16:21.751Z] 06:16:21     INFO - Got timeout in harness
[task 2023-05-26T06:16:21.754Z] 06:16:21     INFO - TEST-UNEXPECTED-TIMEOUT | /css/selectors/has-style-sharing-005.html | TestRunner hit external timeout (this may indicate a hang)
[task 2023-05-26T06:16:21.754Z] 06:16:21     INFO - TEST-INFO took 35012ms

The other tests in the log all seem to have taken ~25 seconds each, so this one lasting >35 seconds doesn't seem out-of-place.

It looks like the whole run was just under 10 minutes (first logging at 6:07am, first test actually gets run at 06:11am, and then "timeout in harness" at 6:16am.)

So I think nothing is obviously amiss here with the test or with gecko itself. It seems like we either need to increase the timeout threshold (9 minutes seems a bit low, particularly if it takes 4 minuites to actually start the first test?); or, we need to subdivide this bucket so that we're not at risk of hitting it.

Component: CSS Parsing and Computation → Code Coverage
Product: Core → Testing

Meant to say, the issue per the log here seems to be "Got timeout in harness", which it notes "may indicate a hang" but in this case it doesn't seem to indicate a hang; we're still running tests up to the last second, and it's not like we got stuck in a test for an inordinate amount of time. It seems the time-needed-for-code-coverage-and-these-tests is substantially longer than the 9 minutes (really, 5 minutes after 4 minutes of spin-up time) that we're apparently allotting for this task.

(I'm also not sure whether this belongs in Testing|Code Coverage vs. in Release Engineering like e.g. bug 1799052. But in any case, the fix here is probably to increase the overall harness timeout threshold for this task.)

Set release status flags based on info from the regressing bug 1793012

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.