Closed Bug 1886772 Opened 1 year ago Closed 1 year ago

Intermittent [tier 2] /css/css-layout-api/constraints/fixed-block-size-fixed.https.html | single tracking bug

Categories

(Core :: Layout, defect, P5)

defect

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: intermittent-bug-filer, Assigned: jmaher)

Details

(Keywords: intermittent-failure, intermittent-testcase)

Attachments

(1 file)

Filed by: imoraru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=451665470&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/ZZD16AJeR5elDPRW3nqbhQ/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/ZZD16AJeR5elDPRW3nqbhQ/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1


[task 2024-03-21T14:02:57.350Z] 14:02:57     INFO - TEST-START | /css/css-layout-api/constraints/fixed-block-size-fixed.https.html
[task 2024-03-21T14:02:57.351Z] 14:02:57     INFO - PID 5844 | 1711029777350	Marionette	INFO	Testing https://web-platform.test:8443/css/css-layout-api/constraints/fixed-block-size-fixed.https.html == http://web-platform.test:8000/css/css-layout-api/green-square-ref.html
[task 2024-03-21T14:02:59.883Z] 14:02:59     INFO - TEST-UNEXPECTED-TIMEOUT | /css/css-layout-api/constraints/fixed-block-size-fixed.https.html | expected FAIL
[task 2024-03-21T14:02:59.883Z] 14:02:59     INFO - TEST-INFO expected FAIL | took 2534ms
[task 2024-03-21T14:02:59.893Z] 14:02:59     INFO - PID 5844 | 1711029779892	Marionette	INFO	Stopped listening on port 53471
[task 2024-03-21T14:03:13.483Z] 14:03:13     INFO - Browser exited with return code 0
[task 2024-03-21T14:03:13.484Z] 14:03:13     INFO - Closing logging queue
[task 2024-03-21T14:03:13.484Z] 14:03:13     INFO - queue closed
[task 2024-03-21T14:03:13.533Z] 14:03:13     INFO - Application command: Z:\task_171102798689494\build\application\firefox\firefox.exe -marionette about:blank --wait-for-browser -profile C:\Users\task_171102798689494\AppData\Local\Temp\tmp97i_sffw
[task 2024-03-21T14:03:13.540Z] 14:03:13     INFO - PID 3864 | 1711029772581	Marionette	INFO	Marionette enabled
[task 2024-03-21T14:03:13.541Z] 14:03:13     INFO - PID 3864 | 1711029772663	Marionette	INFO	Listening on port 53563
[task 2024-03-21T14:03:13.541Z] 14:03:13     INFO - Starting runner
[task 2024-03-21T14:03:14.572Z] 14:03:14     INFO - TEST-START | /css/css-layout-api/constraints/fixed-block-size-flex-basis-vrl.https.html

If I'm reading the log right, this test was judged as a TIMEOUT after only 2.5 seconds (took 2534ms)... that seems suspiciously low for a "test timed out" judgement. I thought 10s was our default/lowest timeout threshold for WPTs?

jgraham or jmaher, do you know what might be going on here? Do we actually have a ~2.5 second timeout threshold for these tests? (If so, we should probably increase that; some of the nearby tests are taking e.g. 1732ms, 1976ms, 2229ms, if I scroll around a bit in the log -- super close to whatever threshold this test happened to trip over.)

Flags: needinfo?(james)

The problem is that this is put in a backlog job, where the timeout is overridden in mozharness: https://searchfox.org/mozilla-central/source/testing/mozharness/scripts/web_platform_tests.py#443 The tests are never validated with such a short timeout, and I agree that it seems problematic.

Flags: needinfo?(james)

Thanks, that makes some sense. Adding some additional detail:

(In reply to James Graham [:jgraham] from comment #3)

The problem is that this is put in a backlog job

That happens via the implementation-status: backlog annotation here, from bug 1572820.

It seems we do this for about 100 directories of tests, based on a searchfox search. I would bet that many/most of those tests are simply "expected: FAIL" (due to us not implementing some feature), and the 0.25x timeout multiplier puts them at risk of triggering intermittent failures and resulting confusion & bug-spam like this bug here. (For tests that are simply expected-failure, the reduced timeout multiplier (bug 1643177) turns every test into a potential infrequent intermittent-timeout, given that (per comment 1) many of the tests are extremely close to tripping over the reduced 2.5s timeout threshold. I'd bet we've already proceeded to annotate some such tests as expected-timeout, not due to a "real" timeout but only because they overshoot this reduced 2.5s threshold occasionally.)

Two ideas for ways forward here:
(1) If possible, it seems like we should only apply bug 1643177's timeout multiplier for tests that are explicitly annotated as expected timeout (which seems to be what bug 1643177 comment 0 was assuming about all of the backlog tests -- "Since the tests in the backlog side of things are expected to time out by default,[...]").

(2) One less-elegant solution would be to simply treat every test in the backlog task as having TIMEOUT included in its list of acceptable results, whether due to an actual timeout in the test or due to our reduced time threshold. That would feel pretty clumsy, though, and could give us reduced test-coverage for real regressions where we inadvertently introduce a perma-hang that one of the backlog tests happens to trigger.

jmaher, does proposal (1) make sense to you / do you have other ideas about what we could do here?

Flags: needinfo?(jmaher)

I like #1. I am unsure of how to proceed here. Possibly we pass in a backlog parameter to the harness and before the test if the possible expectations include TIMEOUT, then we set the multiplier. I am hunting around and will for a bit more.

Flags: needinfo?(jmaher)

I assume the code around https://searchfox.org/mozilla-central/source/testing/web-platform/tests/tools/wptrunner/wptrunner/testrunner.py#655 would need to be modified. I am not sure if we have access to the expected test results here.

we should be able to use self.state.test.expected() to get what is expected, maybe need a call for self.state.test.known_intermittent. Of course a backlog variable needs to be plumbed down there

Assignee: nobody → jmaher
Status: NEW → ASSIGNED
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: