Bug 1886772 Comment 4 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Original comment by

Daniel Holbert [:dholbert]

on 2024-03-26 11:15:14 PDT

Thanks, that makes some sense.  Adding some additional detail:

(In reply to James Graham [:jgraham] from comment #3)
> The problem is that this is put in a backlog job

That happens via the `implementation-status: backlog` annotation [here](https://searchfox.org/mozilla-central/rev/47a0a01e1f7ad0451c6ba6c790d5c6855df512c1/testing/web-platform/meta/css/css-layout-api/__dir__.ini#1), from bug 1572820.

It seems we do this for about 100 directories of tests, based on a searchfox search.  I would bet that many/most of those tests are simply "expected: FAIL" (due to us not implementing some feature), and the 0.25x timeout multiplier puts them at risk of triggering intermittent failures and resulting confusion & bug-spam like this bug here.

Two ideas for ways forward here:
(1) If possible, it seems like we should only apply bug 1643177's timeout multiplier **for tests that are explicitly annotated as expected timeout** (which seems to be what bug 1643177 comment 0 was assuming about all of the backlog tests -- "Since the tests in the backlog side of things are expected to time out by default,[...]").  For tests that are simply expected-failure, the reduced timeout multiplier (bug 1643177) turns every test into a potential infrequent intermittent-timeout, given that (per comment 1) many of the tests are *extremely* close to tripping over the reduced 2.5s timeout threshold.

(2) One less-elegant solution would be to simply treat every test in the backlog task as having `TIMEOUT` included in its list of acceptable results, whether due to an actual timeout in the test or due to our reduced time threshold.  That would feel pretty clumsy, though, and could give us reduced test-coverage for real regressions where we inadvertently introduce a perma-hang that one of the backlog tests happens to trigger.

jmaher, does proposal (1) make sense to you / do you have other ideas about what we could do here?

Revision 1 by

Daniel Holbert [:dholbert]

on 2024-03-26 11:16:12 PDT

Thanks, that makes some sense.  Adding some additional detail:

(In reply to James Graham [:jgraham] from comment #3)
> The problem is that this is put in a backlog job

That happens via the `implementation-status: backlog` annotation [here](https://searchfox.org/mozilla-central/rev/47a0a01e1f7ad0451c6ba6c790d5c6855df512c1/testing/web-platform/meta/css/css-layout-api/__dir__.ini#1), from bug 1572820.

It seems we do this for about 100 directories of tests, based on a searchfox search.  I would bet that many/most of those tests are simply "expected: FAIL" (due to us not implementing some feature), and the 0.25x timeout multiplier puts them at risk of triggering intermittent failures and resulting confusion & bug-spam like this bug here. (For tests that are simply expected-failure, the reduced timeout multiplier (bug 1643177) turns every test into a potential infrequent intermittent-timeout, given that (per comment 1) many of the tests are *extremely* close to tripping over the reduced 2.5s timeout threshold.)

Two ideas for ways forward here:
(1) If possible, it seems like we should only apply bug 1643177's timeout multiplier **for tests that are explicitly annotated as expected timeout** (which seems to be what bug 1643177 comment 0 was assuming about all of the backlog tests -- "Since the tests in the backlog side of things are expected to time out by default,[...]"). 

(2) One less-elegant solution would be to simply treat every test in the backlog task as having `TIMEOUT` included in its list of acceptable results, whether due to an actual timeout in the test or due to our reduced time threshold.  That would feel pretty clumsy, though, and could give us reduced test-coverage for real regressions where we inadvertently introduce a perma-hang that one of the backlog tests happens to trigger.

jmaher, does proposal (1) make sense to you / do you have other ideas about what we could do here?

Revision 2 by

Daniel Holbert [:dholbert]

on 2024-03-26 11:25:27 PDT

Thanks, that makes some sense.  Adding some additional detail:

(In reply to James Graham [:jgraham] from comment #3)
> The problem is that this is put in a backlog job

That happens via the `implementation-status: backlog` annotation [here](https://searchfox.org/mozilla-central/rev/47a0a01e1f7ad0451c6ba6c790d5c6855df512c1/testing/web-platform/meta/css/css-layout-api/__dir__.ini#1), from bug 1572820.

It seems we do this for about 100 directories of tests, based on a searchfox search.  I would bet that many/most of those tests are simply "expected: FAIL" (due to us not implementing some feature), and the 0.25x timeout multiplier puts them at risk of triggering intermittent failures and resulting confusion & bug-spam like this bug here. (For tests that are simply expected-failure, the reduced timeout multiplier (bug 1643177) turns every test into a potential infrequent intermittent-timeout, given that (per comment 1) many of the tests are *extremely* close to tripping over the reduced 2.5s timeout threshold.  I'd bet we've already proceeded to annotate some such tests as expected-timeout, not due to a "real" timeout but only because they overshoot this reduced 2.5s threshold occasionally.)

Two ideas for ways forward here:
(1) If possible, it seems like we should only apply bug 1643177's timeout multiplier **for tests that are explicitly annotated as expected timeout** (which seems to be what bug 1643177 comment 0 was assuming about all of the backlog tests -- "Since the tests in the backlog side of things are expected to time out by default,[...]"). 

(2) One less-elegant solution would be to simply treat every test in the backlog task as having `TIMEOUT` included in its list of acceptable results, whether due to an actual timeout in the test or due to our reduced time threshold.  That would feel pretty clumsy, though, and could give us reduced test-coverage for real regressions where we inadvertently introduce a perma-hang that one of the backlog tests happens to trigger.

jmaher, does proposal (1) make sense to you / do you have other ideas about what we could do here?

Back to Bug 1886772 Comment 4