1787569 - Intermittent TVw ::: Test verification FAIL | TinderboxPrint: Per-test run of .../outline-valid-mandatory.html<br/>: FAILURE

Reporter

Description

•

2 years ago

treeherder

Filed by: imoraru [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=388620274&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/BOnYSrB7TfGBDJfrp8xQOA/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/BOnYSrB7TfGBDJfrp8xQOA/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1

[task 2022-08-26T20:09:50.561Z] 20:09:50     INFO - ## Slow tests ##
[task 2022-08-26T20:09:50.561Z] 20:09:50     INFO - 
[task 2022-08-26T20:09:50.561Z] 20:09:50     INFO - |                        Test                        | Result | Longest duration (ms) | Timeout (ms) |
[task 2022-08-26T20:09:50.561Z] 20:09:50     INFO - |----------------------------------------------------|--------|-----------------------|--------------|
[task 2022-08-26T20:09:50.562Z] 20:09:50     INFO - | `/css/css-ui/parsing/outline-valid-mandatory.html` | `OK`   | `9294`                | `10000`      |
[task 2022-08-26T20:09:50.562Z] 20:09:50     INFO - 
[task 2022-08-26T20:09:50.562Z] 20:09:50     INFO - ::: Running tests in a loop 10 times : PASS
[task 2022-08-26T20:09:50.563Z] 20:09:50     INFO - ::: Running tests in a loop with restarts 5 times : PASS
[task 2022-08-26T20:09:50.563Z] 20:09:50     INFO - ::: Running tests in a loop 10 times with flags chaos_mode_flags=0xfb : PASS
[task 2022-08-26T20:09:50.563Z] 20:09:50     INFO - ::: Running tests in a loop with restarts 5 times with flags chaos_mode_flags=0xfb : FAIL
[task 2022-08-26T20:09:50.564Z] 20:09:50     INFO - :::
[task 2022-08-26T20:09:50.564Z] 20:09:50    ERROR - ::: Test verification FAIL
[task 2022-08-26T20:09:50.564Z] 20:09:50     INFO - :::
[task 2022-08-26T20:09:50.867Z] 20:09:50    ERROR - Return code: 1
[task 2022-08-26T20:09:50.867Z] 20:09:50    ERROR - # TBPL FAILURE #
[task 2022-08-26T20:09:50.867Z] 20:09:50  WARNING - setting return code to 2
[task 2022-08-26T20:09:50.867Z] 20:09:50    ERROR - TinderboxPrint: Per-test run of .../outline-valid-mandatory.html<br/>: FAILURE
[task 2022-08-26T20:09:50.868Z] 20:09:50     INFO - Running post-action listener: _package_coverage_data
[task 2022-08-26T20:09:50.868Z] 20:09:50     INFO - Running post-action listener: _resource_record_post_action
[task 2022-08-26T20:09:50.868Z] 20:09:50     INFO - Running post-action listener: process_java_coverage_data
[task 2022-08-26T20:09:50.868Z] 20:09:50     INFO - Running post-action listener: stop_device
[task 2022-08-26T20:09:50.868Z] 20:09:50     INFO - [mozharness: 2022-08-26 20:09:50.868281Z] Finished run-tests step (success)
[task 2022-08-26T20:09:50.868Z] 20:09:50     INFO - Running post-run listener: _resource_record_post_run
[task 2022-08-26T20:09:50.966Z] 20:09:50     INFO - Total resource usage - Wall time: 538s; CPU: 4%; Read bytes: 94969856; Write bytes: 844189696; Read time: 1445; Write time: 2919
[task 2022-08-26T20:09:50.966Z] 20:09:50     INFO - TinderboxPrint: CPU usage<br/>4.1%
[task 2022-08-26T20:09:50.966Z] 20:09:50     INFO - TinderboxPrint: I/O read bytes / time<br/>94,969,856 / 1,445
[task 2022-08-26T20:09:50.967Z] 20:09:50     INFO - TinderboxPrint: I/O write bytes / time<br/>844,189,696 / 2,919
[task 2022-08-26T20:09:50.967Z] 20:09:50     INFO - TinderboxPrint: CPU idle<br/>6,181.1 (95.7%)
[task 2022-08-26T20:09:50.967Z] 20:09:50     INFO - TinderboxPrint: CPU system<br/>106.5 (1.6%)
[task 2022-08-26T20:09:50.967Z] 20:09:50     INFO - TinderboxPrint: CPU user<br/>169.3 (2.6%)
[task 2022-08-26T20:09:50.967Z] 20:09:50     INFO - TinderboxPrint: Swap in / out<br/>334,327,808 / 0
[task 2022-08-26T20:09:50.967Z] 20:09:50     INFO - pull - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2022-08-26T20:09:50.967Z] 20:09:50     INFO - start-emulator - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2022-08-26T20:09:50.968Z] 20:09:50     INFO - verify-device - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2022-08-26T20:09:50.968Z] 20:09:50     INFO - install - Wall time: 19s; CPU: 14%; Read bytes: 265295360; Write bytes: 257552384; Read time: 17141; Write time: 515
[task 2022-08-26T20:09:50.975Z] 20:09:50     INFO - run-tests - Wall time: 520s; CPU: 4%; Read bytes: 83832832; Write bytes: 577069056; Read time: 1219; Write time: 2321
[task 2022-08-26T20:09:51.101Z] 20:09:51  WARNING - returning nonzero exit status 2
[taskcluster 2022-08-26T20:09:51.378Z]    Exit Code: 2
[taskcluster 2022-08-26T20:09:51.378Z]    User Time: 3m3.825947s
[taskcluster 2022-08-26T20:09:51.378Z]  Kernel Time: 53.171067s
[taskcluster 2022-08-26T20:09:51.378Z]    Wall Time: 10m15.833135s
[taskcluster 2022-08-26T20:09:51.378Z]       Result: FAILED
[taskcluster 2022-08-26T20:09:51.378Z] === Task Finished ===
[taskcluster 2022-08-26T20:09:51.378Z] Task Duration: 10m15.836404s
[taskcluster 2022-08-26T20:09:51.476Z] Uploading artifact public/logs/localconfig.json from file logs/localconfig.json with content encoding "gzip", mime type "application/json" and expiry 2023-08-26T19:35:59.001Z
[taskcluster 2022-08-26T20:09:51.685Z] Uploading artifact public/test_info/resource-usage.json from file build/blobber_upload_dir/resource-usage.json with content encoding "gzip", mime type "application/json" and expiry 2023-08-26T19:35:59.001Z
[taskcluster 2022-08-26T20:09:52.449Z] Uploading artifact public/test_info/wpt-test1_errorsummary.log from file build/blobber_upload_dir/wpt-test1_errorsummary.log with content encoding "gzip", mime type "text/plain" and expiry 2023-08-26T19:35:59.001Z
[taskcluster 2022-08-26T20:09:52.556Z] Uploading artifact public/test_info/wpt-test1_raw.log from file build/blobber_upload_dir/wpt-test1_raw.log with content encoding "gzip", mime type "text/plain" and expiry 2023-08-26T19:35:59.001Z
[taskcluster 2022-08-26T20:09:52.836Z] Uploading artifact public/test_info/wpt_instruments.txt from file build/blobber_upload_dir/wpt_instruments.txt with content encoding "gzip", mime type "text/plain; charset=utf-8" and expiry 2023-08-26T19:35:59.001Z
[taskcluster 2022-08-26T20:09:52.928Z] Uploading artifact public/test_info/wptreport.json from file build/blobber_upload_dir/wptreport.json with content encoding "gzip", mime type "application/json" and expiry 2023-08-26T19:35:59.001Z
[taskcluster 2022-08-26T20:09:53.059Z] Uploading redirect artifact public/logs/live.log to URL https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/BOnYSrB7TfGBDJfrp8xQOA/runs/0/artifacts/public%2Flogs%2Flive_backing.log with mime type "text/plain; charset=utf-8" and expiry 2023-08-26T19:35:59.001Z
[taskcluster:error] exit status 2

BugBot [:suhaib / :marco/ :calixte]

Comment 1

•

2 years ago

:connor.pearson, since you are the author of the regressor, bug 1565111, could you take a look?
For more information, please visit auto_nag documentation.

Flags: needinfo?(connor.pearson)

BugBot [:suhaib / :marco/ :calixte]

Comment 2

•

2 years ago

Set release status flags based on info from the regressing bug 1565111

status-firefox104: --- → unaffected

status-firefox105: --- → unaffected

status-firefox106: --- → affected

status-firefox-esr102: --- → unaffected

status-firefox-esr91: --- → unaffected

Comment hidden (Intermittent Failures Robot)

Connor Pearson

Comment 4

•

2 years ago

I'm not quite sure how to interpret this result as I can't reproduce it locally and I'm not too familiar with treeherder.

The fix for bug 1565111 did re-enable this test and make some small changes, but without knowing more I'm not sure if it caused this issue.

:dholbert, I see you are the triage owner for this bug. Could you advise me on how to handle this?

Flags: needinfo?(connor.pearson) → needinfo?(dholbert)

Daniel Holbert [:dholbert]

Comment 5

•

2 years ago

Hi Connor!

Demystifying things slightly: any time you touch a test, it gets run in "Test-Verify" mode, which is an extra-robust series of test runs (running it under a variety of conditions to see if it can be made to fail). This aims to "flush out" intermittent failures aggressively and early; any failures that turn up are presumed to be latent bugs in the test and/or the code that would may come up intermittently in the future during normal test runs.

You can run test-verify mode locally using ./mach test --verify path/to/test.

In this case, it looks like the we failed during the part of the test-verify running in "chaos mode", but it's not obvious to me what the failure is -- I don't see any unexpected results in that file (usually they're prominently flagged with the word unexpected in the log).

My guess is that the failure here is just that the test is dangerously close to the timeout value (at least, when run in chaos mode):

20:09:50     INFO - ## Slow tests ##
20:09:50     INFO - 
20:09:50     INFO - |                        Test                        | Result | Longest duration (ms) | Timeout (ms) |
20:09:50     INFO - |----------------------------------------------------|--------|-----------------------|--------------|
20:09:50     INFO - | `/css/css-ui/parsing/outline-valid-mandatory.html` | `OK`   | `9294`                | `10000`      |

9294ms is dangerously close to 10000ms (9 sec vs 10 sec), which indicates that this test is likely to occasionally time out on certain configurations.

We can address this by simply requesting a longer timeout for this test by adding this at the top of the test:

<meta name="timeout" content="long">

I suspect that'll take care of this.

Flags: needinfo?(dholbert)

Daniel Holbert [:dholbert]

Comment 6

•

2 years ago

•

Edited

...BUT, let's not actually add that; it'd probably be silly.

It looks like the test completes ~instantly (usually less than 100ms) when run "normally" (not in chaos mode), and it takes on the order of 50x to 100x the regular duration when run in chaos mode. (50x = comparing 70ms to 80ms vs. 5000ms, with 9294ms being the worst as flagged as a test failure, which is over 100x the average test duration in non-chaos-mode).

Quoting the log -- the first section (not in chaos mode) gets test-durations in the 100ms range, usually a bit lower:

Running test verification step "Running tests in a loop 10 times"...
[...]
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 133ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 88ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 80ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 85ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 78ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 71ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 71ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 74ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 72ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 79ms

But then in chaos mode, the test duration is > 3s, sometimes several times that (which puts us close to 10s hence the test harness worrying):

Running test verification step "Running tests in a loop 10 times with flags chaos_mode_flags=0xfb"...
[...]
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 5445ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 3790ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 3300ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 3401ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 3143ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 3304ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 3064ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 3104ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 3057ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 3364ms
[...]
Running test verification step "Running tests in a loop with restarts 5 times with flags chaos_mode_
flags=0xfb"...
[...]
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 5746ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 9294ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 5473ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 5552ms
TEST-OK | /css/css-ui/parsing/outline-valid-mandatory.html | took 5262ms

Bottom line: the "worrisome" long test-duration here isn't really realistic since it's chaos-mode-only and is about 100x the test's regular duration. To the extent that chaos mode is trying to test edge cases and flush out possible-but-unlikely scenarios, it's useful, but it seems like the durations that it produces aren't especially realistic given the order of magnitude difference.

Probably we should run WPT tests with a longer timeout allowance during chaos mode. It doesn't feel like it makes sense that we'd have to flag every 100ms-regular-duration test as needing a longer timeout (particularly if it means adding a crufty meta tag to the test itself), just to appease chaos mode & test-verify.

(I think dshin and I discussed something similar a while back; CC'ing him as an FYI. [EDIT: yeah, I was remembering running into this in bug 1765426.])

Daniel Holbert [:dholbert]

Comment 7

•

2 years ago

It looks like this is a known issue, tracked in bug 1734020.

This is essentially just an instance of that bug, and I don't think we need to take any action here. (Nothing particularly bad will happen / no further test-failures will occur here and cause orange on TreeHerder, except as a possible one-off, for commits that make further modifications to this test, which will trigger another Test-Verify run & might hit this again.)

Depends on: tv-chaosmode-timeout-wpt

Connor Pearson

Comment 8

•

2 years ago

Thanks for the explanation!

Daniel Holbert [:dholbert]

Updated

•

2 years ago

Status: NEW → RESOLVED

Closed: 2 years ago

No longer depends on: tv-chaosmode-timeout-wpt

Resolution: --- → DUPLICATE

Bugzilla

Intermittent TVw ::: Test verification FAIL | TinderboxPrint: Per-test run of .../outline-valid-mandatory.html<br/>: FAILURE

Categories

(Core :: Layout, defect, P5)

Tracking

()

People

(Reporter: intermittent-bug-filer, Unassigned)

References

(Regression)

Details

(Keywords: intermittent-failure, regression)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated