Closed Bug 1851896 Opened 1 year ago Closed 1 year ago

Intermittent [TV] ::: Test verification FAIL | TinderboxPrint: Per-test run of .../text-input-vertical-overflow-no-scroll.html<br/>: FAILURE

Categories

(Testing :: web-platform-tests, defect, P5)

defect

Tracking

(firefox-esr102 unaffected, firefox-esr115 unaffected, firefox117 unaffected, firefox118 unaffected, firefox119 fixed)

RESOLVED FIXED
119 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox-esr115 --- unaffected
firefox117 --- unaffected
firefox118 --- unaffected
firefox119 --- fixed

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: intermittent-failure, regression, test-verify-fail)

Filed by: nbeleuzu [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=428171091&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/cYsaFoR1QpKycqoKpgA0CA/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/cYsaFoR1QpKycqoKpgA0CA/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1


[task 2023-09-06T18:52:41.462Z] 18:52:41     INFO - | `/css/css-writing-modes/forms/text-input-vertical-overflow-no-scroll.html` | `input[type=number] in sideways-rl: typing characters in input should not cause the page to scroll`   | **FAIL: 8/10, PASS: 2/10** | `assert_equals: Typing lots of characters in input did not cause scrolling expected 177 but got 596` |
[task 2023-09-06T18:52:41.462Z] 18:52:41     INFO - 
[task 2023-09-06T18:52:41.463Z] 18:52:41     INFO - ::: Running tests in a loop 10 times : FAIL
[task 2023-09-06T18:52:41.463Z] 18:52:41     INFO - :::
[task 2023-09-06T18:52:41.463Z] 18:52:41    ERROR - ::: Test verification FAIL
[task 2023-09-06T18:52:41.463Z] 18:52:41     INFO - :::
[task 2023-09-06T18:52:41.948Z] 18:52:41     INFO - Return code: 1
[task 2023-09-06T18:52:41.948Z] 18:52:41  WARNING - setting return code to 2
[task 2023-09-06T18:52:41.949Z] 18:52:41    ERROR - TinderboxPrint: Per-test run of .../text-input-vertical-overflow-no-scroll.html<br/>: FAILURE
[task 2023-09-06T18:52:41.949Z] 18:52:41     INFO - Running post-action listener: _package_coverage_data
[task 2023-09-06T18:52:41.949Z] 18:52:41     INFO - Running post-action listener: _resource_record_post_action
[task 2023-09-06T18:52:41.950Z] 18:52:41     INFO - Running post-action listener: process_java_coverage_data
[task 2023-09-06T18:52:41.950Z] 18:52:41     INFO - Running post-action listener: stop_device
[task 2023-09-06T18:52:41.950Z] 18:52:41     INFO - [mozharness: 2023-09-06 18:52:41.950109Z] Finished run-tests step (success)
[task 2023-09-06T18:52:41.950Z] 18:52:41     INFO - Running post-run listener: _resource_record_post_run
[task 2023-09-06T18:52:42.037Z] 18:52:42     INFO - Total resource usage - Wall time: 45s; CPU: 46%; Read bytes: 0; Write bytes: 879271936; Read time: 0; Write time: 325152
[task 2023-09-06T18:52:42.037Z] 18:52:42     INFO - TinderboxPrint: CPU usage<br/>45.9%
[task 2023-09-06T18:52:42.037Z] 18:52:42     INFO - TinderboxPrint: I/O read bytes / time<br/>0 / 0
[task 2023-09-06T18:52:42.037Z] 18:52:42     INFO - TinderboxPrint: I/O write bytes / time<br/>879,271,936 / 325,152
[task 2023-09-06T18:52:42.037Z] 18:52:42     INFO - TinderboxPrint: CPU idle<br/>97.1 (54.3%)
[task 2023-09-06T18:52:42.037Z] 18:52:42     INFO - TinderboxPrint: CPU system<br/>7.7 (4.3%)
[task 2023-09-06T18:52:42.038Z] 18:52:42     INFO - TinderboxPrint: CPU user<br/>73.7 (41.2%)
[task 2023-09-06T18:52:42.038Z] 18:52:42     INFO - TinderboxPrint: Swap in / out<br/>0 / 0
[task 2023-09-06T18:52:42.038Z] 18:52:42     INFO - pull - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2023-09-06T18:52:42.038Z] 18:52:42     INFO - start-emulator - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2023-09-06T18:52:42.038Z] 18:52:42     INFO - verify-device - Wall time: 0s; CPU: Can't collect data; Read bytes: 0; Write bytes: 0; Read time: 0; Write time: 0
[task 2023-09-06T18:52:42.038Z] 18:52:42     INFO - install - Wall time: 12s; CPU: 25%; Read bytes: 0; Write bytes: 23830528; Read time: 0; Write time: 1480
[task 2023-09-06T18:52:42.038Z] 18:52:42     INFO - run-tests - Wall time: 34s; CPU: 53%; Read bytes: 0; Write bytes: 855441408; Read time: 0; Write time: 323672
[task 2023-09-06T18:52:42.046Z] 18:52:42  WARNING - returning nonzero exit status 2
Keywords: regression
Regressed by: 1802466

Set release status flags based on info from the regressing bug 1802466

:dholbert, since you are the author of the regressor, bug 1802466, could you take a look?

For more information, please visit BugBot documentation.

Odd. I can't reproduce this locally yet, FWIW. Strictly speaking, this isn't a "regression", since the test was already flaky and my patch made it not-flaky (and this Test-Verify output seems to be saying "no, it's still flaky").

Having said that, a few observations:

  • In the linked parsed log, some of the failures look like bug 1851066 (expected 177 but got 176); but others are more substantial (e.g. expected 283 but got 596)
  • I slightly wonder if test-verify tasks really run with up-to-date code, vs. if it's running some other snapshot. If this was running a "before the actual Gecko fix" slightly-old build, then that would explain the failure. The linked log has this at the top:
Image 'public/image.tar.zst' from task 'SsRaYS2QT-CLJaZ8VWd3fA' loaded.  Using image ID sha256:eac10505c38e624da8f74c8e6119059426a6742b8d756b87995d3e7c548e53f5.

I'm not sure what build that corresponds to. I don't immediately see that SsRaYS2QT-CLJaZ8VWd3fA or eac10505c38e624da8f74c8e6119059426a6742b8d756b87995d3e7c548e53f5 string in the actual corresponding build log from that same push, which makes me suspicious that maybe we're testing an old build.

...oh. Looking closer, the push that this TV run is on is the push where my patch was initially backed out. So... yes, that reintroduced our flakiness in the test that my patch was fixing, and it brought back some TV-exposed intermittent failures.

So: this bug is in fact fixed by bug 1802466, not caused by bug 1802466. The test-verify failures here were presumably present before my initial landing in bug 1802466 (though that's non-obvious since TV only runs when a test gets touched); and and they came back when that initial landing was backed out (which counted as touching this test & hence kicked off some TV runs); and they should be gone again now that bug 1802466 is back in.

Here's a Try run to double-check (green aside from one orange just from the test running long in chaos mode):
https://treeherder.mozilla.org/jobs?repo=try&revision=c1b042f0581d9e80171b918c12ad1a6bebb93072

Resolving this as FIXED-by bug 1802466 rather than regressed-by bug 1802466.

Status: NEW → RESOLVED
Closed: 1 year ago
Depends on: 1802466
Flags: needinfo?(dholbert)
No longer regressed by: 1802466
Resolution: --- → FIXED
Target Milestone: --- → 119 Branch
You need to log in before you can comment on or make changes to this bug.