Intermittent browser/components/shopping/tests/browser/browser_ui_telemetry.js | single tracking bug
Categories
(Firefox Graveyard :: Shopping, defect, P5)
Tracking
(firefox129 fixed, firefox130 fixed)
People
(Reporter: intermittent-bug-filer, Assigned: aminomancer)
References
Details
(Keywords: intermittent-failure, intermittent-testcase, Whiteboard: [stockwell disable-recommended])
Attachments
(2 files)
Filed by: csabou [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=430438837&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/dHZcYHtLSPamKHNd-qI5hA/runs/0/artifacts/public/logs/live_backing.log
task 2023-09-27T04:30:15.010Z] 04:30:15 INFO - TEST-PASS | browser/components/shopping/tests/browser/browser_ui_telemetry.js | {"category":"shopping","extra":{"source":"addressBarIcon"},"name":"surface_closed"} deepEqual {"category":"shopping","name":"surface_closed","extra":{"source":"addressBarIcon"}} -
[task 2023-09-27T04:30:15.013Z] 04:30:15 INFO - Leaving test bound test_close_telemetry_recorded
[task 2023-09-27T04:30:15.013Z] 04:30:15 INFO - Buffered messages finished
[task 2023-09-27T04:30:15.013Z] 04:30:15 INFO - TEST-UNEXPECTED-FAIL | browser/components/shopping/tests/browser/browser_ui_telemetry.js | This test exceeded the timeout threshold. It should be rewritten or split up. If that's not possible, use requestLongerTimeout(N), but only as a last resort. -
[task 2023-09-27T04:30:15.013Z] 04:30:15 INFO - GECKO(9670) | MEMORY STAT | vsize 3148MB | residentFast 367MB | heapAllocated 219MB
[task 2023-09-27T04:30:15.014Z] 04:30:15 INFO - TEST-OK | browser/components/shopping/tests/browser/browser_ui_telemetry.js | took 47695ms
[task 2023-09-27T04:30:15.015Z] 04:30:15 INFO - checking window state
[task 2023-09-27T04:30:15.015Z] 04:30:15 INFO - TEST-START | browser/components/shopping/tests/browser/browser_unanalyzed_product.js
| Comment hidden (Intermittent Failures Robot) |
Comment 2•2 years ago
|
||
Update
There have been 37 failures within the last 7 days:
- 10 failures on Linux 18.04 x64 WebRender debug/opt
- 6 failures on Linux 18.04 x64 WebRender Shippable opt
- 3 failures on Linux 18.04 x64 WebRender tsan opt
- 6 failures on OS X 10.15 WebRender debug/ opt
- 6 failures on Windows 11 x86 22H2 WebRender debug/opt
- 6 failures on Windows 11 x64 22H2 WebRender debug/opt
Recent log: https://treeherder.mozilla.org/logviewer?job_id=431408743&repo=autoland&lineNumber=14279
Jared, can you assign this to someone?
Thank you.
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Updated•2 years ago
|
Comment 21•2 years ago
|
||
Hi Jared! Can you please take a look at this? I think the recent spike in failures here is caused by the changes from Bug 1868602. The new failure line TEST-UNEXPECTED-FAIL | browser/components/shopping/tests/browser/browser_ui_telemetry.js | Uncaught exception in test bound test_shopping_sidebar_displayed - at chrome://mochitests/content/browser/browser/components/shopping/tests/browser/browser_ui_telemetry.js:197 - TypeError: can't access property "length", tabSwitchEvents is null appeared after that bug landed.
Thank you!
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 25•2 years ago
•
|
||
This is frequently failing on linux runs. By the rate this is failing it should be fixed or it will be disabled on linux. https://treeherder.mozilla.org/intermittent-failures/bugdetails?startday=2024-01-01&endday=2024-01-16&tree=trunk&failurehash=all&bug=1855360
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=443543302&repo=autoland
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 28•2 years ago
|
||
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 29•2 years ago
|
||
Updated•2 years ago
|
Comment 30•2 years ago
|
||
Comment 31•2 years ago
|
||
| bugherder | ||
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Assignee | ||
Comment 43•1 year ago
|
||
If you run the whole shopping browser test manifest locally and watch it as it runs, you may see a point where browser_ui_telemetry.js hangs for 30+ seconds, and you may get the "exceeds the timeout threshold" failure at the end. I'm not sure if it's consistently reproducible on every platform, but it is 100% consistent for me on Windows 10.
I encountered this before with a different test, and my fix was simply to add Services.fog.testResetFOG() after every subtest. I don't know exactly why that worked. My intuition at the time was that some kind of data was accumulating, and once it crossed a certain threshold, finally calling Services.fog.testResetFOG() would cause a hang as it stumbled over something. Whereas if you call that method often enough, it never crosses that threshold, so it never hangs. It's not simply that you're replacing one long hang with 15 small hangs - the total duration of the test was greatly reduced by placing these calls all over the place. And I also found that removing tests didn't alter the duration, until I removed a certain number, at which point the hang completely disappeared. So there seems to be a certain threshold where you go from no hang to a really long one.
This particular test already has very frequent Services.fog.testResetFOG() calls. So it doesn't actually fail individually. It has to be run in combination with other tests for the hang and the timeout threshold failure to happen. I think some other tests in this manifest are accumulating this unknown FOG data, and then browser_ui_telemetry.js is the test that's hanging because it's the first one in a long time to call testResetFOG(). That was exactly what happened with my previous test, except that it happened in the context of one test file. All the earlier subtests didn't hang, because they didn't call testResetFOG. It was only the first testResetFOG call that ran into problems.
So I think this failure can be avoided by adding testResetFOG to one or some of the tests in this manifest that come between browser_shopping_onboarding.js and browser_ui_telemetry.js (since those 2 already call it). The placement would be kind of arbitrary, since it would not be necessary for any actual assertions. But if it's causing 30-second hangs for me, I imagine that adds up considerably on CI, so I suspect this is worthwhile, even if it's a bit hacky.
ni?chutten because I think we discussed this briefly several months ago, and I seem to recall you having more information about the underlying cause of the issue. If that can be fixed in a more direct manner, it'd probably be preferable to adding testResetFOG for reasons other than its intended purpose. Thanks!
Comment 44•1 year ago
|
||
Yeah, that was bug 1833453 which I couldn't prioritize at the time. It was down to there being so many pending pings on disk, and disk not being as rapid as we might like. Might be the same here, not sure.
| Comment hidden (Intermittent Failures Robot) |
Comment 46•1 year ago
|
||
@Shane, it's perma on win/mac, after bug 1900486 landed.
| Assignee | ||
Comment 47•1 year ago
|
||
Thanks, that makes sense. I was running the tests after bug 1900486. So it's my regression, presumably due to the extra test I added. Seems like the fix will still be the same.
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Assignee | ||
Comment 51•1 year ago
|
||
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 58•1 year ago
|
||
| Comment hidden (Intermittent Failures Robot) |
Comment 60•1 year ago
|
||
| bugherder | ||
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 69•1 year ago
|
||
Updated•1 year ago
|
Comment 70•1 year ago
•
|
||
For posterity, try push.
| Comment hidden (Intermittent Failures Robot) |
Comment 72•1 year ago
|
||
| bugherder | ||
Updated•1 year ago
|
Comment 73•1 year ago
|
||
The patch landed in nightly and beta is affected.
:CosminS, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox129towontfix.
For more information, please visit BugBot documentation.
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Comment 76•1 year ago
|
||
Shane, if you think the fix here needs to get into beta where is permafailing please coordinate with dmeehan.
| Comment hidden (Intermittent Failures Robot) |
| Assignee | ||
Comment 78•1 year ago
|
||
Is permafailing bad enough to justify an uplift? I would normally not uplift non-user-facing issues, but I'd defer to your judgment on CI matters.
| Comment hidden (Intermittent Failures Robot) |
Comment 80•1 year ago
|
||
(In reply to Shane Hughes [:aminomancer] from comment #78)
Is permafailing bad enough to justify an uplift? I would normally not uplift non-user-facing issues, but I'd defer to your judgment on CI matters.
In general, we aim to minimize test failures in all branches when it makes sense.
This is test-only change, I can push it without an uplift request. I'll take in my next push to beta and esr128
Comment 81•1 year ago
|
||
| uplift | ||
Updated•1 year ago
|
Comment 82•1 year ago
|
||
Donal, the fix is in https://bugzilla.mozilla.org/show_bug.cgi?id=1855360#c60. I'll let you decide if that needs an uplift request. Comment 69 just enables the test after the fix. Thank you.
Comment 83•1 year ago
|
||
| uplift | ||
Updated•1 year ago
|
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
| Comment hidden (Intermittent Failures Robot) |
Updated•8 months ago
|
Description
•