Intermittent browser/components/shopping/tests/browser/browser_ui_telemetry.js | single tracking bug
Categories
(Firefox :: Shopping, defect, P5)
Tracking
()
People
(Reporter: intermittent-bug-filer, Assigned: aminomancer)
References
Details
(Keywords: intermittent-failure, intermittent-testcase, Whiteboard: [stockwell disable-recommended])
Attachments
(2 files)
Filed by: csabou [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=430438837&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/dHZcYHtLSPamKHNd-qI5hA/runs/0/artifacts/public/logs/live_backing.log
task 2023-09-27T04:30:15.010Z] 04:30:15 INFO - TEST-PASS | browser/components/shopping/tests/browser/browser_ui_telemetry.js | {"category":"shopping","extra":{"source":"addressBarIcon"},"name":"surface_closed"} deepEqual {"category":"shopping","name":"surface_closed","extra":{"source":"addressBarIcon"}} -
[task 2023-09-27T04:30:15.013Z] 04:30:15 INFO - Leaving test bound test_close_telemetry_recorded
[task 2023-09-27T04:30:15.013Z] 04:30:15 INFO - Buffered messages finished
[task 2023-09-27T04:30:15.013Z] 04:30:15 INFO - TEST-UNEXPECTED-FAIL | browser/components/shopping/tests/browser/browser_ui_telemetry.js | This test exceeded the timeout threshold. It should be rewritten or split up. If that's not possible, use requestLongerTimeout(N), but only as a last resort. -
[task 2023-09-27T04:30:15.013Z] 04:30:15 INFO - GECKO(9670) | MEMORY STAT | vsize 3148MB | residentFast 367MB | heapAllocated 219MB
[task 2023-09-27T04:30:15.014Z] 04:30:15 INFO - TEST-OK | browser/components/shopping/tests/browser/browser_ui_telemetry.js | took 47695ms
[task 2023-09-27T04:30:15.015Z] 04:30:15 INFO - checking window state
[task 2023-09-27T04:30:15.015Z] 04:30:15 INFO - TEST-START | browser/components/shopping/tests/browser/browser_unanalyzed_product.js
Comment hidden (Intermittent Failures Robot) |
Comment 2•1 year ago
|
||
Update
There have been 37 failures within the last 7 days:
- 10 failures on Linux 18.04 x64 WebRender debug/opt
- 6 failures on Linux 18.04 x64 WebRender Shippable opt
- 3 failures on Linux 18.04 x64 WebRender tsan opt
- 6 failures on OS X 10.15 WebRender debug/ opt
- 6 failures on Windows 11 x86 22H2 WebRender debug/opt
- 6 failures on Windows 11 x64 22H2 WebRender debug/opt
Recent log: https://treeherder.mozilla.org/logviewer?job_id=431408743&repo=autoland&lineNumber=14279
Jared, can you assign this to someone?
Thank you.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Updated•8 months ago
|
Comment 21•8 months ago
|
||
Hi Jared! Can you please take a look at this? I think the recent spike in failures here is caused by the changes from Bug 1868602. The new failure line TEST-UNEXPECTED-FAIL | browser/components/shopping/tests/browser/browser_ui_telemetry.js | Uncaught exception in test bound test_shopping_sidebar_displayed - at chrome://mochitests/content/browser/browser/components/shopping/tests/browser/browser_ui_telemetry.js:197 - TypeError: can't access property "length", tabSwitchEvents is null
appeared after that bug landed.
Thank you!
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 25•8 months ago
•
|
||
This is frequently failing on linux runs. By the rate this is failing it should be fixed or it will be disabled on linux. https://treeherder.mozilla.org/intermittent-failures/bugdetails?startday=2024-01-01&endday=2024-01-16&tree=trunk&failurehash=all&bug=1855360
Recent failure log: https://treeherder.mozilla.org/logviewer?job_id=443543302&repo=autoland
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 28•8 months ago
|
||
Updated•8 months ago
|
Updated•8 months ago
|
Updated•8 months ago
|
Comment 29•8 months ago
|
||
Updated•8 months ago
|
Comment 30•8 months ago
|
||
Comment 31•8 months ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 43•3 months ago
|
||
If you run the whole shopping browser test manifest locally and watch it as it runs, you may see a point where browser_ui_telemetry.js hangs for 30+ seconds, and you may get the "exceeds the timeout threshold" failure at the end. I'm not sure if it's consistently reproducible on every platform, but it is 100% consistent for me on Windows 10.
I encountered this before with a different test, and my fix was simply to add Services.fog.testResetFOG()
after every subtest. I don't know exactly why that worked. My intuition at the time was that some kind of data was accumulating, and once it crossed a certain threshold, finally calling Services.fog.testResetFOG()
would cause a hang as it stumbled over something. Whereas if you call that method often enough, it never crosses that threshold, so it never hangs. It's not simply that you're replacing one long hang with 15 small hangs - the total duration of the test was greatly reduced by placing these calls all over the place. And I also found that removing tests didn't alter the duration, until I removed a certain number, at which point the hang completely disappeared. So there seems to be a certain threshold where you go from no hang to a really long one.
This particular test already has very frequent Services.fog.testResetFOG()
calls. So it doesn't actually fail individually. It has to be run in combination with other tests for the hang and the timeout threshold failure to happen. I think some other tests in this manifest are accumulating this unknown FOG data, and then browser_ui_telemetry.js is the test that's hanging because it's the first one in a long time to call testResetFOG()
. That was exactly what happened with my previous test, except that it happened in the context of one test file. All the earlier subtests didn't hang, because they didn't call testResetFOG
. It was only the first testResetFOG
call that ran into problems.
So I think this failure can be avoided by adding testResetFOG
to one or some of the tests in this manifest that come between browser_shopping_onboarding.js and browser_ui_telemetry.js (since those 2 already call it). The placement would be kind of arbitrary, since it would not be necessary for any actual assertions. But if it's causing 30-second hangs for me, I imagine that adds up considerably on CI, so I suspect this is worthwhile, even if it's a bit hacky.
ni?chutten because I think we discussed this briefly several months ago, and I seem to recall you having more information about the underlying cause of the issue. If that can be fixed in a more direct manner, it'd probably be preferable to adding testResetFOG
for reasons other than its intended purpose. Thanks!
Comment 44•3 months ago
|
||
Yeah, that was bug 1833453 which I couldn't prioritize at the time. It was down to there being so many pending pings on disk, and disk not being as rapid as we might like. Might be the same here, not sure.
Comment hidden (Intermittent Failures Robot) |
Comment 46•3 months ago
|
||
@Shane, it's perma on win/mac, after bug 1900486 landed.
Assignee | ||
Comment 47•3 months ago
|
||
Thanks, that makes sense. I was running the tests after bug 1900486. So it's my regression, presumably due to the extra test I added. Seems like the fix will still be the same.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 51•3 months ago
|
||
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 58•3 months ago
|
||
Comment hidden (Intermittent Failures Robot) |
Comment 60•2 months ago
|
||
bugherder |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 69•2 months ago
|
||
Updated•2 months ago
|
Comment 70•2 months ago
•
|
||
For posterity, try push.
Comment hidden (Intermittent Failures Robot) |
Comment 72•2 months ago
|
||
bugherder |
Updated•2 months ago
|
Comment 73•2 months ago
|
||
The patch landed in nightly and beta is affected.
:CosminS, is this bug important enough to require an uplift?
- If yes, please nominate the patch for beta approval.
- If no, please set
status-firefox129
towontfix
.
For more information, please visit BugBot documentation.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment 76•2 months ago
|
||
Shane, if you think the fix here needs to get into beta where is permafailing please coordinate with dmeehan.
Comment hidden (Intermittent Failures Robot) |
Assignee | ||
Comment 78•2 months ago
|
||
Is permafailing bad enough to justify an uplift? I would normally not uplift non-user-facing issues, but I'd defer to your judgment on CI matters.
Comment hidden (Intermittent Failures Robot) |
Comment 80•2 months ago
|
||
(In reply to Shane Hughes [:aminomancer] from comment #78)
Is permafailing bad enough to justify an uplift? I would normally not uplift non-user-facing issues, but I'd defer to your judgment on CI matters.
In general, we aim to minimize test failures in all branches when it makes sense.
This is test-only change, I can push it without an uplift request. I'll take in my next push to beta and esr128
Comment 81•2 months ago
|
||
uplift |
Updated•2 months ago
|
Comment 82•2 months ago
|
||
Donal, the fix is in https://bugzilla.mozilla.org/show_bug.cgi?id=1855360#c60. I'll let you decide if that needs an uplift request. Comment 69 just enables the test after the fix. Thank you.
Comment 83•2 months ago
|
||
uplift |
Updated•2 months ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Description
•