Open Bug 1776962 Opened 2 years ago Updated 2 days ago

Intermittent dom/workers/test/test_sharedworker_event_listener_leaks.html | single tracking bug

Categories

(Core :: DOM: Workers, defect, P3)

defect

Tracking

()

REOPENED
Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- unaffected
firefox102 --- unaffected
firefox103 --- unaffected
firefox104 --- affected

People

(Reporter: jmaher, Assigned: asuth)

References

(Regression)

Details

(Keywords: intermittent-failure, intermittent-testcase, regression, Whiteboard: [retriggered])

Attachments

(1 file)

No description provided.

Additional information about this bug failures and frequency patterns can be found by running: ./mach test-info failure-report --bug 1776962

Set release status flags based on info from the regressing bug 1685627

(In reply to Natalia Csoregi [:nataliaCs] from comment #2)

First occurrence: https://treeherder.mozilla.org/jobs?repo=autoland&searchStr=Linux%2C18.04%2Cx64%2CWebRender%2Cdebug%2CMochitests%2Cwith%2Ccross-origin%2Cand%2Cfission%2Cenabled%2Ctest-linux1804-64-qr%2Fdebug-mochitest-plain-xorig%2C2&tochange=910652b5335034e44b15635289f56c63f6b0a6c6&fromchange=5956f91b3975ff848ab22b5b2abbd524422ede6d&group_state=expanded&selectedTaskRun=X6qtJ8TQSkOTCsWoST30gw.0

Luca, can you take a look please?

Sure thing,
The patch landed from Bug 1685627 should be basically a no-op for webcontent workers and only affect workers created from a moz-extension url worker script.

But I'll dig a bit more to see if I can get some more details useful to confirm that it is the regressing change (or that would exclude it with absolute certainty), I don't exclude that I may be not seeing the correlation between the two just yet.

I think the regression finding may have stopped a little early on the test jobs as it doesn't look like the patch could have introduced the problem[1], but after looking at this with :rpl I think at least one obvious thing we can do to improve things here is to automatically close the SharedWorker binding on DOMEventTargetHelper::DisconnectFromOwner which will drop the mWindow strong ref which is cycle-collected but doesn't need to wait that long.

1: That is, I Think if tests had been run on older jobs than https://treeherder.mozilla.org/jobs?repo=autoland&searchStr=Linux%2C18.04%2Cx64%2CWebRender%2Cdebug%2CMochitests%2Cwith%2Ccross-origin%2Cand%2Cfission%2Cenabled%2Ctest-linux1804-64-qr%2Fdebug-mochitest-plain-xorig%2C2&tochange=910652b5335034e44b15635289f56c63f6b0a6c6&fromchange=5956f91b3975ff848ab22b5b2abbd524422ede6d&group_state=expanded&selectedTaskRun=X6qtJ8TQSkOTCsWoST30gw.0 we would find that the test failures continued to happen.

Assignee: nobody → bugmail
Status: NEW → ASSIGNED
Flags: needinfo?(lgreco)
Severity: normal → S3

SharedWorker can be a better lifecycle participant by closing itself
promptly when its global disconnects its DETHs. This avoids the need
to wait for CC.

Pushed by bugmail@asutherland.org:
https://hg.mozilla.org/integration/autoland/rev/508fcd4fae2f
Close SharedWorker on DETH disconnect. r=dom-worker-reviewers,smaug
Status: ASSIGNED → RESOLVED
Closed: 8 months ago
Resolution: --- → FIXED
Target Milestone: --- → 121 Branch

This is still happening, at least with this failure line: TEST-UNEXPECTED-FAIL | dom/workers/test/test_sharedworker_event_listener_leaks.html | iframe content window should be garbage collected - SharedWorker default

Hi Andrew! Can you please take another look at this?
Thank you!

Status: RESOLVED → REOPENED
Flags: needinfo?(bugmail)
Resolution: FIXED → ---
Target Milestone: 121 Branch → ---

Okay, so I think the problem here is systemic and not specific to ServiceWorkers; looking at the search for symbol:#checkForEventListenerLeaks, every single test using it has intermittent failures in the last 30 days. Unfortunately, nothing is jumping out at me immediately. Something like the AbortController test_event_listener_leaks.html test is potentially a better test to diagnose since it does not involve IPC.

I'll ask around if anyone has any ideas.

Flags: needinfo?(bugmail)

It looks like patch D213816 is making this permafail on try. Not sure if it is the exact same thing that seems to leak, but it sounds likely that deferring the destruction of PresShell might keep alive some pointer that should have been unlinked inside mPresShell->Destroy(); instead.

See Also: → 1902728

Does it make all the other leak tests permafail too, or just SharedWorker? (I see it's a "try auto" run so my question would be if you explicitly ran the other tests too, assuming they weren't picked by try auto.)

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: