Intermittent TEST-UNEXPECTED-TIMEOUT /webmessaging/broadcastchannel/blobs.html | single tracking bug
Categories
(Core :: DOM: Core & HTML, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr115 | --- | unaffected |
firefox120 | --- | unaffected |
firefox121 | --- | fixed |
firefox122 | --- | fixed |
People
(Reporter: intermittent-bug-filer, Assigned: edenchuang)
References
(Regression)
Details
(Keywords: intermittent-failure, intermittent-testcase, regression)
Attachments
(2 files)
48 bytes,
text/x-phabricator-request
|
dmeehan
:
approval-mozilla-beta+
|
Details | Review |
48 bytes,
text/x-phabricator-request
|
dmeehan
:
approval-mozilla-beta+
|
Details | Review |
Filed by: csabou [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=436999716&repo=mozilla-central
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/Dox1L-eZSjauZqHGuXIlIw/runs/0/artifacts/public/logs/live_backing.log
[task 2023-11-21T07:34:32.898Z] 07:34:32 INFO - TEST-START | /wasm/jsapi/memory/grow.any.worker.html
[task 2023-11-21T07:34:32.911Z] 07:34:32 INFO - Closing window 0bac6d09-aa82-4662-ae54-9a6976be77f6
[task 2023-11-21T07:34:32.996Z] 07:34:32 INFO -
[task 2023-11-21T07:34:32.996Z] 07:34:32 INFO - TEST-PASS | /webmessaging/broadcastchannel/blobs.html | Blobs work on BroadcastChannel
[task 2023-11-21T07:34:32.996Z] 07:34:32 INFO - TEST-UNEXPECTED-TIMEOUT | /webmessaging/broadcastchannel/blobs.html | Blobs work with workers on BroadcastChannel - Test timed out
[task 2023-11-21T07:34:32.996Z] 07:34:32 INFO - TEST-UNEXPECTED-TIMEOUT | /webmessaging/broadcastchannel/blobs.html | expected OK
[task 2023-11-21T07:34:32.996Z] 07:34:32 INFO - TEST-INFO took 10678ms
[task 2023-11-21T07:34:32.999Z] 07:34:32 INFO - PID 7080 | 1700552072998 Marionette INFO Stopped listening on port 49243
[task 2023-11-21T07:34:33.051Z] 07:34:33 INFO - ...................
[task 2023-11-21T07:34:33.051Z] 07:34:33 INFO - TEST-OK | /wasm/jsapi/memory/grow.any.worker.html | took 149ms
[task 2023-11-21T07:34:33.051Z] 07:34:33 INFO - TEST-START | /wasm/jsapi/memory/grow.any.html
Comment 1•8 months ago
|
||
Still failing after changes in https://bugzilla.mozilla.org/show_bug.cgi?id=1864459#c8 reached central.
Assignee | ||
Comment 2•8 months ago
|
||
Found an edge case that the Worker is GCed immediately.
Worker creation and WorkerThreadPrimaryRunnable dispatching.
0:13.86 pid:15848 [Child 15856: Main Thread]: D/WorkerPrivate WorkerPrivate::WorkerPrivate [1286cab00]
0:13.86 pid:15848 [Child 15856: Main Thread]: D/WorkerPrivate WorkerPrivate::Start [1286cab00]
0:13.86 pid:15848 [Child 15856: Main Thread]: V/WorkerRunnable WorkerRunnable::DispatchInternal [1057ea9d0]
0:13.86 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate WorkerPrivate::DispatchLockHeld [1286cab00] runnable: 1057ea9d0
0:13.86 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate WorkerPrivate::DispatchLockHeld [1286cab00] runnable 1057ea9d0 is queued in mPreStartRunnables
worker.postMessage in the test. The WorkerRunnable would be saved in mPreStartRunnables.
0:13.86 pid:15848 postMessage channel to worker
0:13.86 pid:15848 [Child 15856: Main Thread]: V/WorkerRunnable WorkerRunnable::DispatchInternal [127922980]
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate WorkerPrivate::DispatchLockHeld [1286cab00] runnable: 127922980
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate WorkerPrivate::DispatchLockHeld [1286cab00] runnable 127922980 is queued in mPreStartRunnables
0:13.87 pid:15848 postMessage blob to worker
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerRunnable WorkerRunnable::DispatchInternal [127922a20]
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate WorkerPrivate::DispatchLockHeld [1286cab00] runnable: 127922a20
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate WorkerPrivate::DispatchLockHeld [1286cab00] runnable 127922a20 is queued in mPreStartRunnables
WorkerThreadPrimaryRunnable::Run()
We suppose that runnable in mPreStartRunnables would be moved to the thread event queue after WorkerPrivate::SetWorkerPrivateInWorkerThread(). So now WorkerPrivate::mThread->HasPendingEvents() should be true.
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::SetWorkerPrivateInWorkerThread [1286cab00]
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::AddWorkerRef [1286cab00] mStatus: 0, aFailStatus: (3)
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::DoRunLoop [1286cab00]
0:13.87 pid:15848 [Child 15856: DOM Worker]: V/WorkerPrivate WorkerPrivate::DoRunLoop [1286cab00] mStatus 1 before getting events to run
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate Worker 1286cab00 scheduled periodic GC timer
0:13.87 pid:15848 [Child 15856: DOM Worker]: V/WorkerRunnable WorkerRunnable::Run [1057ea9d0]
0:13.87 pid:15848 [Child 15856: DOM Worker]: V/WorkerRef IPCWorkerRef::IPCWorkerRef [1057eb290] aWorkerPrivate: 1286cab00 aName: ClientManagerChild
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::AddWorkerRef [1286cab00] mStatus: 1, aFailStatus: (3)
0:13.87 pid:15848 [Child 15856: DOM Worker]: V/WorkerRef IPCWorkerRef::SetActorCount [1057eb290] aCount: 1
0:13.87 pid:15848 [Child 15856: DOM Worker]: V/WorkerPrivate WorkerPrivate::AdjustNonblockingCCBackgroundActors [1286cab00] (1/0)
In the meantime, the main thread checks if the worker can be GCed. And the answer is "Yes," but it should be "No."
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate WorkerPrivate::IsEligibleForCC [1286cab00]
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate mMainThreadEventTarget: empty
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate mMainThreadEventTargetForMessaging: empty
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate mMainThreadDebuggerEventTarget: empty
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate mCCFlagSaysEligible: true
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate hasShutdownTasks: false
We suppose hasPendingEvents should be "true" here, but it is not.
0:13.87 pid:15848 [Child 15856: Main Thread]: V/WorkerPrivate hasPendingEvents: false
Then GC the worker, so the status changes to "Canceling"
0:13.87 pid:15848 [Child 15856: Main Thread]: D/WorkerPrivate WorkerPrivate::DispatchControlRunnable [1286cab00] runnable 102babf70
0:13.87 pid:15848 [Child 15856: DOM Worker]: V/WorkerRunnable WorkerRunnable::Run [102babf70]
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::NotifyInternal [1286cab00] mStatus: 1, aStatus: 3
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::NotifyWorkerRefs [1286cab00] aStatus: 3
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::NotifyWorkerRefs [1286cab00] WorkerRefs(WeakWorkerRef 1057eab60)
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::RemoveWorkerRef [1286cab00] aWorkerRef: 1057eab60
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::NotifyWorkerRefs [1286cab00] WorkerRefs(ClientManagerChild 1057eb290)
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::NotifyWorkerRefs [1286cab00] WorkerRefs(WeakWorkerRef 102bf5f80)
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::RemoveWorkerRef [1286cab00] aWorkerRef: 102bf5f80
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::NotifyWorkerRefs [1286cab00] WorkerRefs(ScriptLoader 102bf60c0)
0:13.87 pid:15848 [Child 15856: DOM Worker]: D/WorkerPrivate WorkerPrivate::NotifyInternal [1286cab00] abort script
Assignee | ||
Comment 3•8 months ago
|
||
Assignee | ||
Comment 4•8 months ago
|
||
Comment 5•8 months ago
|
||
Set release status flags based on info from the regressing bug 1836700
Assignee | ||
Updated•8 months ago
|
Pushed by echuang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e97f959d1225 Allow WorkerThread::HasPendingEvents() can be accessed on the parent thread. r=asuth
Comment 7•8 months ago
|
||
bugherder |
Comment 8•8 months ago
|
||
When double-checking things I realized we probably still have a gap with HasPendingEvents(). My original (unposted) proposal before HasPendingEvents was that we would set a flag on dispatch and clear it once WorkerPrivate::DoRunLoop had drained the events. At a high level, that was conceptually simple, but practically speaking it would end up much more messy and confusing, especially because of locking concerns like those that give rise to mOtherThreadsDispatchingViaEventTarget. So when I realized HasPendingEvents would do most of the work I proposed that, but...
HasPendingEvents() is great for letting us know as soon as a runnable is scheduled without us having to do anything extra. But it can start returning false as soon as ProcessNextEvent pulls the event off the queue to run it. We need to make sure that we set a flag that will cause IsEligibleForCC to return false that will at least cover:
- From before calling any function that will pull an event off an event queue that we intend to process.
- Until we've fully completed running the event.
The logic for this doesn't need to be particularly complex. In fact, I think one potentially quite simple thing we could do is.
- Add an mMutex guarded
mWorkerLoopIsIdle
boolean. - IsEligibleForCC only returns true if
mWorkerLoopIsIdle
is true. - Bracket the call to WaitForWorkerEvents in WorkerPrivate::DoRunLoop so it looks like:
mWorkerLoopIsIdle = true;
WaitForWorkerEvents();
mWorkerLoopIsIdle = false;
Note that I propose setting the flag outside of WaitForWorkerEvents because this way we only set the flag if we're in the top level loop, as the other 3 of 4 calls to the method would not be indicative of the worker being idle.
Assignee | ||
Comment 9•8 months ago
|
||
Comment 10•8 months ago
|
||
Pushed by echuang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f4fae08660c1 Add WorkerPrivate::mWorkerLoopIsIdle to indicate the Worker is idle and could be GCed. r=asuth
Comment 11•8 months ago
|
||
bugherder |
Assignee | ||
Comment 12•8 months ago
|
||
Comment on attachment 9364892 [details]
Bug 1865774 - Add WorkerPrivate::mWorkerLoopIsIdle to indicate the Worker is idle and could be GCed. r=asuth
Beta/Release Uplift Approval Request
- User impact if declined: The worker might be GCed unexpectedly, and significant functionality of the GCed worker could be interrupted by the unexpected GC.
- Is this code covered by automated tests?: Unknown
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The patches are low risk since the patches just make the pending event checking can be done on the parent thread of the worker, which is the place to judge if the Worker should be GCed or not. And making GC a Worker can only happen when the worker is idle.
- String changes made/needed: No
- Is Android affected?: Yes
Assignee | ||
Updated•8 months ago
|
Comment 13•8 months ago
•
|
||
Comment on attachment 9364716 [details]
Bug 1865774 - Allow WorkerThread::HasPendingEvents() can be accessed on the parent thread. r=asuth
Approved for 121.0b3
Comment 14•8 months ago
•
|
||
Comment on attachment 9364892 [details]
Bug 1865774 - Add WorkerPrivate::mWorkerLoopIsIdle to indicate the Worker is idle and could be GCed. r=asuth
Approved for 121.0b3
Comment 15•8 months ago
|
||
uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/02d92178bed6 https://hg.mozilla.org/releases/mozilla-beta/rev/543d00fd861d
Updated•8 months ago
|
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Description
•