Service Worker doesn't terminate after a period of time when thread blocking
Categories
(Core :: DOM: Service Workers, defect, P2)
Tracking
()
People
(Reporter: ben.kay01, Assigned: perry)
Details
Attachments
(2 files)
47 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
|
Details | Review |
47 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-beta+
|
Details | Review |
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3937.0 Safari/537.36
Firefox for Android
Steps to reproduce:
We gave a service worker an infinite loop outside the event handlers, it locked up the service worker thread, blocking calls out to check for updates and essentially killed all service worker functionality with no easy way to recover as a user.
Please see https://static-misc-2.glitch.me/blocking-sw/ that was made by Jake Archibald from Google as an example where the service worker served never resolves the promise to install and blocks the thread using a while (true).
Actual results:
Service worker is blocked, and cannot update. even on a client refresh.
Expected results:
If you visit https://static-misc-2.glitch.me/blocking-sw/ on Chrome, Chrome kills the thread after 60 seconds unless dev tools is open.
I would expect Firefox kill the service worker thread after a timeout period.
Comment 1•5 years ago
|
||
Bugbug thinks this bug should belong to this component, but please revert this change in case of error.
Updated•5 years ago
|
I raised this same bug against webkit https://bugs.webkit.org/show_bug.cgi?id=202992 which has now been resolved,
Any update on where this is up to?
Assignee | ||
Comment 3•4 years ago
|
||
(In reply to ben.kay01 from comment #2)
I raised this same bug against webkit https://bugs.webkit.org/show_bug.cgi?id=202992 which has now been resolved,
Any update on where this is up to?
Our kill timer is set to 5 minutes at the moment, but I think we can reduce it to one minute to be more consistent.
Assignee | ||
Comment 4•4 years ago
|
||
This gives each ServiceWorker a total of 1 minute to finish doing "something"
(unless the timeout is reset by something else the ServiceWorker needs to do).
1 minute is chosen to be roughly consistent with Chrome/Safari (as opposed to
a total of 5 minutes 30 seconds).
Assignee | ||
Comment 5•4 years ago
|
||
- Make the worker thread's associated WorkerPrivate available from
RemoteWorkerChild' "Pending" state (which is the state while the script
is evaluating). WorkerPrivate is needed to forcefully terminate the worker,
and previously it wasn't available until the script finished evaluating, so
a script likewhile (true);
wouldn't be able to be stopped because its
WorkerPrivate wouldn't ever be available. - Refactor some RemoteWorkerChild states to clean up WorkerPrivate/WeakWorkerRef.
Depends on D59245
Assignee | ||
Comment 6•4 years ago
|
||
Verified patches against https://static-misc-2.glitch.me/blocking-sw/, which should say the worker is killed in ~60 seconds.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=84d541aa05a259a5e47edee8838e6288ac6d34c1
Pushed by pjiang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a6bc584681e4 decrease "idle extended timeout" to 30 seconds r=dom-workers-and-storage-reviewers,asuth https://hg.mozilla.org/integration/autoland/rev/dc9c01d8dfb1 be able to terminate remote workers while their script is evaluating r=dom-workers-and-storage-reviewers,asuth
Comment 8•4 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a6bc584681e4
https://hg.mozilla.org/mozilla-central/rev/dc9c01d8dfb1
Assignee | ||
Comment 9•4 years ago
|
||
Comment on attachment 9119600 [details]
Bug 1588838 - be able to terminate remote workers while their script is evaluating r?#dom-workers-and-storage
Beta/Release Uplift Approval Request
- User impact if declined: ServiceWorker scripts with "very long running execution" in their initial script evaluation (e.g. infinite loop) won't be able to be interrupted. This means that the ServiceWorker's thread will be completely blocked. This also causes a shutdown crash with the ServiceWorkerShutdownBlocker because we won't be able to terminate the worker.
The answer to the next question is more accurately "no, but soon yes" - there's WPTs waiting to be synced with upstream introduced by WebKit (after which they'll be synced with our repo).
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: Yes
- If yes, steps to reproduce: https://bugzilla.mozilla.org/show_bug.cgi?id=1588838#c0 are STR. The page should eventually display a time that's approximately 60 seconds.
- List of other uplifts needed: None
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): In theory not risky because we're just making a handle to the worker thread available while the worker's script is evaluating so we can terminate it if we want, and this is the only scenario in which we touch this new data, so there's low risk of misusing it somehow. Setting as medium risk because there's a decent number of LOC changed.
- String changes made/needed:
Assignee | ||
Updated•4 years ago
|
Comment 10•4 years ago
|
||
Comment on attachment 9119599 [details]
Bug 1588838 - decrease "idle extended timeout" to 30 seconds r?#dom-workers-and-storage
Reduces the chances of a stuck service worker blocking shutdown and breaking other service workers. Approved for 73.0b3.
Updated•4 years ago
|
Comment 11•4 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/aadd25bc7eae
https://hg.mozilla.org/releases/mozilla-beta/rev/2a0c019d7195
Updated•4 years ago
|
Comment 12•4 years ago
|
||
Reproduced the issue with Firefox 71.0a1 (20191015213743) on Windows 10x64. Visiting https://static-misc-2.glitch.me/blocking-sw/ keeps the service worker in "Registering" state for an Infinite time in about:debugging-> This Nightly -> Service Workers and "pending" message on the visited page.
Using Firefox 74.0a1 (20200109213942) and Firefox 73.0b3 (20200110003145) with Windows 10x64, macOS 10.15 and Ubuntu 16.04 on the same website, terminates the worker in about:debugging-> This Nightly -> Service Worker and "rejected in 60 seconds" message is displayed.
Description
•