Closed
Bug 1317266
Opened 8 years ago
Closed 8 years ago
registration.installing gets unexpectedly cycle collected when the SW switches states
Categories
(Core :: DOM: Service Workers, defect)
Core
DOM: Service Workers
Tracking
()
RESOLVED
FIXED
mozilla53
Tracking | Status | |
---|---|---|
firefox53 | --- | fixed |
People
(Reporter: catalinb, Assigned: catalinb)
References
Details
Attachments
(3 files, 4 obsolete files)
6.46 KB,
patch
|
Details | Diff | Splinter Review | |
3.52 KB,
patch
|
bkelly
:
review+
|
Details | Diff | Splinter Review |
13.74 KB,
patch
|
Details | Diff | Splinter Review |
Waiting for a worker to become active using the state change event sometimes fails
because the ServiceWorker DOM object can get collected when changing states.
new Promise(function(res) {
var worker = registration.installingWorker;
worker.addEventListener(function() {
if (worker.state === "active") res();
})
});
These are two tests that show the issue. In both cases we wait for the worker to become active using the statechange event. The worker takes 10s to activate using
setTimeout.
test_worker_reference_gc.html succeeds because there's a global variable keeping
the ServiceWorker object alive
test_worker_Reference_gc_timeout.html fails because the object gets CC-ed and there
are no listeners when the last state change is finally dispatched.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=113a9b20dad82e894a6a2cfe9ad74e9f1328431a&selectedJob=30982704
I'm not sure why test_workerupdatefoundevent.html also fails, the patch only adds a couple of NS_WARNINGs and the two mochitests.
Assignee | ||
Comment 1•8 years ago
|
||
Assignee | ||
Comment 2•8 years ago
|
||
Ben, could you please look at this and confirm the issue?
Flags: needinfo?(bkelly)
Comment 3•8 years ago
|
||
I could be a consequence of how we do lazy tracking of the DOM ServiceWorker objects on the registration object. While you might expect the ServiceWorkerRegistration simply holds a RefPtr<> to the installing/waiting/active worker attributes, it does not.
We only populate these attributes when a user accesses them. This by itself is not a problem.
However, when the SW changes state we invalidate the cached RefPtr<> values forcing the DOM ServiceWorker to be looked up again the next time. This is done in:
https://dxr.mozilla.org/mozilla-central/source/dom/workers/ServiceWorkerRegistration.cpp#308
So in this case the SW has changed state and wiped its references to the SW DOM object. If script is not holding the SW DOM object alive it could be collected at this point before the next state change event occurs.
Flags: needinfo?(bkelly)
Comment 4•8 years ago
|
||
What I don't understand in this test, however, is why the closure doesn't hold the worker alive:
add_task(function* test_worker_ref_gc() {
let registration = yield navigator.serviceWorker.register(
"lazy_worker.js", { scope: "./lazy_worker_scope_timeout"} )
.then(function(registration) {
SpecialPowers.exactGC();
var worker = registration.installing;
return new Promise(function(resolve) {
worker.addEventListener('statechange', function() {
info("CATALINB: state is " + worker.state + "\n");
SpecialPowers.exactGC();
if (worker.state === 'activated') {
resolve(registration);
}
});
});
});
The promise chain should be held alive by the async function yield. The promise chain should then hold the `var worker` alive.
Till, Boris, do you have any idea how we could be collecting the DOM object referenced by `var worker` in this code?
Flags: needinfo?(till)
Flags: needinfo?(bzbarsky)
Comment 5•8 years ago
|
||
Actually, wpt tests have a 10 second timeout by default. You are probably just racing that. Try adding this to your test html files and regenerating the wpt manifest:
<meta name=timeout content=long>
Flags: needinfo?(till)
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 6•8 years ago
|
||
(In reply to Ben Kelly [:bkelly] from comment #5)
> Actually, wpt tests have a 10 second timeout by default. You are probably
> just racing that. Try adding this to your test html files and regenerating
> the wpt manifest:
>
> <meta name=timeout content=long>
It's a mochitest.
Comment 7•8 years ago
|
||
Oh, these are not wpt tests. Ignore comment 5. Till, Boris, any thoughts on comment 4? Sorry for the flag spam.
Flags: needinfo?(till)
Flags: needinfo?(bzbarsky)
Comment 8•8 years ago
|
||
Catalin, I would also be happy to see us change comment 3. If someone adds a listener and we don't keep the worker ref in js explicitly the listener should still fire. We probably should explicitly keep the DOM SW referenced by the registration once its set even if it changes state.
Assignee | ||
Comment 9•8 years ago
|
||
(In reply to Ben Kelly [:bkelly] from comment #8)
> Catalin, I would also be happy to see us change comment 3. If someone adds
> a listener and we don't keep the worker ref in js explicitly the listener
> should still fire. We probably should explicitly keep the DOM SW referenced
> by the registration once its set even if it changes state.
Yes, I've been working on a fix today.
Assignee: nobody → catalin.badea392
Comment 10•8 years ago
|
||
The basic setup here is this:
(function() {
var worker = getMeAWorker();
worker.addEventListener("statechange", function() {
// use "worker" here
});
})();
right? So the only thing with a reference to "worker" is the function passed to addEventListener, and the only reference to that function is from the worker itself. Our cycle collector is quite happy to collect such a cycle. Otherwise this would leak:
(function() {
var div = document.createElement("div");
div.addEventListener("a new hope", function() {
div.setAttribute("death star", "exploded");
});
})();
Whatever is going to fire the statechange event should be keeping the worker object alive in this case...
Updated•8 years ago
|
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 12•8 years ago
|
||
Attachment #8810830 -
Flags: review?(bkelly)
Assignee | ||
Comment 13•8 years ago
|
||
Attachment #8810831 -
Flags: review?(bkelly)
Assignee | ||
Comment 14•8 years ago
|
||
Comment on attachment 8810831 [details] [diff] [review]
ServiceWorkerInfo keeps strong references to SW instances.
This breaks a bunch of devtools tests.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=70c215962d7a95c1eb044a3490946dbc0d4f2ecd&selectedJob=31149025
Attachment #8810831 -
Flags: review?(bkelly)
Comment 15•8 years ago
|
||
Comment on attachment 8810830 [details] [diff] [review]
Test registration.installing getting collected before all statechange events are dispatched.
Review of attachment 8810830 [details] [diff] [review]:
-----------------------------------------------------------------
::: dom/workers/test/serviceworkers/test_worker_reference_gc_timeout.html
@@ +33,5 @@
> + let registration = yield navigator.serviceWorker.register(
> + "lazy_worker.js", { scope: "./lazy_worker_scope_timeout"} )
> + .then(function(registration) {
> + SpecialPowers.exactGC();
> + var worker = registration.installing;
Please add a comment about what is going on here. I would mention that the `worker` ref is only kept alive by the closure held via the SW listener callback. Unless the browser holds the DOM SW object alive this is subject to cycle collection.
Attachment #8810830 -
Flags: review?(bkelly) → review+
Assignee | ||
Comment 16•8 years ago
|
||
I was breaking the QI to nsIServiceWorkerInfo.
Attachment #8810831 -
Attachment is obsolete: true
Attachment #8810982 -
Flags: review?(bkelly)
Comment 17•8 years ago
|
||
Comment on attachment 8810982 [details] [diff] [review]
ServiceWorkerInfo keeps strong references to SW instances.
Review of attachment 8810982 [details] [diff] [review]:
-----------------------------------------------------------------
As we discussed in IRC this would cause us to accumulate DOM ServiceWorker objects until the ServiceWorkerInfo is finally destroyed. For a long running browser that navigates through many intercepted pages this could be significant.
Ideally it would be better to make ServiceWorkerRegistration not drop its refs to ServiceWorker objects when the state changes.
Attachment #8810982 -
Flags: review?(bkelly)
Assignee | ||
Comment 18•8 years ago
|
||
Attachment #8810982 -
Attachment is obsolete: true
Attachment #8811729 -
Flags: review?(bkelly)
Comment 19•8 years ago
|
||
Comment on attachment 8811729 [details] [diff] [review]
Transition ServiceWorker instances when the SW changes states instead of just invalidating them.
Review of attachment 8811729 [details] [diff] [review]:
-----------------------------------------------------------------
This is the right idea, but we must change the ServiceWorker.state attribute synchronously with the ServiceWorkerRegistration attributes.
Also, this is turning into a dupe of bug 1257977. Maybe the discussion in there will help.
Thanks for tackling this!
::: dom/workers/ServiceWorkerManager.cpp
@@ +2530,5 @@
> +ServiceWorkerManager::TransitionServiceWorkerRegistrationWorker(ServiceWorkerRegistrationInfo* aRegistration,
> + WhichServiceWorker aWhichOne)
> +{
> + AssertIsOnMainThread();
> + nsTObserverArray<ServiceWorkerRegistrationListener*>::ForwardIterator it(mServiceWorkerRegistrationListeners);
I'm not asking you to change it here, but I find this listener setup really weird. Maybe in the future we should store the listeners on the ServiceWorkerRegistrationInfo directly. That way we don't have to compare scopes here.
::: dom/workers/ServiceWorkerRegistration.cpp
@@ +311,5 @@
> +ServiceWorkerRegistrationMainThread::TransitionWorker(WhichServiceWorker aWhichOne)
> +{
> + AssertIsOnMainThread();
> +
> + MOZ_ASSERT(!(aWhichOne & WhichServiceWorker::ACTIVE_WORKER));
I think it would be easier to move all your assertions here to the top of the method as:
MOZ_ASSERT(aWhichOne == INSTALLING_WORKER || aWhichOne == WAITING_WORKER);
@@ +314,5 @@
> +
> + MOZ_ASSERT(!(aWhichOne & WhichServiceWorker::ACTIVE_WORKER));
> + if (aWhichOne & WhichServiceWorker::INSTALLING_WORKER) {
> + MOZ_ASSERT(!(aWhichOne & WhichServiceWorker::ACTIVE_WORKER));
> + mWaitingWorker = mInstallingWorker.forget();
So I think we have a bigger problem here. I was going to suggest we do something like:
MOZ_ASSERT(mState == ServiceWorkerState::Installed)
But this won't work because we set the state via ChangeStateUpdater runnable. This will run after this method is run in your current patch.
I think what we need to do instead is trigger this transition from the ChangeStateUpdater itself. This will ensure the worker and registration attributes are changed synchronously.
The devtools callbacks can then be fired from another runnable that is dispatched after the ChangeStateUpdater is dispatched.
Incidentally this is basically bug 1257977. I have some patches that I started there down this road, but didn't have time to finish them.
Attachment #8811729 -
Flags: review?(bkelly) → review-
Assignee | ||
Comment 20•8 years ago
|
||
Attachment #8811729 -
Attachment is obsolete: true
Attachment #8812233 -
Flags: review?(bkelly)
Comment 21•8 years ago
|
||
Comment on attachment 8812233 [details] [diff] [review]
Transition ServiceWorker instances when the SW changes states instead of just invalidating them.
Review of attachment 8812233 [details] [diff] [review]:
-----------------------------------------------------------------
r=me with comments addressed.
::: dom/workers/ServiceWorkerRegistration.cpp
@@ +312,5 @@
> +{
> + AssertIsOnMainThread();
> +
> + if (aWhichOne == WhichServiceWorker::INSTALLING_WORKER) {
> + mWaitingWorker = mInstallingWorker.forget();
Please assert the worker state here. I guess we must assert that it is the old state. Add a comment that the state is updated in a queued runnable.
::: dom/workers/ServiceWorkerRegistrationInfo.cpp
@@ +321,3 @@
>
> +nsresult
> +ServiceWorkerRegistrationInfo::UpdateStatePropertiesRunnable::Run()
If you make this a method on ServiceWorkerRegistrationInfo like AsyncUpdateRegistrationStateProperties(), then you could just do this:
nsCOMPtr<nsIRunnable> r = NewRunnableMethod(
this,
&ServiceWorkerRegistrationInfo::AsyncUpdateRegistrationStateProperties,
aWorker, aTransition);
And you can remove UpdateStatePropertiesRunnable completely.
@@ +342,5 @@
> + RefPtr<Runnable> runnable = new UpdateStatePropertiesRunnable(this, aWorker,
> + aTransition);
> + MOZ_ALWAYS_SUCCEEDS(NS_DispatchToMainThread(runnable.forget()));
> +
> + NotifyChromeRegistrationListeners();
This will fire the chrome listeners before the attributes are updated. We probably want to ensure all attributes are updated before firing these.
So I think you should probably queue another runnable after async setting the registration attributes to fire these chrome listeners. You have to wait until that async step to queue the runnable in order to guarantee it fires after the state change runnable.
Attachment #8812233 -
Flags: review?(bkelly) → review+
Assignee | ||
Comment 22•8 years ago
|
||
Addressed comments except the change about how chrome listeners are notified.
The chrome listeners need to be called in the same task that changed
mSWInfo properties (mInstllingWorker etc). This is checked in test_serviceworkerregistrationinfo.xul.
Attachment #8812233 -
Attachment is obsolete: true
Comment 23•8 years ago
|
||
(In reply to Cătălin Badea (:catalinb) from comment #22)
> Addressed comments except the change about how chrome listeners are notified.
> The chrome listeners need to be called in the same task that changed
> mSWInfo properties (mInstllingWorker etc). This is checked in
> test_serviceworkerregistrationinfo.xul.
Ok. Can you add a comment to that effect?
Comment 24•8 years ago
|
||
Pushed by catalin.badea392@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3c1343636012
Test registration.installing getting collected before all statechange events are dispatched. r=bkelly
https://hg.mozilla.org/integration/mozilla-inbound/rev/ac6b94962bf2
Transition ServiceWorker instances when the SW changes states instead of just invalidating them. r=bkelly
Comment 25•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/3c1343636012
https://hg.mozilla.org/mozilla-central/rev/ac6b94962bf2
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox53:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
Comment 26•8 years ago
|
||
For the setTimeout in the test, it seems like the intent of the setTimeout() is for the SW to delay activation long enough for exactGC() to complete. Could lazy_worker.js be re-written to using postMessage() instead?
For example, in lazy_worker.js:
onactivate = function(event) {
var promise = new Promise(function(res) {
self.onmessage = function() {
res();
}
});
event.waitUntil(promise);
}
and the caller then be:
...
// (we see 'installed', 'activating', 'activated')
SpecialPowers.exactGC();
if (worker.state === 'activating') {
worker.postMessage({});
} else if (worker.state === 'activated') {
resolve(registration);
}
Flags: needinfo?(catalin.badea392)
Assignee | ||
Comment 27•8 years ago
|
||
(In reply to Andrew Sutherland [:asuth] from comment #26)
> For the setTimeout in the test, it seems like the intent of the setTimeout()
> is for the SW to delay activation long enough for exactGC() to complete.
> Could lazy_worker.js be re-written to using postMessage() instead?
>
> For example, in lazy_worker.js:
>
> onactivate = function(event) {
> var promise = new Promise(function(res) {
> self.onmessage = function() {
> res();
> }
> });
> event.waitUntil(promise);
> }
>
> and the caller then be:
>
> ...
> // (we see 'installed', 'activating', 'activated')
> SpecialPowers.exactGC();
> if (worker.state === 'activating') {
> worker.postMessage({});
> } else if (worker.state === 'activated') {
> resolve(registration);
> }
Yes, though, looking through the source code, I think we should use the callback that exactGC takes
to wait until the gc is done.
https://dxr.mozilla.org/mozilla-central/source/testing/specialpowers/content/specialpowersAPI.js#1465
Flags: needinfo?(catalin.badea392)
You need to log in
before you can comment on or make changes to this bug.
Description
•