Closed
Bug 1265795
Opened 8 years ago
Closed 8 years ago
Service worker can continue to control document in bfcache
Categories
(Core :: DOM: Service Workers, defect)
Tracking
()
RESOLVED
FIXED
mozilla48
People
(Reporter: bkelly, Assigned: bkelly)
References
(Blocks 2 open bugs)
Details
(Whiteboard: btpp-active)
Attachments
(4 files, 3 obsolete files)
4.66 KB,
patch
|
bkelly
:
review+
ritu
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
6.64 KB,
patch
|
bzbarsky
:
review+
ritu
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
1.08 KB,
patch
|
bzbarsky
:
review+
ritu
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
2.64 KB,
patch
|
bzbarsky
:
review+
ritu
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #1265771 +++ While investigating the fix for bug 1265771 I discovered with Boris's help that nsDocument::RemovedFromDocShell() does not get called when a document goes into the bfcache. This means we don't stop controlling documents on navigation within a window. Instead we need to move the service worker logic to nsDocument::SetScriptGlobalObject() when nullptr is passed in for the new global. In addition, we should only start controlling when a non-nullptr global is passed in. I'm doing this as a separate bug because its important that we uplift this one.
Assignee | ||
Updated•8 years ago
|
status-firefox45:
--- → wontfix
status-firefox46:
--- → wontfix
status-firefox47:
--- → affected
status-firefox-esr45:
--- → disabled
Comment 1•8 years ago
|
||
Tracking since we may want to uplift this to 47 and Ben says it is important and may have a fix coming soon.
tracking-firefox47:
--- → +
tracking-firefox48:
--- → +
Assignee | ||
Comment 2•8 years ago
|
||
Boris, this patch is based on our IRC conversation today. Our intention is to stop controlling a document when it becomes inactive. Previously we were trying to do this with RemovedFromDocShell(), but that is not called if the document goes into bfcache. I'm fairly sure this works as expected for the iframe.remove() case because we have many, many use cases of that throughout our mochitests and wpt tests. They pass with this patch. I'm going to work on a new test case that navigates a window to demonstrate the old document becomes uncontrolled.
Attachment #8742939 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 3•8 years ago
|
||
Comment on attachment 8742939 [details] [diff] [review] P1 Uncontrolled service workers when global is removed from document. r=bz Review of attachment 8742939 [details] [diff] [review]: ----------------------------------------------------------------- ::: dom/base/nsDocument.cpp @@ +4680,5 @@ > + loader->ClearCacheForControlledDocument(this); > + } > + } > + swm->MaybeStopControlling(this); > + } Its difficult to tell in splinter, but this is in this conditional block: if (mScriptGlobalObject && !aScriptGlobalObject) { So it only runs when we are clearing a global from the document.
Comment 4•8 years ago
|
||
Comment on attachment 8742939 [details] [diff] [review] P1 Uncontrolled service workers when global is removed from document. r=bz r=me
Attachment #8742939 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 5•8 years ago
|
||
I wrote a test that navigates a window, but I can't actually reproduce this. I think perhaps RemovedFromDocShell() is getting called through another convoluted path. Something that calls nsDocument::Destroy(). I think we should still land this patch, though, since it makes it much more clear when the document becomes uncontrolled. Its also nice to see the symmetry between becoming controlled when setting the global and becoming uncontrolled when clearing the global. We can see all the logic in a single method. With that said, I don't think this needs to be uplifted any more. Boris, does that sound reasonable to you? I'll get a stack for how RemovedFromDocShell() is getting called tomorrow to make sure we fully understand.
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 6•8 years ago
|
||
Well, let me do more testing. I think I just saw a case in my browser where a service worker registration was kept alive after a series of navigate/back/forward/back/forward and then an unregister. Let me see if that triggers any corner case in my test tomorrow.
Flags: needinfo?(bzbarsky)
Comment 7•8 years ago
|
||
> I wrote a test that navigates a window, but I can't actually reproduce this.
In your testcase, was the document you were testing against going into bfcache? You can check via the pagehide event.
Assignee | ||
Comment 8•8 years ago
|
||
Assignee | ||
Comment 9•8 years ago
|
||
Ah, the test is passing because Clients.matchAll() is filtering out controlled documents that don't have a window. This is still bad, though, because SWM thinks the documents are controlled and will keep a service worker registration alive because of this. So we do need to fix and uplift this. Unfortunately the test fails with the P1 patch applied. I need to investigate.
Assignee | ||
Comment 10•8 years ago
|
||
Updated to clear some service worker specific state when the document becomes inactive. This lets us actually re-control the document when it comes out of bfcache.
Attachment #8742939 -
Attachment is obsolete: true
Attachment #8743391 -
Flags: review+
Assignee | ||
Comment 11•8 years ago
|
||
Slightly better version of the patch that maintains the same clientID for the document after it comes out of bfcache.
Attachment #8743391 -
Attachment is obsolete: true
Attachment #8743423 -
Flags: review+
Assignee | ||
Comment 12•8 years ago
|
||
This test fails with the assertion I'm about to add in P3. It then succeeds once P1 is applied.
Attachment #8743335 -
Attachment is obsolete: true
Attachment #8743428 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 13•8 years ago
|
||
Assert that we don't have any controlled documents without an outer window. This is exactly the sort of thing we want to avoid in this bug. This change is necessary for the test to fail today since previously we filtered these window clients out.
Attachment #8743429 -
Flags: review?(bzbarsky)
Assignee | ||
Comment 14•8 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b5349263af28
Comment 15•8 years ago
|
||
Comment on attachment 8743428 [details] [diff] [review] P2 Add a web-platform-test for the window navigation case. r=bz r=me
Attachment #8743428 -
Flags: review?(bzbarsky) → review+
Comment 16•8 years ago
|
||
Comment on attachment 8743429 [details] [diff] [review] P3 Assert that controlled documents have an outer window. r=bz r=me
Attachment #8743429 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 17•8 years ago
|
||
The try build shows we are leaking windows in browser_printpreview.js. I believe nsDocumentViewer calls nsDocument::Destroy() for the print preview path without first calling SetScriptGlobalObject(nullptr). Elsewhere it always pairs these two methods together. Here is a try push that refactors things so nsDocument::Destroy() always calls SetScriptGlobalObject(nullptr) instead of requiring our external callers to do that: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a9777c75e722
Assignee | ||
Comment 18•8 years ago
|
||
The try run is looking quite good with this patch. To recap, the leak occurred because I moved ServiceWorkerManager::MaybeStopControllingDocument() from nsDocument::RemovedFromDocShell() to SetScriptGlobalObject(nullptr). One of the things MaybeStopControllingDocument() does is remove a reference to the document. Unfortunately, it seems certain paths through nsDocumentViewer do not call SetScriptGlobalObject(nullptr) when destroying a document. So this caused the window to leak because SWM was still holding a reference to it. It seems wrong to me that a document can be destroyed without explicitly having its global cleared and being marked inactive. This path moves the call to SetScriptGlobalObject(nullptr) into nsDocument::Destroy(). This avoids external callers having to remember to call both these methods together. To see that SetScriptGlobalObject(nullptr) can fail to be called currently, just run the browser_printpreview.js browser chrome mochitest. Again, here is the try that looks pretty green with this patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a9777c75e722
Attachment #8743956 -
Flags: review?(bzbarsky)
Comment 19•8 years ago
|
||
Comment on attachment 8743956 [details] [diff] [review] P4 Always call nsDocument::SetScriptGlobalObject(nullptr) from nsDocument::Destroy(). r=bz OK. So the issue is that in the print-preview case (and only in the case) nsDocumentViewer::OnDonePrinting doesn't null out mPrintEngine (on purpose). Then when nsDocumentViewer::Close is called we think that non-null mPrintEngine means we're closing while still printing and we don't call mDocument->SetScriptGlobalObject(nullptr). We also don't call mDocument->RemovedFromDocShell(). Then we land in nsDocumentViewer::Destroy and mPrintEngine->CheckBeforeDestroy() returns false. So we press and calls mDocument->Destroy(). This will call RemovedFromDocShell for us, so that part gets handled, at least.... We could add some state tracking for "OnDonePrinting already called for this mPrintEngine" or something, but it's not clear to me that we can't get multiple OnDonePrinting calls for the same mPrintEngine in the print preview case. This printing stuff is not as well documented or cleanly integrated as one would like... Anyway, you're right that the other calls to nsIDocument::Destroy are all in this file, and that they're all preceded by SetScriptGlobalObject(nullptr), albeit indirectly in the ~nsDocumentViewer case. So this looks like the safe thing to do (surprising, I know!) compared to other possible solutions to this print preview craziness. r=me.
Attachment #8743956 -
Flags: review?(bzbarsky) → review+
Comment 20•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/51a007dd9060 https://hg.mozilla.org/integration/mozilla-inbound/rev/7ee57ef9f290 https://hg.mozilla.org/integration/mozilla-inbound/rev/f3bc58b4de70 https://hg.mozilla.org/integration/mozilla-inbound/rev/46ac1843cd89
Assignee | ||
Comment 21•8 years ago
|
||
I'm going to wait to request uplift on this until it has baked a bit. I'd still like to get it on FF47 if we can, but the change in P4 is a bit more risky than I thought we would need.
Comment 22•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/51a007dd9060 https://hg.mozilla.org/mozilla-central/rev/7ee57ef9f290 https://hg.mozilla.org/mozilla-central/rev/f3bc58b4de70 https://hg.mozilla.org/mozilla-central/rev/46ac1843cd89
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Assignee | ||
Comment 23•8 years ago
|
||
Comment on attachment 8743423 [details] [diff] [review] P1 Uncontrolled service workers when global is removed from document. r=bz Approval Request Comment [Feature/regressing bug #]: service workers [User impact if declined]: This bug creates a very confusing situation for web developers where they may not be able to update their service worker because a document in the history is holding it alive. In addition, we need this fix in order to uplift bug 1265771 which exposes these history documents to content in Clients.matchAll(). This is a bad compat issue. [Describe test coverage new/current, TreeHerder]: Automated tests included. [Risks and why]: Moderate. Fixing this bug required changing some code in nsDocument and nsDocumentViewer which are used much more widely than service workers. We believe the changes are correct, pass tests, and have had some bake time on nightly/aurora, but there is some risk. [String/UUID change made/needed]: None.
Attachment #8743423 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 24•8 years ago
|
||
Comment on attachment 8743428 [details] [diff] [review] P2 Add a web-platform-test for the window navigation case. r=bz See comment 23.
Attachment #8743428 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 25•8 years ago
|
||
Comment on attachment 8743429 [details] [diff] [review] P3 Assert that controlled documents have an outer window. r=bz See comment 23.
Attachment #8743429 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 26•8 years ago
|
||
Comment on attachment 8743956 [details] [diff] [review] P4 Always call nsDocument::SetScriptGlobalObject(nullptr) from nsDocument::Destroy(). r=bz See comment 23.
Attachment #8743956 -
Flags: approval-mozilla-beta?
Hello BZ, given the risk assessment, I would like a second opinion on the severity of this issue on end-users and whether this is worth fixing in 47 or let it ride the trains. What do you think? Appreciate your help here.
Flags: needinfo?(bzbarsky)
Comment 28•8 years ago
|
||
The user impact is likely not too great. The impact on web developer goodwill is a lot more significant... I do think the changes are fairly low risk, though; we vetted them pretty carefully.
Flags: needinfo?(bzbarsky)
(In reply to Boris Zbarsky [:bz] from comment #28) > The user impact is likely not too great. The impact on web developer > goodwill is a lot more significant... > > I do think the changes are fairly low risk, though; we vetted them pretty > carefully. Thanks BZ for a prompt reply! I am inclined to taking it in Beta47.
Comment on attachment 8743423 [details] [diff] [review] P1 Uncontrolled service workers when global is removed from document. r=bz While this doesn't have a clear end-user impact atm, I am still inclined to taking it since we are striving to make our code more robust and it is still early in the Beta cycle, Beta47+
Attachment #8743423 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8743428 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8743429 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8743956 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Assignee | ||
Comment 31•8 years ago
|
||
This needs some re-basing.
Assignee | ||
Comment 32•8 years ago
|
||
remote: https://hg.mozilla.org/releases/mozilla-beta/rev/1e8087ee0087 remote: https://hg.mozilla.org/releases/mozilla-beta/rev/03c07da147aa remote: https://hg.mozilla.org/releases/mozilla-beta/rev/d7d52a3b03f4 remote: https://hg.mozilla.org/releases/mozilla-beta/rev/31722794d6c2
You need to log in
before you can comment on or make changes to this bug.
Description
•