Clients.matchAll should not return Clients with inactive documents

RESOLVED FIXED in Firefox 47

Status

()

defect
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: bkelly, Assigned: bkelly)

Tracking

(Blocks 1 bug)

48 Branch
mozilla48
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox45 wontfix, firefox46 wontfix, firefox47+ fixed, firefox48+ fixed, firefox-esr45 disabled)

Details

(Whiteboard: btpp-active)

Attachments

(3 attachments, 8 obsolete attachments)

5.68 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
6.58 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
4.34 KB, patch
bkelly
: review+
Details | Diff | Splinter Review
+++ This bug was initially created as a clone of Bug #1263204 +++

With the test case in bug 1263204 you can see the matchAll() list continue to grow as you refresh.  This is because we are exposing clients in the bfcache.  We need to only expose clients when they are active.
Depends on: 1265795
This is my attempt at a test, but it doesn't trigger the problem.  I might need a browser chrome test.
This test case does provoke the problem.  It seems its necessary to trigger a refresh from the browser chrome instead of using window.reload().
This is my current fix.  It switches us from using the observer service list to instead using the explicit list of documents in SWM.  It does pre-generate the UUID for each document, so I am running talos to see if there are any major regressions:

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=9aea950b92f3
  https://treeherder.mozilla.org/perf.html#/comparechooser?newProject=try&newRevision=9aea950b92f3
Ehsan convinced me to stick with the current observer list design.  This patch just implements the changes necessary to work with that setup.  It also keeps the lazy evaluation of the document ID.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b00d83d0c3c0
Attachment #8744431 - Attachment is obsolete: true
Updated to assert that we don't find multiple documents with the same ID.

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=41570b946eff
Attachment #8744464 - Attachment is obsolete: true
This expands the test you reviewed over in bug 1265795 to also verify uncontrolled client windows.

Unfortunately it does not reproduce the problem in this bug.  It seems the issue here is only triggered by a browser chrome refresh and not window.reload().  I think the test is worth keeping, though.
Attachment #8743540 - Attachment is obsolete: true
Attachment #8744473 - Flags: review?(bzbarsky)
This test does reproduce the problem in this bug.  As described in comment 0 we are seeing uncontrolled clients show up in Clients.matchAll() every time the page is refreshed.  The old windows are being picked up in the list incorrectly.

To test this I am expanding and abusing our browser chrome mochitest browser_force_refresh.js.  The original purpose of this test was to verify a hard refresh bypasses service workers.  This patch adds some additional normal refreshes to the test.  On each refresh we verify that we only get the windows we expect to see.
Attachment #8744039 - Attachment is obsolete: true
Attachment #8744474 - Flags: review?(bzbarsky)
Comment on attachment 8744470 [details] [diff] [review]
P1 Only store active documents in the global observer list. r=bz

This patch fixes the problem.

The way Caches.matchAll() works is that each document registers itself with observer service under a particular topic.  The ServiceWorkerManager then enumerates the documents under this topic and returns any with an associated window.

The fix is only to register active documents in the observer service and to remove them when they become inactive.  Apparently when you refresh a page the document can become inactive, but still have a window.

While I'm here I fixes a few extra things:

* Made the observer service topic not be specific to service workers. In theory other code could use this list of active documents.
* Only register documents for principals we can actually observe.  So ignore system and null principal documents.
* Don't unnecessary generate document IDs when the observer topic is fired.
* Verify we only find one matching document when the observer topic is fired.
Attachment #8744470 - Flags: review?(bzbarsky)
Summary: Clients.matchAll should not return windows in bfcache → Clients.matchAll should not return Clients with inactive documents
Blocks: 1266857
Comment on attachment 8744470 [details] [diff] [review]
P1 Only store active documents in the global observer list. r=bz

I'm not a huge fan of using the word "active" to describe a state which may or may not match the spec's "active document" state.  In particular, it's quite non-obvious why null-checking mScriptGlobalObject is equivalent to nsIDocument::IsCurrentActiveDocument(), which aims to implement the spec concept of "active".

I would prefer we just keep service-worker-specific naming, if that's the only thing that cares about this observer type.  Maybe "get-doc-service-workers-care-about" or something?

r=me with that.
Attachment #8744470 - Flags: review?(bzbarsky) → review+
Comment on attachment 8744473 [details] [diff] [review]
P2 Expand navigate-window.https.html wpt test to cover uncontrolled windows. r=bz

>+  win.reload();

win.location.reload()?  Or something else?  There's no reload function on Window.

>+promise_test(function(t) {

Shouldn't this function return the promise?  And not make manual done() calls on t like this one seems to be doing via service_worker_unregister_and_done?

If you're not doing either of those things, why make this a promise_test?

Same for the other test you're adding.
Attachment #8744473 - Flags: review?(bzbarsky) → review-
Comment on attachment 8744474 [details] [diff] [review]
P3 Expand browser_force_refresh.js to verify Clients.matchAll() behavior on refresh. r=bz

Might have been nice to propagate the failure string out.  Say by using CustomEvent instead of Event and putting the string in .detail.  Then you can use it in the ok(false) call...

r=me either way.
Attachment #8744474 - Flags: review?(bzbarsky) → review+
Updated to:

1) Use window.location.reload()
2) Return test promise

I did not remove the explicit done() because it doesn't seem to cause a problem and its convenient to use the unregister_and_done() helper.

With these changes this test *does* trigger the problem in comment 0.  The P1 patch allows the test pass.
Attachment #8744473 - Attachment is obsolete: true
Comment on attachment 8744496 [details] [diff] [review]
P2 Expand navigate-window.https.html wpt test to cover uncontrolled windows. r=bz

See comment 12 for changes.
Attachment #8744496 - Flags: review?(bzbarsky)
Comment on attachment 8744496 [details] [diff] [review]
P2 Expand navigate-window.https.html wpt test to cover uncontrolled windows. r=bz

r=me, I guess.  I'm hoping wpt is smart about multiple done() calls on a test.
Attachment #8744496 - Flags: review?(bzbarsky) → review+
I'll remove the explicit dones before landing.
Updated to pass error message in CustomEvent.detail now that I know thats possible.
Attachment #8744474 - Attachment is obsolete: true
Attachment #8744534 - Flags: review+
Duplicate of this bug: 1263204
[Tracking Requested - why for this release]:
This is a compat issue that someone writing a unit test for their js framework noticed (see bug 1263204).  If we can uplift to FF47 we should.  Ideally we should uplift bug 1265795 first, though.
I was able to reproduce the failure consistently with rr chaos mode on linux.  The issue is I dropped a couple conditional checks before adding the strong ref.  If I restore those checks:

  mDocumentContainer && GetChannel()

Then the leak is fixed.  In this particular case we were adding strong ref's to documents without a channel, which are apparently not cleaned up with a SetScriptGlobalObject(nullptr).

Since these checks were already in the code previously and I'd like to get this landed before the merge, I'm just going to fold the fix in and re-land.
Try looks good.  I did 12 pushes of Win7 debug M(bc6) and didn't reproduce the failure.  Since trees are closed atm, marking checkin-needed.
Keywords: checkin-needed
Hi Ben, do you think this fix is stable enough and well tested now to consider uplifting to Beta47 and Aurora48?
Flags: needinfo?(bkelly)
Comment on attachment 8744627 [details] [diff] [review]
P1 Only store active documents in the global observer list. r=bz

Approval Request Comment
[Feature/regressing bug #]: service workers
[User impact if declined]: We incorrectly show inactive documents from the current window's history in Clients.matchAll().  This is a bad compat issue since these windows are not actually functional.  This could confuse js script running in the wild.
[Describe test coverage new/current, TreeHerder]: Automated tests included.
[Risks and why]: Moderate due to needing bug 1265795 uplifted first.  This code also moves some things around in nsDocument, but is mostly isolated to service worker issues.  We believe the code is correct, passes tests, and has had some bake time on nightly/aurora.
[String/UUID change made/needed]: None.
Flags: needinfo?(bkelly)
Attachment #8744627 - Flags: approval-mozilla-beta?
Comment on attachment 8744534 [details] [diff] [review]
P3 Expand browser_force_refresh.js to verify Clients.matchAll() behavior on refresh. r=bz

See comment 29.
Attachment #8744534 - Flags: approval-mozilla-beta?
Comment on attachment 8744531 [details] [diff] [review]
P2 Expand navigate-window.https.html wpt test to cover uncontrolled windows. r=bz

See comment 29.
Attachment #8744531 - Flags: approval-mozilla-beta?
(In reply to Ritu Kothari (:ritu) from comment #28)
> Hi Ben, do you think this fix is stable enough and well tested now to
> consider uplifting to Beta47 and Aurora48?

We don't need to uplift to aurora48 since it originally landed in 48.
Comment on attachment 8744627 [details] [diff] [review]
P1 Only store active documents in the global observer list. r=bz

Baked in Nightly for a week, improves SW robustness, Beta47+
Attachment #8744627 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8744534 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #8744531 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.