Closed Bug 1469873 Opened 6 years ago Closed 6 years ago

Crash in mozilla::dom::ClientSource::SetController

Categories

(Core :: DOM: Service Workers, defect)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla63
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- unaffected
firefox61 --- unaffected
firefox62 --- fixed
firefox63 --- fixed

People

(Reporter: MatsPalmgren_bugz, Assigned: bkelly)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file, 4 obsolete files)

This bug was filed from the Socorro interface and is
report bp-5d2e123a-7272-4c3f-b02f-0e9e50180620.
=============================================================

MOZ_CRASH Reason:  MOZ_RELEASE_ASSERT(Info().URL().LowerCaseEqualsLiteral("about:blank") || StringBeginsWith(Info().URL(), static_cast<const nsLiteralCString&>(nsLiteralCString("" "blob:"))) || nsContentUtils::StorageAllowedForWindow(GetInnerWindow()) == nsContentUtils::StorageAccess::eAllow)


Top 10 frames of crashing thread:

0 XUL mozilla::dom::ClientSource::SetController dom/clients/manager/ClientSource.cpp:399
1 XUL mozilla::dom::ClientSource::Control dom/clients/manager/ClientSource.cpp:434
2 XUL void mozilla::dom::ClientSourceOpChild::DoSourceOp<RefPtr<mozilla::MozPromise<mozilla::dom::ClientOpResult, nsresult, false> >  dom/clients/manager/ClientSourceOpChild.cpp:45
3 XUL mozilla::dom::ClientSourceChild::RecvPClientSourceOpConstructor dom/clients/manager/ClientSourceChild.cpp:49
4 XUL mozilla::dom::PClientSourceChild::OnMessageReceived ipc/ipdl/PClientSourceChild.cpp:252
5 XUL mozilla::ipc::MessageChannel::DispatchAsyncMessage ipc/glue/MessageChannel.cpp:2134
6 XUL mozilla::ipc::MessageChannel::DispatchMessage ipc/glue/MessageChannel.cpp:2064
7 XUL mozilla::ipc::MessageChannel::MessageTask::Run ipc/glue/MessageChannel.cpp:1943
8 XUL nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1059
9 XUL NS_ProcessPendingEvents xpcom/threads/nsThreadUtils.cpp:461

=============================================================
Flags: needinfo?(bkelly)
Mats, did you run into this?  Do you have any extensions installed or were you adjusting cookie permissions/prefs?
Flags: needinfo?(bkelly) → needinfo?(mats)
No, I just found that there are quite a few of these assertions
reported at https://crash-stats.mozilla.com
Flags: needinfo?(mats)
the signature started taking off on 62.0a1 build 20180605100102 - the changelog to the prior nightly build is: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d8f180&tochange=a35875
two service worker patches that would stick out there are bug 1441932 and bug 1462069.
Keywords: regression
Thanks.  That helps narrow down the issue.  I'll look on Monday.
Flags: needinfo?(bkelly)
I believe the issue is that the ClientSource::Claim() method used to eventually call through this code:

https://searchfox.org/mozilla-central/source/dom/serviceworkers/ServiceWorkerManager.cpp#1586-1601

Which checks for storage allowed.  But now it ends up going through this which does not:

https://searchfox.org/mozilla-central/source/dom/serviceworkers/ServiceWorkerManager.cpp#1603-1615

We can't just add a check into this new method, unfortunately.  It needs to be able to run in the parent process which does not have access to stuff like nsContentUtils::StorageAllowedForDocument().

I'll have to think of the right solution.  I can get a short term fix soon, but the e10s compatible fix may take longer.
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Note to self, I need to make this get set on the return of operation success instead of eagerly before contacting the ClientSource:

https://searchfox.org/mozilla-central/source/dom/clients/manager/ClientSourceParent.cpp#282-287
Comment on attachment 8987599 [details] [diff] [review]
Make ClientSource::Control() fail if storage access is not available and make claim() respect the result. r=mrbkap

Blake, this is crash and fix touch a somewhat complicated intersection of features.  Let me provides some background:

Currently we disable service workers in any window that is blocked from storing cookies or otherwise accessing storage.  The reasoning is that registering a service worker requires storing state on disk and also once you get into a service worker context you have escaped things like 3rd party iframe status which might effect storage policy.  Its safest to block service worker access if storage is blocked.

As a complication there are a number of ways storage can be blocked:

a. private browsing mode
b. cookie preference values
c. per-site permissions that override preference values

In addition, its possible for things like permissions to change after a page is loaded.  We do our best to handle this, but I imagine we are inconsistent in some places.

Previously we did a bad job with this service worker storage blocking.  To help improve things we added some diagnostic assertions to catch any corner cases.

This crash bug is about one of these storage allowed diagnostic assertions failing.  Specifically, the assertion around setting the client controlled by a service worker when storage permissions are denied.

This is further complicated by the fact that "client is controlled" state is somewhat spread across components currently.  We have:

1. ClientSource mController field.  This must be set for synchronous navigator.serviceWorker.controller access by content script.
2. ServiceWorkerManager's list of controlled ClientHandle objects.  This is nececssary for SWM to know when its safe to promote a waiting worker to active.  In legacy mode this is stored in every process and in our new e10s mode just in the parent.
3. ClientSourceParent also has a mController value that ClientManagerService checks for things like clients.matchAll().  In theory this could be removed in favor of sending messages to the ClientSource or the SWM to check.  We don't have time for that refactoring right now, though.

So with that out of the way... Why is this assertion triggering?

The reason is that a service worker is calling clients.claim() to control all windows/worker that match its URL scope, but there is some window that has storage denied.  Probably a 3rd party iframe.

Previously this situation was handled by always checking for storage access in ClientSource::Claim().  This method is only called in legacy child-process mode, though.  In our new parent-process mode we do the claim completely in the parent process and just use ClientHandle::Control() to tell the ClientSource.  So we can't rely on the old code's approach to checking storage access.  We also don't have the storage access state available in the parent.

This patch fixes the problem a few ways:

1. ClientSource::Control() now checks storage access permissions and rejects its promise if the check fails.
2. ServiceWorkerManager now checks to see if the ClientHandle::Control() promise rejects and updates its state appropriately.
3. ClientSourceOpParent now checks for rejection of a Control operation and clears the ClientSourceParent mController state as appropriate.
4. ClientManagerService::Claim() now calls scopeExit.release() appropriately in the parent-process case to avoid rejecting its operation incorrectly.  This is just a bug that we were mostly ignoring before, but now matters.  Note, this rejection is on a different promise than the above chain, but the timing now matters more.
5. Since we now require explicit control success I also had to make ClientSource::Claim() function fully on worker threads.  This means sending a message to the SWM in that case to perform the control operation.  Previously we were just marking things controlled locally without telling the SWM at all.

Anyway, I hope that isn't too much information.  Let me know if you want me to clarify any of this.
Attachment #8987599 - Flags: review?(mrbkap)
Attachment #8987599 - Flags: review?(mrbkap) → review+
Pushed by bkelly@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b610acdb4ead
Make ClientSource::Control() fail if storage access is not available and make claim() respect the result. r=mrbkap
Backout by cbrindusan@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2259143ecb07
Backed out changeset b610acdb4ead for mochitest crashes on xul.dll. CLOSED TREE
Any link to the failures that triggered this backout?
Flags: needinfo?(cbrindusan)
It looks like the windows compiler is doing a return-value move optimization on the ref variable.  This then plays badly with the ScopeExit that tries to use the ref value as well.  I'll just try removing ScopeExit here for now.
Flags: needinfo?(bkelly)
Pushed by bkelly@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/54ab67a48fb2
Make ClientSource::Control() fail if storage access is not available and make claim() respect the result. r=mrbkap
https://hg.mozilla.org/mozilla-central/rev/54ab67a48fb2
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Lets verify the crashes are resolved in nightly before uplifting to beta.
the issue looks indeed fixed in recent nightly builds: https://crash-stop.herokuapp.com/bug.html?id=1469873
(In reply to Ben Kelly [:bkelly] from comment #21)
> Lets verify the crashes are resolved in nightly before uplifting to beta.

Patch seems to have fixed the issue on nightly, Ben could you request uplifting to Beta please? thanks
Flags: needinfo?(bkelly)
Let me make sure it applies cleanly to beta.  It should since branch was so recent, but I've also been landing a lot of stuff in this area...
Blocks: 1462069
Flags: needinfo?(ben)
Comment on attachment 8988580 [details] [diff] [review]
Make ClientSource::Control() fail if storage access is not available and make claim() respect the result. r=mrbkap

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1462069
[User impact if declined]: Diagnostic assertions in nightly/dev-edition if the user is using cookie blocking permissions.  In beta/release there will be privacy leaks instead of the assertions.
[Is this code covered by automated tests?]: We have tests for service workers and for cookie permissions, but none that cover this exact case.
[Has the fix been verified in Nightly?]: Yes
[Needs manual test from QE? If yes, steps to reproduce]:  None
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: Moderate
[Why is the change risky/not risky?]: Its a moderate sized patch in some complicated code.
[String changes made/needed]: None
Attachment #8988580 - Flags: approval-mozilla-beta?
Comment on attachment 8988580 [details] [diff] [review]
Make ClientSource::Control() fail if storage access is not available and make claim() respect the result. r=mrbkap

Crash fix, verified in beta, may be a little risky but let's try it since we're in early beta. Should land for beta 5 or 6.
Attachment #8988580 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: