Closed Bug 1495285 Opened 6 years ago Closed 6 years ago

Crash in mozilla::dom::ClientSource::WindowExecutionReady

Categories

(Core :: DOM: Service Workers, defect, P1)

Unspecified
All
defect

Tracking

()

RESOLVED FIXED
mozilla64
Tracking Status
firefox-esr60 --- unaffected
firefox62 --- unaffected
firefox63 + wontfix
firefox64 blocking fixed

People

(Reporter: BenWa, Assigned: baku)

References

Details

(Keywords: crash, topcrash)

Crash Data

Attachments

(3 files, 2 obsolete files)

This bug was filed from the Socorro interface and is
report bp-dfff99f7-e92d-4c0e-8aff-3e06f0180930.
=============================================================

Top 10 frames of crashing thread:

0 xul.dll mozilla::dom::ClientSource::WindowExecutionReady dom/clients/manager/ClientSource.cpp:286
1 xul.dll nsGlobalWindowOuter::SetNewDocument dom/base/nsGlobalWindowOuter.cpp:1950
2 xul.dll nsresult nsDocumentViewer::InitInternal layout/base/nsDocumentViewer.cpp:1032
3 xul.dll nsDocumentViewer::Init layout/base/nsDocumentViewer.cpp:771
4 xul.dll nsresult nsDocShell::Embed docshell/base/nsDocShell.cpp:6706
5 xul.dll nsDSURIContentListener::DoContent docshell/base/nsDSURIContentListener.cpp:196
6 xul.dll bool nsDocumentOpenInfo::TryContentListener uriloader/base/nsURILoader.cpp:759
7 xul.dll nsDocumentOpenInfo::OnStartRequest uriloader/base/nsURILoader.cpp:306
8 xul.dll void mozilla::net::HttpChannelChild::DoOnStartRequest netwerk/protocol/http/HttpChannelChild.cpp:771
9 xul.dll void mozilla::net::HttpChannelChild::OnStartRequest netwerk/protocol/http/HttpChannelChild.cpp:638

=============================================================

This is reproducable by loading this site and waiting 2 seconds:
https://forums.2k.com/showthread.php?86547-Space-Race-launch-location&highlight=spaceship
Blocks: 1441133
Flags: needinfo?(bugmail)
Priority: -- → P1
Andrew, could you take a look at this?  MOZ_DIAGNOSTIC_ASSERT firing is really bad.
Assignee: nobody → bugmail
see also bug 1487227, recently resolved for the same stack (as a duplicate of the fixed bug 1493211)
Just hit it again and can reproduce it on this URL:
https://knowyourmeme.com/memes/press-f-to-pay-respects

Nightly 64.0a1 (2018-09-30) (64-bit)
I ran a mozregression on it and it doesn't reproduce there, even if I reuse my profile. On my main profile it will reproduce even if I restart the browser and go to the URL immediately.
The signature is ranked #1 in content top crashers for nightly
Keywords: topcrash
OS: Windows 10 → All
[Tracking Requested - why for this release]:
We have around 10000 crashes on the last 7 days in nightly.
I think we need a fix or a backout today.
As :gsnedders notes in comment 2, this is basically the same situation as bug 1487227.  Tracking protection induces asymmetries in the storage-allowed checks between channel and window.

This rush of crashes was due to turning on tracking protection by default on nightly in bug 1492563 and has ceased again in build 20180930220130 with the landing of the backout commit https://hg.mozilla.org/mozilla-central/rev/463e670ab9dd in https://bugzilla.mozilla.org/show_bug.cgi?id=1492563#c18

We can probably expect some level of crashes to continue for those who have explicitly enabled.

We can downgrade the diagnostic asserts but they are capturing an important invariant violation:
- The channel check said it's okay to use storage.
- The window check said it's not okay to use storage.

I'll see if there's a clean way for us to change the channel check into a window check.
Status: NEW → ASSIGNED
Component: DOM → DOM: Service Workers
Flags: needinfo?(bugmail)
FWIW: I have repeatedly seen this crash with network.cookie.cookieBehavior;0 (i.e., the default), and I've never changed the pref.
(In reply to Geoffrey Sneddon [:gsnedders] from comment #9)
> FWIW: I have repeatedly seen this crash with network.cookie.cookieBehavior;0
> (i.e., the default), and I've never changed the pref.

Hm, that's more than a little concerning.  Are you sure you weren't impacted by the default being changed on nightly for a few days?  (If your setting was the default of 0, then when the default was changed you would end up using the new default unless you explicitly changed the setting back to 0 when the default value was changed on nightly.)

If you could share some of your crash-ids from "about:crashes" that would be useful.  You can send them to me privately at asuth@mozilla.com.

Related questions might be if you use any WebExtensions that might do anything that would intercept requests with the webRequest API or use the cookies' APIs to automatically set the cookie preferences on a per-site basis.  Thanks!
Flags: needinfo?(geoffers+mozilla)
(In reply to Andrew Sutherland [:asuth] from comment #10)
> (In reply to Geoffrey Sneddon [:gsnedders] from comment #9)
> > FWIW: I have repeatedly seen this crash with network.cookie.cookieBehavior;0
> > (i.e., the default), and I've never changed the pref.
> 
> Hm, that's more than a little concerning.  Are you sure you weren't impacted
> by the default being changed on nightly for a few days?  (If your setting
> was the default of 0, then when the default was changed you would end up
> using the new default unless you explicitly changed the setting back to 0
> when the default value was changed on nightly.)
> 
> If you could share some of your crash-ids from "about:crashes" that would be
> useful.  You can send them to me privately at asuth@mozilla.com.
> 
> Related questions might be if you use any WebExtensions that might do
> anything that would intercept requests with the webRequest API or use the
> cookies' APIs to automatically set the cookie preferences on a per-site
> basis.  Thanks!

So the pref will have got changed when it was the default, but I thought I'd actually had crashes *after* that got backed out. Looking again that doesn't seem to be the case (last crash in the last build before the back out). Hmm. No idea why I thought otherwise now!

Sorry for the noise!
Flags: needinfo?(geoffers+mozilla)
It looks like the change for bug 1492563 got pushed again, and now nightly (on Linux, at least) is crashing again as of build id 20181002220140.
I've attached the browser test that triggers the assertion, and here's the MOZ_LOG=AntiTracking:5 output from the segment of the test that triggers the actual assertion.
Flags: needinfo?(amarchesini)
Attached patch patch (obsolete) — Splinter Review
The idea behind this patch is:

1. When we need to check if a channel should be intercepted or not...
2. if the current prefs/permission require the channel to be marked as tracking resource...
3. let's create a dummy channel (TrackingDummyChannel) on the parent side
4. let's check if that dummy channel is a tracking resource
5. update the original HttpChannelChild
6. evaluate if we should intercept the channel again.

This introduces an extra jump on the parent process, but it happens only if cookieBahavior/permission needs it. All this code will go away when the interception will be done on the parent side only.
Assignee: bugmail → amarchesini
Flags: needinfo?(amarchesini)
Attachment #9014705 - Flags: review?(francois)
Attachment #9014705 - Flags: feedback?(bugmail)
Comment on attachment 9014705 [details] [diff] [review]
patch

Review of attachment 9014705 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/protocol/http/HttpBaseChannel.cpp
@@ -324,5 @@
>    LOG(("HttpBaseChannel::SetIsTrackingResource thirdparty=%d %p",
>         static_cast<int>(aIsThirdParty), this));
>  
>    if (aIsThirdParty) {
> -    MOZ_ASSERT(!mIsFirstPartyTrackingResource);

Why do we need to remove these asserts?

They were added to ensure that we can't mark a channel as both a third-party tracker and a first-party tracker.

::: toolkit/components/antitracking/AntiTrackingCommon.cpp
@@ +1245,5 @@
>    cc->SendStoreUserInteractionAsPermission(IPC::Principal(aPrincipal));
>  }
> +
> +/* static */ bool
> +AntiTrackingCommon::ChannelNeedsToBeAnnotated(nsIHttpChannel* aChannel,

Are the annotations only needed for cookie restrictions?

What about the other users of tracking annotations? Aren't they going to be running into the same problem where intercepted channels are annotated too late?
Attachment #9014705 - Flags: review?(francois)
> > -    MOZ_ASSERT(!mIsFirstPartyTrackingResource);
> 
> Why do we need to remove these asserts?

It was left from an a previous version of the patch where I didn't store the 3rd/1st party information in the dummy channel.

> Are the annotations only needed for cookie restrictions?

For the purpose of this bug, yes, but we can generalize it to mark any channel, when should be intercepted, if the pref privacy.trackingprotection.annotate_channels is set to true.
Attached patch sw_0.patch (obsolete) — Splinter Review
Attachment #9014705 - Attachment is obsolete: true
Attachment #9014705 - Flags: feedback?(bugmail)
Attachment #9014994 - Flags: review?(francois)
Attachment #9014994 - Flags: feedback?(bugmail)
Attached patch sw_0.patchSplinter Review
Better approach in which I use the original nsILoadInfo in the dummy channel.
Attachment #9014994 - Attachment is obsolete: true
Attachment #9014994 - Flags: review?(francois)
Attachment #9014994 - Flags: feedback?(bugmail)
Attachment #9014995 - Flags: review?(francois)
Attachment #9014995 - Flags: feedback?(bugmail)
Comment on attachment 9014995 [details] [diff] [review]
sw_0.patch

Review of attachment 9014995 [details] [diff] [review]:
-----------------------------------------------------------------

This is a good solution to an ugly problem, thank you!  I propose some documentation and additional guards below.

::: netwerk/protocol/http/HttpChannelChild.cpp
@@ +2726,5 @@
> +                                        shouldUpgrade);
> +      intercepted->NotifyController();
> +    };
> +
> +    TrackingDummyChannel::StorageAllowedState state =

I think you should be bypassing the StorageAllowed check for cases for which nsContentUtils::IsNonSubresourceRequest() returns false (which for our current implementation purposes basically means a navigation request).  As you can see at the check for this scenario at https://searchfox.org/mozilla-central/rev/4b6737ba4008b71182cc2a12f507c82adf2b4e70/dom/serviceworkers/ServiceWorkerInterceptController.cpp#31, whether we attempt to fetch a sub-resource exclusively depends on whether or not there's a controller associated with the loadInfo (and conceptually, client).

Now, there is a nuance here.  Just because a client is controlled doesn't mean the channel will be intercepted.  If the SW has no 'fetch' event handler registered or doesn't respondWith() to the event, then the request will go to the parent.  And in that case we end up with the same potential issue again where in the client we didn't have the annotation but in the parent we did.  However, we don't care in this case because it's not actually impacting decision-making.  We only care for navigation requests.

And this is likely to matter for real-world purposes.  I understand via back-channels that the overhead of our no-fetch optimization case is already concerning to some sites.  This would add additional latency to that common scenario where a SW is registered for push notifications at the root scope.  (Which is arguably foolish, but I guess we encouraged sites to do that somehow?)

::: netwerk/protocol/http/PTrackingDummyChannel.ipdl
@@ +9,5 @@
> +
> +namespace mozilla {
> +namespace net {
> +
> +protocol PTrackingDummyChannel

Please add a block comment here or someplace else obvious in the patch that explains why the channel exists, when it can go away, etc.

Here's a possible start:

Provides a mechanism for the "child intercept" mode of ServiceWorker operation to work correctly with Tracking Protection annotations.  ServiceWorkers should not be allowed for third-party iframes which are annotated as tracking origins.

In child intercept mode, the decision to intercept a channel is made in the child process without consulting the parent process.  The decision is based on whether there is a ServiceWorker with a scope covering the URL in question and whether storage is allowed for the origin/URL.  When the "network.cookie.cookieBehavior" preference is set to BEHAVIOR_REJECT_TRACKER, annotated channels are denied storage which means that the ServiceWorker should not intercept the channel.  However, the decision for tracking protection to annotate a channel only occurs in the parent process.  The dummy channel is a hack to allow the intercept decision process to ask the parent process if the channel should be annotated.  Because this round-trip to the parent has overhead, the dummy channel is only created 1) if the ServiceWorker initially determines that the channel should be intercepted and 2) it's a navigation request.

This hack can be removed once Bug 1231208's new "parent intercept" mechanism fully lands, the pref is enabled by default it stays enabled for long enough to be confident we will never need/want to turn it off.  Then as part of bug 1496997 we can remove this implementation.  Bug TBD covers removing this hack in particular.

::: netwerk/protocol/http/TrackingDummyChannel.cpp
@@ +19,5 @@
> +
> +namespace {
> +
> +bool
> +ChannelNeedsToBeAnnotated()

Please either augment the logic here or add a comment here that explains the rationale behind checking only on "privacy.trackingprotection.annotate_channels" and when it would possibly be disabled.  It seems like that preference is going to be on all of the time per https://searchfox.org/mozilla-central/rev/4b6737ba4008b71182cc2a12f507c82adf2b4e70/modules/libpref/init/all.js#1366 setting it and no UI existing to set it to false (other than about:config).

In particular, it seems like the decision for storage being denied due to annotation is a product of:
1. The channel being annotated as a tracker, which is dependent on the "privacy.trackingprotection.annotate_channels" pref.
2. The channel being associated with an iframe which is a context thing.
3. The cookie behavior for the origin being set to BEHAVIOR_REJECT_TRACKER=4 which is dependent on the global "network.cookie.cookieBehavior" pref and the "cookie" permission as tracked by nsIPermissionManager.

Without a comment it's not clear why the 3rd case is not being factored into this decision given that the cookie behavior preference is available globally and the ServiceWorkerManagerService propagates the permissions for the origin to the content process at https://searchfox.org/mozilla-central/rev/4b6737ba4008b71182cc2a12f507c82adf2b4e70/dom/serviceworkers/ServiceWorkerManagerService.cpp#122 in addition to the preload opportunity at https://searchfox.org/mozilla-central/rev/4b6737ba4008b71182cc2a12f507c82adf2b4e70/dom/ipc/ContentParent.cpp#5571 also propagating the permissions.

::: netwerk/protocol/http/TrackingDummyChannelParent.cpp
@@ +25,5 @@
> +                                 nsILoadInfo* aLoadInfo)
> +{
> +  MOZ_ASSERT(mIPCActive);
> +
> +  RefPtr<TrackingDummyChannelParent> self =this;

nit: missing whitespace
Attachment #9014995 - Flags: feedback?(bugmail) → feedback+
Comment on attachment 9014995 [details] [diff] [review]
sw_0.patch

Review of attachment 9014995 [details] [diff] [review]:
-----------------------------------------------------------------

The parts that have to do with the classification look good to me, but please ask a Necko peer to review as well.
Attachment #9014995 - Flags: review?(francois) → review+
Attachment #9014995 - Flags: review?(honzab.moz)
Comment on attachment 9014995 [details] [diff] [review]
sw_0.patch

Review of attachment 9014995 [details] [diff] [review]:
-----------------------------------------------------------------

r+ with comments addressed

please, please, please..  start adding comments to your patches :)  it really is important and makes all our lives (including yours!) better.  thanks.

::: dom/serviceworkers/test/browser_antitracking.js
@@ +77,5 @@
>        return payload;
>      }
>    );
>  
> +  ok(!controlled, "Should not be controlled!");

this is actually demonstrating the fix? :)

::: netwerk/protocol/http/TrackingDummyChannel.h
@@ +19,5 @@
> +
> +namespace mozilla {
> +namespace net {
> +
> +class TrackingDummyChannel final : public nsIChannel

add comments where and how this class is used

::: netwerk/protocol/http/TrackingDummyChannelChild.cpp
@@ +25,5 @@
> +    return false;
> +  }
> +
> +  URIParams uriParams;
> +  SerializeURI(aURI, uriParams);

I heard rumors we can send uris directly now

@@ +78,5 @@
> +
> +  bool storageGranted =
> +    AntiTrackingCommon::IsFirstPartyStorageAccessGrantedFor(httpChannel, mURI,
> +                                                            nullptr);
> +  mCallback(storageGranted);

are there any calling thread requirements?

::: netwerk/protocol/http/TrackingDummyChannelParent.cpp
@@ +42,5 @@
> +    new nsChannelClassifier(channel);
> +
> +  bool willCallback =
> +    NS_SUCCEEDED(channelClassifier->CheckIsTrackerWithLocalTable(
> +      [self, channel]() {

these days you can use self{std::move(self)}

::: toolkit/components/antitracking/AntiTrackingCommon.cpp
@@ -769,5 @@
>  
> -  if (NS_WARN_IF(NS_FAILED(rv) || !channelPrincipal)) {
> -    LOG(("No channel principal, bail out early"));
> -    return false;
> -  }

I don't think this change is correct.  we use the information of channelPrincipal == nullptr to find the toplevelPrincipal.  

if you believe this change is correct, then add a good comment why it's fine to be moved.
Attachment #9014995 - Flags: review?(honzab.moz) → review+
Blocks: 1498259
Comments added. Mainly taken from the asuth's comment #22.

> are there any calling thread requirements?

Assertions Added. Yes, this is main-thread only.

> > -  if (NS_WARN_IF(NS_FAILED(rv) || !channelPrincipal)) {
> 
> I don't think this change is correct.  we use the information of
> channelPrincipal == nullptr to find the toplevelPrincipal.  

reverted. It was a completely unrelated change.
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f6d88f69e849
browser test to check how ServiceWorkers intercept tracking resources, r=baku
https://hg.mozilla.org/integration/mozilla-inbound/rev/75ed7377db61
Introduce TrackingDummyChannel to annotate channels before being intercepted by ServiceWorkers, r=francois, r=mayhemer, f=asuth
https://hg.mozilla.org/mozilla-central/rev/f6d88f69e849
https://hg.mozilla.org/mozilla-central/rev/75ed7377db61
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
See Also: → 1498723
Flags: in-testsuite+
Tracking for 63. The crash volume on 63 beta is low but if we have a dot release in between 63 and 64, it might be a candidate for a ride-along uplift depending on the volume of crashes we get on the release channel.
So it appears we still have crashes on nightly after this patch landed, and a spike in the 10-17 build (116 crashes) - See Bug 1499995 where a user is crashing on theverge.com site. Should I reopen this or track this in Bug 1499995?
Flags: needinfo?(amarchesini)
Let's work on the other bug. It seems that the fix for this bug has reduced the number of crashes significantly. But something is still there. I prefer to work on a new fresh bug.
Flags: needinfo?(amarchesini)
Depends on: 1607518
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: