Closed
Bug 1499995
Opened 6 years ago
Closed 6 years ago
Crash in mozilla::dom::ClientSource::WindowExecutionReady
Categories
(Core :: DOM: Content Processes, defect)
Tracking
()
RESOLVED
FIXED
mozilla65
People
(Reporter: josh.tumath+bugzilla, Assigned: baku)
References
Details
(Keywords: crash, regression, topcrash, Whiteboard: [privacy65])
Crash Data
Attachments
(3 files, 1 obsolete file)
5.36 KB,
patch
|
ehsan.akhgari
:
review+
RyanVM
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
19.79 KB,
patch
|
ehsan.akhgari
:
review+
mayhemer
:
review+
|
Details | Diff | Splinter Review |
9.82 KB,
patch
|
ehsan.akhgari
:
review+
|
Details | Diff | Splinter Review |
I am encountering this tab crash every time I visit an article on The Verge. Nothing happens on first load, but when I start scrolling down the page, the tab crashes as soon as I reach a certain point.
e.g.
https://www.theverge.com/circuitbreaker/2018/10/16/17981684/amazon-kindle-paperwhite-update-new-waterproof-thinner-lighter-oasis-voyage
This bug was filed from the Socorro interface and is
report bp-0acb61ee-c4b1-4aaf-b462-0c0100181018.
=============================================================
Top 10 frames of crashing thread:
0 xul.dll mozilla::dom::ClientSource::WindowExecutionReady dom/clients/manager/ClientSource.cpp:286
1 xul.dll nsGlobalWindowOuter::SetNewDocument dom/base/nsGlobalWindowOuter.cpp:1958
2 xul.dll nsresult nsDocumentViewer::InitInternal layout/base/nsDocumentViewer.cpp:1032
3 xul.dll nsDocumentViewer::Init layout/base/nsDocumentViewer.cpp:771
4 xul.dll nsresult nsDocShell::Embed docshell/base/nsDocShell.cpp:6767
5 xul.dll nsDSURIContentListener::DoContent docshell/base/nsDSURIContentListener.cpp:196
6 xul.dll bool nsDocumentOpenInfo::TryContentListener uriloader/base/nsURILoader.cpp:759
7 xul.dll nsDocumentOpenInfo::OnStartRequest uriloader/base/nsURILoader.cpp:306
8 xul.dll void mozilla::net::HttpChannelChild::DoOnStartRequest netwerk/protocol/http/HttpChannelChild.cpp:774
9 xul.dll void mozilla::net::HttpChannelChild::OnStartRequest netwerk/protocol/http/HttpChannelChild.cpp:641
=============================================================
Comment 1•6 years ago
|
||
Thanks for filing Josh - we had this crash in Bug 1495285 - I am waiting to hear back from the developer as to whether we reopen that bug or track the crashes in this bug.
Assignee | ||
Comment 2•6 years ago
|
||
I'm working on this crash.
Assignee | ||
Comment 3•6 years ago
|
||
Andrew, do you have any idea why this crash happens after the landing of TrackingDummyChannel?
I'm not able to reproduce it.
Flags: needinfo?(bugmail)
Can confirm this crash. It happened on https://www.reddit.com/r/DotA2/ and https://www.reddit.com/r/Artifact/ for me.
Can reproduce crash on https://www.reddit.com/r/VintageApple/comments/9pvjtl/apples_education_mac/ Webrender on, MacOS 10.14, MBP A1708. as of writing this, am on latest nightly build 20181020102231
Comment 8•6 years ago
|
||
I think our best option is if those able to reproduce this crash could run with "MOZ_LOG=AntiTracking:5" and record the output. The pages at https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Gecko_Logging#Enabling_Logging and https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging provide related information. Note that the ",sync" option would likely want to be used if logging to a file because otherwise the crash might eat the log info.
Other than the potential for a race related to what the dummy channel perceives and what the subsequent normal check in the parent is, my best guess for an edge-case would be something related to controller inheritance and about:blank iframes. Per https://w3c.github.io/ServiceWorker/#selection, about:blank and other local schemes[1] will inherit the controller. I don't have any real idea how that could actually trigger a problem directly, it's just the alternate code path for marking a client controlled... The inheritance[2] uses ServiceWorkerAllowedToControlWindow()[3] which is a different check that doesn't involve the dummy channel.
1: https://fetch.spec.whatwg.org/#local-scheme
2: https://searchfox.org/mozilla-central/rev/4fc7bfa931189e0718dd70b4c59885829f1d761e/docshell/base/nsDocShell.cpp#2832
3: https://searchfox.org/mozilla-central/rev/4fc7bfa931189e0718dd70b4c59885829f1d761e/docshell/base/nsDocShell.cpp#13754
Flags: needinfo?(bugmail)
Comment 9•6 years ago
|
||
Uh, and note that the logs could quite potentially include private information from other windows/tabs, so you would want to:
- Only include near the end of the log for the crashing page's site.
- Probably send the log directly to :baku at amarchesini@mozilla.com
Comment 10•6 years ago
|
||
Tracking for 63 in case we have both a working fix and a dot release in the 63 to 64 timeframe.
Assignee: nobody → amarchesini
Status: UNCONFIRMED → NEW
status-firefox63:
--- → affected
tracking-firefox63:
--- → +
Ever confirmed: true
Keywords: crash
See Also: → 1495285
Updated•6 years ago
|
status-firefox65:
--- → affected
Comment 11•6 years ago
|
||
The signature is ranked #1 in content top-crashes for Firefox.
Keywords: regression,
topcrash
Comment 12•6 years ago
|
||
@Moses and @Enzo - I am not able to reproduce the crash on my Mac on those reddit sites - is there something particular you are doing on the page when the crash happens, or does it crash when you load the page? Are you able to try what Andrew suggests in Comment 8 and Comment 9? If so, please let us know. Thanks.
Flags: needinfo?(spidersouris)
Flags: needinfo?(mb8672)
Comment 13•6 years ago
|
||
It seems to crash when loading some content. Sometimes, it crashes on scroll down and other times, it crashes when loading. I've sent an email to :baku with the logs yesterday, hope I did everything fine.
Flags: needinfo?(spidersouris)
Updated•6 years ago
|
Flags: needinfo?(mb8672)
Assignee | ||
Comment 14•6 years ago
|
||
Yes, I confirm. I received the log. I'm trying to understand why the crash happens. Thanks, Enzo!
Comment 15•6 years ago
|
||
With today's nightly I crash 100% of time on https://reddit.com when Content Blocking is enabled. It happens about 2 seconds into load. I'm on Win 10.
Comment 16•6 years ago
|
||
If I remove/add the youtube.com serviceworker the crashes stop/start. Still having trouble reproducing in a clean profile.
Comment 17•6 years ago
|
||
It seems to come down to something in permissions.sqlite that I don't understand. Clearing Site Preferences in settings fixes the crash. Here is the select lines from permissions.sqlite related to youtube/reddit. I can email you the full database if it is really needed.
moz_perms table
> 167|https://www.youtube.com|storageAccessAPI|1|2|1542904681092|1540312681092
> 178|https://www.reddit.com|3rdPartyStorage^https://www.youtube.com|1|2|1542644862782|1540052862782
> 30|https://www.reddit.com|storageAccessAPI|1|2|1542905134669|1540313134669
STR:
- permissions.sqlite contains these entries
- visit youtube.com to register service worker
- visit reddit.com (no login, 'new' UI)
- scrolll down until a youtube embed occurs
- Crash
Assignee | ||
Comment 18•6 years ago
|
||
Great! I can reproduce it!
Comment 19•6 years ago
|
||
Tracking for 64 in case baku finds a fix - maybe this will be good for beta uplift.
tracking-firefox64:
--- → +
Assignee | ||
Updated•6 years ago
|
Attachment #9019582 -
Attachment description: iframe.patch → part 1 - nested iframes + SW
Assignee | ||
Comment 21•6 years ago
|
||
TrackingDummyChannel has a bad bug: it always sets the tracking flag ignoring the real annotation result.
We were passing all the tests because we don't have tests for intercepted 3rd party contexts: all the mochitests run in nested iframes and we use cookieBehavior 0 for them. In the next patches I add some tests.
The fix consists in exposing nsIHttpChannelInternal interface and return the topWindowURL (the only method used by the classifier).
Attachment #9019663 -
Flags: review?(ehsan)
Assignee | ||
Comment 22•6 years ago
|
||
Tests for 3rd party intercepted contexts. We don't want a 3rd party intercepted context to load a sub intercepted iframe: in nightly/beta, opening an nested iframe from a intercepted iframe triggers the crash.
I also fixed browser_antitracking.js because it was using multiple tabs, which were handled by different processes. Sometimes we were not controlling the iframe because the second process didn't know about the service worker registration yet.
Attachment #9019667 -
Flags: review?(ehsan)
Updated•6 years ago
|
Attachment #9019582 -
Flags: review?(ehsan) → review+
Comment 23•6 years ago
|
||
Comment on attachment 9019663 [details] [diff] [review]
part 2 - fix TrackingDummyChannel
Review of attachment 9019663 [details] [diff] [review]:
-----------------------------------------------------------------
::: netwerk/protocol/http/TrackingDummyChannel.h
@@ +41,5 @@
> * 1496997 we can remove this implementation. Bug 1498259 covers removing this
> * hack in particular.
> */
> class TrackingDummyChannel final : public nsIChannel
> + , public nsIHttpChannelInternal
Hmm, a class which implements nsIHttpChannelInternal but not nsIHttpChannel. I wonder what a Necko peer would think about this. Do you mind asking someone like mayhemer to have a brief look at the patch please?
@@ +49,5 @@
>
> NS_DECL_THREADSAFE_ISUPPORTS
> NS_DECL_NSIREQUEST
> NS_DECL_NSICHANNEL
> + NS_DECL_NSIHTTPCHANNELINTERNAL
Do you mind adding a comment here saying only GetTopWindowURI is implemented?
Attachment #9019663 -
Flags: review?(ehsan) → review+
Updated•6 years ago
|
Attachment #9019667 -
Flags: review?(ehsan) → review+
Comment 24•6 years ago
|
||
One thing I don't understand about your fixes is how they will affect 63. The tracking flags on the bug seem to suggest that the crash affects 63, is that right? TrackingDummyChannel however doesn't even exist on that branch.
What needs to be done for the release channel, if anything?
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 25•6 years ago
|
||
Comment on attachment 9019663 [details] [diff] [review]
part 2 - fix TrackingDummyChannel
Mayhemer, do you mind to take a look at this patch?
I need to expose nsIHttpChannelInternal::GetTopWindowURL to make URL-Classifier able to annotate this channel.
Flags: needinfo?(amarchesini)
Attachment #9019663 -
Flags: review?(honzab.moz)
Assignee | ||
Comment 26•6 years ago
|
||
> What needs to be done for the release channel, if anything?
The crash happens just in 64 and 65 because only in these 2 releases we have cookieBehavior==4 by default.
For 63, we could have crashes for users who set cookieBehavior 3 or 4.
Comment 27•6 years ago
|
||
(In reply to Andrea Marchesini [:baku] from comment #26)
> > What needs to be done for the release channel, if anything?
>
> The crash happens just in 64 and 65 because only in these 2 releases we have
> cookieBehavior==4 by default.
> For 63, we could have crashes for users who set cookieBehavior 3 or 4.
So are you planning to write a different set of patches for 63? Will that happen in the same bug?
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 28•6 years ago
|
||
If we want to fix 63, we should uplift bug 1495285 and this bug.
Flags: needinfo?(amarchesini)
Comment 29•6 years ago
|
||
(In reply to Andrea Marchesini [:baku] from comment #28)
> If we want to fix 63, we should uplift bug 1495285 and this bug.
Aha, great, thanks! That's what I was looking after. :-)
Comment 30•6 years ago
|
||
Comment on attachment 9019663 [details] [diff] [review]
part 2 - fix TrackingDummyChannel
Review of attachment 9019663 [details] [diff] [review]:
-----------------------------------------------------------------
::: netwerk/protocol/http/TrackingDummyChannelChild.cpp
@@ +93,5 @@
>
> RefPtr<HttpBaseChannel> httpChannel = do_QueryObject(channel);
> + if (aTrackingResource) {
> + httpChannel->SetIsTrackingResource(mIsThirdParty);
> + }
ups.. I missed this?
Could we do:
if (aTrackingResource && mIsThirdParty) {
httpChannel->SetIsTrackingResource(true);
}
?
Attachment #9019663 -
Flags: review?(honzab.moz) → review+
Updated•6 years ago
|
Assignee | ||
Comment 31•6 years ago
|
||
> Could we do:
>
> if (aTrackingResource && mIsThirdParty) {
> httpChannel->SetIsTrackingResource(true);
> }
If it's not a 3rd party, we should call SetIsTrackingResource(false).
Assignee | ||
Comment 32•6 years ago
|
||
This bit is missing for some existing tests.
Attachment #9019740 -
Flags: review?(ehsan)
Updated•6 years ago
|
Whiteboard: [privacy65]
Comment 33•6 years ago
|
||
Comment on attachment 9019740 [details] [diff] [review]
part 4 - propagation of the return value of GetTopWindowURI
Review of attachment 9019740 [details] [diff] [review]:
-----------------------------------------------------------------
Wow, so even the return value matters here! OK.
Attachment #9019740 -
Flags: review?(ehsan) → review+
Comment 34•6 years ago
|
||
Do you mind filing a bug to get rid of this whole code once the SW team finishes their work on parent interception?
Flags: needinfo?(amarchesini)
Comment 35•6 years ago
|
||
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a81dc13c9089
part 1 - LoadInfo must expose the correct top-level-storage-area-principal for sub documents, r=ehsan
https://hg.mozilla.org/integration/mozilla-inbound/rev/ccd2e78000b8
part 2 - TrackingDummyChannel must expose nsIHttpChannelInternal, r=ehsan, r=mayhemer
https://hg.mozilla.org/integration/mozilla-inbound/rev/2393a7d9f454
part 3 - Tests for nested iframes controlled by ServiceWorkers, r=ehsan
Assignee | ||
Comment 36•6 years ago
|
||
In reply to :Ehsan Akhgari from comment #34)
> Do you mind filing a bug to get rid of this whole code once the SW team
> finishes their work on parent interception?
We already have it: https://bugzilla.mozilla.org/show_bug.cgi?id=1498259
Flags: needinfo?(amarchesini)
Assignee | ||
Comment 37•6 years ago
|
||
Comment on attachment 9019582 [details] [diff] [review]
part 1 - nested iframes + SW
[Beta/Release Uplift Approval Request]
Feature/Bug causing the regression: Bug 1493211
User impact if declined: Nested iframes, controlled by a ServiceWorker, can trigger a crash
Is this code covered by automated tests?: Yes
Has the fix been verified in Nightly?: Yes
Needs manual test from QE?: No
If yes, steps to reproduce:
List of other uplifts needed: Bug 1495285
Risk to taking this patch: Low
Why is the change risky/not risky? (and alternatives if risky): The fix is about exposing nsIHttpChannelInternal in the TrackingDummyChannel. Nothing risky.
String changes made/needed: none
Attachment #9019582 -
Flags: approval-mozilla-beta?
Assignee | ||
Comment 38•6 years ago
|
||
We have some crashes in 63. They are extremely rare because the crash can happen only for users who chose to use cookieBehavior 3. Do we want to uplift this and bug 1495285 to release?
Flags: needinfo?(ryanvm)
Comment 40•6 years ago
|
||
(In reply to Andrea Marchesini [:baku] from comment #38)
> We have some crashes in 63. They are extremely rare because the crash can
> happen only for users who chose to use cookieBehavior 3. Do we want to
> uplift this and bug 1495285 to release?
According to Socorro we don't have any crash for this signature for 63.0 on the release channel since we shipped We only had 2 crashes on the beta channel during the whole beta cycle and all the other crashes where on the aurora channel (Dev Edition), about 25 crashes last week. Given that data, I don't think we need to uplift a patch to the release channel. Thanks.
Flags: needinfo?(pascalc)
Comment 41•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a81dc13c9089
https://hg.mozilla.org/mozilla-central/rev/ccd2e78000b8
https://hg.mozilla.org/mozilla-central/rev/2393a7d9f454
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Comment 42•6 years ago
|
||
bugherder |
Updated•6 years ago
|
status-firefox-esr60:
--- → unaffected
Comment 43•6 years ago
|
||
Same setup no longer crashes for me on Build 20181025220044. \o/
Comment 44•6 years ago
|
||
We're still seeing crashes with this signature on the 20181025220044 nightly builds. Baku is going to file a follow-up bug for those crashes as they're coming from a different path than this bug.
Flags: in-testsuite+
Comment 45•6 years ago
|
||
Comment on attachment 9019740 [details] [diff] [review]
part 4 - propagation of the return value of GetTopWindowURI
Per baku on IRC, this was merged into part 2.
Attachment #9019740 -
Attachment is obsolete: true
Comment 46•6 years ago
|
||
Comment on attachment 9019582 [details] [diff] [review]
part 1 - nested iframes + SW
Reduces crashes when using nested iframes controlled by a ServiceWorker. Approved for 64.0b5.
Attachment #9019582 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 47•6 years ago
|
||
bugherder uplift |
Comment 48•6 years ago
|
||
Baku, was there a follow-up bug filed for the remaining crashes? We're clearly in a better place than we used to be, but the crashes are still there.
Flags: needinfo?(amarchesini)
You need to log in
before you can comment on or make changes to this bug.
Description
•