Closed
Bug 1318759
Opened 8 years ago
Closed 8 years ago
Crash in mozilla::net::HttpChannelParent::OnStartRequest
Categories
(Core :: Networking, defect)
Core
Networking
Tracking
()
VERIFIED
FIXED
mozilla53
People
(Reporter: marcia, Assigned: mayhemer)
References
Details
(Keywords: crash, regression, Whiteboard: [necko-active])
Crash Data
Attachments
(1 file, 1 obsolete file)
11.43 KB,
patch
|
mayhemer
:
review+
jcristau
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-653c5f48-b685-42b9-a491-622112161118.
=============================================================
Seen while looking at trunk crash stats: http://bit.ly/2g4vtkC
Crashes on 53 started with Build 20161114043454. Crash is also present on Mac with a slightly different signature.
Here is the checkins between the 13th and 14th: https://mzl.la/2fNdEn6
![]() |
Assignee | |
Updated•8 years ago
|
Assignee: nobody → honzab.moz
Updated•8 years ago
|
Whiteboard: [necko-active]
![]() |
||
Comment 1•8 years ago
|
||
This is disproportionately common on Mac, with slightly more Mac crashes (11) than Windows crashes (8) over the past 3 days, making it the #1 browser topcrash in that time period.
The correlations report in:
https://crash-stats.mozilla.com/signature/?product=Firefox&release_channel=nightly&signature=mozilla%3A%3Anet%3A%3AHttpChannelParent%3A%3AOnStartRequest&date=%3E%3D2016-11-13T07%3A16%3A00.000Z&date=%3C2016-12-01T07%3A16%3A00.000Z#correlations
gives:
(100.0% in signature vs 05.72% overall) Addon "Ghostery" = true
(100.0% in signature vs 08.47% overall) Addon "NoScript Security Suite" = true
It also appears to be more common on Linux than Windows.
(There's no correlation report for the separate Mac signature.)
![]() |
||
Comment 3•8 years ago
|
||
I suspect it's Ghostery that's relevant more so than NoScript, because bug 1319606 (which appears strongly related to this bug) has a 100% Ghostery correlation but no such NoScript correlation.
![]() |
Assignee | |
Comment 4•8 years ago
|
||
Going to work on this now. I'll use URL from https://bugzilla.mozilla.org/show_bug.cgi?id=1317517#c14 to reproduce.
![]() |
Assignee | |
Comment 5•8 years ago
|
||
To reproduce the crash one needs to install both addons:
> (100.0% in signature vs 03.01% overall) Addon "NoScript Security Suite" = true
> (100.0% in signature vs 03.25% overall) Addon "Ghostery" = true
W/o NoScript Security Suite, I only experience bug 1322346 and bug 1322355.
![]() |
Assignee | |
Comment 6•8 years ago
|
||
The nasty thing is clearly visible here:
2016-12-07 16:28:59.010000 UTC - [Main Thread]: V/nsHttp HttpChannelParent::RecvRedirect2Verify call OnRedirectVerifyCallback [this=184a4e20 result=805303f4, mRedirectCallback=1b996410]
2016-12-07 16:28:59.010000 UTC - [Main Thread]: D/nsHttp nsHttpChannel::OnRedirectVerifyCallback [this=1ceb1800] result=0 stack=1 mWaitingForRedirectCallback=1
HttpChannelParent::RecvRedirect2Verify shows the redirect have been vetoed but nsHttpChannel::OnRedirectVerifyCallback sees the result as NS_OK (i.e. carry on with the redirect).
mRedirectCallback is not the expected nsHttpChannel, but something JS, provided by the extension. I found a code like this in NoScript:
RCBDelegate.prototype = {
QueryInterface: XPCOMUtils.generateQI([Ci.sIAsyncVerifyRedirectCallback]),
onRedirectVerifyCallback: function(result) {
try {
if (result !== 0) ns.log("Overriding failed (" + result + ") redirect callback for " + this.label); // plugin failure is 2147500037
this.delegator.onRedirectVerifyCallback(0);
} catch (e) {
ns.log(e);
}
}
}
and
asyncOnChannelRedirect: function(oldChan, newChan, flags, redirectCallback) {
this.onChannelRedirect(oldChan, newChan, flags);
redirectCallback.onRedirectVerifyCallback(0);
},
If this is added on every channel, then it's actually a security issue because it 1) allows redirects we have vetoed based on any content security policy and check-may-load policy and 2) breaks the internal logic, since when we notify redirect result as a failure, we are in a certain state and count on the redirect to not proceed. I need to check details here first to report this to NoScript, tho.
Still, I will be really happy to move away from xpcom for extensions, specially in this case!!
Status: NEW → ASSIGNED
![]() |
Assignee | |
Comment 7•8 years ago
|
||
(backup, needs even more local testing)
![]() |
Assignee | |
Comment 8•8 years ago
|
||
![]() |
Assignee | |
Comment 10•8 years ago
|
||
One more try, this time not on top of bug 1322355 that should not block here.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=17e926c46bfc7999e78def957f32511e8194ebb6
![]() |
Assignee | |
Comment 11•8 years ago
|
||
Comment on attachment 8817202 [details] [diff] [review]
v1 (Properly query final class in HttpChannelParent::OnStartRequest)
Jason, please try to review or find someone to do it. This is not a oneliner, but something I wanted to do a long ago.
I was also thinking about simply returning an error from onstartrequest (aka aborting the channel) when nsHttpChannel is not found, because we are then sending the data to HttpChannelChild, which seems a bit weird, at least.
I don't remember why we were doing the static_cast on aRequest at the first place and not use mChannel and assert aRequest == mChannel...
Attachment #8817202 -
Attachment description: wip1 → v1 (Properly query final class in HttpChannelParent::OnStartRequest)
Attachment #8817202 -
Flags: review?(jduell.mcbugs)
Comment 12•8 years ago
|
||
Comment on attachment 8817202 [details] [diff] [review]
v1 (Properly query final class in HttpChannelParent::OnStartRequest)
Review of attachment 8817202 [details] [diff] [review]:
-----------------------------------------------------------------
::: netwerk/protocol/http/HttpBaseChannel.h
@@ +238,5 @@
> NS_IMETHOD SetIntegrityMetadata(const nsAString& aIntegrityMetadata) override;
> + virtual mozilla::net::nsHttpChannel * QueryHttpChannelImpl(void) override
> + {
> + return nullptr;
> + }
Same comment as below about inline + virtual here.
::: netwerk/protocol/http/HttpChannelParent.cpp
@@ +1169,5 @@
> + // XXX here we should throw mChannel away since it no longer points to
> + // nsHttpChannel we actually work with. But, it's so heavilly used all over
> + // this class it's simply not save and easy thing to do.
> + // The true solution here is to rethink the HttpChannelParent in relation
> + // to redirects to different schemes.
So... what happens after this? We call OnStart/Stop on the child with a dummy requesthead? Should we mark the channel with an error code?
::: netwerk/protocol/http/nsHttpChannel.h
@@ +149,5 @@
> NS_IMETHOD ForceIntercepted(uint64_t aInterceptionID) override;
> + virtual mozilla::net::nsHttpChannel * QueryHttpChannelImpl(void) override
> + {
> + return this;
> + }
So this is a virtual function and thus can't be inlined I assume? Is this going to generate a copy of this in each .cpp file that includes it? (Sorry if my knowledge of C++ compilers is out of date).
Attachment #8817202 -
Flags: review?(jduell.mcbugs) → review+
![]() |
Assignee | |
Comment 13•8 years ago
|
||
For ref, here's a try run that aborts the stream listener contract (returns NS_ERROR_something from onstart) when aRequest is not nsHttpChannel.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d6241cbfc5bb118b73cc2a42d2f6e3b4520c0cd5
If it goes well, I might prefer to land that patch. (Still need to check locally it fixes the crash and doesn't introduce any other problem)
![]() |
||
Comment 14•8 years ago
|
||
Honza, do you know why this (and the related signatures) would have started on November 14?
Flags: needinfo?(honzab.moz)
![]() |
||
Updated•8 years ago
|
Crash Signature: [@ mozilla::net::HttpChannelParent::OnStartRequest]
[@ <name omitted> | mozilla::net::HttpChannelParent::OnStartRequest] → [@ mozilla::net::HttpChannelParent::OnStartRequest]
[@ <name omitted> | mozilla::net::HttpChannelParent::OnStartRequest]
[@ mozilla::net::nsHttpChannel::GetCacheTokenExpirationTime]
![]() |
Assignee | |
Comment 15•8 years ago
|
||
differences from v1:
- review comments addressed
- onstart return NS_ERROR_UNEXPECTED when aRequest != nsHttpChannel
Few simplifications to the patch comes with that. Rather one more try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dae0139bd62f0b04e998311342479f0270c6efb7
Attachment #8817202 -
Attachment is obsolete: true
Flags: needinfo?(honzab.moz)
Attachment #8817616 -
Flags: review+
![]() |
Assignee | |
Comment 16•8 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #14)
> Honza, do you know why this (and the related signatures) would have started
> on November 14?
Likely a NoScript update. There was a number of updates to that addon since November 13. Before that the last update was on August 8, 2016.
Ghostery was update on November 28, but that may not have any effect here.
![]() |
Assignee | |
Updated•8 years ago
|
Keywords: checkin-needed
Comment 17•8 years ago
|
||
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/291d703d4964
Properly query final class in HttpChannelParent::OnStartRequest to prevent evil static_cast, r=jduell
Keywords: checkin-needed
![]() |
||
Comment 18•8 years ago
|
||
Need a day or two to confirm the fix is crashed on Nightly. If it is, this should be uplifted to Aurora. (But it doesn't seem to be affecting Beta or Release.)
Comment 19•8 years ago
|
||
bugherder |
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
![]() |
||
Comment 20•8 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #18)
> Need a day or two to confirm the fix is crashed on Nightly.
We don't have full numbers yet for the 20161214030231 Nightly... but I don't see any crashes matching the three signatures for this bug, which is a good sign. We should have a clearer idea in 24 hours.
![]() |
||
Comment 21•8 years ago
|
||
> We don't have full numbers yet for the 20161214030231 Nightly... but I don't
> see any crashes matching the three signatures for this bug, which is a good
> sign. We should have a clearer idea in 24 hours.
Updated numbers look good! No crashes after Nightly 20161213030215.
Honza, please nominate this for Aurora uplift.
Status: RESOLVED → VERIFIED
![]() |
||
Updated•8 years ago
|
Flags: needinfo?(honzab.moz)
![]() |
Assignee | |
Comment 22•8 years ago
|
||
Comment on attachment 8817616 [details] [diff] [review]
v2 (query method for nsHttpChannel, abort from HttpChannelParent when aRequest != nsHttpChannel)
Approval Request Comment
[Feature/Bug causing the regression]: e10s + some addons recent updates
[User impact if declined]: crash during regular browsing with Ghostery and/or NoScript addons
[Is this code covered by automated tests?]: yes (heavily executed http redirect code)
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no need, except verification of the fix with the mentioned addons
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: I think very small, safe enough for aurora
[Why is the change risky/not risky?]: small risk because: the patch changes the way we get some info from the channel and pass down to child process. Before the patch we get it from a previously stored channel (mChannel) but now we get it from the channel passed to OnStartRequest (aRequest). There has been added an assertion that mChannel == aRequest (verified by tests). But in release builds in the wild an inequality caused by some addon may not be caught and I'm not sure what impact that can have. But in defense, the patched code is actually more correct than before!
[String changes made/needed]: none
Flags: needinfo?(honzab.moz)
Attachment #8817616 -
Flags: approval-mozilla-aurora?
Comment 23•8 years ago
|
||
Comment on attachment 8817616 [details] [diff] [review]
v2 (query method for nsHttpChannel, abort from HttpChannelParent when aRequest != nsHttpChannel)
fix top crash in aurora52
Attachment #8817616 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 24•8 years ago
|
||
bugherder uplift |
status-firefox52:
--- → fixed
![]() |
Assignee | |
Comment 25•8 years ago
|
||
I just recalled nsQueryObject. We should use it here. I'll file a separate bug.
You need to log in
before you can comment on or make changes to this bug.
Description
•