Closed Bug 1373310 Opened 7 years ago Closed 5 years ago

Crash in mozilla::net::HttpChannelParent::ConnectChannel

Categories

(Core :: DOM: Service Workers, defect, P5)

56 Branch
Unspecified
Windows 10
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox55 --- unaffected
firefox56 --- affected

People

(Reporter: marcia, Unassigned)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file)

This bug was filed from the Socorro interface and is 
report bp-517c2d56-3165-46e6-9aec-f2b320170615.
=============================================================

Seen while looking at nightly crash data - small volume crash which affects Win 10: http://bit.ly/2t6Ai0n

Possible regression range based on Build ID: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b266a8d8fd595b84a7d6218d7b8c6b7af0b5027c&tochange=035c25bef7b5e4175006e63eff10c61c2eef73f1
The line its crashing on is service worker related, but not changed recently.  Andrew, do you have any thoughts here?
Flags: needinfo?(bugmail)
(In reply to Ben Kelly [reviewing, but slowly][:bkelly] from comment #1)
> The line its crashing on is service worker related, but not changed
> recently.  Andrew, do you have any thoughts here?

I was unable to find a smoking gun.  I used this opportunity to do some other legwork for the parent intercept blog post, resulting in the attached diagram.  I think the primary utility of the diagram is showing that things are just as confusing at 10,000 feet as they are at 0 feet.

The concise summary of what's going on[1] in all the crashes is:
- The parent is actively attempting to perform a redirect.  Specifically:
  - SetupReplacementChannel has already run, creating the new channel and propagating mCallbacks from the old channel to the new channel.
  - The parent has invoked gHttpHandler->AsyncOnChannelRedirect which progressed to HttpChannelParentListener::AsyncOnChannelRedirect which resulted in HttpChannelParent::StartRedirect invoking SendRedirect1Begin which resulted in the HttpChannelChild::ConnectParent in the child process.
- The crash is attempting to fish the nsINetworkInterceptController out of the new channel, not the old channel.  This is notable because the child would have had to "mess up" the old channel prior to the old channel beginning the redirect process to cause ReleaseListeners/DoNotifyListener to clear mCallbacks prior to SetupReplacementChannel being triggered.  (The child can't really mess with the new channel since it's the very act of attempting to establish an actor relationship that's crashing.)

nsHttpChannel just has too many potential control flow paths and too little state exposed via the crash backtraces (or the stacks if we grab the minidumps) to better guess at what's going on.

Probably the best next steps would be:
- Add a MOZ_DIAGNOSTIC_ASSERT(mCallbacks)/MOZ_CRASH/MOZ_CRASH_UNSAFE_PRINTF("%x", mLoadFlags) in HttpBaseChannel::SetupReplacementChannel where the SetNotificationCallbacks call happens.  Assuming it's the case that mCallbacks is null here, this is a much more useful place to crash because the stack will uniquely identify which of the manypotential callers of SetupReplacementChannel/StartRedirectChannelToURI we're in.  (NB: The assert might need to go in nsHttpChannel rather than HttpBaseChannel; I didn't look at how HttpChannelChild might be affected.)
- Add a more explicit MOZ_CRASH_UNSAFE_PRINTF("%x", mLoadFlags) at the current crash call-site to provide additional info in the event the prior crash isn't the one that gets us.

1: A digest of my notes from https://github.com/asutherland/asuth-gecko-notes/blob/93568e667c0e35e6760294d4de17ec285d03eca6/reviews/http-channel-connect-crash.md which are a bit scattered and fragmented in places.
Flags: needinfo?(bugmail)
I'd rather add a MOZ_CRASH to SetNotificationCallbacks when mCallbacks QO to HttpChannelParentListener but the new given callbacks don't.
Component: Networking → DOM: Service Workers
Priority: -- → P5

Resolving as WFM. Only 1 crash in 68.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME

Bugbug thinks this bug is a regression, but please revert this change in case of error.

Keywords: regression
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: