Closed Bug 932046 Opened 6 years ago Closed 6 years ago

crash in mozilla::net::HttpChannelChild::OnRedirectVerifyCallback(unsigned int)

Categories

(Core :: Networking, defect, critical)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla28
blocking-b2g 1.3+
Tracking Status
firefox27 --- wontfix
b2g-v1.2 --- unaffected
b2g-v1.3 --- fixed

People

(Reporter: gbennett, Assigned: mayhemer)

References

()

Details

(Keywords: crash, regression, topcrash-b2g, Whiteboard: [b2g-crash])

Crash Data

Attachments

(3 files, 2 obsolete files)

Attached file TerraVideoCrashLOG.txt
This bug was filed from the Socorro interface and is 
report bp-a5d6b418-7603-400f-aaf5-309d42131028.
=============================================================
Description:
After installing and launching the Terra app from the marketplace, select any video located on the front page and a crash should occur.

Repro Steps:
1) Updated Buri 1.3 master/mc mozRIL to BuildID: 20131028040200
2) Open Marketplace app 
3) Download and launch Terra app
4) Select a video

Actual:
Terra app crashes.

Expected:
App does not crash and user is able to watch the video in some manner.

Environmental Variables:
Device: Buri 1.3 master/mc mozRIL
BuildID: 20131028040200
Gaia: 31c260b5c853fe78d6bf44c6aeec52a651c3f025
Gecko: 59ff3a2a708a
Version: 27.0a1
Firmware Version: 20131015

Repro frequency: 100%, 6/6
See attached: TerraVideoCrashLOG.txt
QA Wanted to see if this reproduces on 1.2.
blocking-b2g: --- → 1.3?
Keywords: qawanted
Also - check to see if this reproduces on 1.1.

Pretty sure this is a regression, so noming defensively.
QA Contact: gbennett
Which video crashed for you? I am trying this on both 1.2 and 1.3 and so far no crash has been generated.
1.2

Instead of crashing an error message appears that says "Terra is having problems | Terra is not loading properly | Close/Try Again"

Environmental Variables:
Device: Buri 1.2 Aurora mozRIL
BuildID: 20131028004002
Gaia: 2ef9bc3c7a6de228b63e6ba3613eb0c0dd639c59
Gecko: 4a94d2ea9d37
Version: 26.0a2
Firmware Version: 20131015


1.1

Instead of crashing an error message appears that says "Terra is having problems | Terra is not loading properly | Close/Try Again"

Environmental Variables:
Device: Leo 1.1 mozRIL
BuildID: 20131028041204
Gaia: 39b0203fa9809052c8c4d4332fef03bbaf0426fc
Gecko: 41c15ddb7216
Version: 18.0
Firmware Version: 20131015

(In reply to Marcia Knous [:marcia] from comment #3)
> Which video crashed for you? I am trying this on both 1.2 and 1.3 and so far
> no crash has been generated.

I tried both videos on 1.1, 1.2, and 1.3 and all videos generate a crash on 1.3, where 1.1/1.2 cause a user generated error written above.
Keywords: qawanted
Keywords: regression
Last "Working" (white screen is shown and still doesn't work but the crash does not occur) Build
Environmental Variables:
Device: Buri 1.3 mozRIL
BuildID: 20131015040202
Gaia: 17e871ae1f82699793e3cd28acda805ba724a8b6
Gecko: febfe3c7732b
Version: 27.0a1
Firmware Version: 20131015

First Broken (crashing) Build
Environmental Variables:
Device: Buri 1.3 mozRIL
BuildID: 20131016040202
Gaia: 3d4f1107e6e91e5f5649edc0f2565ac837111d7d
Gecko: 9f63bbc00527
Version: 27.0a1
Firmware Version: 20131015

I attached a log for the white screen that occurs prior to 20131016 builds, which still blocks viewing the video, but no crash occurs. Should a new bug be created or can it be tracked within this issue?
This is not in my area (WebRTC or Web Audio), but this should block 1.3 IMO
(In reply to Maire Reavy [:mreavy] from comment #6)
> This is not in my area (WebRTC or Web Audio), but this should block 1.3 IMO

Agreed.
Hi Robert,

I wonder if you know who can help on triage this bug or take the ownership? Thank you!
Flags: needinfo?(roc)
Whiteboard: [b2g-crash] → [b2g-crash] burirun3
oops, i did not need to put a whiteboard tag here.
Whiteboard: [b2g-crash] burirun3 → [b2g-crash]
Jason Duell maybe?
Assignee: nobody → jduell.mcbugs
Flags: needinfo?(roc)
This looks to me like we're not actually confident that mRedirectChannelChild is non-null, which would occur if we're redirecting to a protocol that doesn't support universal protocol redirection.
Component: Video/Audio → Networking
Assignee: jduell.mcbugs → honzab.moz
Regression from bug 765934.  Apparently we are only missing a null check.  When we pass this point on the child, HttpChannelParentListener::OnRedirectResult later won't find any registered parent channel and will cancel the new channel since there is no way to deliver the data to the child process - new child channel didn't connect it's parent (not implementing nsIChildChannel).

I was able to reproduce when on the initial screen of the app was a video to click (it's a news app).  When I wrote the simple fix and made sure it's correct and sufficient not to crash by inspecting the code, there is now no more video to check in real :)  I have to wait.
Blocks: 765934
Status: NEW → ASSIGNED
Attached patch v1Splinter Review
Confirmed fix.  Then we behave the same way as in the single process model - in my desktop build (yay!) it shows the redirect page content + invokes external helper to save or choose an application to open the content.

The app itself doesn't crash.
Attachment #8342737 - Flags: review?(jduell.mcbugs)
And to note: we are redirecting to an rtsp:// url.
(In reply to Honza Bambas (:mayhemer) from comment #13)
> external helper to save or choose an application to open the content.

Sorry, not save or choose, just choose.  Normal behavior for an unknown schema.
Hmm, we're doing lots of work to support RTSP in B2G, so I'm wondering if popping up a download/helper dialog is going to be enough here.  Steve, do you know where we're at with RTSP in B2G, and if this fix is going to be ok for the current train at least (so we can followup), or if we need more right now?  (If you don't know, maybe you know who to ask...)
Flags: needinfo?(sworkman)
Comment on attachment 8342737 [details] [diff] [review]
v1

Review of attachment 8342737 [details] [diff] [review]:
-----------------------------------------------------------------

If this is the right approach, code looks good.
Attachment #8342737 - Flags: review?(jduell.mcbugs) → review+
Steve: to be more precise: can we live without http redirects to rtsp:// working for now.

My guess is we'll want to apply this patch as a crash stopgap in any case.
(In reply to Jason Duell (:jduell) from comment #18)
> Steve: to be more precise: can we live without http redirects to rtsp://
> working for now.
> 
> My guess is we'll want to apply this patch as a crash stopgap in any case.

Definitely rtsp is a different story then this bug is.  Note that I'm testing with desktop build.  Could have influence...
Honza: so I'm guessing the fix to make http->RTSP redirects is to have rtsp channels implement the redirect infrastructure (i.e. implement nsIChildChannel and nsIParentChannel), like FTP does. Right?

If that's right it should be a followup--we'll leave this bug for the crash fix.
Flags: needinfo?(honzab.moz)
(In reply to Jason Duell (:jduell) from comment #16)
> Hmm, we're doing lots of work to support RTSP in B2G, so I'm wondering if
> popping up a download/helper dialog is going to be enough here.  Steve, do
> you know where we're at with RTSP in B2G, and if this fix is going to be ok
> for the current train at least (so we can followup), or if we need more
> right now?  (If you don't know, maybe you know who to ask...)

An external handler for RTSP for B2G was added in Bug 884702. The bug mentions this working for <a href="rtsp://...">, and there is a second bug to make it work for the URL bar - Bug 945603.

(In reply to Jason Duell (:jduell) from comment #18)
> Steve: to be more precise: can we live without http redirects to rtsp://
> working for now.
> 
> My guess is we'll want to apply this patch as a crash stopgap in any case.

I think the external handler should work for redirects, right? But I'm not so hot on redirects to other protocols... so maybe comment 21 is right?

(In reply to Honza Bambas (:mayhemer) from comment #20)
> Definitely rtsp is a different story then this bug is.  Note that I'm
> testing with desktop build.  Could have influence...

RTSP is B2G only, so I'd expect the helper to pop up for desktop and mobile.
Flags: needinfo?(sworkman)
(In reply to Jason Duell (:jduell) from comment #21)
> Honza: so I'm guessing the fix to make http->RTSP redirects is to have rtsp
> channels implement the redirect infrastructure (i.e. implement
> nsIChildChannel and nsIParentChannel), like FTP does. Right?
> 
> If that's right it should be a followup--we'll leave this bug for the crash
> fix.

Yep, or we could take in the universal protocol/channel to solve them all once and for all - that patch is tho completely bitrotted :((((
Flags: needinfo?(honzab.moz)
That is bug 590682.
https://hg.mozilla.org/mozilla-central/rev/3d1b55d822b2
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
gbennett@qanalydocs.com:

Can you test to see if the video redirect now works on B2G?  We know it will just popup  a dialog on desktop/android, but we're hoping it works smoothly on B2G (see comment 22).
Flags: needinfo?(gbennett)
Target Milestone: mozilla28 → ---
Target Milestone: --- → mozilla28
(In reply to Jason Duell (:jduell) from comment #26)
> gbennett@qanalydocs.com:
> 
> Can you test to see if the video redirect now works on B2G?  We know it will
> just popup  a dialog on desktop/android, but we're hoping it works smoothly
> on B2G (see comment 22).

Of course. On the latest 1.3 mozRIL I got these two crashes when selecting a video within the Terra app.

https://crash-stats.mozilla.com/report/index/751ecba6-0a93-4ce8-9ba6-6beb52131206

https://crash-stats.mozilla.com/report/index/e6782082-a7e9-49c9-a48b-01de12131206

Environmental Variables:
Device: Buri 1.3 mozRIL
BuildID: 20131206040203
Gaia: 8fca2ca67e8a6022fe6ed8cb576e5d59dfb5237f
Gecko: 1401e4b394ad
Version: 28.0a1
Firmware Version: 20131115

Let me know if you need any additional information. :)
Flags: needinfo?(gbennett)
I guess HTTPChanelParent needs to stop assuming it can always cast to an nsHttpChannel and do a QI to nsIHttpChannelInternal first :(
Thanks!

jduell@46755 430  nsHttpChannel *chan = static_cast<nsHttpChannel *>(aRequest);

http://hg.mozilla.org/mozilla-central/annotate/1401e4b394ad/netwerk/protocol/http/HttpChannelParent.cpp#l430

aReqeust is RtspChannel.  Another place to do proper QI.

Apparently, I think we should separate out the redirect logic from the HttpChannelParentListener to a generic redirect class.  Originally I (and others surprisingly too :)) was persuaded that redirects can only come from http channel and be done only again to http channels.  I wrote the code blindly with this in mind.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Looking into this again, I will go with a simple patch, anything else can be done in a followup.
Status: REOPENED → NEW
1) we need to fix best ALL places where we do static_casts
2) RtspChannel::AsyncOpen must respect the fact the channel has been canceled and return the failure w/o notifying the listener
Status: NEW → ASSIGNED
BTW, that RtspChannel::AsyncOpen is calling OnStartRequest from inside before it returns is I believe a contract violation.
Attached patch patch2, v1 (obsolete) — Splinter Review
- when the channel on the child process we are trying to redirect to (created at HttpChannelChild::Redirect1Begin) doesn't impl nsIChildChannel, we fail the redirect immediately.  Not sure this can break stuff or not, but upper levels will stop us anyway since HttpChannelParentListener::OnRedirectResult won't find any registered parent channel and will cancel the target parent real channel before it's asyncOpened'ed.  RTSP doesn't respect that + breaks the asyncOpen contract: [1], fixing it would fix it probably... I'll have a one more simpler patch specific to this for uplift or even a full fix.

- introduced nsIHttpChannelInternal.SendOnStartRequestToChild(HttpChannelParent *aParent) method that is responsible to call aParent->SendOnStartRequestToChild() with proper arguments ; this removes need to extract everything from the channel in HttpChannelParent::OnStartRequest, where different then nsHttpChannel instance can end up (however it shouldn't - but only checking for nsIHttpChannelInternal is IMO not enough to safely static_cast to nsHttpChannel* - some extensions may implement it too..)

- added a test (+ipc wrapper) that redirects to "data:" ; this checks the patch does what it should do, but doesn't check directly with rtsp (maybe just attempt a redirect to check we don't crash on the code path to rtsp::asyncOpen?)


Generally, w/o the universal protocol patch (that is completely bitrotted) redirecting to different protocols is completely broken.  Hence, asking more only for feedback.



[1] http://hg.mozilla.org/mozilla-central/annotate/12ea03a70243/netwerk/protocol/rtsp/RtspChannel.cpp#l24
Attachment #8346033 - Flags: feedback?(jduell.mcbugs)
Attached patch patch2, v2 (obsolete) — Splinter Review
Simple as hell.  Cannot test.
Attachment #8346051 - Flags: review?(jduell.mcbugs)
Comment on attachment 8346051 [details] [diff] [review]
patch2, v2

Review of attachment 8346051 [details] [diff] [review]:
-----------------------------------------------------------------

We should fork this bug now that FF 28 has forked--only the patch that landed belongs in this bug.

> RtspChannel::AsyncOpen is calling OnStartRequest from inside before it returns is I believe a contract violation.

We should file a bug for that.  Easy enough to dispatch a new event to the main thread?

> RtspChannel::AsyncOpen must respect the fact the channel has been canceled and return the failure w/o notifying the listener

Hmm.  But we deliberately do the opposite in nsHttpChannel (see bug 773475).  I.e. for nsHttpChannel if you do
  
   channel == getChannel
   channel.Cancel(errCode)
   channel.AsyncOpen()

we actually fail asynchronously (i.e. AsyncOpen still returns NS_OK and we fail in OnStopRequest).  We actually took out exactly the lines of code you're adding in your patch.  And we did it to support redirects correctly...

I'm not opposed to this patch if it stops crashes or is somehow the right thing to do in the RTSP case.  Just make sure it makes sense here given bug 773475.  I'm skeptical.

In any case let's move all the patches/work here to a new bug.
Attachment #8346051 - Flags: review?(jduell.mcbugs) → feedback-
(In reply to Jason Duell (:jduell) from comment #35)
> Comment on attachment 8346051 [details] [diff] [review]
> patch2, v2
> 
> Review of attachment 8346051 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> We should fork this bug now that FF 28 has forked--only the patch that
> landed belongs in this bug.
> 
> > RtspChannel::AsyncOpen is calling OnStartRequest from inside before it returns is I believe a contract violation.
> 
> We should file a bug for that.  Easy enough to dispatch a new event to the
> main thread?
> 
> > RtspChannel::AsyncOpen must respect the fact the channel has been canceled and return the failure w/o notifying the listener
> 
> Hmm.  But we deliberately do the opposite in nsHttpChannel (see bug 773475).
> I.e. for nsHttpChannel if you do
>   
>    channel == getChannel
>    channel.Cancel(errCode)
>    channel.AsyncOpen()
> 

Agr... nsHttpChannel::BeginConnect() doesn't return mStatus when canceled but NS_OK and fails asynchronously...

OK, then the simple patch is not possible :(

> we actually fail asynchronously (i.e. AsyncOpen still returns NS_OK and we
> fail in OnStopRequest).  We actually took out exactly the lines of code
> you're adding in your patch.  And we did it to support redirects correctly...
> 
> I'm not opposed to this patch if it stops crashes or is somehow the right
> thing to do in the RTSP case.  Just make sure it makes sense here given bug
> 773475.  I'm skeptical.

Good point, I forgot this.

> 
> In any case let's move all the patches/work here to a new bug.

Yep, not that simple as I thought...


Closing this bug for now.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Blocks: 949667
Attachment #8346033 - Attachment is obsolete: true
Attachment #8346033 - Flags: feedback?(jduell.mcbugs)
Attachment #8346051 - Attachment is obsolete: true
blocking+ for crash regression
blocking-b2g: 1.3? → 1.3+
You need to log in before you can comment on or make changes to this bug.