Closed Bug 1199862 Opened 9 years ago Closed 9 years ago

[e10s] crash in nsCOMPtr_base::assign_with_AddRef(nsISupports*zilla::net::HttpChannelParent::RecvDivertOnStopRequest(nsresult const&))

Categories

(Core :: Networking: HTTP, defect)

43 Branch
x86
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
e10s m8+ ---
firefox43 --- fixed

People

(Reporter: epinal99-bugzilla2, Assigned: dragana)

References

Details

(Keywords: crash, reproducible, Whiteboard: [mozfr-community])

Crash Data

Attachments

(1 file, 2 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-0db3adb3-c266-4749-bde5-977732150829.
=============================================================

I hit this crash when testing bug 1199782.

1) Start Nightly in e10s mode
2) Open http://agpl.fsf.org/emailselfdefense.fsf.org/edward/CURRENT/ and click on the file "edward.tgz.sig" to download it
3) Close immediately the error dialog box (yellow triangle) by clicking on the X button

Sometimes it doesn't crash, just spam the X button of the error dialog box to crash, it's a matter of "timing", I guess.

Result: all the browser crashes!

2 CR with similar signatures:

Firefox 43.0a1 Crash Report [@ nsCOMPtr_base::assign_with_AddRef(nsISupports*) | PL_HashTableLookupConst | mozilla::net::HttpChannelParent::RecvDivertOnStopRequest(nsresult const&) ]
https://crash-stats.mozilla.com/report/index/54875785-4431-49ac-b012-91cd02150829

Firefox 43.0a1 Crash Report [@ nsCOMPtr_base::assign_with_AddRef(nsISupports*) | mozilla::net::HttpChannelParentListener::OnStopRequest(nsIRequest*, nsISupports*, nsresult) ]
https://crash-stats.mozilla.com/report/index/8e1c1ea6-ae8d-4c02-877a-73e142150829
Summary: [e10s] crash in nsCOMPtr_base::assign_with_AddRef(nsISupports*zilla::net::HttpChannelParent::RecvDivertOnStopRequest(nsresult const&)) | PL_HashTableLookupConst | mo → [e10s] crash in nsCOMPtr_base::assign_with_AddRef(nsISupports*zilla::net::HttpChannelParent::RecvDivertOnStopRequest(nsresult const&))
Blocks: e10s
The analysis:

HttpChannelParent calls StopRequest of the listener which prompt the error dialog and does not return from the function but calls next function in the queue.
On of the next functions is RecvDivertComplited which deletes the listener. So when the prompt returns it finish the function on already destroyed listener and that crashes.
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
This sounds like the problem with sync XHR/alert that we fixed back in 2010 by creating a queue of IPDL necko events and deferring running them until after we return from the listener calls. What changed?
Oh, this is happening in the parent this time :(
Attached patch bug_1199862_v1.patch (obsolete) — Splinter Review
Attachment #8655367 - Flags: review?(jduell.mcbugs)
Attached patch bug_1199862_v1.patch (obsolete) — Splinter Review
Attachment #8655367 - Attachment is obsolete: true
Attachment #8655367 - Flags: review?(jduell.mcbugs)
Attachment #8655451 - Flags: review?(jduell.mcbugs)
Comment on attachment 8655451 [details] [diff] [review]
bug_1199862_v1.patch

Review of attachment 8655451 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good to me.  I assume you've tested it against the STR in comment 0?

In HttpChannelChild we hook :Suspend() so it also calls Suspend on the ChannelEventQueue.  I was wondering if we need something like that here.  Looking over the existing code (and with this patch) I'm under the impression that our DivertTo logic does not handle the case where we 1) DivertTo() a channel on the parent, and then 2) some external code (download manager) suspends that channel during the IPDL diversion process.  It looks to me like we just call OnStartRequest/OnDataAvailable as the diverted IPDL messages arrive.  I think that's a bug (though one that's unlikely to happen in practice, unless a user tried to suspend a download immediately after they start it--even that might be hard to human fingers to do in time?).  Fortunately enough, adding mEventQueue to the parent should make fixing that easier.  I assume we'd need to check the suspendCount of mChannel each time a diverted message arrives, and if it's >1 (i.e. if the channel has been suspended more than just when we did it as part of SuspendForDiversion()), then we could queue the message.  Except we'd need some notification of when the channel was resumed back down to count==1.  That may be a little messy.  Anyway, if I'm right about this, it's clearly a new bug, and shouldn't block this one.   But think over it, and file a bug if you also agree we've got a bug here.
Attachment #8655451 - Flags: review?(jduell.mcbugs) → review+
Flags: needinfo?(dd.mozilla)
(In reply to Jason Duell [:jduell] (needinfo? me) from comment #7)
> Comment on attachment 8655451 [details] [diff] [review]
> bug_1199862_v1.patch
> 
> Review of attachment 8655451 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks good to me.  I assume you've tested it against the STR in comment
> 0?
> 
> In HttpChannelChild we hook :Suspend() so it also calls Suspend on the
> ChannelEventQueue.  I was wondering if we need something like that here. 
> Looking over the existing code (and with this patch) I'm under the
> impression that our DivertTo logic does not handle the case where we 1)
> DivertTo() a channel on the parent, and then 2) some external code (download
> manager) suspends that channel during the IPDL diversion process.  It looks
> to me like we just call OnStartRequest/OnDataAvailable as the diverted IPDL
> messages arrive.  I think that's a bug (though one that's unlikely to happen
> in practice, unless a user tried to suspend a download immediately after
> they start it--even that might be hard to human fingers to do in time?). 
> Fortunately enough, adding mEventQueue to the parent should make fixing that
> easier.  I assume we'd need to check the suspendCount of mChannel each time
> a diverted message arrives, and if it's >1 (i.e. if the channel has been
> suspended more than just when we did it as part of SuspendForDiversion()),
> then we could queue the message.  Except we'd need some notification of when
> the channel was resumed back down to count==1.  That may be a little messy. 
> Anyway, if I'm right about this, it's clearly a new bug, and shouldn't block
> this one.   But think over it, and file a bug if you also agree we've got a
> bug here.

I will file another bug for this.
Flags: needinfo?(dd.mozilla)
Keywords: checkin-needed
Hi, this patch failed to apply:

patching file netwerk/protocol/http/HttpChannelParent.h
Hunk #1 FAILED at 152
1 out of 2 hunks FAILED -- saving rejects to file netwerk/protocol/http/HttpChannelParent.h.rej

could you take a look, thanks!
Flags: needinfo?(dd.mozilla)
Keywords: checkin-needed
Rebased. 
Thanks!
Attachment #8655451 - Attachment is obsolete: true
Flags: needinfo?(dd.mozilla)
Attachment #8656484 - Flags: review+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/ef818b20a5d8
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Whiteboard: [mozfr-community]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: