[e10s] crash in nsCOMPtr_base::assign_with_AddRef(nsISupports*zilla::net::HttpChannelParent::RecvDivertOnStopRequest(nsresult const&))

RESOLVED FIXED in Firefox 43

Status

()

Core
Networking: HTTP
--
critical
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: Loic, Assigned: dragana)

Tracking

({crash, reproducible})

43 Branch
mozilla43
x86
Windows NT
crash, reproducible
Points:
---

Firefox Tracking Flags

(e10sm8+, firefox43 fixed)

Details

(Whiteboard: [mozfr-community], crash signature)

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

3 years ago
This bug was filed from the Socorro interface and is 
report bp-0db3adb3-c266-4749-bde5-977732150829.
=============================================================

I hit this crash when testing bug 1199782.

1) Start Nightly in e10s mode
2) Open http://agpl.fsf.org/emailselfdefense.fsf.org/edward/CURRENT/ and click on the file "edward.tgz.sig" to download it
3) Close immediately the error dialog box (yellow triangle) by clicking on the X button

Sometimes it doesn't crash, just spam the X button of the error dialog box to crash, it's a matter of "timing", I guess.

Result: all the browser crashes!

2 CR with similar signatures:

Firefox 43.0a1 Crash Report [@ nsCOMPtr_base::assign_with_AddRef(nsISupports*) | PL_HashTableLookupConst | mozilla::net::HttpChannelParent::RecvDivertOnStopRequest(nsresult const&) ]
https://crash-stats.mozilla.com/report/index/54875785-4431-49ac-b012-91cd02150829

Firefox 43.0a1 Crash Report [@ nsCOMPtr_base::assign_with_AddRef(nsISupports*) | mozilla::net::HttpChannelParentListener::OnStopRequest(nsIRequest*, nsISupports*, nsresult) ]
https://crash-stats.mozilla.com/report/index/8e1c1ea6-ae8d-4c02-877a-73e142150829
(Reporter)

Updated

3 years ago
Summary: [e10s] crash in nsCOMPtr_base::assign_with_AddRef(nsISupports*zilla::net::HttpChannelParent::RecvDivertOnStopRequest(nsresult const&)) | PL_HashTableLookupConst | mo → [e10s] crash in nsCOMPtr_base::assign_with_AddRef(nsISupports*zilla::net::HttpChannelParent::RecvDivertOnStopRequest(nsresult const&))
(Reporter)

Updated

3 years ago
Blocks: 516752
(Assignee)

Comment 1

3 years ago
The analysis:

HttpChannelParent calls StopRequest of the listener which prompt the error dialog and does not return from the function but calls next function in the queue.
On of the next functions is RecvDivertComplited which deletes the listener. So when the prompt returns it finish the function on already destroyed listener and that crashes.
(Assignee)

Updated

3 years ago
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
This sounds like the problem with sync XHR/alert that we fixed back in 2010 by creating a queue of IPDL necko events and deferring running them until after we return from the listener calls. What changed?
Oh, this is happening in the parent this time :(
(Assignee)

Comment 4

3 years ago
Created attachment 8655367 [details] [diff] [review]
bug_1199862_v1.patch
Attachment #8655367 - Flags: review?(jduell.mcbugs)
(Assignee)

Comment 5

3 years ago
Created attachment 8655451 [details] [diff] [review]
bug_1199862_v1.patch
Attachment #8655367 - Attachment is obsolete: true
Attachment #8655367 - Flags: review?(jduell.mcbugs)
Attachment #8655451 - Flags: review?(jduell.mcbugs)
tracking-e10s: --- → m8+
Comment on attachment 8655451 [details] [diff] [review]
bug_1199862_v1.patch

Review of attachment 8655451 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good to me.  I assume you've tested it against the STR in comment 0?

In HttpChannelChild we hook :Suspend() so it also calls Suspend on the ChannelEventQueue.  I was wondering if we need something like that here.  Looking over the existing code (and with this patch) I'm under the impression that our DivertTo logic does not handle the case where we 1) DivertTo() a channel on the parent, and then 2) some external code (download manager) suspends that channel during the IPDL diversion process.  It looks to me like we just call OnStartRequest/OnDataAvailable as the diverted IPDL messages arrive.  I think that's a bug (though one that's unlikely to happen in practice, unless a user tried to suspend a download immediately after they start it--even that might be hard to human fingers to do in time?).  Fortunately enough, adding mEventQueue to the parent should make fixing that easier.  I assume we'd need to check the suspendCount of mChannel each time a diverted message arrives, and if it's >1 (i.e. if the channel has been suspended more than just when we did it as part of SuspendForDiversion()), then we could queue the message.  Except we'd need some notification of when the channel was resumed back down to count==1.  That may be a little messy.  Anyway, if I'm right about this, it's clearly a new bug, and shouldn't block this one.   But think over it, and file a bug if you also agree we've got a bug here.
Attachment #8655451 - Flags: review?(jduell.mcbugs) → review+
Flags: needinfo?(dd.mozilla)
(Assignee)

Comment 8

3 years ago
(In reply to Jason Duell [:jduell] (needinfo? me) from comment #7)
> Comment on attachment 8655451 [details] [diff] [review]
> bug_1199862_v1.patch
> 
> Review of attachment 8655451 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks good to me.  I assume you've tested it against the STR in comment
> 0?
> 
> In HttpChannelChild we hook :Suspend() so it also calls Suspend on the
> ChannelEventQueue.  I was wondering if we need something like that here. 
> Looking over the existing code (and with this patch) I'm under the
> impression that our DivertTo logic does not handle the case where we 1)
> DivertTo() a channel on the parent, and then 2) some external code (download
> manager) suspends that channel during the IPDL diversion process.  It looks
> to me like we just call OnStartRequest/OnDataAvailable as the diverted IPDL
> messages arrive.  I think that's a bug (though one that's unlikely to happen
> in practice, unless a user tried to suspend a download immediately after
> they start it--even that might be hard to human fingers to do in time?). 
> Fortunately enough, adding mEventQueue to the parent should make fixing that
> easier.  I assume we'd need to check the suspendCount of mChannel each time
> a diverted message arrives, and if it's >1 (i.e. if the channel has been
> suspended more than just when we did it as part of SuspendForDiversion()),
> then we could queue the message.  Except we'd need some notification of when
> the channel was resumed back down to count==1.  That may be a little messy. 
> Anyway, if I'm right about this, it's clearly a new bug, and shouldn't block
> this one.   But think over it, and file a bug if you also agree we've got a
> bug here.

I will file another bug for this.
Flags: needinfo?(dd.mozilla)
(Assignee)

Updated

3 years ago
Keywords: checkin-needed
Hi, this patch failed to apply:

patching file netwerk/protocol/http/HttpChannelParent.h
Hunk #1 FAILED at 152
1 out of 2 hunks FAILED -- saving rejects to file netwerk/protocol/http/HttpChannelParent.h.rej

could you take a look, thanks!
Flags: needinfo?(dd.mozilla)
Keywords: checkin-needed
(Assignee)

Comment 10

3 years ago
Created attachment 8656484 [details] [diff] [review]
bug_1199862_v1.patch

Rebased. 
Thanks!
Attachment #8655451 - Attachment is obsolete: true
Flags: needinfo?(dd.mozilla)
Attachment #8656484 - Flags: review+
(Assignee)

Updated

3 years ago
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/ef818b20a5d8
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
status-firefox43: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
(Reporter)

Updated

2 years ago
Whiteboard: [mozfr-community]
You need to log in before you can comment on or make changes to this bug.