Crash in mozilla::net::HttpChannelParent::NotifyDiversionFailed

RESOLVED FIXED in Firefox 56

Status

()

defect
--
critical
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: marcia, Assigned: xeonchen)

Tracking

({crash})

55 Branch
mozilla56
Unspecified
Windows 10
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox54 unaffected, firefox55 wontfix, firefox56 fixed)

Details

(Whiteboard: [necko-active], crash signature)

Attachments

(2 attachments)

This bug was filed from the Socorro interface and is 
report bp-85719858-3a04-4a30-8c77-a4bb60170427.
=============================================================

Seen while looking at nightly crash stats - started using 20170426030329 build: http://bit.ly/2oNydmP. Fairly low volume crash so far.

Possible regression range based on Build ID: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a30dc237c3a600a5231f2974fc2b85dfb5513414&tochange=0f5ba06c4c5959030a05cb852656d854065e2226

URLs:

http://www.ceramicrecords.net/yzcr001/
http://www.chip.de/downloads/Steam_35910421.html

Comment 1

2 years ago
Dragana - this is a long shot to be sure, but you're the only one touching necko code I see in the regression range above. Do you think the TFO stuff from bug 1188435 could have anything to do with this? If not, just clear ni? and I'll get this re-triaged. Thanks!
Flags: needinfo?(dd.mozilla)
(In reply to Nicholas Hurley [:nwgh][:hurley] (also hurley@todesschaf.org) from comment #1)
> Dragana - this is a long shot to be sure, but you're the only one touching
> necko code I see in the regression range above. Do you think the TFO stuff
> from bug 1188435 could have anything to do with this? If not, just clear ni?
> and I'll get this re-triaged. Thanks!

That one was backed out already in m.-i. :)
Flags: needinfo?(dd.mozilla)
correct me if I'm wrong
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1188435
Blocks: 1188435
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Assignee: nobody → dd.mozilla
Whiteboard: [necko-active]
This is not TFO. there are crashes even before TFO landed.

I will unassigned my self, I am not sure I have time to take a look.
Long shot but I will ask. sc, have you change something a 25.4.?
Flags: needinfo?(schien)
sc, you were doing stuff around parent-child process.
Here is my analysis.

Taking a one-year-long query: https://crash-stats.mozilla.com/signature/?signature=mozilla%3A%3Anet%3A%3AHttpChannelParent%3A%3ANotifyDiversionFailed&date=%3E%3D2016-06-09T01%3A23%3A00.000Z&date=%3C2017-06-08T01%3A23%3A35.000Z#graphs

1) This crash signature exists long ago but with a relatively low frequency (about 1~2 crash per week).
2) Most of the crashes happens in Version 55 (48.7%)
3) Crash rate ramps up after buildID 20170418004027 on nightly channel (version 54)
4) Crash rate peak between buildID 20170504105526 and 20170507030205 on nightly channel
5) Crash report stops at buildID 20170523030206 on nightly channel

The PBg-Http modification that touches HTTP IPC lands on buildID 20170328095415 and 20170602030204, which have very little correlation with this crash IMO.
Flags: needinfo?(schien)
By looking at the crash symptom, it is complaining about mParentListener is removed before the HTTPFailDiversionEvent runnable get executed.

There are only 3 functions that will remove mParentListener:
1) HttpChannelParent::AsyncOpenFailed
  Not sure if channel diversion can happen before async open complete.

2) HttpChannelParent::DivertComplete
  Not sure if there is any race condition that between DivertComplete and FailDiversion. We should add diagnostic assertion while we detect it.

3) HttpChannelParent::NotifyDiversionFailed
  Will only cause problem if HttpChannelParent::FailDiversion is called multiple times. This is possible because our diversion IPC messsage handler will invoke FailDiversion if anything go wrong and usually there will be more diversion IPC messages waiting for process (e.g. multiple DivertODA). We should be able to ensure there is only one HTTPFailDiversionEvent fired.
(In reply to Shih-Chiang Chien [:schien] (UTC+8) (use ni? plz) from comment #7)
> 3) HttpChannelParent::NotifyDiversionFailed
>   Will only cause problem if HttpChannelParent::FailDiversion is called
> multiple times. This is possible because our diversion IPC messsage handler
> will invoke FailDiversion if anything go wrong and usually there will be
> more diversion IPC messages waiting for process (e.g. multiple DivertODA).
> We should be able to ensure there is only one HTTPFailDiversionEvent fired.

There is also another possiblity for this.  If FailDiversion() is called inside ResumeForDiversion(), it will always be called again after ResumeForDiversion() returns error.

https://searchfox.org/mozilla-central/rev/a798ee4fc323f9387b7576dbed177859d29d09b7/netwerk/protocol/http/HttpChannelParent.cpp#1316
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Assignee

Updated

2 years ago
Assignee: dd.mozilla → xeonchen

Comment 11

2 years ago
mozreview-review
Comment on attachment 8877405 [details]
Bug 1360332 - Part 0: remove unused function parameter;

https://reviewboard.mozilla.org/r/148802/#review153894
Attachment #8877405 - Flags: review?(dd.mozilla) → review+

Comment 12

2 years ago
mozreview-review
Comment on attachment 8877406 [details]
Bug 1360332 - Part 1: avoid calling FailDiversion() twice;

https://reviewboard.mozilla.org/r/148804/#review153900
Attachment #8877406 - Flags: review?(dd.mozilla) → review+

Comment 13

2 years ago
Pushed by gachen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/69564ef15348
Part 0: remove unused function parameter; r=dragana
https://hg.mozilla.org/integration/autoland/rev/2e0028f31611
Part 1: avoid calling FailDiversion() twice; r=dragana
Assignee

Updated

2 years ago
Keywords: leave-open
Assignee

Comment 15

2 years ago
ni? myself to check this several days later
Flags: needinfo?(xeonchen)
Assignee

Comment 16

2 years ago
No recent crash on Nightly, basically crashes are on aurora/beta, so I think we should close this first.
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Flags: needinfo?(xeonchen)
Resolution: --- → FIXED
Keywords: leave-open
Target Milestone: --- → mozilla56
You need to log in before you can comment on or make changes to this bug.