Crash in mozilla::DataChannel::GetBufferedAmount

RESOLVED FIXED in Firefox 51

Status

()

Core
WebRTC: Networking
P2
normal
Rank:
25
RESOLVED FIXED
2 years ago
a year ago

People

(Reporter: njn, Assigned: jesup)

Tracking

({crash})

Trunk
mozilla52
x86
Windows 8
crash
Points:
---

Firefox Tracking Flags

(firefox49 wontfix, firefox-esr45 affected, firefox50 wontfix, firefox51 fixed, firefox52 fixed)

Details

(crash signature)

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

2 years ago
This bug was filed from the Socorro interface and is 
report bp-c2d79dbd-da85-4531-9e0e-d9d1a2160612.
=============================================================

This crash signature first showed up on June 1st. Since then we have 87 occurrences. The crash address is always 0x14. Perhaps |mConnection| is null?

rjesup, any ideas?
Flags: needinfo?(rjesup)
(Assignee)

Comment 1

2 years ago
Yes, almost certainly a null pointer due to bug 1240209.
Assignee: nobody → rjesup
Severity: critical → normal
Rank: 17
Component: Networking → WebRTC: Networking
Depends on: 1240209
Flags: needinfo?(rjesup)
Priority: -- → P1
Crash volume for signature 'mozilla::DataChannel::GetBufferedAmount':
 - nightly (version 50): 75 crashes from 2016-06-06.
 - aurora  (version 49): 326 crashes from 2016-06-07.
 - beta    (version 48): 0 crash from 2016-06-06.
 - release (version 47): 0 crash from 2016-05-31.
 - esr     (version 45): 0 crash from 2016-04-07.

Crash volume on the last weeks:
             Week N-1   Week N-2   Week N-3   Week N-4   Week N-5   Week N-6   Week N-7
 - nightly          1          1          0          3         22         15         32
 - aurora           0          3         35         70         97         85         34
 - beta             0          0          0          0          0          0          0
 - release          0          0          0          0          0          0          0
 - esr              0          0          0          0          0          0          0

Affected platforms: Windows, Linux
status-firefox49: --- → affected
(Assignee)

Comment 3

a year ago
Created attachment 8802235 [details] [diff] [review]
make sure DataChannel isn't closed before trying to get buffered amount

the remaining crashes here are probably a race between the 'if' and locking mConnection.  mConnection can only be null if DestroyLocked was called on it, which sets the state to CLOSED
Attachment #8802235 - Flags: review?(drno)
(Assignee)

Updated

a year ago
Rank: 17 → 25
Priority: P1 → P2
Comment on attachment 8802235 [details] [diff] [review]
make sure DataChannel isn't closed before trying to get buffered amount

Review of attachment 8802235 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/sctp/datachannel/DataChannel.h
@@ +382,5 @@
>  
>    // Amount of data buffered to send
>    uint32_t GetBufferedAmount()
>    {
> +    if (mState != OPEN || !mConnection) {

I'm not 100% convinced that this will actually fix it, but as I don't have a better explanation and it certainly does not harm to have this extra safe guard.

But it makes me wonder if this https://dxr.mozilla.org/mozilla-central/rev/dc89484d4b45abf442162e5ea2dd46f9de40197d/netwerk/sctp/datachannel/DataChannel.h#402 reference to mConncetion should also be guarded the same way.
Attachment #8802235 - Flags: review?(drno) → review+
(Assignee)

Comment 5

a year ago
Created attachment 8802618 [details] [diff] [review]
more complete fix for issues surrounding mConnection clearing

yeah, access to mConnection and mState are and have been problematic, with the clearing in DestroyLocked().  This reworks things to have the clearing always happen via the notification to DOMDataChannel (on MainThread), and have DOMDataChannel block further access to the DataChannel object except for a few white-listed methods.
Attachment #8802618 - Flags: review?(drno)
(Assignee)

Updated

a year ago
Attachment #8802235 - Attachment is obsolete: true
Comment on attachment 8802618 [details] [diff] [review]
more complete fix for issues surrounding mConnection clearing

Review of attachment 8802618 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM
Attachment #8802618 - Flags: review?(drno) → review+

Comment 7

a year ago
Pushed by rjesup@wgate.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d4532c93e46c
more complete fix for issues surrounding mConnection clearing r=drno
(Assignee)

Comment 8

a year ago
Reminder to ask for uplift
status-firefox49: affected → wontfix
status-firefox51: --- → affected
status-firefox52: --- → affected
status-firefox-esr45: --- → affected
Flags: needinfo?(rjesup)

Comment 9

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/d4532c93e46c
Status: NEW → RESOLVED
Last Resolved: a year ago
status-firefox52: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
See Also: → bug 1297384
Randel do you want to request uplift to 51 before its too late?
(Assignee)

Comment 11

a year ago
Comment on attachment 8802618 [details] [diff] [review]
more complete fix for issues surrounding mConnection clearing

Approval Request Comment
[Feature/regressing bug #]: bug 1240209

[User impact if declined]: crashes and stability

[Describe test coverage new/current, TreeHerder]: Seen in crashstats only; race condition.  Fix baked.

[Risks and why]: Low risk; mirrors some state (so we can access it without taking a lock) and defers clearing the link to the parent connection to avoid races.

[String/UUID change made/needed]: none
Flags: needinfo?(rjesup)
Attachment #8802618 - Flags: approval-mozilla-aurora?
Comment on attachment 8802618 [details] [diff] [review]
more complete fix for issues surrounding mConnection clearing

Fix a crash. Aurora51+.
Attachment #8802618 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(Assignee)

Comment 13

a year ago
https://hg.mozilla.org/releases/mozilla-aurora/rev/51b57560770b0d733148460981683fc1aeb0d284
status-firefox50: affected → wontfix
status-firefox51: affected → fixed
You need to log in before you can comment on or make changes to this bug.