DataChannel leak on shutdown if peerconnection instances are not released

RESOLVED FIXED in Firefox 22

Status

()

P1
normal
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: whimboo, Assigned: jesup)

Tracking

({memory-leak})

23 Branch
mozilla23
memory-leak
Points:
---

Firefox Tracking Flags

(firefox21 unaffected, firefox22 fixed, firefox23 fixed)

Details

(Whiteboard: [WebRTC][blocking-webrtc+][MemShrink:P3][qa-automation-blocked][qa-])

Attachments

(2 attachments)

I experience a leak in the DataChannel code on shutdown. It happens when createDataChannel() is called before the createOffer() call, and a shutdown of the browser without releasing the created peerconnection instances.

I will attach a simple patch which let you demonstrate the leak with the current testing code. Just run any of our basic peer connection tests.

Leak output:

TEST-INFO | leakcheck | leaked 1 DataChannel (144 bytes)
TEST-INFO | leakcheck | leaked 1 Mutex (24 bytes)
TEST-INFO | leakcheck | leaked 2 nsTArray_base (16 bytes)

TEST-UNEXPECTED-FAIL| leakcheck | 184 bytes leaked (DataChannel, Mutex, nsTArray_base)
Assignee: nobody → rjesup
Priority: -- → P1
Whiteboard: [WebRTC][MemShrink][qa-automation-blocked] → [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked]
Target Milestone: --- → mozilla23
status-firefox21: --- → unaffected
status-firefox22: --- → affected
status-firefox23: --- → affected
OS: Linux → All
Hardware: x86_64 → All
Comment on attachment 738030 [details] [diff] [review]
Don't lose reference to queued DataChannels when they open, handle null labels properly

Assigning to mcmanus as this is just a reference-counting issue (and a minor foobar with "" labels where we weren't accounting for the base structure having a not-always-used char[1] trailing).  

ProcessQueuedOpens() was calling OpenFinish(), but ignoring that it returned an already_AddRefed DataChannel and so dropping the ref on the floor (too bad the compiler can't whine about that)

The rest is just comments/LOG changes
Attachment #738030 - Flags: review?(mcmanus)
Comment on attachment 738030 [details] [diff] [review]
Don't lose reference to queued DataChannels when they open, handle null labels properly

Review of attachment 738030 [details] [diff] [review]:
-----------------------------------------------------------------

go ahead and plug the leak with this patch. Add NS_WARN_UNUSED_RESULT ?

::: netwerk/sctp/datachannel/DataChannel.cpp
@@ +1295,5 @@
>  
>    switch (ppid) {
>      case DATA_CHANNEL_PPID_CONTROL:
> +      // structure includes a possibly-unused char label[1] (in a packed structure)
> +      NS_ENSURE_TRUE_VOID(length >= sizeof(*req) - 1);

I don't understand how is this germane
Attachment #738030 - Flags: review?(mcmanus) → review+
(In reply to Patrick McManus [:mcmanus] from comment #3)
> Comment on attachment 738030 [details] [diff] [review]
> Don't lose reference to queued DataChannels when they open, handle null
> labels properly
> 
> Review of attachment 738030 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> go ahead and plug the leak with this patch. Add NS_WARN_UNUSED_RESULT ?
> 
> ::: netwerk/sctp/datachannel/DataChannel.cpp
> @@ +1295,5 @@
> >  
> >    switch (ppid) {
> >      case DATA_CHANNEL_PPID_CONTROL:
> > +      // structure includes a possibly-unused char label[1] (in a packed structure)
> > +      NS_ENSURE_TRUE_VOID(length >= sizeof(*req) - 1);
> 
> I don't understand how is this germane

Because the testcase here uses a label of "" which was failing due to the test (no label bytes were included in the Open packet, so it had 12 bytes, but sizeof(*req) is 13 due to the char label[1].

This is why the patch also had "handle null labels properly" in the title; sorry I didn't explain in the bug (and perhaps the comment in the code wasn't sufficiently clear).

I.e. this testcase invokes two bugs, both important to fix (and simple).
(In reply to Patrick McManus [:mcmanus] from comment #3)
> Comment on attachment 738030 [details] [diff] [review]
> Don't lose reference to queued DataChannels when they open, handle null
> labels properly
> 
> Review of attachment 738030 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> go ahead and plug the leak with this patch. Add NS_WARN_UNUSED_RESULT ?

Perhaps we should add that to all already_AddRefed<>'d-returning functions
> TEST-UNEXPECTED-FAIL| leakcheck | 184 bytes leaked (DataChannel, Mutex,
> nsTArray_base)

MemShrink:P3 because the leak is so small.  khuey will file a bug on the general issue.
Whiteboard: [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked] → [WebRTC][blocking-webrtc+][MemShrink:P3][qa-automation-blocked]
https://hg.mozilla.org/integration/mozilla-inbound/rev/23ed57f33db0

This is just the leak portion; I'll break off the null label as a separate issue.
Whiteboard: [WebRTC][blocking-webrtc+][MemShrink:P3][qa-automation-blocked] → [WebRTC][blocking-webrtc+][MemShrink:P3][qa-automation-blocked][webrtc-uplift]

Comment 8

6 years ago
Comment on attachment 738030 [details] [diff] [review]
Don't lose reference to queued DataChannels when they open, handle null labels properly

Review of attachment 738030 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me

::: netwerk/sctp/datachannel/DataChannel.cpp
@@ +1300,1 @@
>  

I think it is correct. The -1 compensates for the declaration
of the final field as label[1] instead of label[0].
https://hg.mozilla.org/mozilla-central/rev/23ed57f33db0
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Whiteboard: [WebRTC][blocking-webrtc+][MemShrink:P3][qa-automation-blocked][webrtc-uplift] → [WebRTC][blocking-webrtc+][MemShrink:P3][qa-automation-blocked][webrtc-uplift][qa-]
Comment on attachment 738030 [details] [diff] [review]
Don't lose reference to queued DataChannels when they open, handle null labels properly

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 855623

User impact if declined: Channels with empty ("") labels won't open.  Leak that blocks landing any DataChannel tests in Aurora.

Testing completed (on m-c, etc.): on m-c, tested with testcase in bug.

Risk to taking this patch (and alternatives if risky): nil - avoids dropping a returned reference on the floor.

String or IDL/UUID changes made by this patch: none
Attachment #738030 - Flags: approval-mozilla-aurora?

Updated

6 years ago
status-firefox23: affected → fixed
Attachment #738030 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/bb3c481d2f70
status-firefox22: affected → fixed
Whiteboard: [WebRTC][blocking-webrtc+][MemShrink:P3][qa-automation-blocked][webrtc-uplift][qa-] → [WebRTC][blocking-webrtc+][MemShrink:P3][qa-automation-blocked][qa-]
You need to log in before you can comment on or make changes to this bug.