Intermittent data channel leaks - "9378 bytes leaked (DataChannel, DataChannelConnection, DtlsIdentity, Mutex, NrIceCtx, ...)"

RESOLVED FIXED in Firefox 22

Status

()

Core
WebRTC
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: whimboo, Assigned: jesup)

Tracking

(Depends on: 1 bug, {intermittent-failure, mlk})

24 Branch
mozilla24
intermittent-failure, mlk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox21 disabled, firefox22+ fixed, firefox23+ fixed, firefox24+ fixed)

Details

(Whiteboard: [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][qa-])

Attachments

(3 attachments, 2 obsolete attachments)

(Reporter)

Description

4 years ago
With attachment 750174 [details] [diff] [review] (bug 796894) applied to a mozilla-central checkout, I intermittently see the following leak. It looks like that it is only happening when I'm connected to the MPT VPN. For safety I will set qa-automation-blocked for now because the datachannel tests could not be landed if it's a general issue.

TEST-INFO | leakcheck | leaked 2 DataChannel (288 bytes)
TEST-INFO | leakcheck | leaked 2 DataChannelConnection (928 bytes)
TEST-INFO | leakcheck | leaked 2 DtlsIdentity (64 bytes)
TEST-INFO | leakcheck | leaked 6 Mutex (144 bytes)
TEST-INFO | leakcheck | leaked 2 NrIceCtx (384 bytes)
TEST-INFO | leakcheck | leaked 6 NrIceMediaStream (1008 bytes)
TEST-INFO | leakcheck | leaked 2 NrIceResolver (80 bytes)
TEST-INFO | leakcheck | leaked 24 NrSocket (4800 bytes)
TEST-INFO | leakcheck | leaked 1 ReentrantMonitor (32 bytes)
TEST-INFO | leakcheck | leaked 2 StringAdopt (2 bytes)
TEST-INFO | leakcheck | leaked 2 TransportFlow (352 bytes)
TEST-INFO | leakcheck | leaked 2 VerificationDigest (176 bytes)
TEST-INFO | leakcheck | leaked 1 nsDNSService (144 bytes)
TEST-INFO | leakcheck | leaked 2 nsDeque (192 bytes)
TEST-INFO | leakcheck | leaked 1 nsIDNService (120 bytes)
TEST-INFO | leakcheck | leaked 1 nsPrefBranch (128 bytes)
TEST-INFO | leakcheck | leaked 1 nsSocketTransportService (216 bytes)
TEST-INFO | leakcheck | leaked 3 nsStringBuffer (24 bytes)
TEST-INFO | leakcheck | leaked 10 nsTArray_base (80 bytes)
TEST-INFO | leakcheck | leaked 2 nsTimerImpl (192 bytes)
TEST-INFO | leakcheck | leaked 1 nsUnicodeNormalizer (24 bytes) 
TEST-UNEXPECTED-FAIL | leakcheck | 9378 bytes leaked (DataChannel, DataChannelConnection, DtlsIdentity, Mutex, NrIceCtx, ...)
Did you try removing the patches and verify you could no longer reproduce the leaks?
(Reporter)

Comment 2

4 years ago
Removing which patches? The attachment I have mentioned is necessary so data channel tests get run at all. We don't have any of those yet in our mochitest suite.
(Reporter)

Comment 3

4 years ago
I got the leak now without being connected to the MPT VPN. So it's indeed blocking us to get the new tests for data channels landed. I will try to nail down and find a simple testcase.
(Reporter)

Updated

4 years ago
Whiteboard: [WebRTC][MemShrink][qa-automation-blocked] → [WebRTC][blocking-webrtc?][MemShrink][qa-automation-blocked]

Updated

4 years ago
Whiteboard: [WebRTC][blocking-webrtc?][MemShrink][qa-automation-blocked] → [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked]
(Reporter)

Updated

4 years ago
Blocks: 796894
(Reporter)

Comment 4

4 years ago
The same leak happens with the current version of my patch on bug 796894, and blocks us from getting the data channel tests landed.

This leak does not reproduce constantly so it's kinda hard to figure out what's going on. I will attach the log from a mochitest run in the hope that you can find something in there.
(Reporter)

Comment 5

4 years ago
Created attachment 750555 [details]
log output

stdout/stderr output to the console from the mochitest run.
(Reporter)

Comment 6

4 years ago
Created attachment 750573 [details]
log output (basic video + multiple channel)

It looks like that the test_dataChannel_basicVideo.html test is causing this leak most of the time. I will try to get a datachannel log.
(Reporter)

Comment 7

4 years ago
As the following try server run shows the leak is happening across platforms:
https://tbpl.mozilla.org/?tree=Try&rev=fb2864809c6b
OS: Linux → All
Hardware: x86_64 → All
(Reporter)

Comment 8

4 years ago
Created attachment 750622 [details]
NSPR log (datachannel:5,signaling:5)
Attachment #750555 - Attachment is obsolete: true
(Reporter)

Updated

4 years ago
Attachment #750622 - Attachment mime type: text/x-log → text/plain

Updated

4 years ago
Keywords: intermittent-failure
Assignee: nobody → rjesup
(Assignee)

Comment 9

4 years ago
Created attachment 754142 [details] [diff] [review]
process any pending stream resets on incoming resets

WIP patch - no leaks in heavily-retriggered Try run (or locally after 15+ hours of mochitest runs), but there's a Windows (and maybe linux) crash inside the SCTP library which appears to be some sort of internal race condition between local and remote association shutdown - Michael Tuexen is looking at it, and I'm trying to reproduce again locally on Linux with stack backtrace.
Whiteboard: [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked] → [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][webrtc-uplift]
status-firefox21: --- → disabled
status-firefox22: ? → affected
status-firefox23: ? → affected
tracking-firefox22: --- → ?
tracking-firefox23: --- → ?
tracking-firefox24: --- → ?
(Assignee)

Updated

4 years ago
Depends on: 876167
tracking-firefox22: ? → +
tracking-firefox23: ? → +
tracking-firefox24: ? → +
(Assignee)

Comment 10

4 years ago
Created attachment 757062 [details] [diff] [review]
process any pending stream resets on incoming resets
(Assignee)

Updated

4 years ago
Attachment #754142 - Attachment is obsolete: true
(Assignee)

Comment 11

4 years ago
Comment on attachment 757062 [details] [diff] [review]
process any pending stream resets on incoming resets

This change triggers the sctp library bug in bug 876167 (see the workaround there).  Sending a new try of the two patches as the last try had one optional item here commented out in a failed attempt to avoid bug 876167

This cleans up stream close handling, especially at association shutdown time
Attachment #757062 - Flags: review?(tuexen)

Updated

4 years ago
Attachment #757062 - Flags: review?(tuexen) → review+
(Assignee)

Comment 12

4 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/062a6a2269b5
Target Milestone: --- → mozilla24
https://hg.mozilla.org/mozilla-central/rev/062a6a2269b5
Status: NEW → RESOLVED
Last Resolved: 4 years ago
status-firefox24: affected → fixed
Resolution: --- → FIXED

Updated

4 years ago
Whiteboard: [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][webrtc-uplift] → [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][webrtc-uplift][qa-]
(Assignee)

Comment 14

4 years ago
Comment on attachment 757062 [details] [diff] [review]
process any pending stream resets on incoming resets

[Approval Request Comment]
Bug caused by (feature/regressing bug #): N/A

User impact if declined: Intermittent leak when closing PeerConnections when DataChannels are in use.  When the tests land (which they should this week), the intermittent failure will show up quite often on M3. It's unclear how often it would happen to users - probably rare, as it requires both sides to be closing at the same time.

Testing completed (on m-c, etc.): On M-C.  To solve bug 876167, we had to run a zillion Try retriggers and local mochitest runs.

Risk to taking this patch (and alternatives if risky): We need to take bug 876167 if we take this one (though that bug could still probably be hit with the right timing without this patch - this patch makes it much easier to hit that other bug. 

String or IDL/UUID changes made by this patch: none
Attachment #757062 - Flags: approval-mozilla-beta?
Attachment #757062 - Flags: approval-mozilla-aurora?
Attachment #757062 - Flags: approval-mozilla-beta?
Attachment #757062 - Flags: approval-mozilla-beta+
Attachment #757062 - Flags: approval-mozilla-aurora?
Attachment #757062 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/9005eaf15285

Doesn't apply cleanly to beta.
status-firefox23: affected → fixed
Keywords: branch-patch-needed
(Assignee)

Comment 16

4 years ago
https://hg.mozilla.org/releases/mozilla-beta/rev/931514a4b3be
status-firefox22: affected → fixed
Keywords: branch-patch-needed
Whiteboard: [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][webrtc-uplift][qa-] → [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][qa-]
Depends on: 889088
You need to log in before you can comment on or make changes to this bug.