Closed Bug 872978 Opened 10 years ago Closed 10 years ago

Intermittent data channel leaks - "9378 bytes leaked (DataChannel, DataChannelConnection, DtlsIdentity, Mutex, NrIceCtx, ...)"

Categories

(Core :: WebRTC, defect)

24 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla24
Tracking Status
firefox21 --- disabled
firefox22 + fixed
firefox23 + fixed
firefox24 + fixed

People

(Reporter: whimboo, Assigned: jesup)

References

Details

(Keywords: intermittent-failure, memory-leak, Whiteboard: [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][qa-])

Attachments

(3 files, 2 obsolete files)

With attachment 750174 [details] [diff] [review] (bug 796894) applied to a mozilla-central checkout, I intermittently see the following leak. It looks like that it is only happening when I'm connected to the MPT VPN. For safety I will set qa-automation-blocked for now because the datachannel tests could not be landed if it's a general issue.

TEST-INFO | leakcheck | leaked 2 DataChannel (288 bytes)
TEST-INFO | leakcheck | leaked 2 DataChannelConnection (928 bytes)
TEST-INFO | leakcheck | leaked 2 DtlsIdentity (64 bytes)
TEST-INFO | leakcheck | leaked 6 Mutex (144 bytes)
TEST-INFO | leakcheck | leaked 2 NrIceCtx (384 bytes)
TEST-INFO | leakcheck | leaked 6 NrIceMediaStream (1008 bytes)
TEST-INFO | leakcheck | leaked 2 NrIceResolver (80 bytes)
TEST-INFO | leakcheck | leaked 24 NrSocket (4800 bytes)
TEST-INFO | leakcheck | leaked 1 ReentrantMonitor (32 bytes)
TEST-INFO | leakcheck | leaked 2 StringAdopt (2 bytes)
TEST-INFO | leakcheck | leaked 2 TransportFlow (352 bytes)
TEST-INFO | leakcheck | leaked 2 VerificationDigest (176 bytes)
TEST-INFO | leakcheck | leaked 1 nsDNSService (144 bytes)
TEST-INFO | leakcheck | leaked 2 nsDeque (192 bytes)
TEST-INFO | leakcheck | leaked 1 nsIDNService (120 bytes)
TEST-INFO | leakcheck | leaked 1 nsPrefBranch (128 bytes)
TEST-INFO | leakcheck | leaked 1 nsSocketTransportService (216 bytes)
TEST-INFO | leakcheck | leaked 3 nsStringBuffer (24 bytes)
TEST-INFO | leakcheck | leaked 10 nsTArray_base (80 bytes)
TEST-INFO | leakcheck | leaked 2 nsTimerImpl (192 bytes)
TEST-INFO | leakcheck | leaked 1 nsUnicodeNormalizer (24 bytes) 
TEST-UNEXPECTED-FAIL | leakcheck | 9378 bytes leaked (DataChannel, DataChannelConnection, DtlsIdentity, Mutex, NrIceCtx, ...)
Did you try removing the patches and verify you could no longer reproduce the leaks?
Removing which patches? The attachment I have mentioned is necessary so data channel tests get run at all. We don't have any of those yet in our mochitest suite.
I got the leak now without being connected to the MPT VPN. So it's indeed blocking us to get the new tests for data channels landed. I will try to nail down and find a simple testcase.
Whiteboard: [WebRTC][MemShrink][qa-automation-blocked] → [WebRTC][blocking-webrtc?][MemShrink][qa-automation-blocked]
Whiteboard: [WebRTC][blocking-webrtc?][MemShrink][qa-automation-blocked] → [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked]
Blocks: 796894
The same leak happens with the current version of my patch on bug 796894, and blocks us from getting the data channel tests landed.

This leak does not reproduce constantly so it's kinda hard to figure out what's going on. I will attach the log from a mochitest run in the hope that you can find something in there.
Attached file log output (obsolete) —
stdout/stderr output to the console from the mochitest run.
It looks like that the test_dataChannel_basicVideo.html test is causing this leak most of the time. I will try to get a datachannel log.
As the following try server run shows the leak is happening across platforms:
https://tbpl.mozilla.org/?tree=Try&rev=fb2864809c6b
OS: Linux → All
Hardware: x86_64 → All
Attachment #750555 - Attachment is obsolete: true
Attachment #750622 - Attachment mime type: text/x-log → text/plain
Assignee: nobody → rjesup
WIP patch - no leaks in heavily-retriggered Try run (or locally after 15+ hours of mochitest runs), but there's a Windows (and maybe linux) crash inside the SCTP library which appears to be some sort of internal race condition between local and remote association shutdown - Michael Tuexen is looking at it, and I'm trying to reproduce again locally on Linux with stack backtrace.
Whiteboard: [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked] → [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][webrtc-uplift]
Depends on: 876167
Attachment #754142 - Attachment is obsolete: true
Comment on attachment 757062 [details] [diff] [review]
process any pending stream resets on incoming resets

This change triggers the sctp library bug in bug 876167 (see the workaround there).  Sending a new try of the two patches as the last try had one optional item here commented out in a failed attempt to avoid bug 876167

This cleans up stream close handling, especially at association shutdown time
Attachment #757062 - Flags: review?(tuexen)
Attachment #757062 - Flags: review?(tuexen) → review+
https://hg.mozilla.org/mozilla-central/rev/062a6a2269b5
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Whiteboard: [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][webrtc-uplift] → [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][webrtc-uplift][qa-]
Comment on attachment 757062 [details] [diff] [review]
process any pending stream resets on incoming resets

[Approval Request Comment]
Bug caused by (feature/regressing bug #): N/A

User impact if declined: Intermittent leak when closing PeerConnections when DataChannels are in use.  When the tests land (which they should this week), the intermittent failure will show up quite often on M3. It's unclear how often it would happen to users - probably rare, as it requires both sides to be closing at the same time.

Testing completed (on m-c, etc.): On M-C.  To solve bug 876167, we had to run a zillion Try retriggers and local mochitest runs.

Risk to taking this patch (and alternatives if risky): We need to take bug 876167 if we take this one (though that bug could still probably be hit with the right timing without this patch - this patch makes it much easier to hit that other bug. 

String or IDL/UUID changes made by this patch: none
Attachment #757062 - Flags: approval-mozilla-beta?
Attachment #757062 - Flags: approval-mozilla-aurora?
Attachment #757062 - Flags: approval-mozilla-beta?
Attachment #757062 - Flags: approval-mozilla-beta+
Attachment #757062 - Flags: approval-mozilla-aurora?
Attachment #757062 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-beta/rev/931514a4b3be
Whiteboard: [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][webrtc-uplift][qa-] → [WebRTC][blocking-webrtc+][MemShrink][qa-automation-blocked][qa-]
Depends on: 889088
You need to log in before you can comment on or make changes to this bug.