Closed
Bug 980270
Opened 12 years ago
Closed 11 years ago
ICE leaks many allocations
Categories
(Core :: WebRTC: Networking, defect, P3)
Core
WebRTC: Networking
Tracking
()
RESOLVED
FIXED
mozilla33
People
(Reporter: jesup, Assigned: bwc)
References
Details
(Keywords: memory-leak, Whiteboard: [lsan][MemShrink:P2])
Attachments
(4 files)
|
1.25 MB,
text/plain
|
Details | |
|
1.91 KB,
patch
|
drno
:
review+
Sylvestre
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
|
1.32 MB,
text/plain
|
Details | |
|
10.28 KB,
patch
|
Details | Diff | Splinter Review |
Lots and lots of things get leaked in ICE/etc
| Reporter | ||
Comment 1•12 years ago
|
||
| Assignee | ||
Comment 2•12 years ago
|
||
So, I suspect that the following line is causing a leak when ctx->response is already set:
http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/stun/stun_client_ctx.c#524
| Assignee | ||
Comment 3•12 years ago
|
||
nr_ice_pre_answer_request_destroy doesn't seem to be freeing the buffer created here:
http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_component.c?from=nr_ice_pre_answer_request_destroy#60
http://dxr.mozilla.org/mozilla-central/source/media/mtransport/third_party/nICEr/src/ice/ice_component.c?from=nr_ice_pre_answer_request_destroy#85
| Assignee | ||
Updated•12 years ago
|
Assignee: nobody → docfaraday
| Assignee | ||
Comment 4•12 years ago
|
||
The rest of the nICEr stuff in this trace is related to the nr_registry, which is set up once but never torn down. Do we want to put priority on cleaning this up on shutdown? I don't see a whole lot of reason to expend extra effort to clean this up when all NrIceCtx are gone (prior to shutdown), since it looks like tens of K, but if it is easier/cleaner this way it is an option.
Flags: needinfo?(rjesup)
| Assignee | ||
Comment 5•12 years ago
|
||
First attempt at plugging the recurring leaks.
| Reporter | ||
Comment 6•12 years ago
|
||
SUMMARY: LeakSanitizer: 198910 byte(s) leaked in 1824 allocation(s).
That's down in the same test from ~360K/2000, and no errors flagged.
Flags: needinfo?(rjesup)
| Reporter | ||
Comment 7•12 years ago
|
||
| Assignee | ||
Updated•12 years ago
|
Attachment #8387137 -
Flags: review?(ekr)
| Assignee | ||
Comment 8•12 years ago
|
||
Do we want to spend the time to clean up the one-time nr_registry leaks?
Flags: needinfo?(rjesup)
| Reporter | ||
Comment 9•12 years ago
|
||
There seem to be 3x 32K+delta allocs there (perhaps some other smaller ones). Note that due to jemalloc, the 'slop' for a 32K + delta allocation is almost 32K.... so these are poor sizes to begin with.
Given at least 100K is locked in use after the first ref to ICE, and probably closer to 200K, I'd like to see them released when we're not using them (when we shut down peerconnections), even if it's off a timer to avoid churn or some such (I don't think a timer is needed). I'm not so concerned with releasing them in shutdown; if that's all we should just add entries in the suppression list for lsan (see the grandparent bug to this one on lsan for suppression info IIRC).
Also: are these registry sizes appropriate for this use? they seem large for a browser that at most might have a handful of sessions active (doubly so with bundle/rtcpmux)
Flags: needinfo?(rjesup)
Flags: needinfo?(ekr)
Flags: needinfo?(docfaraday)
| Reporter | ||
Comment 10•12 years ago
|
||
slop here is more like 3*(1 page - delta) apparently. Still we're well over 100K total.
| Assignee | ||
Comment 11•12 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #9)
> There seem to be 3x 32K+delta allocs there (perhaps some other smaller
> ones). Note that due to jemalloc, the 'slop' for a 32K + delta allocation
> is almost 32K.... so these are poor sizes to begin with.
>
> Given at least 100K is locked in use after the first ref to ICE, and
> probably closer to 200K, I'd like to see them released when we're not using
> them (when we shut down peerconnections), even if it's off a timer to avoid
> churn or some such (I don't think a timer is needed). I'm not so concerned
> with releasing them in shutdown; if that's all we should just add entries in
> the suppression list for lsan (see the grandparent bug to this one on lsan
> for suppression info IIRC).
>
> Also: are these registry sizes appropriate for this use? they seem large
> for a browser that at most might have a handful of sessions active (doubly
> so with bundle/rtcpmux)
This is configuration setup and setting up the logging system/s, so the max number of ICE contexts isn't really a factor here.
Flags: needinfo?(docfaraday)
| Assignee | ||
Comment 12•12 years ago
|
||
Begin experimenting with cleaning up nr_registry and r_log.
| Assignee | ||
Comment 13•12 years ago
|
||
Try out patch #2 and see how much it helps.
Flags: needinfo?(rjesup)
Updated•12 years ago
|
Whiteboard: [lsan][MemShrink] → [lsan][MemShrink:P2]
Comment 14•11 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #4)
> The rest of the nICEr stuff in this trace is related to the nr_registry,
> which is set up once but never torn down. Do we want to put priority on
> cleaning this up on shutdown? I don't see a whole lot of reason to expend
> extra effort to clean this up when all NrIceCtx are gone (prior to
> shutdown), since it looks like tens of K, but if it is easier/cleaner this
> way it is an option.
I don't see a lot of value in clenaning up the shutdown stuff.
Flags: needinfo?(ekr)
Updated•11 years ago
|
Priority: -- → P3
Target Milestone: --- → mozilla33
| Assignee | ||
Updated•11 years ago
|
Attachment #8387137 -
Flags: review?(ekr) → review?(drno)
Comment 15•11 years ago
|
||
Comment on attachment 8387137 [details] [diff] [review]
Part 1: Plug a couple of common leaks in nICEr.
Review of attachment 8387137 [details] [diff] [review]:
-----------------------------------------------------------------
LGTM
With this patch applied a three hour call as a TURN client no longer showed a memory leak.
Attachment #8387137 -
Flags: review?(drno) → review+
Updated•11 years ago
|
Whiteboard: [lsan][MemShrink:P2] → [lsan][MemShrink:P2][webrtc-uplift]
Updated•11 years ago
|
Keywords: checkin-needed
Updated•11 years ago
|
status-firefox30:
--- → affected
status-firefox31:
--- → affected
status-firefox32:
--- → affected
status-firefox33:
--- → affected
Comment 17•11 years ago
|
||
I think this is worth considering uplifting as this leak consumes ~1MB per second per stream if the TURN relay of your client gets selected by ICE.
Comment 19•11 years ago
|
||
Try run for Part 1 patch: https://tbpl.mozilla.org/?tree=Try&rev=52424c947a91
Comment 21•11 years ago
|
||
Thanks for fixing this! Hopefully we're a little closer to being able to do more targeted LSan whitelisting of WebRTC leaks.
Comment 22•11 years ago
|
||
Keywords: checkin-needed
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 24•11 years ago
|
||
Comment on attachment 8387137 [details] [diff] [review]
Part 1: Plug a couple of common leaks in nICEr.
Approval Request Comment
[Feature/regressing bug #]: TURN support is available at least since FF23; 786235 is oldest ticket I was able to find regarding TURN support.
[User impact if declined]: Firefox might get killed by the OS in a WebRTC call if the host runs out of memory (this eats at around 1MB/s if the scenario applies).
[Describe test coverage new/current, TBPL]: No TBPL coverage as TBPL allows no proper TURN testing. This was caught in the WebRTC QA lab and can be verified with that as well. Additionally manual verification has been done and can be done as needed.
[Risks and why]: low risk - we are freeing memory which should not be used any more; the same function for freeing the memory is in use in other places in the code and has not caused problems so far.
[String/UUID change made/needed]: None.
Attachment #8387137 -
Flags: approval-mozilla-beta?
Updated•11 years ago
|
Updated•11 years ago
|
Attachment #8387137 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 25•11 years ago
|
||
Flags: needinfo?(rjesup)
Comment 26•11 years ago
|
||
I'm not sure how verifiable this really is. This code still leaks, just less.
Comment 27•11 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #26)
> I'm not sure how verifiable this really is. This code still leaks, just
> less.
Thanks Andrew for the info! Removing keyword for now... add it back if you think the QA team can help with this.
Keywords: verifyme
Comment 28•11 years ago
|
||
Updated•11 years ago
|
status-b2g-v2.0:
--- → fixed
status-b2g-v2.1:
--- → fixed
| Reporter | ||
Updated•11 years ago
|
Whiteboard: [lsan][MemShrink:P2][webrtc-uplift] → [lsan][MemShrink:P2]
Comment 29•11 years ago
|
||
Could you provide the detailed reproduce steps and video for me to verify this bug. Thanks!
Flags: needinfo?(rjesup)
Comment 30•11 years ago
|
||
(In reply to Sue from comment #29)
> Could you provide the detailed reproduce steps and video for me to verify
> this bug. Thanks!
There is no specific video. You would need to establish a call between two different networks, so that the call goes through a TURN relay. And then leave the call up for quite some time and watch the memory usage.
| Reporter | ||
Updated•11 years ago
|
Flags: needinfo?(rjesup)
You need to log in
before you can comment on or make changes to this bug.
Description
•