Closed Bug 1034327 Opened 10 years ago Closed 10 years ago

Memory leak with TURN relay

Categories

(Core :: WebRTC: Networking, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33
Tracking Status
firefox30 --- wontfix
firefox31 --- verified
firefox32 --- fixed
firefox33 --- fixed
b2g-v1.3 --- wontfix
b2g-v1.3T --- wontfix
b2g-v1.4 --- wontfix
b2g-v2.0 --- fixed
b2g-v2.1 --- fixed

People

(Reporter: drno, Assigned: drno)

Details

(Keywords: memory-leak, Whiteboard: [lsan][MemShrink])

Attachments

(1 file)

If FF Nightly is wrapping its media in TURN packages it consumes big amount of memory.
In our SDE environment FF gets reproducible killed 35min into the call, because the host runs out of memory.
Can you do a 1-way call where the Nightly is the sender and see if this still happens?
Assignee: nobody → drno
Running FF in Valgrind was way to slow for any ICE connection to get established.
Now trying TURN unit tests in valgrind...
This fixes the major leak I see in the turn_unitest.
Attachment #8450658 - Flags: review?(ekr)
Attachment #8450658 - Flags: review?(docfaraday)
Comment on attachment 8450658 [details] [diff] [review]
bug_1034327_fix_turn_memory_leaks.patch

Review of attachment 8450658 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me; we definitely call this if nr_stun_build_send_indication fails, so it seems right. Test, of course.
Attachment #8450658 - Flags: review?(docfaraday) → review+
I presume we'll want to uplift this as far as possible?
Whiteboard: [webrtc-uplift]
(In reply to Randell Jesup [:jesup] from comment #6)
> I presume we'll want to uplift this as far as possible?

Indeed.

I want to throw a manual build with the fix onto the sunny day QA lab on Monday for a 3 hour call to verify this is really the only fix needed.
3 hour test run did not fail after 30min like it used to do. But there might be another smaller memory leak.
OS: Linux → All
Hardware: x86_64 → All
I looks like we still have a smaller leak somewhere:

Client 1:
 4123 mozilla   20   0 1050m 178m  46m S   70  4.6  72:51.88 firefox

Client 2 (the one which uses send indication):
6214 mozilla   20   0 1692m 578m  10m S   74 14.8  80:34.06 firefox

Unfortunately this leak does not show up in the turn_unittest, and is probably a lot harder to find. I opened Bug 1035332 to continue with that investigation, but get this fix landed earlier.

As the test run is green I'm requesting checkin.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/6c1f2cc228a2
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Comment on attachment 8450658 [details] [diff] [review]
bug_1034327_fix_turn_memory_leaks.patch

Approval Request Comment
[Feature/regressing bug #]: TURN support is available at least since FF23; 786235 is oldest ticket I was able to find regarding TURN support.

[User impact if declined]: Firefox might get killed by the OS in a WebRTC call if the host runs out of memory.

[Describe test coverage new/current, TBPL]: No TBPL coverage as TBPL allows no proper TURN testing. This was caught in the WebRTC QA lab and can be verified with that as well. Additionally manual verification has been done and can be done as needed.

[Risks and why]: low risk - we are freeing memory which should not be used any more; the same function for freeing the memory is in use in other places in the code and has not caused problems so far.

[String/UUID change made/needed]: None.
Attachment #8450658 - Flags: review?(ekr) → approval-mozilla-aurora?
Comment on attachment 8450658 [details] [diff] [review]
bug_1034327_fix_turn_memory_leaks.patch

Aurora+
Attachment #8450658 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Whiteboard: [webrtc-uplift] → [webrtc-uplift][lsan][MemShrink]
Keywords: mlk
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #15)
> Might be worth taking on older branches as well?

Yes, I just wanted to do it one by one to see if I encounter any problems with the approvals in between, and because the guidelines say that the code should sit and rest in Nightly & Aurora for some time before uplifting into Beta.
Flags: needinfo?(drno)
Comment on attachment 8450658 [details] [diff] [review]
bug_1034327_fix_turn_memory_leaks.patch

Approval Request Comment
[Feature/regressing bug #]: TURN support is available at least since FF23; 786235 is oldest ticket I was able to find regarding TURN support.

[User impact if declined]: Firefox might get killed by the OS in a WebRTC call if the host runs out of memory (this eats at least 1MB/s if the scenario applies).

[Describe test coverage new/current, TBPL]: No TBPL coverage as TBPL allows no proper TURN testing. This was caught in the WebRTC QA lab and can be verified with that as well. Additionally manual verification has been done and can be done as needed.

[Risks and why]: low risk - we are freeing memory which should not be used any more; the same function for freeing the memory is in use in other places in the code and has not caused problems so far.

[String/UUID change made/needed]: None.
Attachment #8450658 - Flags: approval-mozilla-beta?
My assumption is that, because of the lack of video support for WebRTC on B2G < 2.0 the usage of WebRTC is so low on B2G (this might be totally wrong), that it is not worth trying to fix it.
Attachment #8450658 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Nils, can you give me some clearer instructions on how I can verify this manually? Thanks!
Flags: needinfo?(drno)
(Or, if it's something you're set up better for, ie to do memory leak measurements, can you verify this fix? Thanks!)
It is not that easy to reproduce, because the problem occurs only in a special network setup.
I took the 31 rc1 and made a manual call in our QA lab setup. No memory leak visible any more after running the call for 10min (without the patch the memory footprint used to increase every second).
Flags: needinfo?(drno)
Whiteboard: [webrtc-uplift][lsan][MemShrink] → [lsan][MemShrink]
You need to log in before you can comment on or make changes to this bug.