Multiple pre-answer binding requests result in excessive memory usage

RESOLVED FIXED in Firefox 52

Status

()

defect
P1
normal
Rank:
19
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: adam, Assigned: drno)

Tracking

50 Branch
mozilla52
All
Unspecified
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox49 wontfix, firefox50 wontfix, firefox51 wontfix, firefox52 fixed)

Details

Attachments

(1 attachment)

Reporter

Description

3 years ago
Steps to reproduce: 

1) Go to https://output.jsbin.com/lekaco in IE 11
2) Download and install the OpenTok plugin for WebRTC
3) Allow access to your camera in IE 11
4) Go to https://output.jsbin.com/lekaco in Firefox 50

Result: Firefox browser locks up and you have to force quit it


This is happening with an older version of the OpenTok SDK (v2.8) which is using an older version of WebRTC in the IE plugin (version 45). It's possible that the same thing could happen with other clients using an older version of WebRTC, eg. Temasys plugin or iOS/Android SDKs.

Updated

3 years ago
Component: Untriaged → WebRTC
Product: Firefox → Core
Rank: 14
Priority: -- → P1
Assignee

Comment 1

3 years ago
Adam do you know if this happens on a specific version of Windows, or universally across all supported Windows versions?

If nobody gets to this earlier, then I can probably take a look at this next week.
Rank: 14 → 19
Assignee

Comment 3

3 years ago
I'm able to repro this. FF 50b10 continues to render the local preview video, but on my Mac I get the beach ball of death and FF is not responding any more. I'll investigate.
Assignee: nobody → drno
Assignee

Comment 4

3 years ago
This is some bizarre ICE issue.
Assignee

Comment 5

3 years ago
So comparing the behavior between FF 49 and 50 I see the following in our logs:

In case of 49:
- PC1 gets created
- PC1.createOffer() & sLD()
- PC1 gathers and emits all its candidates
- PC1 sets answer via sRD()
- PC1 establishes connection and starts sending RTP
- PC2 gets created
- PC2 sets an offer via sRD()
- PC2 creates answer and sets it via sLD()
- ...

In case of FF >= 50 the JS code on that page behaves quite different:
- PC1 gets created
- PC1.createOffer() & sLD()
- PC1 gathers and emits all its candidates
- PC2 gets created
- PC2 sets offer via sRD()
- at this point FF is eating 100% CPU, because the IE Plugin keeps hammering with STUN binding requests onto one of the ports of PC1
  - this results in FF eating up memory and to lock up

I'll be working on a fix for the memory leak in FF. That seems wrong to me what we do there.

But I think this is partially caused by the different behavior of the TB SDK behaving differently with FF >= 50, plus the fact that something is broken in OpenTok's ICE stack (libjingle or what ever it uses for STUN/ICE) which hammers in DoS way onto our STUN port. I'm not convinced that it is worth trying to build some kind of DoS protection into our ICE stack.
Comment hidden (mozreview-request)
Assignee

Comment 7

3 years ago
The attached patch fixes the memory leak in Firefox when the OpenTok IE plugin starts bombarding Fx with binding requests.
But the browser still locks up if it's running without e10s (a.k.a. Multi-Process support).

If you turn on Multi-Process support (users can manually turn that on and off only in Developer Edition and Nightly, but it's available in Beta and Release as well) then there is no lockup what so ever. With Multi-Process support turned on I think you can nicely see that this is actually some problem in the OpenTok SDK or the signaling, because about:webrtc in that case shows one PeerConnection which generated an offer and gathered all candidates, but never gets an answer or remote candidates. The reason Firefox does not lock up with Multi-Process support is that the binding requests from IE never make into our ICE stack (because we never received any remote candidates which would open the internal packet filter for these requests).

Comment 8

3 years ago
mozreview-review
Comment on attachment 8805225 [details]
Bug 1309641: only store a single pre-answer request per 5 tuple.

https://reviewboard.mozilla.org/r/89002/#review88162

::: media/mtransport/third_party/nICEr/src/ice/ice_component.c:1152
(Diff revision 1)
> +    int _status;
> +    nr_ice_pre_answer_request *r1, *r2;
> +    nr_transport_addr dst_addr;
> +
> +    if (r=nr_socket_getaddr(sock, &dst_addr))
> +      ABORT(r);

Seems like this error will get eaten. Maybe we should handle the enqueue in here and name it something else, so an actual error will get noticed.
Attachment #8805225 - Flags: review?(docfaraday) → review-
Assignee

Comment 9

3 years ago
Found out that the ICE DoS attack is not only problem. Stay tuned...
Assignee

Updated

3 years ago
Depends on: 1313527
Assignee

Comment 10

3 years ago
So the real bug which causes the hang is bug 1313527, which got introduced in Fx 50 through bug 1275360.

Thanks Adam for the report before 50 gets into release!

But as usual the bad bugs are accumulation of multiple failures: the root cause here is that the OpenTok SDK actually generates broken SDP. Firefox receives an SDP answer from IE with the following line:

 'a=fmtp:111 minptime=10\n'

Note: it is missing the mandatory '\r\n' at the end and instead only has a '\n'. So would be helpful if you guys could fix that on your end as well.
Comment hidden (mozreview-request)
Assignee

Updated

3 years ago
Summary: Firefox 50 browser locks up when using WebRTC with IE 11 using OpenTok → Multiple pre-answer binding requests result in excessive memory usage

Comment 12

3 years ago
mozreview-review
Comment on attachment 8805225 [details]
Bug 1309641: only store a single pre-answer request per 5 tuple.

https://reviewboard.mozilla.org/r/89002/#review88462
Attachment #8805225 - Flags: review?(docfaraday) → review+

Comment 13

3 years ago
Pushed by drno@ohlmeier.org:
https://hg.mozilla.org/integration/autoland/rev/bcccdc43232b
only store a single pre-answer request per 5 tuple. r=bwc

Comment 14

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/bcccdc43232b
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Nils, is this something you might want to uplift to 51? Or did the patch for bug 1313527 fix this in 50/51 already? Thanks.
Flags: needinfo?(drno)
Assignee

Comment 16

3 years ago
Thanks for checking Liz.
Bug 1313527 fixed the real issue of an endless loop on main thread. The likeliness of hitting the problem which is solved by this patch here is so low that I think we should let this one just ride the trains.
Reporter

Comment 17

3 years ago
Thank you!
You need to log in before you can comment on or make changes to this bug.