ICE pacing needs to be global

RESOLVED FIXED in Firefox 27, Firefox OS v1.2

Status

()

Core
WebRTC: Networking
RESOLVED FIXED
5 years ago
3 years ago

People

(Reporter: ekr, Assigned: bwc)

Tracking

({csectype-dos, sec-low})

unspecified
mozilla28
x86
Mac OS X
csectype-dos, sec-low
Points:
---

Firefox Tracking Flags

(firefox26 wontfix, firefox27 fixed, firefox28 fixed, firefox-esr24 wontfix, b2g18 unaffected, b2g-v1.1hd unaffected, b2g-v1.2 fixed, b2g-v1.3 fixed, b2g-v1.4 unaffected)

Details

(Whiteboard: [qa-][adv-main27-])

Attachments

(2 attachments, 6 obsolete attachments)

26.76 KB, text/plain
Details
2.67 KB, patch
bwc
: review+
RyanVM
: checkin+
Details | Diff | Splinter Review
(Reporter)

Description

5 years ago
Martin Thomson pointed out to me that the Firefox ICE implementation doesn't do global
pacing. This means that an attacker can cause Firefox to generate a lot of attack traffic to a victim by making a lot of peerconnections with a lot of ICE candidates pointing to the victim.

We need to make ICE pacing Firefox-global, thus limiting the total amount of
traffic Firefox generates in total.
Keywords: sec-moderate
Created attachment 793562 [details]
draft on algorithm changes

I've written a draft on the subject, text version attached.  See also http://martinthomson.github.io/drafts/draft-thomson-mmusic-ice-webrtc.html
(Reporter)

Comment 3

5 years ago
I asked Martin to remove it, since this is a security bug.
(Reporter)

Comment 4

5 years ago
The fix for this issue will make the issue pretty obvious, and to complicate matters, there is a similar issue in Chrome. Who at Mozilla would be the best person to coordinate with Chrome security on coordinated fixing?
Flags: needinfo?(continuation)
(Reporter)

Comment 5

5 years ago
Notes from today's call with Justin and Martin. Instead of doing global pacing, the proposal is to instead put a global "circuit breaker" cap on the maximum amount of STUN traffic you can generate and then (as needed) adjust pacing to the point where legitimate applications should stay under the cap.
Dan probably has a better idea than me.
Flags: needinfo?(continuation)
Flags: needinfo?(dveditz)
-> ekr for now; if implementation needs to go elsewhere we'll reassign
Assignee: nobody → ekr
We'd like to resolve this in the code within the next 2 weeks.  Adding an internal tracking deadline of Sept 6.
(Reporter)

Comment 9

5 years ago
The problem isn't implementing it, it's when we can commit it without implicitly disclosing a problem in both ffox and Chrome.
(In reply to Eric Rescorla (:ekr) from comment #9)
> The problem isn't implementing it, it's when we can commit it without
> implicitly disclosing a problem in both ffox and Chrome.

I understand. I'm just asking everyone (everyone who needs to help make this happen) if we can target resolving this issue within the next 2 weeks.  I realize there are lots of dependencies.
> 
> I understand. I'm just asking everyone (everyone who needs to help make this
> happen) if we can target resolving this issue within the next 2 weeks.  I
> realize there are lots of dependencies.

I also realize that it won't be truly "resolved" until we release updates to the field.  So perhaps the better and more precise question to ask everyone is: can we target having a release plan with Chrome to resolve this issue within 2 weeks?
Flags: needinfo?(dveditz)
Per discussion with dveditz, sec-low
Keywords: sec-moderate → sec-low
(In reply to Eric Rescorla (:ekr) from comment #9)
> The problem isn't implementing it, it's when we can commit it without
> implicitly disclosing a problem in both ffox and Chrome.

When asked about landing this the Google security team said "Please go right ahead. We never get very excited about DOS class issues."
(Assignee)

Comment 14

5 years ago
Created attachment 827067 [details] [diff] [review]
(WIP) Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 1.8K/sec over 20 sec.

Strawman patch.
(Assignee)

Updated

5 years ago
Assignee: ekr → docfaraday
Status: NEW → ASSIGNED
(Assignee)

Comment 15

5 years ago
Comment on attachment 827067 [details] [diff] [review]
(WIP) Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 1.8K/sec over 20 sec.

Review of attachment 827067 [details] [diff] [review]:
-----------------------------------------------------------------

Is this roughly what we're looking for?
Attachment #827067 - Flags: feedback?(ekr)
(Assignee)

Comment 17

5 years ago
Created attachment 827486 [details] [diff] [review]
(WIP) Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 1.8K/sec over 20 sec.

Add a comment that more clearly explains why the rate limiting is needed.
(Assignee)

Updated

5 years ago
Attachment #827067 - Attachment is obsolete: true
Attachment #827067 - Flags: feedback?(ekr)
(Assignee)

Updated

5 years ago
Attachment #827486 - Flags: feedback?(ekr)
(Reporter)

Comment 18

5 years ago
Comment on attachment 827486 [details] [diff] [review]
(WIP) Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 1.8K/sec over 20 sec.

Review of attachment 827486 [details] [diff] [review]:
-----------------------------------------------------------------

::: media/mtransport/nr_socket_prsock.cpp
@@ +395,5 @@
> +    // (see http://tools.ietf.org/html/draft-thomson-mmusic-ice-webrtc)
> +    // Tolerate rate of 8k/sec, for one second.
> +    static SimpleTokenBucket burst(8192*1, 8192);
> +    // Tolerate rate of 1.8k/sec over twenty seconds.
> +    static SimpleTokenBucket sustained(1843*20, 1843);

I think Martin may have changed his recommendation here to just be 100kbps overall.

Martin?

@@ +399,5 @@
> +    static SimpleTokenBucket sustained(1843*20, 1843);
> +    if (burst.getTokens(UINT32_MAX) < len ||
> +        sustained.getTokens(UINT32_MAX) < len) {
> +      // TODO(bcampen@mozilla.com): Better error code for this?
> +      ABORT(R_NOT_PERMITTED);

IIRC, R_NOT_PERMITTED causes an error higher up.

I recommend R_WOULDBLOCK or a silent drop.

@@ +403,5 @@
> +      ABORT(R_NOT_PERMITTED);
> +    }
> +
> +    burst.getTokens(len);
> +    sustained.getTokens(len);

You should comment this so it's clear what's going on.

Also, a note that this isn't thread safe would probably be useful.
Attachment #827486 - Flags: feedback?(ekr)
Byron has used the numbers that I recommended in the draft. However, I've talked to some people and a higher limit is probably better for the longer term limit to prevent some valid cases with multiple agents from bumping the cap. Doubling the rate for the slower token bucket would probably be ok.
(Assignee)

Comment 20

5 years ago
(In reply to Eric Rescorla (:ekr) from comment #18)
> Comment on attachment 827486 [details] [diff] [review]
> (WIP) Very simple global rate-limiting based on SimpleTokenBucket. Will
> tolerate a maximum of 8K/sec over 1 sec, and 1.8K/sec over 20 sec.
> 
> Review of attachment 827486 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +399,5 @@
> > +    static SimpleTokenBucket sustained(1843*20, 1843);
> > +    if (burst.getTokens(UINT32_MAX) < len ||
> > +        sustained.getTokens(UINT32_MAX) < len) {
> > +      // TODO(bcampen@mozilla.com): Better error code for this?
> > +      ABORT(R_NOT_PERMITTED);
> 
> IIRC, R_NOT_PERMITTED causes an error higher up.
> 
> I recommend R_WOULDBLOCK or a silent drop.
> 

   I was worried that at some point R_WOULDBLOCK would prompt a retry. I guess it wouldn't be too awful if that happened.

> @@ +403,5 @@
> > +      ABORT(R_NOT_PERMITTED);
> > +    }
> > +
> > +    burst.getTokens(len);
> > +    sustained.getTokens(len);
> 
> You should comment this so it's clear what's going on.
> 
> Also, a note that this isn't thread safe would probably be useful.

Ok.
(Assignee)

Comment 21

5 years ago
(In reply to Martin Thomson from comment #19)
> Byron has used the numbers that I recommended in the draft. However, I've
> talked to some people and a higher limit is probably better for the longer
> term limit to prevent some valid cases with multiple agents from bumping the
> cap. Doubling the rate for the slower token bucket would probably be ok.

I'll do that.
(Assignee)

Comment 22

5 years ago
Created attachment 831077 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

Incorporate feedback.
(Assignee)

Updated

5 years ago
Attachment #827486 - Attachment is obsolete: true
(Assignee)

Updated

5 years ago
Attachment #831077 - Flags: review?(ekr)
(Reporter)

Comment 23

5 years ago
Comment on attachment 831077 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

Review of attachment 831077 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm

::: media/mtransport/nr_socket_prsock.cpp
@@ +401,5 @@
> +
> +    // Check number of tokens in each bucket.
> +    if (burst.getTokens(UINT32_MAX) < len ||
> +        sustained.getTokens(UINT32_MAX) < len) {
> +      ABORT(R_WOULDBLOCK);

Suggest adding a high level (ERR) log here and an assert. That way if this happens
in the field we will know earlier.
Attachment #831077 - Flags: review?(ekr) → review+
(Assignee)

Comment 24

5 years ago
(In reply to Eric Rescorla (:ekr) from comment #23)
> Comment on attachment 831077 [details] [diff] [review]
> Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a
> maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.
> 
> Review of attachment 831077 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> lgtm
> 
> ::: media/mtransport/nr_socket_prsock.cpp
> @@ +401,5 @@
> > +
> > +    // Check number of tokens in each bucket.
> > +    if (burst.getTokens(UINT32_MAX) < len ||
> > +        sustained.getTokens(UINT32_MAX) < len) {
> > +      ABORT(R_WOULDBLOCK);
> 
> Suggest adding a high level (ERR) log here and an assert. That way if this
> happens
> in the field we will know earlier.

Hmm. This would allow someone to crash my browser by spamming candidates; we don't require any user interaction to allow a PeerConnection to spin up (at least, not in the DataChannel-only case).
(Assignee)

Comment 25

5 years ago
Created attachment 832368 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

Unbitrot
(Assignee)

Updated

5 years ago
Attachment #831077 - Attachment is obsolete: true
(Assignee)

Comment 26

5 years ago
Created attachment 832384 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

Add some logging when we drop STUN requests due to the rate limit.
(Assignee)

Updated

5 years ago
Attachment #832368 - Attachment is obsolete: true
(Assignee)

Comment 27

5 years ago
Created attachment 832497 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

Use MOZ_ASSERT, since that won't crash on release builds.
(Assignee)

Updated

5 years ago
Attachment #832384 - Attachment is obsolete: true
(Assignee)

Comment 28

5 years ago
Comment on attachment 832497 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

Review of attachment 832497 [details] [diff] [review]:
-----------------------------------------------------------------

This look about right to you?

https://bugzilla.mozilla.org/attachment.cgi?oldid=832368&action=interdiff&newid=832497&headers=1
Attachment #832497 - Flags: review?(ekr)
(Reporter)

Comment 29

5 years ago
Comment on attachment 832497 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

Review of attachment 832497 [details] [diff] [review]:
-----------------------------------------------------------------

Does MOZ_ASSERT generate a log message if it's not in debug mode?

Anyway, feel free to land as-is without rereview.
Attachment #832497 - Flags: review?(ekr) → review+
> Does MOZ_ASSERT generate a log message if it's not in debug mode?

No
(Reporter)

Comment 31

5 years ago
Then let's add a separate log to go along with the MOZ_ASSERT.

Byron, please land.
(Assignee)

Comment 32

5 years ago
Created attachment 8333497 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

Add a log message.
(Assignee)

Updated

5 years ago
Attachment #832497 - Attachment is obsolete: true
(Assignee)

Updated

5 years ago
Attachment #8333497 - Flags: checkin?(adam)
(Assignee)

Comment 33

5 years ago
Comment on attachment 8333497 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

Review of attachment 8333497 [details] [diff] [review]:
-----------------------------------------------------------------

Carry forward r+ ekr.
Attachment #8333497 - Flags: review+
Comment on attachment 8333497 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

https://hg.mozilla.org/integration/mozilla-inbound/rev/a906959d7743
Attachment #8333497 - Flags: checkin?(adam) → checkin+
https://hg.mozilla.org/mozilla-central/rev/a906959d7743
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
status-firefox28: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
Does this need any testing from QA? If so, please advise.
Flags: needinfo?(docfaraday)
(Assignee)

Comment 37

5 years ago
Hmm. Probably not. ekr?
Flags: needinfo?(docfaraday) → needinfo?(ekr)
I can provide a test case that someone should run, but I think that Byron is amply qualified to do so.  I don't know if you want to commit that test case to mochitest though.
How far back does this issue go?
(Reporter)

Comment 40

5 years ago
Back to the original landing for WebRTC.
Flags: needinfo?(ekr)
status-b2g18: --- → unaffected
status-b2g-v1.1hd: --- → unaffected
status-b2g-v1.2: --- → affected
status-firefox26: --- → wontfix
status-firefox27: --- → affected
status-firefox-esr24: --- → wontfix
This applies cleanly to beta and b2g26. Please request approval for uplift :)
status-b2g-v1.3: --- → fixed
status-b2g-v1.4: --- → unaffected
Flags: needinfo?(docfaraday)
(Assignee)

Comment 42

5 years ago
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #41)
> This applies cleanly to beta and b2g26. Please request approval for uplift :)

What about aurora?
Flags: needinfo?(docfaraday)
(Assignee)

Comment 43

5 years ago
Comment on attachment 8333497 [details] [diff] [review]
Very simple global rate-limiting based on SimpleTokenBucket. Will tolerate a maximum of 8K/sec over 1 sec, and 3.6K/sec over 20 sec.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 

   Bug 790517

User impact if declined: 
 
   A user's browser could be induced to perform a DoS attack.

Testing completed (on m-c, etc.):

   The usual CI testing, plus some manual testing.

Risk to taking this patch (and alternatives if risky):

   Fairly low.
 
String or IDL/UUID changes made by this patch:

   None.
Attachment #8333497 - Flags: approval-mozilla-beta?
Attachment #8333497 - Flags: approval-mozilla-b2g26?
It landed on trunk when it was version 28, so Aurora already has the fix.

Updated

5 years ago
Attachment #8333497 - Flags: approval-mozilla-beta?
Attachment #8333497 - Flags: approval-mozilla-beta+
Attachment #8333497 - Flags: approval-mozilla-b2g26?
Attachment #8333497 - Flags: approval-mozilla-b2g26+
(In reply to Martin Thomson [:mt] from comment #38)
> I can provide a test case that someone should run, but I think that Byron is
> amply qualified to do so.  I don't know if you want to commit that test case
> to mochitest though.

Eric, please advise.
Flags: needinfo?(ekr)
(Reporter)

Comment 47

5 years ago
I think it's fine to commit it as long as it doesn't do something bad as a side effect.
Flags: needinfo?(ekr)
Whiteboard: [qa-]
Keywords: csectype-dos
Keywords: csectype-dos
Whiteboard: [qa-] → [qa-][adv-main27-]
Keywords: csectype-dos
Group: core-security
You need to log in before you can comment on or make changes to this bug.