Closed Bug 1296514 Opened 8 years ago Closed 8 years ago

DTLS record with a large sequence number gap causes DTLS to spin

Categories

(NSS :: Libraries, defect)

defect
Not set
normal

Tracking

(firefox48 wontfix, firefox49 wontfix, firefox-esr45 wontfix, firefox50 wontfix, firefox51 fixed)

RESOLVED FIXED
Tracking Status
firefox48 --- wontfix
firefox49 --- wontfix
firefox-esr45 --- wontfix
firefox50 --- wontfix
firefox51 --- fixed

People

(Reporter: mt, Assigned: mt)

References

Details

(Keywords: csectype-dos, sec-other)

Attachments

(1 file, 3 obsolete files)

Say that we expect to receive packet number 3.  If we instead receive packet 2^48-1, we will scroll the receive window forward one octet at a time:

http://searchfox.org/nss/rev/462a77115abebd0f3cd9cb56dbc350a25b9be706/lib/ssl/dtlscon.c#1130

I believe that it's possible to trigger this code with an unauthenticated packet.  That might only happen during the handshake, but I first ran into this with encrypted packets.

Without the patch, the test for DTLS 1.0, which uses AES, takes 5s on my machine.  I'm not patient enough to wait for the ChaCha one.
Attached patch gap.patch (obsolete) — Splinter Review
With the patch this time.
Assignee: nobody → martin.thomson
Attachment #8782744 - Flags: review?(ekr)
Comment on attachment 8782744 [details] [diff] [review]
gap.patch

Review of attachment 8782744 [details] [diff] [review]:
-----------------------------------------------------------------

::: external_tests/ssl_gtest/ssl_drop_unittest.cc
@@ +100,5 @@
> +  // Note: this number has to be smaller than the record limit for the
> +  // ciphersuite.
> +  EXPECT_EQ(SECSuccess,
> +            SSLInt_AdvanceWriteSeqNum(client_->ssl_fd(), limit - 10));
> +  SendReceive();

Does this test fail with the previous code?

I think it would also be good to do a test that is precisely at seq + WINDOW and seq + WINDOW + 1.

::: lib/ssl/dtlscon.c
@@ +1127,5 @@
>          new_right = seq | 0x07;
>          new_left = (new_right - DTLS_RECVD_RECORDS_WINDOW) + 1;
>  
> +        if (seq > records->right + DTLS_RECVD_RECORDS_WINDOW) {
> +            PORT_Memset(records->data, 0, sizeof(records->data));

Shouldn't you be using new_right here?
Attachment #8782744 - Flags: review?(ekr)
Does this need to be hidden?
Keywords: csectype-dos
(In reply to Eric Rescorla (:ekr) from comment #3)
> > +  EXPECT_EQ(SECSuccess,
> > +            SSLInt_AdvanceWriteSeqNum(client_->ssl_fd(), limit - 10));
> > +  SendReceive();
> 
> Does this test fail with the previous code?

Well, I imagine that it doesn't fail, though it will hit most timeouts.  I haven't been patient enough to run the test.  I estimate that it would take a couple of years on my machine; for each test iteration, of which we have quite a few.
 
> I think it would also be good to do a test that is precisely at seq + WINDOW
> and seq + WINDOW + 1.

Sure.  Easy.

> > +        if (seq > records->right + DTLS_RECVD_RECORDS_WINDOW) {
> > +            PORT_Memset(records->data, 0, sizeof(records->data));
> 
> Shouldn't you be using new_right here?

Sure.  It's not going to be much different (if seq&7 < 7, we might do an unnecessary scrub over the buffer).

(In reply to Daniel Veditz [:dveditz] from comment #4)
> Does this need to be hidden?

For the moment, I think that it's wise.  This locks up the STS thread in the parent, which renders the browser unusable.  We will probably want to sneak this patch out.  This blocks bug 1268745 and that is actually for a security problem.
(In reply to Martin Thomson [:mt:] from comment #5)
> (In reply to Eric Rescorla (:ekr) from comment #3)
> > > +  EXPECT_EQ(SECSuccess,
> > > +            SSLInt_AdvanceWriteSeqNum(client_->ssl_fd(), limit - 10));
> > > +  SendReceive();
> > 
> > Does this test fail with the previous code?
> 
> Well, I imagine that it doesn't fail, though it will hit most timeouts.  I
> haven't been patient enough to run the test.  I estimate that it would take
> a couple of years on my machine; for each test iteration, of which we have
> quite a few.

Can't we just run the test with conventional timers (e.g., SIGALRM) with some sort of
generous timeout like 1s? That will fail with the old code and succeed with the new
code.


> > I think it would also be good to do a test that is precisely at seq + WINDOW
> > and seq + WINDOW + 1.
> 
> Sure.  Easy.
> 
> > > +        if (seq > records->right + DTLS_RECVD_RECORDS_WINDOW) {
> > > +            PORT_Memset(records->data, 0, sizeof(records->data));
> > 
> > Shouldn't you be using new_right here?
> 
> Sure.  It's not going to be much different (if seq&7 < 7, we might do an
> unnecessary scrub over the buffer).
> 
> (In reply to Daniel Veditz [:dveditz] from comment #4)
> > Does this need to be hidden?
> 
> For the moment, I think that it's wise.  This locks up the STS thread in the
> parent, which renders the browser unusable.  We will probably want to sneak
> this patch out.  This blocks bug 1268745 and that is actually for a security
> problem.
(In reply to Eric Rescorla (:ekr) from comment #6)
> Can't we just run the test with conventional timers (e.g., SIGALRM) with
> some sort of
> generous timeout like 1s? That will fail with the old code and succeed with
> the new
> code.

The problem is splitting the gap for the faster tests.  ChaCha might take 2 years, but AES takes only 5s.  And RC4 takes milliseconds.
Then just test ChaCha. Any of these is >> the size of the window.
Well that was a waste of time.  It's possible to have the code run on another thread, and the test will fail, but it still hangs for years.  gtest seems to join open threads at the end of a test run.
Attached patch bug1296514.patch (obsolete) — Splinter Review
This is based on the other patch, since I had to use the sequence number tweaking stuff.  This version has the fancy C++ stuff in it that doesn't work (if you comment out the fix, this hangs as badly as the patch that I prefer).
Attachment #8782744 - Attachment is obsolete: true
Attachment #8783392 - Flags: review?(ekr)
Attached patch bug1296514.patch (obsolete) — Splinter Review
I could go for the original over this even, since the timeout didn't end up working.  The original has slightly better coverage.
Attachment #8783394 - Flags: review?(ekr)
Attached patch bug1296514.patchSplinter Review
Attachment #8783392 - Attachment is obsolete: true
Attachment #8783394 - Attachment is obsolete: true
Attachment #8783392 - Flags: review?(ekr)
Attachment #8783394 - Flags: review?(ekr)
Attachment #8783758 - Flags: review?(ekr)
Comment on attachment 8783758 [details] [diff] [review]
bug1296514.patch

Review of attachment 8783758 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM
Attachment #8783758 - Flags: review?(ekr) → review+
dveditz,

We expect to receive a public announcement about 3DES being bad on Aug 25.

This, and bug 1268745 (which is actually a public front for bug 1267899) need to land together.  

I would like to just land these and avoid creating a big fuss.  That is, no need to uplift other than as the opportunity presents.  Nightly currently has a Beta of NSS 3.27, which we would update before the current cycle ends (which is plenty away).

Does this sound like a reasonable plan to you?
Flags: needinfo?(dveditz)
(In reply to Martin Thomson [:mt:] from comment #14)
> I would like to just land these and avoid creating a big fuss.[...]
> Does this sound like a reasonable plan to you?

Sounds good.
Flags: needinfo?(dveditz)
Keywords: sec-other
https://hg.mozilla.org/projects/nss/rev/4200b26d8087
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.27
Group: crypto-core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: