Closed
Bug 1009227
Opened 7 years ago
Closed 7 years ago
DTLS connection establishment failure
Categories
(NSS :: Libraries, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
3.16.2
People
(Reporter: ekr, Assigned: ekr)
Details
Attachments
(4 files, 3 obsolete files)
1009 bytes,
patch
|
Details | Diff | Splinter Review | |
4.48 KB,
patch
|
mt
:
review+
wtc
:
review+
|
Details | Diff | Splinter Review |
928 bytes,
patch
|
wtc
:
review+
lmandel
:
approval-mozilla-aurora+
wtc
:
checked-in+
|
Details | Diff | Splinter Review |
7.80 KB,
patch
|
mt
:
review+
|
Details | Diff | Splinter Review |
If a ChangeCipherSpec is sent in a separate packet from earlier parts of one side's second flight, and then the earlier packet is lost or reordered with respect to the CCS, we generate an error because of being in an unexpected state. CCS does not come with a handshake message sequence number and so it gets processed right away by ssl3_HandleChangeCipherSpecs which detects that we are not yet in state wait_change_cipher and generates an unexpected message alert: if (ws != wait_change_cipher) { (void)SSL3_SendAlert(ss, alert_fatal, unexpected_message); PORT_SetError(SSL_ERROR_RX_UNEXPECTED_CHANGE_CIPHER); return SECFailure; } One fix here is to special-case the logic here for DTLS and ignore any CCS packets that arrive out of order rather than to generate an error. That will not recover immediately because the other side must then retransmit, but it will fix the immediate problem. Outside of this bug, WTC suggested that he had a better fix in mind.
Comment 1•7 years ago
|
||
Eric: I was wrong. We also need your fix. My patch cannot handle the case where CCS is the first message of an out-of-order record.
Attachment #8421344 -
Flags: review?(ekr)
Comment 2•7 years ago
|
||
Comment on attachment 8421344 [details] [diff] [review] Ignore the rest of the messages in the record if it has an out-of-order message Review of attachment 8421344 [details] [diff] [review]: ----------------------------------------------------------------- Looking at the pcap file in https://code.google.com/p/chromium/issues/detail?id=369855 closer, I don't think this patch will even handle that particular sequence of events. All the DTLS records in that pcap file contain a single handshake or CCS message. So this patch is at most an optimization for the case where the peer stuffs multiple handshake messages into a DTLS record. We definitely need your patch. ::: lib/ssl/dtlscon.c @@ -320,5 @@ > } else if (message_seq > ss->ssl3.hs.recvMessageSeq) { > /* Case 2 > * > - * Ignore this message. This means we don't handle out of > - * order complete messages that well, but we're still Note that I removed "complete". I believe we don't know if the message is complete at this point. @@ +325,5 @@ > * compliant and this probably does not happen often > * > * XXX OK for now. Maybe do something smarter at some point? > */ > + break; Should we set rv to SECSuccess explicitly before this break statement?
Assignee | ||
Comment 3•7 years ago
|
||
Wan-Teh, How do you want to proceed here? My sense is that my proposed patch would fix the bug here and that this code is a shortcut that immediately discards data which would be discarded in any case. Do you want to do both patches?
Comment 4•7 years ago
|
||
Eric: We need your proposed patch. You're right that my patch is just a shortcut. We can ignore my patch.
Assignee | ||
Comment 5•7 years ago
|
||
OK. I will try to prepare something in the next week or so (I am at a meeting this week)
Assignee | ||
Comment 6•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → ekr
Status: NEW → ASSIGNED
Assignee | ||
Comment 7•7 years ago
|
||
Assignee | ||
Comment 8•7 years ago
|
||
Wan-Teh, I have attached a patch along the lines we discussed. It checks for CCS out of order and simply ignores it. I wanted to get your opinion on several points. 1. I don't sanity check that the CCS is correct, since we are ignoring it anyway. I believe this is consistent with our general practice, but I wanted to get your opinion. 2. I decided to return SECSuccess. This is what we do with out of order handshake messages so it should be safe. See: http://dxr.mozilla.org/mozilla-central/source/security/nss/lib/ssl/dtlscon.c#327 which breaks and thus returns the default value of SECSuccess set in: http://dxr.mozilla.org/mozilla-central/source/security/nss/lib/ssl/dtlscon.c#187 Since dtls_HandleHandshake and dtls_HandleChangeCipherSpecs are called from adjacent places: http://dxr.mozilla.org/mozilla-central/source/security/nss/lib/ssl/ssl3con.c#11603 Having them behave similarly on re-ordered packets makes sense. 3. I had to add code to set buf->len = 0 because it's expected that we will consume the entire buffer. Failure to do so causes an assert later. Does this seem OK? 4. I have included a unit test in the C++ WebRTC mtransport unit test suite. It simply injects a bogus CCS into the handshake and makes sure it's ignored. You can verify that the test fails without patch 1 and succeeds with it. I would rather have this in an NSS suite but it's unfortunately not that easy to add there and it's quite easy to add one here. I've f?ed you there but you don't need to give it a complete review since I can have Martin review it.
Assignee | ||
Updated•7 years ago
|
Attachment #8432008 -
Flags: review?(martin.thomson)
Attachment #8432008 -
Flags: feedback?(wtc)
Assignee | ||
Updated•7 years ago
|
Attachment #8432010 -
Flags: review?(wtc)
Assignee | ||
Comment 9•7 years ago
|
||
Comment on attachment 8421344 [details] [diff] [review] Ignore the rest of the messages in the record if it has an out-of-order message Review of attachment 8421344 [details] [diff] [review]: ----------------------------------------------------------------- this patch is overtaken by events, so clearing r?
Attachment #8421344 -
Flags: review?(ekr)
Comment 10•7 years ago
|
||
Comment on attachment 8432008 [details] [diff] [review] part 2: Unit test Review of attachment 8432008 [details] [diff] [review]: ----------------------------------------------------------------- This needs another round. ::: media/mtransport/test/transport_unittests.cpp @@ +88,5 @@ > + // Which starts with a certificate message. > + virtual bool Inspect(TransportLayer* layer, > + const unsigned char *data, size_t len) { > + unsigned char ccs[] = {0x14, // Type > + 0xfe, 0xff, // Version trailing space @@ +131,5 @@ > > + if (inspector_ && !inspector_->Inspect(this, data, len)) { > + MOZ_MTLOG(ML_NOTICE, "Dropping packet because of inspector"); > + ++packet_; > + return len; This is a fair bit of generality given that this inspector doesn't actually cause packets to be dropped ever. So this is dead code. @@ +138,4 @@ > if (loss_mask_ & (1 << (packet_ % 32))) { > MOZ_MTLOG(ML_NOTICE, "Dropping packet"); > ++packet_; > return len; If you want generality, why not go all the way and have the loss mask as a separate inspector? @@ +179,5 @@ > > private: > uint32_t loss_mask_; > uint32_t packet_; > + Inspector* inspector_; Where's the destructor?
Attachment #8432008 -
Flags: review?(martin.thomson)
Assignee | ||
Comment 11•7 years ago
|
||
OK, this is fair comment. I got excited about generality but then didn't backfill. Would you prefer that we remove the generality or do the backfill. (In reply to Martin Thomson [:mt] from comment #10) > Comment on attachment 8432008 [details] [diff] [review] > part 2: Unit test > > Review of attachment 8432008 [details] [diff] [review]: > ----------------------------------------------------------------- > > This needs another round. > > ::: media/mtransport/test/transport_unittests.cpp > @@ +88,5 @@ > > + // Which starts with a certificate message. > > + virtual bool Inspect(TransportLayer* layer, > > + const unsigned char *data, size_t len) { > > + unsigned char ccs[] = {0x14, // Type > > + 0xfe, 0xff, // Version > > trailing space > > @@ +131,5 @@ > > > > + if (inspector_ && !inspector_->Inspect(this, data, len)) { > > + MOZ_MTLOG(ML_NOTICE, "Dropping packet because of inspector"); > > + ++packet_; > > + return len; > > This is a fair bit of generality given that this inspector doesn't actually > cause packets to be dropped ever. So this is dead code. > > @@ +138,4 @@ > > if (loss_mask_ & (1 << (packet_ % 32))) { > > MOZ_MTLOG(ML_NOTICE, "Dropping packet"); > > ++packet_; > > return len; > > If you want generality, why not go all the way and have the loss mask as a > separate inspector? > > @@ +179,5 @@ > > > > private: > > uint32_t loss_mask_; > > uint32_t packet_; > > + Inspector* inspector_; > > Where's the destructor? I think I prefer to just have it owned by the parent, so I'll remove the new in the actual test case.
Comment 12•7 years ago
|
||
(In reply to Eric Rescorla (:ekr) from comment #11) > OK, this is fair comment. I got excited about generality but > then didn't backfill. Would you prefer that we remove the > generality or do the backfill. I'm not all that excited about having two layers here. Maybe just make a new transport layer that parallels TransportLayerLossy with the new capability. public class TransportLayerCCSRepeater : TransportLayer { TransportLayerCCSRepeater() : sent_(false) {} ~TransportLayerCCSRepeater () {} virtual TransportResult SendPacket(const unsigned char *data, size_t len) { MOZ_MTLOG(ML_NOTICE, LAYER_INFO << "SendPacket(" << len << ")"); TriggerSend(); return downward_->SendPacket(data, len); } void StateChange(TransportLayer *layer, State state) { TL_SET_STATE(state); } void PacketReceived(TransportLayer *layer, const unsigned char *data, size_t len) { SignalPacketReceived(this, data, len); } TRANSPORT_LAYER_ID("ccsrepeater") protected: virtual void WasInserted() { downward_->SignalPacketReceived. connect(this, &TransportLayerLossy::PacketReceived); downward_->SignalStateChange. connect(this, &TransportLayerLossy::StateChange); TL_SET_STATE(downward_->state()); } private: bool TriggerSend() { } bool sent_; }; It's a little more code, but I think that works.
Assignee | ||
Comment 13•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Attachment #8432812 -
Flags: review?(martin.thomson)
Comment 14•7 years ago
|
||
Comment on attachment 8432812 [details] [diff] [review] part 2: Unit test Review of attachment 8432812 [details] [diff] [review]: ----------------------------------------------------------------- LGTM
Attachment #8432812 -
Flags: review?(martin.thomson) → review+
Assignee | ||
Updated•7 years ago
|
Attachment #8432008 -
Attachment is obsolete: true
Attachment #8432008 -
Flags: feedback?(wtc)
Comment 15•7 years ago
|
||
I changed // to /* */ and fixed the indentation level. Patch checked in: https://hg.mozilla.org/projects/nss/rev/8f026c806587
Attachment #8432010 -
Attachment is obsolete: true
Attachment #8432010 -
Flags: review?(wtc)
Attachment #8433760 -
Flags: review+
Attachment #8433760 -
Flags: checked-in+
Comment 16•7 years ago
|
||
Comment on attachment 8432812 [details] [diff] [review] part 2: Unit test Review of attachment 8432812 [details] [diff] [review]: ----------------------------------------------------------------- r=wtc. ::: media/mtransport/test/transport_unittests.cpp @@ +86,5 @@ > + public: > + InspectorCCSInjector() : injected_(false) {} > + > + // Inject a bogus CCS after you see the client's second handshake message > + // Which starts with a certificate message. Nit: don't capitalize "which". @@ +89,5 @@ > + // Inject a bogus CCS after you see the client's second handshake message > + // Which starts with a certificate message. > + virtual void Inspect(TransportLayer* layer, > + const unsigned char *data, size_t len) { > + unsigned char ccs[] = {0x14, // Type Nit: this array can be static const. @@ +115,5 @@ > + return; > + > + layer->SendPacket(ccs, sizeof(ccs)); > + > + return; Nit: this return statement can be omitted.
Attachment #8432812 -
Flags: review+
Comment 17•7 years ago
|
||
(In reply to Eric Rescorla (:ekr) from comment #8) > > 1. I don't sanity check that the CCS is correct, since we are ignoring it > anyway. I believe this is consistent with our general practice, but I wanted > to get your opinion. This is fine by me. How we handle this corner case doesn't matter much in practice, so I prefer your simpler code. > 2. I decided to return SECSuccess. This is what we do with out of order > handshake messages so it should be safe. I agree with your analysis. > 3. I had to add code to set buf->len = 0 because it's expected that we will > consume the > entire buffer. Failure to do so causes an assert later. Does this seem OK? Yes, this matches what ssl3_HandleChangeCipherSpecs does when it handles a good ChangeCipherSpec, and also what dtls_HandleHandshake does before it returns SECSuccess: SECStatus dtls_HandleHandshake(sslSocket *ss, sslBuffer *origBuf) { ... SECStatus rv = SECSuccess; ... origBuf->len = 0; /* So ssl3_GatherAppDataRecord will keep looping. */ /* XXX OK for now. In future handle rv == SECWouldBlock safely in order * to deal with asynchronous certificate verification */ return rv; }
Updated•7 years ago
|
OS: Mac OS X → All
Priority: -- → P1
Hardware: x86 → All
Target Milestone: --- → 3.16.2
Version: trunk → 3.14
Assignee | ||
Comment 18•7 years ago
|
||
Keeler, Do you know when we plan to upgrade NSS in Firefox? We can't land the unit test above until that happens.
Flags: needinfo?(dkeeler)
Looks like around June 6th. This is what Kai sent to the nss-dev list: Today we discussed to target the 3.16.2 release in about 3 weeks, around June 20. All new feature work for 3.16.2 should ideally be checked in to NSS trunk by June 6, and on that day we should land an updated beta into mozilla-central (prior to the Mozilla June 9 branch date). After June 6 we'll have some more time to get web-crypto related bugfixes done, which Ryan is working on. Also, next week we'll be making some changes to the root CA list, and final testing confirmations might take until after June 9. We should freeze new NSS feature work after June 6 and limit changes to bugfixes, until the final release of 3.16.2, to keep it simple (avoid branching of NSS). (Note I'll be on PTO starting Wednesday, June 18, to Sunday, June 22.) Thanks, Kai __ nss-dev mailing list nss-dev@mozilla.org https://mail.mozilla.org/listinfo/nss-dev
Flags: needinfo?(dkeeler)
Assignee | ||
Comment 20•7 years ago
|
||
Comment on attachment 8433760 [details] [diff] [review] part 1: NSS fix by Eric Rescorla, v2 Review of attachment 8433760 [details] [diff] [review]: ----------------------------------------------------------------- [Approval Request Comment] Bug caused by (feature/regressing bug #): In original DTLS code. User impact if declined: Continued occasional DTLS connectivity failures under packet loss. Testing completed (on m-c, etc.): Hand tested. Unit test in patch #2. Risk to taking this patch (and alternatives if risky): Low. Patch is already in NSS and we just didn't pull the latest NSS into Firefox yet. String or IDL/UUID changes made by this patch: None.
Attachment #8433760 -
Flags: approval-mozilla-aurora?
Comment 21•7 years ago
|
||
Eric: I tried to update NSS to NSS_3_16_2_BETA3 last Thursday, but it was reverted because of WebCrypto mochitest failure. Since I don't know how to run mochitest, I could not debug it quickly. With Richard's help, I finally tracked it down and wrote a patch on Saturday. I will test my fix today and hopefully retry the NSS update later today or tomorrow.
Assignee | ||
Comment 22•7 years ago
|
||
Wan-Teh, Thanks for checking into this. I suspect that if you update NSS as you describe in c21, it would be better to just uplift that to Aurora, given that the branch just happened today. What do you think?
Comment 23•7 years ago
|
||
Eric, I am not familiar with how Firefox branches are created, so I'd like to only deal with mozilla-inbound and let others uplift NSS upgrades to Aurora. Is that OK? But I am wondering why it is fine to just uplift an NSS update to Aurora without pushing it to mozilla-inbound or mozilla-central?
Assignee | ||
Comment 24•7 years ago
|
||
Wan-Teh, I am not too familiar with how NSS gets into Firefox myself, but here is what I was assuming would happen: 1. We would eventually move this version of NSS into m-i and m-c. 2. We would concurrently either: (a) update aurora to use the new version of NSS (I.e., uplift NSS) (b) update aurora to just have this one patch if people felt that (a) was risky. Does that seem sensible? Do you have a preference?
Comment 25•7 years ago
|
||
Eric, You mean just uplifting your NSS patch in the bug to Aurora? Yes, we can apply NSS patches directly when necessary. But the patches need to be documented in the security/patches directory (adding a patch file and describing the patch in the README file). Also, I think it still needs to be pushed to mozilla-inbound first. Would you like to do that?
Assignee | ||
Comment 26•7 years ago
|
||
Wan-Teh, It's my fault for not being clear. My final goal here is to make sure we have this fix for FF 32 (which just went to Aurora). I'm flexible about how that is accomplished. If you think it's possible to land a new version of NSS on m-i soon and then uplift that to Aurora, that's probably best.
Assignee | ||
Comment 27•7 years ago
|
||
Wan-Teh, Do you have any updates on the NSS update for Firefox? Were you and Richard able to fix the build problems?
Flags: needinfo?(wtc)
Comment 28•7 years ago
|
||
Eric: I just pushed NSS_3_16_2_BETA4 to mozilla-inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/957b31f09f9e
Flags: needinfo?(wtc)
Comment 29•7 years ago
|
||
Comment on attachment 8433760 [details] [diff] [review] part 1: NSS fix by Eric Rescorla, v2 The NSS fix was pushed to mozilla-central: https://hg.mozilla.org/mozilla-central/rev/957b31f09f9e
Assignee | ||
Comment 30•7 years ago
|
||
Assignee | ||
Updated•7 years ago
|
Attachment #8441484 -
Flags: review?(martin.thomson)
Comment 31•7 years ago
|
||
Comment on attachment 8441484 [details] [diff] [review] Additional DTLS tests Review of attachment 8441484 [details] [diff] [review]: ----------------------------------------------------------------- ::: media/mtransport/test/transport_unittests.cpp @@ +217,5 @@ > + peer->fingerprint_, > + peer->fingerprint_len_); > + > + size_t fingerprint_len = peer->fingerprint_len_; > + MOZ_ASSERT(len_offset); MOZ_ASSERT(peer->fingerprint_len_ + len_offset < TransportLayer::kMaxDigestLength); MOZ_ASSERT(peer->fingerprint_len_ + len_offset > 0); @@ +223,5 @@ > + // Avoid adding a negative number to an unsigned. > + if (len_offset > 0) { > + fingerprint_len += len_offset; > + } else { > + fingerprint_len -= (-1 * len_offset); Want to add the static_cast<size_t> stuff we discussed here to avoid the warning? @@ +231,5 @@ > + "sha-1", > + fingerprint_to_set, fingerprint_len); > + > + if (fingerprint_len > TransportLayerDtls::kMaxDigestLength) { > + ASSERT_FALSE(NS_SUCCEEDED(res)); Just looking at SetVerificationDigest, if the fingerprint is too long, it doesn't change whether verification is required, or the old fingerprint. Since I assume that the test runs, what is causing it to fail in TestConnectBadDigestSuperLong then?
Attachment #8441484 -
Flags: review?(martin.thomson) → review+
Assignee | ||
Updated•7 years ago
|
Attachment #8441484 -
Attachment is obsolete: true
Assignee | ||
Comment 32•7 years ago
|
||
--- media/mtransport/test/transport_unittests.cpp | 162 +++++++++++++++++++++++++- 1 file changed, 161 insertions(+), 1 deletion(-)
Assignee | ||
Updated•7 years ago
|
Attachment #8441534 -
Flags: review?(martin.thomson)
Comment 33•7 years ago
|
||
Comment on attachment 8441534 [details] [diff] [review] [PATCH] part 2: Unit test Review of attachment 8441534 [details] [diff] [review]: ----------------------------------------------------------------- ::: media/mtransport/test/transport_unittests.cpp @@ +48,5 @@ > +const uint8_t kTlsHandshakeType = 0x16; > + > +const uint8_t kTlsHandshakeCertificate = 0x0b; > + > +const unsigned char kTlsFakeChangeCipherSpec[] = { Why not be consistent and use uint8_t here too? @@ +122,5 @@ > void SetLoss(uint32_t packet) { > loss_mask_ |= (1 << (packet & 32)); > } > > + void SetInspector(Inspector* inspector) { // transfers ownership of inspector to TransportLayerLossy @@ +167,5 @@ > + public: > + DtlsRecordParser(const unsigned char *data, size_t len) > + : buffer_(data, len), offset_(0) {} > + > + bool NextRecord(uint8_t* ct, RefPtr<DataBuffer>* buffer) { What's the story on pass by reference here? @@ +239,5 @@ > + virtual void OnRecord(TransportLayer* layer, > + uint8_t content_type, > + const unsigned char *data, size_t len) { > + // Only inject once. > + if (injected_) I'll only say this once, hopefully, but I prefer to see braces on all control statements.
Attachment #8441534 -
Flags: review?(martin.thomson) → review+
Assignee | ||
Comment 34•7 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/882807bc3d73
Comment 35•7 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/882807bc3d73
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Comment 36•7 years ago
|
||
Comment on attachment 8433760 [details] [diff] [review] part 1: NSS fix by Eric Rescorla, v2 Aurora approval granted.
Attachment #8433760 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Assignee | ||
Comment 37•7 years ago
|
||
I'm actually now kind of confused about what the right thing is here, since this is in NSS. Do we want to just update Aurora to the new NSS?
Flags: needinfo?(lmandel)
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(dougt)
Assignee | ||
Updated•7 years ago
|
Flags: needinfo?(wtc)
Comment 38•7 years ago
|
||
(In reply to Eric Rescorla (:ekr) from comment #37) > I'm actually now kind of confused about what the right thing is here, > since this is in NSS. Do we want to just update Aurora to the new NSS? That was my understanding when approving. We're early enough in Aurora where I'm comfortable taking this change if the update is thought to be safe. I won't want to update again after the midpoint on Aurora (end of next week).
Flags: needinfo?(lmandel)
Comment 39•7 years ago
|
||
I suspect you'd uplift https://hg.mozilla.org/mozilla-central/rev/957b31f09f9e
Flags: needinfo?(dougt)
(In reply to Doug Turner (:dougt) from comment #39) > I suspect you'd uplift > https://hg.mozilla.org/mozilla-central/rev/957b31f09f9e More or less - see bug 1020695.
Comment 41•7 years ago
|
||
Eric: David or I will take care of uplifting a new NSS tag to mozilla-aurora in bug 1020695.
Flags: needinfo?(wtc)
Assignee | ||
Comment 42•7 years ago
|
||
Wan-Teh: Thank you for your help.
You need to log in
before you can comment on or make changes to this bug.
Description
•