Closed Bug 1340854 Opened 3 years ago Closed 3 years ago

Properly report TLS handshake telemetry for 0 length reads

Categories

(Core :: Security: PSM, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla54
Tracking Status
firefox52 --- fixed
firefox-esr52 --- fixed
firefox53 --- fixed
firefox54 --- fixed

People

(Reporter: ekr, Assigned: ekr)

Details

Attachments

(1 file)

0 length reads cause intolerance fallback, but we weren't reporting it as handshake failure. This patch fixes that.
Attachment #8838912 - Flags: review?(dkeeler)
Comment on attachment 8838912 [details] [diff] [review]
0001-Report-0-length-reads.patch

Not sure who is around, so...
Attachment #8838912 - Flags: review?(martin.thomson)
Here's a precis of checkHandshake


- bytesTransferred is negative (line 1214)
   woudlblock -> returns

- bytesTransfered is 0 && reading (line 1240)
  error is something or other, but we call for intolerance with PR_END_OF_FILE_ERROR

- If we are retrying, then bytesTransfered = -1


The fix is to make reportHandshakeStatus report PR_END_OF_FILE error if bytesTransfered == 0 and wasReading
Comment on attachment 8838912 [details] [diff] [review]
0001-Report-0-length-reads.patch

Review of attachment 8838912 [details] [diff] [review]:
-----------------------------------------------------------------

Seems fine.  One day you are going to get an r- just for uploading patches with insufficient context.
Attachment #8838912 - Flags: review?(martin.thomson) → review+
Hopefully before that happens we will have a review system which isn't nightmarish to use from Git.
Keywords: checkin-needed
Pushed by ekr@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/356449a93a4c
Properly report TLS handshake telemetry for 0 length reads. r=mt
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/356449a93a4c
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla54
Comment on attachment 8838912 [details] [diff] [review]
0001-Report-0-length-reads.patch

Review of attachment 8838912 [details] [diff] [review]:
-----------------------------------------------------------------

:mt got to it first, but lgtm in any case.
Attachment #8838912 - Flags: review?(dkeeler)
Comment on attachment 8838912 [details] [diff] [review]
0001-Report-0-length-reads.patch

This is the second part of a probe to measure TLS success/failure,
as discussed in e-mail/IRC.

Approval Request Comment
[Feature/Bug causing the regression]: N/A
[User impact if declined]: Inferior reporting of TLS success/failure
with potential slippage of TLS 1.3 ship.
[Is this code covered by automated tests?]: It runs but no tests
for accuracy.
[Has the fix been verified in Nightly?]: Locally and shipped in Nightly w/o signs of breakage
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]: None
[Is the change risky?]: No
[Why is the change risky/not risky?]: Simple change to a telemetry
probe that doesn't affect user behavior.
[String changes made/needed]: None.
Attachment #8838912 - Flags: approval-mozilla-beta?
Attachment #8838912 - Flags: approval-mozilla-aurora?
Comment on attachment 8838912 [details] [diff] [review]
0001-Report-0-length-reads.patch

fix a telemetry probe, aurora53+, beta52+
Attachment #8838912 - Flags: approval-mozilla-beta?
Attachment #8838912 - Flags: approval-mozilla-beta+
Attachment #8838912 - Flags: approval-mozilla-aurora?
Attachment #8838912 - Flags: approval-mozilla-aurora+
Assignee: nobody → ekr
I know this is an old bug, but I manage to hit this assertion:
1181   MOZ_ASSERT(false);

https://hg.mozilla.org/mozilla-central/annotate/tip/security/manager/ssl/nsNSSIOLayer.cpp#l1181

Are we sure that this will never happen? As a part of TCP FastOpen, I call PR_Write with 0 bytes to write to force draining a buffer in TCPFastOpenLayer (this layer is under nss layers) (this code still have not landed). Are we allowed to call PR_Write with 0 bytes to be written?

TCPFastOpenLayer::Send is called with 0 bytes to write, I tried 2 things:
1) TCPFastOpenLayer::Send calls underlying layer with 0 bytes to write.
2) If TCPFastOpenLayer::Send gets 0bytes to write returns WOULD_BLOCK error

The result is the same in both cases. TCPFastOpenLayer::Send is called with 51 bytes to be written, these 51 bytes is part of a handshake, and it returns 51 bytes written. After that it is not called any more and the assertion is hit. Also ssl_DefSent is not call any more.

I suspect that 
    rv = ssl3_SendApplicationData(ss, buf, len, flags);
returns 0. (This is called from ssl_SecureSend(sslSocket *ss, const unsigned char *buf, int len, int flags)).

I cannot reproduce it locally. It fails on try, and I used fprintf to figure out what is happening.

Necko used to call PR_Write with 0 bytes to drive handshake, but this was changed last summer.

If we are no allowed to call PR_Write with 0 bytes to be written, I will change my patch.
Flags: needinfo?(ekr)
Ah, nice catch. I am not sure, TBH.

My first thought is that I don't think there is any guarantee that PR_Write(0)) will do anything. So, NSS might decide to drive the handshake forward but it might also just decide to return immediately. And it certainly should always return that 0 bytes have been written in that case. I think if you want to drive the handshake forward you probably need to do ForceHandshake()
Flags: needinfo?(ekr)
(In reply to Eric Rescorla (:ekr) from comment #14)
> Ah, nice catch. I am not sure, TBH.
> 
> My first thought is that I don't think there is any guarantee that
> PR_Write(0)) will do anything. So, NSS might decide to drive the handshake
> forward but it might also just decide to return immediately. And it
> certainly should always return that 0 bytes have been written in that case.
> I think if you want to drive the handshake forward you probably need to do
> ForceHandshake()

Thanks.
I will change my patch. I want to write to the network the data that was buffered in a layer under tls. TLS thinks that these data are already send. So we can have a problem if nss decides to return immediately. I will change my patch.
So, I think you probably need to just have a handle to the lower-layer NSPR thing and tell it to flush itself.

That's the kind of thing we do in media/mtransport.
(In reply to Eric Rescorla (:ekr) from comment #16)
> So, I think you probably need to just have a handle to the lower-layer NSPR
> thing and tell it to flush itself.
> 
> That's the kind of thing we do in media/mtransport.

That was my backup plan.
I tried with the first one because that one will make nsSocketTransport immediately and nicely pick up errors, but the backup plan works too.
You need to log in before you can comment on or make changes to this bug.