Closed Bug 713933 Opened 13 years ago Closed 11 years ago

Make SSL False Start work with asynchronous certificate validation (SSL_AuthCertificateComplete)

Categories

(NSS :: Libraries, enhancement, P1)

enhancement

Tracking

(Not tracked)

RESOLVED FIXED
3.15.4

People

(Reporter: briansmith, Assigned: briansmith)

References

Details

Attachments

(6 files, 26 obsolete files)

7.17 KB, patch
wtc
: review+
Details | Diff | Splinter Review
3.24 KB, patch
wtc
: review+
Details | Diff | Splinter Review
2.98 KB, patch
briansmith
: review?
wtc
Details | Diff | Splinter Review
41.68 KB, patch
briansmith
: checked-in+
Details | Diff | Splinter Review
1.19 KB, patch
briansmith
: review+
Details | Diff | Splinter Review
1.57 KB, patch
briansmith
: review+
Details | Diff | Splinter Review
When the auth certificate callback returns SECWouldBlock, False Start is disabled on the connection (as of bug 542832). If we didn't disable False Start, then False Start could cause us to start sending application data before the SSL_AuthCertificateComplete has been called.

In order to allow these two features to work tegether, we need to change the False Start implementation to wait for SSL_AuthCertificateComplete to be called, and we need to remove the code in ssl3_HandleCertificate that disables False Start when the auth certificate hook returns SECWouldBlock.
I have a patch that updates false start in moz-central nss in the following ways

 * fixes the async cert verification issue while still using the original false start path when synchronous verification is being done

 * provides an option to link false start to the presence of the NPN extension in the server hello - in effect using NPN as a signal of an up to date. This option is true by default.

* provides an option to link false start to the presence of a forward secrecy preserving key exchange algorithm. This option is true by default. This will cause a minor merge conflict with the cvsroot due to bug 810582 - but the patch here accommodates both policies and provides an option so I hope it is acceptable to supercede 810582

* provides a new bool in SSL_GetChannelInfo() to determine if all the expected handshake bits are completed even in the presence of false start where the handhake callback has been executed earlier.

The code for creating gecko prefs and mapping them to the NSS options can be handled over in psm bug 658222

https://tbpl.mozilla.org/?tree=Try&rev=605f96e2fe2c
Assignee: nobody → mcmanus
Depends on: 810582
Attached patch patch 0 (obsolete) — Splinter Review
Attachment #681956 - Flags: review?(bsmith)
Comment on attachment 681956 [details] [diff] [review]
patch 0

Review of attachment 681956 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/nss/lib/ssl/ssl.h
@@ +163,5 @@
> +/* Require the NPN extension to be supported by server as a precondition    */
> +/* to using false start. This helps screen out older server implementations */
> +/* which may not be compatible. True by default.                            */
> +
> +#define SSL_FALSE_START_REQUIRES_FORWARD_SECRECY 25

These two options make it very complicated to use SSL False Start.
Ideally NSS should automatically do the right thing -- avoid
incompatibilities and avoid cipher suite downgrade.

Do you really need to flexibility provided by these two options?

@@ +626,5 @@
>  
>  /*
>  ** Set the callback on a particular socket that gets called when we finish
> +** performing a handshake. For clients performing false start this will
> +** occur after Server Hello has been received.

I remember for False Start we call the handshake callback after sending
the client's Finished message to the server. Did this patch change that?
If not, then this comment is not accurate.

::: security/nss/lib/ssl/ssl3con.c
@@ +6085,5 @@
> +             ss->ssl3.hs.kea_def->kea == kea_dhe_rsa ||
> +             ss->ssl3.hs.kea_def->kea == kea_ecdhe_ecdsa ||
> +             ss->ssl3.hs.kea_def->kea == kea_ecdhe_rsa);
> +    }
> +    else {

"} else {" should be on the same line.

::: security/nss/lib/ssl/sslinfo.c
@@ +86,5 @@
>  	        inf.sessionIDLength = sidLen;
>  		memcpy(inf.sessionID, sid->u.ssl3.sessionID, sidLen);
>  	    }
>  	}
> +    inf.fullHandshakeComplete = ss->firstHsDone;

'fullHandshakeComplete' is poorly named. "Full handshake" has a special meaning in TLS.
See Figure 1 of RFC 2246, That's not what ss->firstHsDone means.
Comment on attachment 681956 [details] [diff] [review]
patch 0

Review of attachment 681956 [details] [diff] [review]:
-----------------------------------------------------------------

Always base NSS patches on the NSS CVS HEAD. (If you want to do this within the context of a Gecko source tree, then create a new revision--or patch in a patch queue--that contains the output of "python client.py update_nss HEAD".) Even if we want to experiment with this patch in mozilla-central before integrating it into the shared NSS codebase, we would still probably update NSS to the latest revision in mozilla-central so that a patch against the CVS HEAD would apply cleanly.

I filed a new bug for the additional options you propose: bug 820705. Attach a new version of the patch that fixes this bug without fixing bug 820705.

One of the main concerns I had about False Start, and the primary reason I disabled False Start when async. cert validation is used, is concern about code flow like this:
    
     canFalseStart = ssl_CanFalseStart(ss);
     if (canFalseStart) {
        // do X
     }

     ....

     canFalseStart = ssl_CanFalseStart(ss);
     if (canFalseStart) {
        // do Y
     }

In particular, ssl3_CanFalseStart might return PR_FALSE the first time and then PR_TRUE the second time, e.g. if SSL_AuthCertificateComplete were to be called in between the two calls. I remember thinking that it is necessary to avoid calling ssl3_CanFalseStart more than once for this reason. But, I cannot remember the reason exactly. I need to look into it more.

Check the whitespace and use the same convention, which is currently tabs (with one tab == 8 spaces) for indention. Several parts of the patch are not indented correctly because of this.

::: security/nss/lib/ssl/ssl.h
@@ +626,5 @@
>  
>  /*
>  ** Set the callback on a particular socket that gets called when we finish
> +** performing a handshake. For clients performing false start this will
> +** occur after Server Hello has been received.

I suggest:

"When false start is disabled, the callback will be called after the integrity of the handshake has been confirmed (after the peer's Finished message has been received.) When false start is enabled, the callback will be called *before* the integrity of the handshake has been confirmed (after the peer's ServerHelloDone message is received and before any application data is sent); the contents of messages received from the peer before the callback is called are available to the callback (in particular, the peer's certificate and certificate status messages)."

We should reserve the flexibility in when exactly we call the handshake callback, as long as we call it between the two points noted in the previous paragraph.

::: security/nss/lib/ssl/ssl3con.c
@@ +6065,5 @@
>  ssl3_CanFalseStart(sslSocket *ss) {
>      PRBool rv;
>  
>      PORT_Assert( ss->opt.noLocks || ssl_HaveSSL3HandshakeLock(ss) );
>  

It is clearer to add the check that we're not waiting for a certificate validation result here, than in the place that you are currently doing it. But, I am not sure if there is a reason you added it there instead of here; is there?

@@ +6070,1 @@
>      ssl_GetSpecReadLock(ss);

It seems like the majority of the checks we do can be done before we acquire the spec read lock. It seems like a good idea to move this acquisition of the spec read lock to be below the checks that do not require it to be held. This will allow you to use early return rather than the "rv = rv && ..." pattern.

@@ +6070,3 @@
>      ssl_GetSpecReadLock(ss);
> +    rv = ss->opt.enableFalseStart && !ss->sec.isServer &&
> +        !ss->ssl3.hs.isResuming && ss->ssl3.cwSpec;

Keep each condition on a separate line as was orignally done.

In general, it is good to observe the tradition of avoiding making gratuitous or purely stylistic changes in NSS (and especially libssl) when avoidable.

@@ +6091,5 @@
> +        rv = rv &&
> +            ss->ssl3.cwSpec->cipher_def->secret_key_size >= 10 &&
> +            (ss->ssl3.hs.kea_def->exchKeyType == ssl_kea_rsa ||
> +             ss->ssl3.hs.kea_def->exchKeyType == ssl_kea_dh  ||
> +             ss->ssl3.hs.kea_def->exchKeyType == ssl_kea_ecdh);

This is not correct, but this part of the patch will be moved to bug 820705.

@@ +6239,5 @@
>  
> +    /* Do the handshake callback for sslv3 here, if we can false start and are not
> +       not doing async certificate verification */
> +    if (ss->handshakeCallback != NULL &&
> +        !ss->ssl3.hs.didHSCallback && !ss->ssl3.hs.authCertificatePending &&

s/didHSCallback/calledHandshakeCallback/ to make the name clearer.

It seems like a good idea to remove the !ss->ssl3.hs.authCertificatePending check here, and replace it with a !ss->ssl3.hs.restartTarget check in ssl3_CanFalseStart.

@@ +8679,5 @@
> +        !ss->ssl3.hs.didHSCallback &&
> +        ssl3_CanFalseStart(ss)) {
> +        ss->ssl3.hs.didHSCallback = PR_TRUE;
> +	(ss->handshakeCallback)(ss->fd, ss->handshakeCallbackData);
> +    }

Why is this needed? Above, in the target(ss) line, we will call either ssl3_SendClientSecondRound or ssl3_FinishHandshake, either of which should have already done this, no?

::: security/nss/lib/ssl/sslinfo.c
@@ +86,5 @@
>  	        inf.sessionIDLength = sidLen;
>  		memcpy(inf.sessionID, sid->u.ssl3.sessionID, sidLen);
>  	    }
>  	}
> +    inf.fullHandshakeComplete = ss->firstHsDone;

IMO, it was a mistake in the original implementation of False Start to have the handshake callback called before the handshake was actually (securely) done. Instead, we probably should have added a "BeforeFalseStart" callback that is called. Too late now. But, instead of setting this flag, it would be better to have a new callback, e.g. HandshakeFinishedCallback, that is always called after the handshake has been (securely) finished. I am pretty sure that in Gecko we need to execute code at both points in time. If we were to do that, then we would not need this.

::: security/nss/lib/ssl/sslt.h
@@ +120,5 @@
>      /* compression method info */
>      const char *         compressionMethodName;
>      SSLCompressionMethod compressionMethod;
> +
> +    /* fullHandshakeComplete was added in NSS 3.15 ?? */

It is not a good idea to add more of this kind of comment.
Attachment #681956 - Flags: review?(bsmith) → review-
> ::: security/nss/lib/ssl/sslinfo.c
> @@ +86,5 @@
> >  	        inf.sessionIDLength = sidLen;
> >  		memcpy(inf.sessionID, sid->u.ssl3.sessionID, sidLen);
> >  	    }
> >  	}
> > +    inf.fullHandshakeComplete = ss->firstHsDone;
> 
> IMO, it was a mistake in the original implementation of False Start to have
> the handshake callback called before the handshake was actually (securely)
> done. Instead, we probably should have added a "BeforeFalseStart" callback
> that is called. Too late now. But, instead of setting this flag, it would be
> better to have a new callback, e.g. HandshakeFinishedCallback, that is
> always called after the handshake has been (securely) finished. I am pretty
> sure that in Gecko we need to execute code at both points in time. If we
> were to do that, then we would not need this.
> 

for gecko purposes, the info bit and the callback are equivalent. We are already polling for this information based on I/O events.
(In reply to Brian Smith (:bsmith) from comment #4)
> Comment on attachment 681956 [details] [diff] [review]
> patch 0
> 
> Review of attachment 681956 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> 
> In particular, ssl3_CanFalseStart might return PR_FALSE the first time and
> then PR_TRUE the second time, e.g. if SSL_AuthCertificateComplete were to be
> called in between the two calls. I remember thinking that it is necessary to
> avoid calling ssl3_CanFalseStart more than once for this reason. But, I
> cannot remember the reason exactly. I need to look into it more.
> 

I've spent some time pondering the question, and other than the need to track whether or not the handshake callback actually happened early I don't see any problem with this. But you're clearly the expert - we just need to move forward now.

Actually, I'm not entirely certain how the return value of this function could change within a given handshake based on where the functions are called, other than options being flipped, but maybe I'm missing something obvious.

> 
> ::: security/nss/lib/ssl/ssl3con.c
> @@ +6065,5 @@
> >  ssl3_CanFalseStart(sslSocket *ss) {

> 
> It is clearer to add the check that we're not waiting for a certificate
> validation result here, than in the place that you are currently doing it.
> But, I am not sure if there is a reason you added it there instead of here;
> is there?

primarily this is done to avoid/reduce having ssl3_CanFalseStart() return different values at different points in time based on async cert validation (your previous concern).

> 
> @@ +6070,1 @@
> >      ssl_GetSpecReadLock(ss);
> 
> It seems like the majority of the checks we do can be done before we acquire
> the spec read lock. It seems like a good idea to move this acquisition of
> the spec read lock to be below the checks that do not require it to be held.
> This will allow you to use early return rather than the "rv = rv && ..."
> pattern.

I'm just using the locking pattern already in this function. I'd appreciate it if you made the suggested change in a different bug/patch. I don't feel confident enough to make it.


> @@ +6239,5 @@
>
> It seems like a good idea to remove the !ss->ssl3.hs.authCertificatePending
> check here, and replace it with a !ss->ssl3.hs.restartTarget check in
> ssl3_CanFalseStart.
>

noted above why it isn't in CanFalseStart(). Should this call site be changed to !restartTarget from !authCertificatePending ? What will be caught by making that change? restartTarget seems to be null sometimes when authCertificatePending is true. (see below)
 
> @@ +8679,5 @@
> > +        !ss->ssl3.hs.didHSCallback &&
> > +        ssl3_CanFalseStart(ss)) {
> > +        ss->ssl3.hs.didHSCallback = PR_TRUE;
> > +	(ss->handshakeCallback)(ss->fd, ss->handshakeCallbackData);
> > +    }
> 
> Why is this needed? Above, in the target(ss) line, we will call either
> ssl3_SendClientSecondRound or ssl3_FinishHandshake, either of which should
> have already done this, no?

In the handshakes I have looked at restartTarget is null at that point.. there is no client cert to send and we haven't recevied a finished message yet (and don't want to wait for one). Its effectively where the handshake gets kicked off when we do false start.

> better to have a new callback, e.g. HandshakeFinishedCallback, that is
> always called after the handshake has been (securely) finished. I am pretty
> sure that in Gecko we need to execute code at both points in time. If we
> were to do that, then we would not need this.

fwiw I have made this change in the updates I'm working on.
Attached patch patch 1 (obsolete) — Splinter Review
Attachment #681956 - Attachment is obsolete: true
Attachment #691846 - Flags: review?(bsmith)
Comment on attachment 691846 [details] [diff] [review]
patch 1

Review of attachment 691846 [details] [diff] [review]:
-----------------------------------------------------------------

Patrick, please consider the change below, and if you agree that it it would work, try it out. That change would make determining the correctness of the fix for this bug much more obvious.

::: security/nss/lib/ssl/ssl.def
@@ +157,5 @@
>  ;+*;
>  ;+};
> +;+NSS_3.15 {    # NSS 3.15 release?
> +;+    global:
> +SSL_HandshakeFinishedCallback;

The addition of this should be done in a separate bug.

::: security/nss/lib/ssl/ssl.h
@@ +620,5 @@
> +** When false start is disabled, the callback will be called after the
> +** integrity of the handshake has been confirmed (after the peer's
> +** Finished message has been received.) When false start is enabled, the
> +** callback will be called *before* the integrity of the handshake has
> +** been confirmed (after the peer's ServerHelloDone message is received

s/received and/, after the peer's certificate has been validated, and/

@@ +624,5 @@
> +** been confirmed (after the peer's ServerHelloDone message is received
> +** and before any application data is sent); the contents of messages
> +** received from the peer before the callback is called are available to
> +** the callback (in particular, the peer's certificate and certificate
> +** status messages).

This should be "[...] callback. (In particular, the peer's certificate and certificate status messages are available.)"

::: security/nss/lib/ssl/ssl3con.c
@@ +6070,5 @@
>      rv = ss->opt.enableFalseStart &&
>  	 !ss->sec.isServer &&
>  	 !ss->ssl3.hs.isResuming &&
>  	 ss->ssl3.cwSpec &&
> +	 ss->ssl3.nextProtoState != SSL_NEXT_PROTO_NO_SUPPORT &&

We need to do this in a separate bug. (Based on our previous discussions, I think we probably shouldn't do this at all, at least within libssl.)

@@ +6228,5 @@
> +    /* Do the handshake callback for sslv3 here, if we can false start and are not
> +       not doing async certificate verification */
> +    if (ss->handshakeCallback != NULL &&
> +	!ss->ssl3.hs.calledHandshakeCallback &&
> +	!ss->ssl3.hs.authCertificatePending &&

It is confusing to test calledHandshakeCallback here because we know we've never called the handshake callback before at the point.

I think it makes this code much more clear if we rename calledHandshakeCallback to falseStarted.

Also, I think we're supposed to return SECWouldBlock in the authCertificatePending case, something like this:

if (ssl3_CanFalseStart(ss)) {
    MOZ_ASERT(!ss->ssl3.hs.falseStarted);
    if (ss->ssl3.hs.authCertificatePending) {
        ss->restartTarget = ssl3_FalseStart;
        return SECWouldBlock;
    }

    return ssl3_FalseStart(ss);    
}

given:

static SECStatus
ssl3_FalseStart(sslSocket * ss)
{
    PORT_Assert(ss->opt.noLocks || ssl_HaveRecvBufLock(ss));
    PORT_Assert(ss->opt.noLocks || ssl_HaveSSL3HandshakeLock(ss));

    MOZ_ASSERT(!ss->sssl3.hs.falseStarted);
    /* Must be set before we call handshake callback, because the
     * handshake callback may call functions that depend on
     * the value of falseStarted, like SSL_GetChannelInfo.
     */
    ss->ssl3.hs.falseStarted = PR_TRUE;
    if (ss->handshakeCallback) {
       (ss->handshakeCallback)(ss->fd, ss->handshakeCallbackData);
    }

    return SECSuccess;
}

The main benefit of this would be, AFAICT, that every call to ssl3_CanFalseStart except this one can be replaced with a simple test of ss->ssl3.hs.falseStarted.

Making this kind of change would eliminate the issue that we have of it being difficult to determine which calls to ssl3_CanFalseStart need to be combined with a check of !ss->ssl3.hs.authCertificatePending. Note in particular that it seems like, in the patch currently under review, that the call to ssl3_CanFalseStart in ssl3_GatherCompleteHandshake is missing a check for !ss->ssl3.hs.authCertificatePending.

Besides that benefit, the duplicated logic for calling the handshakeCallback that was added to ssl3_AuthCertificateComplete could be removed.

Incidentally, this would pave the way for making ssl3_CanFalseStart pluggable as discussed on the mailing list.

Note that if we do as I suggest then the condition in ssl3_HandleFinished would have to be changed as follows:

- if (ss->ssl3.hs.restartTarget) {
+ if (ss->ssl3.hs.restartTarget &&
+     ss->ssl3.hs.restartTarget != ssl3_FalseStart) {
Attachment #691846 - Flags: review?(bsmith) → review-
(In reply to Brian Smith (:bsmith) from comment #8)
> if (ssl3_CanFalseStart(ss)) {
>     MOZ_ASERT(!ss->ssl3.hs.falseStarted);
>     if (ss->ssl3.hs.authCertificatePending) {
>         ss->restartTarget = ssl3_FalseStart;
>         return SECWouldBlock;
>     }
> 
>     return ssl3_FalseStart(ss);    
> }

After I commented in bug 589047, I realized that this advise is not ideal. We should not return  early in the ss->ssl3.hs.authCertificatePending case because we still want the handshaking to continue even if we can't false start yet, as explained in that bug (bug 589047). So, something more like this:

if (ssl3_CanFalseStart(ss)) {
  if (ss->ssl3.hs.authCertificatePending) {
    (void) ssl3_FalseStart(ss);    
  } else {
    /* Don't false start until certificate authentication completes, but
     * remember that we can false start so that we do it as soon as
     * certificate verification completes. */
    ss->restartTarget = ssl3_FalseStart;
  }
}

> > +SSL_HandshakeFinishedCallback;
> 
> The addition of this should be done in a separate bug.

This was unclear. I mean that the addition of SSL_HandshakeFinishedCallback should be done in a separate bug, if it is still needed. (We should fix bug 823409 instead.)
Attached patch patch v2 (obsolete) — Splinter Review
Set the NSS False Start Pref to true for Mozilla Necko.

Implement the NSS CanFalseStart callback to require NPN (as a signal of a modern TLS stack) and either a forward-secret KeyExchange or (RSA and historical evidence that the host has completed a full handshake with RSA in the recent past)
Attachment #691846 - Attachment is obsolete: true
Attachment #701933 - Flags: review?(bsmith)
Attached patch patch 2 (obsolete) — Splinter Review
[please disregard comment 10 - It does not apply to this bug (it was typed in the wrong tab)]

This version of the patch makes the CanFalseStart() determination one time, generally during the gather handshake logic. Subsequent callsites that need that information access a cached value of what CanFalseStart() returned.

A CanFalseStart callback function is introduced. If set by the application this callback is invoked instead of using the existing algorithm embedded in NSS. If it isn't set the existing algorithm is used.

The HandshakeCallback now only happens when the handshake is fully complete by its traditional definition. Applications looking for a false start indication should use the CanFalseStart callback but they can now be confident the entire handshake exchange has completed when HandshakeCallback is invoked.

No actual application sends (which are what really define false start) happen while asynchronous certificate validation is outstanding.
Attachment #701933 - Attachment is obsolete: true
Attachment #701933 - Flags: review?(bsmith)
Attachment #701946 - Flags: review?(bsmith)
Comment on attachment 701946 [details] [diff] [review]
patch 2

Review of attachment 701946 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/nss/lib/ssl/ssl.h
@@ +629,5 @@
> +** used. Applications that do not set the callback will use an internal
> +** algorithm to determine the result if the false start option is enabled.
> +** If the callback is set false start will never be used without invoking the
> +** callback function, but some connections (e.g. resumed connections) will
> +** never use false start and therefore not invoke the callback.

We should expose the NSS built-in criteria for false start as a callable function, e.g. SSL_CanFalseStartCallback, and recommend in the documentation (but not require) that applications add logic like this to their implementations to maximize safety:

    SECStatus rv = SSL_CanFalseStartCallback(..., 
                                                    canFalseStart);
    if (rv != SECSuccess || !*canFalseStart)
        return rv;

@@ +634,5 @@
> +**/
> +typedef SECStatus (PR_CALLBACK *SSLCanFalseStartCallback)(PRFileDesc *fd,
> +                                                          void *arg,
> +                                                          PRUint16 secretKeySize,
> +                                                          SSLKEAType kea);

Please modify this to work as described in bug 823409 comment 0. The interface proposed in bug 823409 is consistent with the way we (NSS team) recently agreed to have callbacks functions work (e.g. the certificate validation callback function).

In particular, we should let the application use SSL_GetChannelInfo, SSL_HandshakeNegotiatedExtension, and SSL_PeerCertificate within the callback so that we don't have to guess as to what parameters to pass it.

Also, we decided previously that the pattern of separating the error status (SECStatus) from the boolean result is important, based on our previous experience where we couldn't determine whether SECFailure meant "stop, there is a problem" or just "no".

::: security/nss/lib/ssl/ssl3con.c
@@ +6069,2 @@
>      ssl_GetSpecReadLock(ss);
> +    if (ss->ssl3.hs.cachedFalseStart) {

I suggest naming this "cachedDidFalseStart" to make it's purpose clearer.

@@ +6069,4 @@
>      ssl_GetSpecReadLock(ss);
> +    if (ss->ssl3.hs.cachedFalseStart) {
> +	rv = ss->ssl3.hs.didFalseStart;
> +    }

else on same line as closing bracket

@@ +6075,5 @@
> +	     ss->ssl3.hs.isResuming ||
> +	     !ss->ssl3.cwSpec) {
> +	rv = PR_FALSE;
> +    }
> +    else if (!ss->canFalseStartCallback) {

The code in this "else if" should be put in the default false start callback.

::: security/nss/lib/ssl/sslauth.c
@@ +80,5 @@
>  
>      if (ss->firstHsDone) {
>  	enoughFirstHsDone = PR_TRUE;
>      } else if (ss->version >= SSL_LIBRARY_VERSION_3_0 &&
> +	       ss->ssl3.hs.didFalseStart) {

The false start callback should be allowed to call this function (and the other similar ones). In order to do that, we need to have a variable such as ssl3.hs.enoughFirstHsDone (default false, set to true in ssl3_HandleServerHelloDone) that we check instead.

Note that this would change the semantics of these functions from "fail unless the handshake was completed safe enough" to "fail unless we've received the ServerHelloDone" message. But, that should be OK.

::: security/nss/lib/ssl/sslinfo.c
@@ +46,5 @@
>  
>      if (ss->firstHsDone) {
>  	enoughFirstHsDone = PR_TRUE;
>      } else if (ss->version >= SSL_LIBRARY_VERSION_3_0 &&
> +	       ss->ssl3.hs.didFalseStart) {

ditto.

::: security/nss/lib/ssl/sslreveal.c
@@ +92,5 @@
>    }
>  
>    if (sslsocket->firstHsDone) {
>      enoughFirstHsDone = PR_TRUE;
> +  } else if (sslsocket->ssl3.initialized && sslsocket->ssl3.hs.didFalseStart) {

ditto.
Attachment #701946 - Flags: review?(bsmith) → review-
Attached patch patch 3 (obsolete) — Splinter Review
Attachment #701946 - Attachment is obsolete: true
Attachment #721453 - Flags: review?(bsmith)
Comment on attachment 721453 [details] [diff] [review]
patch 3

Review of attachment 721453 [details] [diff] [review]:
-----------------------------------------------------------------

Good work. I have some questions and suggestions.

::: security/nss/lib/ssl/ssl.def
@@ +164,5 @@
>  ;+*;
>  ;+};
> +;+NSS_3.14.3 {    # NSS 3.14.3 release
> +;+    global:
> +SSL_SetCanFalseStartCallback;

add SSL_DefaultCanFalseStart;

::: security/nss/lib/ssl/ssl.h
@@ +687,5 @@
> +/* A utility function that can be called from a custom CanFalseStartCallback
> +** function to determine what NSS would have done for this connection if the
> +** custom callback was not implemented.
> +**/
> +SSL_IMPORT SECStatus SSL_NSSCanFalseStart(PRFileDesc *fd, PRBool *canFalseStart);

rename to SSL_DefaultCanFalseStart.

::: security/nss/lib/ssl/ssl3con.c
@@ +6183,5 @@
>  }
>  
> +/* caller must hold SpecReadLock */
> +PRBool
> +ssl3_BuiltinAreFalseStartConditionsMet(sslSocket *ss) {

Brace on its own line.

@@ +6190,5 @@
> +    ssl_GetSpecReadLock(ss);
> +    if (!ss->opt.enableFalseStart ||
> +	ss->sec.isServer ||
> +	ss->ssl3.hs.isResuming ||
> +	!ss->ssl3.cwSpec) {

You do not need to check these conditions because they are preconditions to this function being called. A user-supplied CanFalseStart function must be able to rely on these as preconditions too.

@@ +6198,5 @@
> +	 * do False Start in the case that the selected ciphersuite is
> +	 * sufficiently strong that the attack can gain no advantage.
> +	 * Therefore we require an 80-bit cipher and a forward-secret key
> +	 * exchange. */
> +	rv = ss->ssl3.cwSpec->cipher_def->secret_key_size >= 10 &&

A user-defined CanFalseStart callback cannot read from ss->ssl3.cwSpec (it would have no way to get the SpecReadLock, for example). Ideally, the default implementation shouldn't do any check that a custom callback implementation couldn't do. There are two possible solutions to make things more consistent:

1. A custom callback can read (indirectly) from bulk_cipher_defs[ss->ssl3.hs.suite_def->bulk_cipher_alg] by calling SSL_GetChannelInfo/SSL_GetCipherSuiteInfo. So, using the value from bulk_cipher_defs[ss->ssl3.hs.suite_def->bulk_cipher_alg] would be keep this function symmetric and also to avoid needing to acquire and release the spec read lock here.

2. I do not think there is any disagreements amongst implementations that it is OK to require at least an 80 bit cipher for False Start. So, we could simply move the ss->ssl3.cwSpec->cipher_def->secret_key_size >= 10 check back to ssl3_CanFalseStart. That would also make it clearer why we're checking ss->ssl3.cwSpec in ssl3_FalseStart.

My preferred solution is #2.

@@ +6221,2 @@
>  
>      ssl_GetSpecReadLock(ss);

Similarly to the above, you do not

@@ +6224,5 @@
> +	 ss->sec.isServer ||
> +	 ss->ssl3.hs.isResuming ||
> +	 !ss->ssl3.cwSpec) {
> +	 /* These conditions don't allow false start to operate correctly, so
> +	  * never yield to the decision of a callback on them. */

The comment is misleading because "correctly" isn't really the issue in some of these cases. I suggest you just remove the comment.

@@ +6233,5 @@
> +	 return rv;
> +    }
> +
> +     if (!ss->canFalseStartCallback) {
> +	 rv = ssl3_BuiltinAreFalseStartConditionsMet(ss);

Nit: Why not just call SSL_CanFalseStart here? That would make the idea of SSL_CanFalseStart being the default implementation much more clear, at not much cost in terms of efficiency:

    status = SSL_CanFalseStart(ss->fd, &rv);

Then you could also fold ssl3_BuiltinAreFalseStartConditionsMet into SSL_CanFalseStart to make all this logic even clearer.

@@ +6241,5 @@
> +					      ss->canFalseStartCallbackData,
> +					      &rv);
> +	 if (status != SECSuccess) {
> +	    rv = PR_FALSE;
> +	 }

In order for the (SECStatus, PRBool) callback pattern to make sense, ssl3_CanFalseStart must use that pattern too, and then the caller must do a check like I explain below.

@@ +9463,5 @@
>      }
>  
>      ss->ssl3.hs.ws = idle_handshake;
>  
> +    /* Do the handshake callback for sslv3 here */

Nit: it is obvious what the code does even without the comment, so just remove it.

::: security/nss/lib/ssl/sslimpl.h
@@ +835,5 @@
>      DTLSTimerCb           rtTimerCb;       /* The function to call on expiry */
>      PRUint32              rtTimeoutMs;     /* The length of the current timeout
>  					    * used for backoff (in ms) */
>      PRUint32              rtRetries;       /* The retry counter */
> +    PRBool                cachedDidFalseStart;/* validity of didFalseStart */

This comment isn't correct. Often we're testing for didFalseStart without checking cachedDidFalseStart. A name like "calledCanFalseStart" with an appropriately-updated comment would be clearer.

@@ +838,5 @@
>      PRUint32              rtRetries;       /* The retry counter */
> +    PRBool                cachedDidFalseStart;/* validity of didFalseStart */
> +    PRBool                didFalseStart;   /* value of ssl3_CanFalseStart */
> +    PRBool                enoughFirstHsDone; /*enough of the handshake is done
> +                                               for false start */

s/for false start/for callbacks to be able to retrieve channel security parameters from callback functions/

enoughFirstHsDone needs to be reset to PR_FALSE in SSL_ResetHandshake.

@@ +1331,5 @@
>  
>  extern SECStatus ssl_EnableNagleDelay(sslSocket *ss, PRBool enabled);
>  
>  extern PRBool    ssl3_CanFalseStart(sslSocket *ss);
> +extern PRBool    ssl3_BuiltinAreFalseStartConditionsMet(sslSocket *ss);

can be removed if you follow my suggestion to fold it into SSL_CanFalseStart.

::: security/nss/lib/ssl/sslsecur.c
@@ +117,5 @@
> +		  ss->ssl3.hs.authCertificatePending)) {
> +		ss->firstHsDone         = PR_TRUE;
> +		ss->gs.writeOffset = 0;
> +		ss->gs.readOffset  = 0;
> +	    }

Please explain this part with a comment, because it is very unclear. (In fact, I haven't figured it out completely yet.)

@@ +1259,5 @@
>  	    if ((ss->ssl3.hs.ws == wait_change_cipher ||
>  		ss->ssl3.hs.ws == wait_finished ||
>  		ss->ssl3.hs.ws == wait_new_session_ticket) &&
> +		ss->ssl3.hs.didFalseStart) {
> +		if (ss->ssl3.hs.authCertificatePending) {

Why not add the !ss->ssl3.hs.authCertificatePending check to the beginning of ssl3_CanFalseStart? That would make it clear that we'll never false start unless/until we validated the certificate and it would remove the need to test didFalseStart and authCertificatePending together, because you'd never have the condition (didFalseStart && !authCertificatePending). To me, that would be much, much clearer.

Also, if we change it so that we check for  !ss->ssl3.hs.authCertificatePending in ssl3_CanFalseStart, then we can add a new guarantee to the false start callback function: the false start callback function will only be called if certificate validation succeeded, and only after certificate validation succeeded. This could be very useful, because, for example, a web browser might decide to use information about the certificate validation results in its false start callback. (In particular, I am thinking that in Gecko it would be a good idea to check that there was no cert error override in effect.)

::: security/nss/lib/ssl/sslsock.c
@@ +2302,5 @@
>  	return 0;	/* don't poll on this socket */
>      }
>  
> +    didFalseStart = ss->version >= SSL_LIBRARY_VERSION_3_0 &&
> +	ss->ssl3.hs.didFalseStart;

It seems like here you'd also want to test for !ss->ssl3.hs.authCertificatePending with your current patch.

@@ +2308,4 @@
>      if (ss->opt.useSecurity && 
>  	ss->handshaking != sslHandshakingUndetermined &&
> +	!ss->firstHsDone &&
> +	!didFalseStart &&

Do you know why is this needed now, but wasn't needed for Chromium?
Attachment #721453 - Flags: review?(bsmith) → review-
Comment on attachment 721453 [details] [diff] [review]
patch 3

Review of attachment 721453 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/nss/lib/ssl/ssl3con.c
@@ +6221,2 @@
>  
>      ssl_GetSpecReadLock(ss);

Ignore the "Similarly to the above, you do not" comment.

@@ +6233,5 @@
> +	 return rv;
> +    }
> +
> +     if (!ss->canFalseStartCallback) {
> +	 rv = ssl3_BuiltinAreFalseStartConditionsMet(ss);

s/SSL_CanFalseStart/SSL_DefaultCanFalseStart/

(Or, maybe SSL_CanFalseStart is a better name. I just don't think SSL_NSSCanFalseStart makes sense as a name.)

@@ +6241,5 @@
> +					      ss->canFalseStartCallbackData,
> +					      &rv);
> +	 if (status != SECSuccess) {
> +	    rv = PR_FALSE;
> +	 }

Sorry, I expected to visit the code "below" in the review, but actually that is the only part of Google's False Start implementation you didn't change :).

To clarify the comment above (when reading in splinter), the code in ssl3_GatherCompleteHandshake should be changed to something like:

SECStatus srv;
...

srv = SECSuccess;
ssl_GetSSL3HandshakeLock(ss);
if (ss->ssl3.hs.ws == wait_change_cipher ||
    ss->ssl3.hs.ws == wait_new_session_ticket) {
        srv = ssl3_CanFalseStart(ss, &canFalseStart);
}
ssl_ReleaseSSL3HandshakeLock(ss);
if (srv != SECSuccess) {
    return SECFailure;
}
> 
> ::: security/nss/lib/ssl/sslsecur.c
> @@ +117,5 @@
> > +		  ss->ssl3.hs.authCertificatePending)) {
> > +		ss->firstHsDone         = PR_TRUE;
> > +		ss->gs.writeOffset = 0;
> > +		ss->gs.readOffset  = 0;
> > +	    }
> 
> Please explain this part with a comment, because it is very unclear. (In
> fact, I haven't figured it out completely yet.)

The issue here (and I have tried to adjust the comments that will be included in the patch update) is that secureSend() needs to get the handshake locks in order to figure out if it is doing false start with the authCertificate pending (and therefore suppress the writes). In cvs it only gets that lock if firstHsDone is false (so very early in a session) because that's the only time according to the cvs code that false start could be relevant  - then pending val hadn't yet come into play.. after firstHsDone is true it doesn't check (or get the lock) any more. 

by making the change to do1sthandshake the additional locking in secureSend() only gets done until ssl3_finishhandshake (via the restartTarget logic of the cert verification callback) is called and sets firstHsDone to true. 

An alternative would have been to just have secureSend() always get the lock and check (independent of firstHsDone). I was trying to keep the locks out of the routine path where the handshake is fully complete.

> 
> @@ +1259,5 @@
> >  	    if ((ss->ssl3.hs.ws == wait_change_cipher ||
> >  		ss->ssl3.hs.ws == wait_finished ||
> >  		ss->ssl3.hs.ws == wait_new_session_ticket) &&
> > +		ss->ssl3.hs.didFalseStart) {
> > +		if (ss->ssl3.hs.authCertificatePending) {
> 
> Why not add the !ss->ssl3.hs.authCertificatePending check to the beginning
> of ssl3_CanFalseStart?

I have done it this way because of your previous review comment 4:

"In particular, ssl3_CanFalseStart might return PR_FALSE the first time and then PR_TRUE the second time, e.g. if SSL_AuthCertificateComplete were to be called in between the two calls." 

ssl3_CanFalseStart is now meant to be consistent (at least across the period of time that is interesting) and of course *Pending is variable.

> Also, if we change it so that we check for 
> !ss->ssl3.hs.authCertificatePending in ssl3_CanFalseStart, then we can add a
> new guarantee to the false start callback function: the false start callback
> function will only be called if certificate validation succeeded, and only
> after certificate validation succeeded. 

to me, that seems considerably more invasive to the original way this was structured (in cvs) with canFalseStart being called off the GatherCompleteHandshake stack and there is a lot of logic using that point as the definitive signal for the false start logic. It seems a like a significant scope addition and if we can avoid it I think we should at least for this objective.

> 
> ::: security/nss/lib/ssl/sslsock.c

> @@ +2308,4 @@
> >      if (ss->opt.useSecurity && 
> >  	ss->handshaking != sslHandshakingUndetermined &&
> > +	!ss->firstHsDone &&
> > +	!didFalseStart &&
> 
> Do you know why is this needed now, but wasn't needed for Chromium?

It wasn't needed anymore - thanks for the catch. It was a dead end idea dealing with the don't-send-application-data-while-verification-is-incomplete issue. removed.
Attached patch patch 4 (obsolete) — Splinter Review
This took longer than I budgeted to revise - sorry about the delay.
Attachment #721453 - Attachment is obsolete: true
Attachment #725788 - Flags: review?(bsmith)
Attached patch patch 5 (obsolete) — Splinter Review
An out-of-band requirement came to my attention: keep existing use cases of false start (defined as those without custom canfalsestart callbacks) generating the handshake callback from ssl3_SendClientSecondRound() instead of ssl3_FinishHandshake() where I had moved it to. Users nss that set the custom canfalsestart callback can get notifications at two different points in time.

This update also fixes another unrelated assignment bug.
Attachment #725788 - Attachment is obsolete: true
Attachment #725788 - Flags: review?(bsmith)
Attachment #727318 - Flags: review?(bsmith)
Attached patch Patch 5 rebased to NSS hg repo (obsolete) — Splinter Review
Attachment #727318 - Attachment is obsolete: true
Attachment #727318 - Flags: review?(bsmith)
Attachment #732120 - Flags: review?(bsmith)
(In reply to Patrick McManus [:mcmanus] from comment #16)
> > 
> > ::: security/nss/lib/ssl/sslsock.c
> 
> > @@ +2308,4 @@
> > >      if (ss->opt.useSecurity && 
> > >  	ss->handshaking != sslHandshakingUndetermined &&
> > > +	!ss->firstHsDone &&
> > > +	!didFalseStart &&
> > 
> > Do you know why is this needed now, but wasn't needed for Chromium?
> 
> It wasn't needed anymore - thanks for the catch. It was a dead end idea
> dealing with the
> don't-send-application-data-while-verification-is-incomplete issue. removed.

I'm going to walk this explanation back. Code is needed here in the poll() implementation to help with false start - though that code was indeed wrong because it induced a spin for poll(write) while the auth certificate callback was outstanding. It can be done better.

The next version of this patch will have code back in sslsock to deal with it better. The problem is that while the handshake is not complete and last write is blocked nss suppresses PR_POLL_WRITE while it waits for incoming data to complete the handshake. (search for "don't select on write"). That logic can be tweaked to not suppress PR_POLL_WRITE if false start has been achieved and we aren't waiting for a cert verification callback.

Without research, I can only speculate on why Chromium doesn't need a similar adjustment. One possibility is that their http data send is event triggered off the handshake being complete, while ours is level triggered from poll(). We also drive the handshake through send(0) which might leave lastWriteBlocked in different state.

Regardless, I think that the ssl 'socket' should poll writable at this point due to the syllogism that a send(1) would succeed.
Attached patch patch 6 (obsolete) — Splinter Review
1: based on email feedback from bsmith: instead of explicitly blocking false started writes while an async certificate auth was outstanding do not begin the false start state until the auth completes.

2: take brians whitespace and comment improvements

3: change ssl3_gathercompletehandshake false start logic. Previously it tested false start at the bottom of the loop and broke out of that upon FS. We still do that for blocking sockets, but for non blocking sockets only test FS when we are returning WOULD_BLOCK (i.e. out of data without gathering the handshake). This has a couple of benefits: we stay out of false start logic and callbacks when the rest of the handshake happens to be sitting in the recv buffers, and it allows an iteration of this function that doesn't process any incoming data to still advance the false start state - this happens after a async cert validation is completed.

4: update the poll() logic so that the ssl socket poll()s as writable when false start is activated even though !firstHsDone (which is kinda the essence of false start).
Attachment #732120 - Attachment is obsolete: true
Attachment #732120 - Flags: review?(bsmith)
Attachment #736347 - Flags: review?(bsmith)
(In reply to Patrick McManus [:mcmanus] from comment #20)
> (In reply to Patrick McManus [:mcmanus] from comment #16)
> > > 
> > > ::: security/nss/lib/ssl/sslsock.c
> > 
> > > @@ +2308,4 @@
> > > >      if (ss->opt.useSecurity && 
> > > >  	ss->handshaking != sslHandshakingUndetermined &&
> > > > +	!ss->firstHsDone &&
> > > > +	!didFalseStart &&
> > > 
> > > Do you know why is this needed now, but wasn't needed for Chromium?
> > 
> > It wasn't needed anymore - thanks for the catch. It was a dead end idea
> > dealing with the
> > don't-send-application-data-while-verification-is-incomplete issue. removed.
> 
> I'm going to walk this explanation back. Code is needed here in the poll()
> implementation to help with false start - though that code was indeed wrong
> because it induced a spin for poll(write) while the auth certificate
> callback was outstanding. It can be done better.
> 
> The next version of this patch will have code back in sslsock to deal with
> it better. The problem is that while the handshake is not complete and last
> write is blocked nss suppresses PR_POLL_WRITE while it waits for incoming
> data to complete the handshake. (search for "don't select on write"). That
> logic can be tweaked to not suppress PR_POLL_WRITE if false start has been
> achieved and we aren't waiting for a cert verification callback.
> 
> Without research, I can only speculate on why Chromium doesn't need a
> similar adjustment. One possibility is that their http data send is event
> triggered off the handshake being complete, while ours is level triggered
> from poll(). We also drive the handshake through send(0) which might leave
> lastWriteBlocked in different state.

Correct. Chromium "lies" and says the socket is connected and ready. We don't use poll() triggering for this - our sockets directly invoke their callers' callbacks. The polling doesn't come into play until the underlying socket says its blocked on reading/writing...

> 
> Regardless, I think that the ssl 'socket' should poll writable at this point
> due to the syllogism that a send(1) would succeed.
Sleevi asked me to poke this thread and mention that we're disabling RC4 with False Start in https://codereview.chromium.org/13985023/.

This does mean that we lose False Start to Google properties, which is sad. Hopefully we can get the TLS 1.2 and AES-GCM patches reviewed and live soon in order to get that back.
See Also: → 681839
Comment on attachment 736347 [details] [diff] [review]
patch 6

Review of attachment 736347 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/nss/lib/ssl/ssl3gthr.c
@@ +357,5 @@
> +	      * second round then consider whether false start can be used
> +	      * to send data while waiting
> +	      */ 
> +	     if (PORT_GetError() == PR_WOULD_BLOCK_ERROR) {
> +		 ssl_TestFalseStart(ss, &canFalseStart);

Here, there is no check of the return value of ssl_TestFalseStart. And...

@@ +398,5 @@
> +     */ 
> +    if (!ssl_SocketIsBlocking(ss) ||
> +	 (ssl_TestFalseStart(ss, &canFalseStart) != SECSuccess)) {
> +	 canFalseStart = PR_FALSE;
> +    }

I am having some trouble following this. I think what is happening is that, for a non-blocking socket, you want to prevent breaking out of the loop early here because you know that we will go through the loop again and break out of the loop in the place where ssl_TestFalseStart is called above. It is not clear why this is so important to do. I would think it would be sufficient to just make sure we go through this loop at least once per call. Could you explain this change?

I attacked a variant of your patch that does the false start decision making in a different place--withing ssl3_SendClientSecondRound. To me, that is a much more logical place to put it, and it also facilitates fixing bug 589047 in a straightforward way. But, I am not sure if I am overlooking something important that warrants this unusual control flow here.

::: security/nss/lib/ssl/sslimpl.h
@@ +837,5 @@
>  					    * used for backoff (in ms) */
>      PRUint32              rtRetries;       /* The retry counter */
> +    PRBool                calledFalseStart; /* true if ssl3_CanFalseStart has
> +					    * been called and set didFalseStart */
> +    PRBool                didFalseStart;   /* value of ssl3_CanFalseStart */

Nit: These variables are a little confusing in the naming. I think part of this confusion lies in the attempt to try to keep the false start logic similar to the way it has been done up to now, by putting the decision making in the ssl3gthr.c code. I attacked a modified version of your patch that tries to make sure that we only enter the decision making code a single time, so that we don't need calledFalseStart. But, otherwise, "canFalseStart" seems like a better name than "didFalseStart" because in some cases this variable can be set even when we don't false start (e.g. when the server's Finished message wins the race with certificate validation, for example).

@@ +842,5 @@
> +    PRBool                enoughFirstHsDone; /*enough of the handshake is done
> +					    * for callbacks to be able to
> +					    * retrieve channel security
> +					    * parameters from callback functions */
> +    PRBool                writableForFalseStart; /* true if the socket should

I am not sure why there needs to be two variables, writableForFalseStart and didFalseStart. Don't they always have the same value? If so, one variable should be sufficient. Otherwise, "pollWriteForFalseStart" would be a less confusing name; "writable" could mean that PR_Write/PR_Send should succeed, but that is what didFalseStart means (AFAICT).

::: security/nss/lib/ssl/sslinfo.c
@@ +46,5 @@
>  
>      if (ss->firstHsDone) {
>  	enoughFirstHsDone = PR_TRUE;
>      } else if (ss->version >= SSL_LIBRARY_VERSION_3_0 &&
> +	       ss->ssl3.hs.enoughFirstHsDone) {

We won't have the SSL3 handshake lock at this time. In fact, we won't have any locks. It would be clearer if enoughFirstHsDone was next to firstHsDone so that we can see that we haven't created any new locking issues and/or made any existing locking issues worse.

::: security/nss/lib/ssl/sslsock.c
@@ +2344,4 @@
>  		    */
> +
> +	     if(!(ss->version >= SSL_LIBRARY_VERSION_3_0 &&
> +		  ss->ssl3.hs.writableForFalseStart)) {

I don't think this issue affects Firefox, since we do all of our SSL networking on a single thread, but it might affect other applications:

ssl_Poll doesn't acquire any locks. In particular, it never acquires the SSL3 handshake lock. Yet, even before your patch, this function is accessing some variables that seem to be needing protection from various locks (most obviously: ss->ssl3.hs.restartTarget; less obviously: ss->lastWriteBlocked, ss->pendingBuf.len, etc.). But, ssl_Poll gets called very often (via PR_Poll on the socket) so it doesn't seem great to add a bunch more lock acquisitions. And, it seems like we've gotten away without the lock acquisitions so far.

I need to look at this locking aspect more carefully.
Attachment #736347 - Flags: review?(bsmith)
(In reply to Brian Smith (:bsmith) from comment #25)

> 
> Here, there is no check of the return value of ssl_TestFalseStart. And...
> 

It doesn't check the return value of ssl_TestFalseStart() because this path doesn't rely on its out parameter in any way (just side effects which it relies on not being set if it fails). So there is literally nothing to do differently based on the return value.

> I am having some trouble following this. I think what is happening is that,
> for a non-blocking socket, you want to prevent breaking out of the loop
> early here because you know that we will go through the loop again and break
> out of the loop in the place where ssl_TestFalseStart is called above. It is
> not clear why this is so important to do. I would think it would be
> sufficient to just make sure we go through this loop at least once per call.
> Could you explain this change?
> 

I think you've got the gist of it. The blocking vs non-blocking dichotomy required adding 2 ssl_testfalsestart() call sites and so it made sense to me to essentially label and use them that way. You're right that the second one could be used by both cases (though it wouldn't obviate non-blocking's need for the first).. not a big deal.
(In reply to Brian Smith (:bsmith) from comment #25)

> I am not sure why there needs to be two variables, writableForFalseStart and
> didFalseStart. 

This is a fine catch. They are left over from a previous iteration of the patch where didFalseStart could be true while the async cert validation was still outstanding. We've changed it so that is no longer possible and now these things are redundant.
(In reply to Patrick McManus [:mcmanus] from comment #27)
> (In reply to Brian Smith (:bsmith) from comment #25)
> 
> > I am not sure why there needs to be two variables, writableForFalseStart and
> > didFalseStart. 
> 
> This is a fine catch. 

I was wrong, its not redundant. I blame cobwebs of not dealing with this patch for a while for the wrong reply.

writableForFalseStart is used to make poll() drive the handshake through send(0) after the async cert verification is done. It might be possible at this time to do false start, but we haven't checked the full can false start callback yet because of the outstanding verification.

An alternative to the extra state variable that feels a little better is to test canFalseStart() off the successful cert verification callback directly - rather than setting the isWritable flag.

I'll make that update.
(In reply to Patrick McManus [:mcmanus] from comment #26)
> (In reply to Brian Smith (:bsmith) from comment #25)
> 
> > 
> > Here, there is no check of the return value of ssl_TestFalseStart. And...
> > 
> 
> It doesn't check the return value of ssl_TestFalseStart() because this path
> doesn't rely on its out parameter in any way (just side effects which it
> relies on not being set if it fails). So there is literally nothing to do
> differently based on the return value.

But, if the applications false start callback returned SECFailure, we must terminate the handshake, as that is what the application is expecting.
See Also: → 589047
(In reply to Brian Smith (:bsmith) from comment #25)
>
> ::: security/nss/lib/ssl/sslinfo.c
> @@ +46,5 @@
> >  
> >      if (ss->firstHsDone) {
> >  	enoughFirstHsDone = PR_TRUE;
> >      } else if (ss->version >= SSL_LIBRARY_VERSION_3_0 &&
> > +	       ss->ssl3.hs.enoughFirstHsDone) {
> 
> We won't have the SSL3 handshake lock at this time. In fact, we won't have
> any locks.

but writing it as:
ssl_getssl3handshakelock(ss)
enoughFirstHsDone = ss->ssl3.hs.enoughFirstHsDone;
ssl_releasessl3handshakelock(ss)

doesn't make any sense. you're protecting a variable that's an atomic assignment anyhow. I can add the lock if you like.

> It would be clearer if enoughFirstHsDone was next to firstHsDone
> so that we can see that we haven't created any new locking issues and/or
> made any existing locking issues worse.


I can't figure out what you are asking for here.
(In reply to Patrick McManus [:mcmanus] from comment #30)
> > It would be clearer if enoughFirstHsDone was next to firstHsDone
> > so that we can see that we haven't created any new locking issues and/or
> > made any existing locking issues worse.
> 
> 
> I can't figure out what you are asking for here.

I mean, put enoughFirstHsDone in the same structure as firstHsDone.
(In reply to Patrick McManus [:mcmanus] from comment #32)
> Created attachment 756196 [details] [diff] [review]
> TLS False Start updates for NSS

I haven't considered the proposed control changes in brian's attachment yet... other than those I think this brings the patch up to date with the review comments. call it WIP while I look at the attachment
Attachment #736347 - Attachment is obsolete: true
Attachment #756196 - Attachment is obsolete: true
Comment on attachment 754968 [details] [diff] [review]
WIP: Modified version of Patrick's patch that clarifies some things and attempts to address bug 589047

Review of attachment 754968 [details] [diff] [review]:
-----------------------------------------------------------------

This is nice and I've incorporated the basics of it into my patch. Thanks.

It won't resolve 589047, at least in a firefox sense, without some more serious necko changes.. I think necko would need to provide the first application bytes synchronously to the false start callback which it doesn't do right now.. but at least it could do that without more nss changes.
Attachment #754968 - Flags: feedback?(mcmanus) → feedback+
Comment on attachment 756605 [details] [diff] [review]
TLS False Start updates for NSS

This contains the two todos your example patch for driving the logic from sendclientsecondround introduced. I need your help in addressing those in short order.

one is about dtls. I've got no idea on that.

the other is about accessing pendingq.len with only the read lock held.. would simply obtaining the write lock in this case be dangerous (deadlockable?)?
Attachment #756605 - Flags: review?(bsmith)
(In reply to Patrick McManus [:mcmanus] from comment #36)
> Comment on attachment 756605 [details] [diff] [review]
> TLS False Start updates for NSS
> 
> This contains the two todos your example patch for driving the logic from
> sendclientsecondround introduced. I need your help in addressing those in
> short order.
> 
> one is about dtls. I've got no idea on that.
> 
> the other is about accessing pendingq.len with only the read lock held..
> would simply obtaining the write lock in this case be dangerous
> (deadlockable?)?

Please just remove the part of the patch that deals with bug 589047; i.e. just always pass 0 to ssl3_SendFinished. Yesterday, I figured out the correct thing to do, which addresses both of those TODOs. (See the patch I am attaching). But, we can do that in bug 589047, after we land this bug.
Attachment #754968 - Attachment is obsolete: true
re comment 37: I'm down with that.
Attached patch TLS False Start updates for NSS (obsolete) — Splinter Review
Attachment #756642 - Flags: review?(bsmith)
Attachment #756605 - Attachment is obsolete: true
Attachment #756605 - Flags: review?(bsmith)
Comment on attachment 756642 [details] [diff] [review]
TLS False Start updates for NSS

Review of attachment 756642 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/nss/lib/ssl/ssl.h
@@ +128,5 @@
> + * cipher isn't broken this is safe. The advantage of False Start is that
> + * it saves a round trip for client-speaks-first protocols when performing a
> + * full handshake.
> + *
> + * See SSL_DefaultCanFalseStart for default criteria that NSS uses to determine

s/for/for the/

@@ +130,5 @@
> + * full handshake.
> + *
> + * See SSL_DefaultCanFalseStart for default criteria that NSS uses to determine
> + * whether to false start or not, and see SSL_SetCanFalseStartCallback to see
> + * how to change that criteria. In addition to those criteria, False start will

s/False start/false start/.

@@ +134,5 @@
> + * how to change that criteria. In addition to those criteria, False start will
> + * only be done when the server selects a cipher suite with an effective key
> + * length of 80 bits or more (including RC4-128). Also, see
> + * SSL_HandshakeCallback for a decsription on how false start affects when the
> + * handshake callback gets called. 

Trialing whitespace.

@@ +660,5 @@
>   */
>  SSL_IMPORT SECStatus SSL_InheritMPServerSIDCache(const char * envString);
>  
>  /*
> +** Set the callback on a that normally gets called when the TLS handshake

s/on a//

@@ +670,5 @@
> +** callback set, then the handshake callback gets called after the peer's
> +** Finished message has been verified, which may be after application data is
> +** sent.
> +**
> +** if false start is enabled and there is not a custom CanFalseStartCallback

s/if/If/

::: security/nss/lib/ssl/ssl3con.c
@@ +6185,5 @@
> +static SECStatus
> +ssl3_CheckFalseStart(sslSocket *ss)
> +{
> +    SECStatus status;
> +    PRBool rv = PR_TRUE;

"rv" should be used for return values. Choose another name like "maybeFalseStart." "rv" would be a better name for the what is now called "status."

@@ +6205,5 @@
> +    if (!ss->opt.enableFalseStart ||
> +	ss->sec.isServer ||
> +	ss->ssl3.hs.isResuming ||
> +	!ss->ssl3.cwSpec ||
> +	ss->ssl3.cwSpec->cipher_def->secret_key_size < 10) {

All of these conditions, except ss->ssl3.cwSpec->cipher_def->secret_key_size < 10, are redundant and can be removed. (ss->opt.enableFalseStart and !ss->sec.isServer are checked in ssl3_SendClientSecondRound, !ss->ssl3.hs.isResuming is implied by the fact that we're sending the second round, and ss->ssl3.cwSpec is implied because we're calling this function after we called ssl3_SendChangeCipherSpec.)

@@ +6206,5 @@
> +	ss->sec.isServer ||
> +	ss->ssl3.hs.isResuming ||
> +	!ss->ssl3.cwSpec ||
> +	ss->ssl3.cwSpec->cipher_def->secret_key_size < 10) {
> +	rv = PR_FALSE;

ss->ssl3.hs.canFalseStart = PR_FALSE;

@@ +6383,4 @@
>      }
>  
>      rv = ssl3_SendFinished(ss, 0);
> +

Remove extra blank line.

@@ +9491,5 @@
> +     * for the case where false start was done without a canFalseStartCallback.
> +     */
> +    if (ss->handshakeCallback != NULL &&
> +	!(ss->ssl3.hs.canFalseStart && !ss->canFalseStartCallback)) {
> +	(ss->handshakeCallback)(ss->fd, ss->handshakeCallbackData);        

trailing whitespace here.

::: security/nss/lib/ssl/ssl3gthr.c
@@ +369,5 @@
>  	    return ss->recvdCloseNotify ? 0 : rv;
>  	}
>  
>  	/* If we kicked off a false start in ssl3_HandleServerHelloDone, break
> +	out of this loop early without finishing the handshake.

Seems like an unintended change.

::: security/nss/lib/ssl/sslauth.c
@@ +78,5 @@
>  	*op = SSL_SECURITY_STATUS_OFF;
>      }
>  
> +    if (ss->firstHsDone ||
> +	ss->enoughFirstHsDone) {

This can be written on one line, more simply:

enoughFirstHsDone = ss->enoughFirstHsDone || ss->firstHsDone;

It should be sufficient to check ss->enoughFirstHsDone, by the definition given in the comment of that structure member. But, that would mean that we must ensure that ss->enoughFirstHsDone is set whenever we set ss->firstHsDone. In this case, you can dispense with the enoughFirstHsDone local variable and test ss->enoughFirstHsDone directly below.

So, we should either ensure that and then check ss->enoughFirstHsDone, or improve the comment for enoughFirstHsDone (perhaps we would just need to add "SSL3 only" to that comment?)

The same applies to the similar changes elsewhere in the patch.

::: security/nss/lib/ssl/sslimpl.h
@@ +832,5 @@
>      DTLSTimerCb           rtTimerCb;       /* The function to call on expiry */
>      PRUint32              rtTimeoutMs;     /* The length of the current timeout
>  					    * used for backoff (in ms) */
>      PRUint32              rtRetries;       /* The retry counter */
> +    PRBool                canFalseStart;   /* value of ssl3_CanFalseStart */

The function got renamed to ssl3_CheckFalseStart. Perhaps the comment should be, simply "Can/did we False Start?" or omitted.

@@ +1082,5 @@
>      /* State flags */
>      unsigned long    clientAuthRequested;
>      unsigned long    delayDisabled;       /* Nagle delay disabled */
>      unsigned long    firstHsDone;         /* first handshake is complete. */
> +    unsigned long    enoughFirstHsDone; /*enough of the handshake is done

space after the "*".

::: security/nss/lib/ssl/sslsecur.c
@@ +222,5 @@
>      ssl_ReleaseRecvBufLock(ss);
>  
>      ssl_GetSSL3HandshakeLock(ss);
> +    if (ss->version >= SSL_LIBRARY_VERSION_3_0) {
> +	ss->ssl3.hs.canFalseStart    =      PR_FALSE;

Just one space around the "=" please.

I do not think it makes sense to test ss->version here. Usually SSL_ResetHandshake is called before the first handshake even happens.

Also, notice that ss->ssl3.hs.* is initialized in ssl3_Init and destroyed in ssl3_DestroySSL3Info. I'm not sure why we're not calling ssl3_DestroySSL3Info here (which, among other things, sets ss->ssl3.initialized, which seems pretty important). I asked on the nss-dev mailing list about this, but it seems to me like the right thing to do is call ssl3_DestroySSl3Info here, and reset ss->ssl3.hs.canFalseStart and ss->ssl3.hs.restartTarget in ssl3_InitState.

@@ +356,5 @@
> +	return SECFailure;
> +    }
> +
> +    ssl_Get1stHandshakeLock(ss);
> +    ssl_GetSSL3HandshakeLock(ss);

I do not think the check of useSecurity or any of the locking is necessary here based on seeing SSL_AuthCertificateHook, SSL_GetClientAuthData, but I see you just copied it from SSL_HandshakeCallback so no need to change it.

@@ +376,5 @@
> +SSL_DefaultCanFalseStart(PRFileDesc *fd, PRBool *canFalseStart)
> +{
> +    sslSocket *ss;
> +    ss = ssl_FindSocket(fd);
> +    *canFalseStart = PR_FALSE;

Invert the order of the above two lines, so that the check of the return value of ssl_FindSocket comes immediately after the call, so we can see more clearly that we don't need to call PORT_SetError() in that block.

@@ +384,5 @@
> +	return SECFailure;
> +    }
> +
> +    if (ss->version <= SSL_LIBRARY_VERSION_3_0)
> +	return SECFailure;

I think it is worth checking also ss->ssl3.initialized, and you must call PORT_SetError(SEC_ERROR_NOT_INITIALIZED). (Something always needs to call PORT_SetError when returning SECFailure.)

@@ +387,5 @@
> +    if (ss->version <= SSL_LIBRARY_VERSION_3_0)
> +	return SECFailure;
> +
> +    /* Require a forward-secret key exchange. */
> +    *canFalseStart = (ss->ssl3.hs.kea_def->kea == kea_dhe_dss ||

Nits: Parentheses are unnecessary, and the "ss->ssl3.hs" choices should be aligned.

@@ +1261,5 @@
>  	if (ss->version >= SSL_LIBRARY_VERSION_3_0) {
>  	    ssl_GetSSL3HandshakeLock(ss);
>  	    if ((ss->ssl3.hs.ws == wait_change_cipher ||
>  		ss->ssl3.hs.ws == wait_finished ||
>  		ss->ssl3.hs.ws == wait_new_session_ticket) &&

Why are these checks of ss->ssl3.hs.ws are different than the similar checks in other parts of the code, which only check  for wait_change_cipher or wait_new_session_ticket?

It might be simpler to set ss->ssl3.hs.canFalseStart = PR_FALSE in ssl3_HandleFinished so that all these checks can be reduced to ss->ssl3.hs.canFalseStart without needing to check ss->ssl3.ws at all. Then we don't need to worry about any consistency.

::: security/nss/lib/ssl/sslsock.c
@@ +2339,3 @@
>  		    */
> +
> +		    if(!(ss->version >= SSL_LIBRARY_VERSION_3_0 &&

space between if and parenthesis.

@@ +2339,4 @@
>  		    */
> +
> +		    if(!(ss->version >= SSL_LIBRARY_VERSION_3_0 &&
> +			ss->ssl3.hs.canFalseStart)) {

Should we be checking ss->ssl3.hs.ws here? (See my previous comment.)
Attachment #756642 - Flags: review?(bsmith) → review-
(In reply to Brian Smith (:bsmith) from comment #40)

> ::: security/nss/lib/ssl/sslsecur.c
> @@ +222,5 @@
> >      ssl_ReleaseRecvBufLock(ss);
> >  
> >      ssl_GetSSL3HandshakeLock(ss);
> > +    if (ss->version >= SSL_LIBRARY_VERSION_3_0) {
> > +	ss->ssl3.hs.canFalseStart    =      PR_FALSE;
>

> I'm not sure why we're not calling
> ssl3_DestroySSL3Info here (which, among other things, sets
> ss->ssl3.initialized, which seems pretty important). I asked on the nss-dev
> mailing list about this, but it seems to me like the right thing to do is
> call ssl3_DestroySSl3Info here, and reset ss->ssl3.hs.canFalseStart and
> ss->ssl3.hs.restartTarget in ssl3_InitState.


I'm going to ask that you make the change to use destroyinfo/initstate in a different patch.

I did drop the version check and add a restartTarget = NULL to this point.
Attached patch TLS False Start updates for NSS (obsolete) — Splinter Review
Attachment #757547 - Flags: review?(bsmith)
Attachment #756642 - Attachment is obsolete: true
Comment on attachment 757547 [details] [diff] [review]
TLS False Start updates for NSS

Review of attachment 757547 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/nss/lib/ssl/ssl.h
@@ +129,5 @@
> + * it saves a round trip for client-speaks-first protocols when performing a
> + * full handshake.
> + *
> + * See SSL_DefaultCanFalseStart for the default criteria that NSS uses to
> + * determine whether to false start or not. See SSL_SetCanFalseStartCallback 

trailing whitespace.

@@ +688,5 @@
> +** start can be used. If the callback does not return SECSuccess then the
> +** handshake will be canceled.
> +**
> +** Applications that do not set the callback will use an internal set of
> +** criteria to determine the result if the connection should false start. If

s/the result//

@@ +691,5 @@
> +** Applications that do not set the callback will use an internal set of
> +** criteria to determine the result if the connection should false start. If
> +** the callback is set false start will never be used without invoking the
> +** callback function, but some connections (e.g. resumed connections) will
> +** never use false start and therefore not invoke the callback.

s/not/will not/

::: security/nss/lib/ssl/ssl3con.c
@@ +6193,5 @@
> +
> +    if (ss->ssl3.hs.ws != wait_change_cipher &&
> +	ss->ssl3.hs.ws != wait_new_session_ticket) {
> +	return SECSuccess;
> +    }

These checks should be removed too. This state is assumed when we're in ssl3_SendClientSecondRound. (In fact, maybe this condition always false, since ssl3_SendClientSecondRound sets these states after calling ssl3_CheckFalseStart.)

@@ +9153,5 @@
>      if (rv != SECSuccess) {
>  	goto fail;	/* err code was set by ssl3_ComputeHandshakeHashes */
>      }
>  
> +    ss->ssl3.hs.canFalseStart = PR_FALSE; /* False Start phase is complete */

This should be done in ssl3_HandleFinished (near the place where restartTarget gets reset) or ssl3_FinishHandshake (near where ss->ssl3.hs.ws gets set to idle_handshake), not ssl3_SendFinished. In particular, ssl3_SendClientSecondRound will call ssl3_SendFinished right after we decided to false start. So, AFAICT, having this here would cancel false start before it even started.

@@ +9476,5 @@
>  
>      /* The first handshake is now completed. */
>      ss->handshake           = NULL;
>      ss->firstHsDone         = PR_TRUE;
> +    ss->enoughFirstHsDone    = PR_TRUE;

Alignment is off by one space.

::: security/nss/lib/ssl/sslsecur.c
@@ +400,5 @@
> +                     ss->ssl3.hs.kea_def->kea == kea_dhe_rsa ||
> +                     ss->ssl3.hs.kea_def->kea == kea_ecdhe_ecdsa ||
> +                     ss->ssl3.hs.kea_def->kea == kea_ecdhe_rsa;
> +
> +	return SECSuccess;

This line is indented too much.
Attachment #757547 - Flags: review?(bsmith) → review-
(In reply to Brian Smith (:bsmith) from comment #43)
> ::: security/nss/lib/ssl/ssl3con.c
> @@ +6193,5 @@
> > +
> > +    if (ss->ssl3.hs.ws != wait_change_cipher &&
> > +	ss->ssl3.hs.ws != wait_new_session_ticket) {
> > +	return SECSuccess;
> > +    }
> 
> These checks should be removed too. This state is assumed when we're in
> ssl3_SendClientSecondRound. 

> (In fact, maybe this condition always false,

thanks. defeinitely an oversight - it wasn't actually ever false when using async cert validation because checkfalsestart start effectively gets done after sendclientsecondround has completed. And actually that's why I had left that check in there - so that if it gets called after the handshake is complete it doesn't go setting hs.canFalseStart to true. Is that a concern?


> ::: security/nss/lib/ssl/sslsecur.c
> @@ +400,5 @@
> > +                     ss->ssl3.hs.kea_def->kea == kea_dhe_rsa ||
> > +                     ss->ssl3.hs.kea_def->kea == kea_ecdhe_ecdsa ||
> > +                     ss->ssl3.hs.kea_def->kea == kea_ecdhe_rsa;
> > +
> > +	return SECSuccess;
> 
> This line is indented too much.

this remains a mystery to me. if the new incantation (in the next patch) isn't to your liking just be precise about how it should look and you shall have it.
Attached patch TLS False Start updates for NSS (obsolete) — Splinter Review
Attachment #758635 - Flags: review?(bsmith)
Attachment #757547 - Attachment is obsolete: true
(In reply to Patrick McManus [:mcmanus] from comment #44)
> it wasn't actually ever false when using
> async cert validation because checkfalsestart start effectively gets done
> after sendclientsecondround has completed. And actually that's why I had
> left that check in there - so that if it gets called after the handshake is
> complete it doesn't go setting hs.canFalseStart to true. Is that a concern?

Good catch.

When the handshake is completed, ssl3_HandleFinished will be called to process the server's Finished message. At that time, we no longer need to false start, so we need to reset restartTarget so that ssl3_CheckFalseStart does not get called at all. (In particular, it doesn't make sense to call the can false start callback after the handshake has completed.) See the comment about this that was added at the place that restartTarget is set to ssl3_CheckFalseStart in ssl3_SendClientSecondRound.

I suggest this (in ssl3_HandleFinished):

+   /* Cancel false start check since we already completed the handshake */
+   if (ss->ssl3.hs.restartTarget == ssl3_CheckFalseStart) {
+       ss->ssl3.hs.restartTarget = NULL;
+   }

    if (ss->ssl3.hs.authCertificatePending) {
	if (ss->ssl3.hs.restartTarget) {
	    PR_NOT_REACHED("ssl3_HandleFinished: unexpected restartTarget");
	    PORT_SetError(SEC_ERROR_LIBRARY_FAILURE);
	    return SECFailure;
	}

	ss->ssl3.hs.restartTarget = ssl3_FinishHandshake;
	return SECWouldBlock;
    }

    rv = ssl3_FinishHandshake(ss);
    return rv;

Note that ssl3_FinishHandshake has an assertion that restartTarget == NULL, which is why I suggest this less-than-obvious change.

> > > +	return SECSuccess;
> > 
> > This line is indented too much.
> 
> this remains a mystery to me. if the new incantation (in the next patch)
> isn't to your liking just be precise about how it should look and you shall
> have it.

I just meant that the "return SECSuccess;" line should be indented with four spaces instead of a tab.
Attachment #758635 - Attachment is obsolete: true
Attachment #758635 - Flags: review?(bsmith)
Attached patch TLS False Start updates for NSS (obsolete) — Splinter Review
Attachment #758708 - Flags: review?(bsmith)
Comment on attachment 758708 [details] [diff] [review]
TLS False Start updates for NSS

Review of attachment 758708 [details] [diff] [review]:
-----------------------------------------------------------------

This patch technically works but the racing logic between the certificate authentication and the finished message doesn't work as expected. Basically, the client system may receive the change_cipher_spec/Finished messages from the peer, but we will never recv() them because restartTarget != NULL in ssl_Poll and in the gather logic causes PR_Poll to avoid doing any networking. Thus, with this patch we will false start 100% of the time, instead of only in the times where the certificate authentication won the race with the Finished message. However, this is already how the old false start logic worked so we are no worse off.

Also, I became concerned about the fact that in the case of applications like Chromium that don't use the new false start API this patch adds, we will call the HandshakeCallback before sending/flushing the client's finished message, whereas before that call was made *after* flushing the handshake messages. I know, in particular, that Chromium does unusual things with its NSPR I/O layers and the TLS state machine and callbacks. So, before we check this change into the NSS trunk, we should change the patch to make sure the HandshakeCallback gets called in the same place as before. This will result in a little bit of code duplication but it is worthwhile.

It is important that we start testing the Necko- and PSM- level logic for false start so that we can get started in verifying the compatibility issues. So, I am going to check in Patrick's patch to mozilla-central with the change I note below. Then, I will write a patch on top of Patrick's patch to improve the auth/Finished racing logic and to make the patch more suitable for use in Chromium and other products, for checkin to the NSS trunk.

::: security/nss/lib/ssl/ssl3con.c
@@ +9482,5 @@
>  	ss->ssl3.hs.cacheSID = PR_FALSE;
>      }
>  
>      ss->ssl3.hs.ws = idle_handshake;
> +    ss->ssl3.hs.canFalseStart = PR_FALSE; /* False Start phase is complete */

If we set canFalseStart = PR_FALSE here, the condition !(ss->ssl3.hs.canFalseStart && !ss->canFalseStartCallback) below becomes !(FALSE && !ss->canFalseStartCallback) == !(FALSE) == TRUE so we will call the handshake callback twice for applications that enable false start and don't use the new CanFalseStart API. I am going to fix this issue before I check the code into mozilla-central.
I landed this on mozilla-inbound for testing:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f6bc026a0c36. I fixed the issue with canFalseStart being reset right before it is tested (the last issue I commented on in comment 48). See https://hg.mozilla.org/integration/mozilla-inbound/diff/f6bc026a0c36/security/nss/lib/ssl/ssl3con.c#l1.215

I will work on the fixes (probably) needed for Chromium compatibility soon.
https://hg.mozilla.org/mozilla-central/rev/f6bc026a0c36
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You incorrectly marked this bug as resolved.

Nothing has been fixed in NSS, and the patch for mozilla-central was a testing patch, only...

In the future, please use the appropriate status whiteboard text to prevent people looking at inbound from incorrectly closing bugs.
Blocks: 898431
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [leave open]
Target Milestone: --- → 3.15.2
In https://hg.mozilla.org/mozilla-central/rev/f6bc026a0c36,
Brian added the PSM patch (for bug 658222) to security/patches
by mistake. We should add the NSS patch (for this bug) instead.
Attachment #784493 - Flags: superreview?(brian)
Attachment #784493 - Flags: review?(mcmanus)
Comment on attachment 784493 [details] [diff] [review]
Add the NSS patch (rather than the PSM patch) to security/patches

Pushed to mozilla-inbound:
https://hg.mozilla.org/integration/mozilla-inbound/rev/eccb3b98dadb
What is the plan for this patch after the merge to aurora? It currently breaks building against a pristine NSS.
Attachment #784493 - Flags: superreview?(brian) → superreview+
Patrick: Please try out this version of the patch. You can do so by first doing a 
"python client.py update_nss default" and then applying the patch.

There are two main differences in this version:
1. The false start callback will only get called if we're actually going to false start; if the server's finished message comes back before the certificate authentication finishes, then the false start callback will not be called. This means that the number of times the false start callback is called / number of connections is now an accurate count of how many connections we false start on.

2. The patch is more conservative in how it calls the handshake callback for applications that enable false start but do not use the new CanFalseStart callback. I suspect that this version should not have any problems with the unusual things that Chromium does with its IO layers underneath libssl.

Wan-Teh: I believe that it should be very simple to move the logic from Chromium's AuthCertificate hook to a CanFalseStart callback, which should remove the need (IIRC) for one of your private patches in the Chromium source code.

Mike: I believe that this version of the patch is the one that needs to land in NSS 3.15.2. As soon as Patrick tests it out, I will ask Wan-Teh to review it. I will try to get it landed in NSS upstream ASAP and also I will pull in the NSS upstream as soon as it has this patch.
Attachment #785796 - Flags: feedback?(mcmanus)
I propose to backout this code from aurora 24 today, because later today the code will transition into beta 24. Beta 24 will probably get lots of attention, because it will be the new ESR release.

It seems this new code isn't required by current aurora 24.

This bug blocks bug 658222 which targets ff25. It seems the PSM code from bug 658222 hasn't landed into ff25.
I don't see any new tests in this patch for the NSS test suite.

The last time we had added new TLS extensions to NSS (OCSP stapling) we were very strict about requiring full tests, including having ability to automatically test the server side of the new TLS protocol extensions.

Should we be similarly strict for TLS false start, and require that automated NSS tests get added to NSS at the time this enhancement gets landed into NSS?
(In reply to Kai Engert (:kaie) from comment #57)
> I propose to backout this code from aurora 24 today, because later today the
> code will transition into beta 24. Beta 24 will probably get lots of
> attention, because it will be the new ESR release.

Sorry, not necessary, aurora 24 is not yet using this code, I misunderstood Mike's comment 55.

The earliest version carrying this NSS change is 25, which will become aurora tomorrow.
(In reply to Kai Engert (:kaie) from comment #58)
> I don't see any new tests in this patch for the NSS test suite.
> 
> The last time we had added new TLS extensions to NSS (OCSP stapling) we were
> very strict about requiring full tests, including having ability to
> automatically test the server side of the new TLS protocol extensions.
> 
> Should we be similarly strict for TLS false start, and require that
> automated NSS tests get added to NSS at the time this enhancement gets
> landed into NSS?

False Start tests were added a while ago to sslstress.txt. However, subsequent changes have (unintentionally) made those tests useless.

When I added the async cert validation mechanism (SSL_AuthCertificateComplete), we decided to stress-test the async mechanism by using it by default in tstclnt. Because the async cert validation code, before the patch in this bug, disabled false start even if the application asked for it, it effectively caused us to stop testing false start. So, this patch here actually helps get us to the point where the tests are working.

However, after that we changed the behavior of libssl so that libssl requires ephemeral key exchange for false start. However, all the tests for false start in sslstress.txt are using non-ephemeral key exchange. So, we should change those tests so that they use ECDHE and DHE.

For completeness, we could also add an option to tstclnt to make it use a custom CanFalseStart callback. Then we could test the non-ephemeral key exchange ciphersuites with false start again, and this would give us the ability to test both code paths that this bug implemented (with a CanFalseStart callback, and without). I am not sure this is strictly necessary. I will get to it if I have time.
(In reply to Brian Smith (:briansmith), was bsmith@mozilla.com (:bsmith) from comment #56)
> Patrick: Please try out this version of the patch. You can do so by first
> doing a "python client.py update_nss default" and then applying the patch.

Also, FYI, you can set SSLTRACE=3 (and SSLDEBUGFILE=log.txt) to see the logging. Search for "race" in the log; you will see "certificate authentication lost the race with peer's finished message" a lot for https://paypal.com and "certificate authentication won the race with peer's finished message" a lot for https://google.com. (Lots of OCSP requests vs. zero OCSP requests.)
FYI it (In reply to Brian Smith (:briansmith), was bsmith@mozilla.com (:bsmith) from comment #56)
> Created attachment 785796 [details] [diff] [review]
> Patrick's patch modified to be a more conservative change
> 
> Patrick: Please try out this version of the patch. You can do so by first
> doing a 
> "python client.py update_nss default" and then applying the patch.


OK but FYI it will be next week before I can test

> 
> There are two main differences in this version:
> 1. The false start callback will only get called if we're actually going to
> false start; if the server's finished message comes back before the
> certificate authentication finishes, then the false start callback will not
> be called. This means that the number of times the false start callback is
> called / number of connections is now an accurate count of how many
> connections we false start on.
> 
> 2. The patch is more conservative in how it calls the handshake callback for
> applications that enable false start but do not use the new CanFalseStart
> callback. I suspect that this version should not have any problems with the
> unusual things that Chromium does with its IO layers underneath libssl.
> 
> Wan-Teh: I believe that it should be very simple to move the logic from
> Chromium's AuthCertificate hook to a CanFalseStart callback, which should
> remove the need (IIRC) for one of your private patches in the Chromium
> source code.
> 
> Mike: I believe that this version of the patch is the one that needs to land
> in NSS 3.15.2. As soon as Patrick tests it out, I will ask Wan-Teh to review
> it. I will try to get it landed in NSS upstream ASAP and also I will pull in
> the NSS upstream as soon as it has this patch.
Attachment #784493 - Flags: review?(mcmanus) → review+
Attachment #758708 - Flags: review?(brian)
Comment on attachment 785796 [details] [diff] [review]
Patrick's patch modified to be a more conservative change

Review of attachment 785796 [details] [diff] [review]:
-----------------------------------------------------------------

I've tested this out and it appears to behave the same in my use cases as the previous ones.. I'm ok with it.
Attachment #785796 - Flags: feedback?(mcmanus) → feedback+
Attached patch false-start-async.patch (obsolete) — Splinter Review
This is the final version of the patch. I believe this version should work just fine in Chromium though I haven't built/run Chromium to find out.
Attachment #756637 - Attachment is obsolete: true
Attachment #758708 - Attachment is obsolete: true
Attachment #785796 - Attachment is obsolete: true
Attachment #803518 - Flags: review?(wtc)
Comment on attachment 803518 [details] [diff] [review]
false-start-async.patch

Review of attachment 803518 [details] [diff] [review]:
-----------------------------------------------------------------

Brian:

When applying the patch to the Chromium source tree, I got two merge conflicts,
which caused me to notice two problems.

::: lib/ssl/ssl3con.c
@@ +7047,5 @@
> +		SSL_TRC(3, ("%d: SSL3[%p]: defering false start check because"
> +			    " certificate authentication is still pending.",
> +			    SSL_GETPID(), ss->fd));
> +	    }
> +	}

Did you mean to put this block of code (lines 7026-7051) inside the
"if (!ss->firstHsDone)" block?

This means this block of code will be skipped in a renegotiation.
We need to skip the ssl3_SendNextProto() call in a renegotiation,
but it seems that this block of code should not care whether the
handshake is the initial handshake or a renegotiation.

::: lib/ssl/sslreveal.c
@@ +95,2 @@
>    /* according to public API SSL_GetChannelInfo, this doesn't need a lock */
> +  if (sslsocket->opt.useSecurity && sslsocket->enoughFirstHsDone) {

Here I want to go further and remove "&& sslsocket->enoughFirstHsDone",
so that SSL_HandshakeNegotiatedExtension can be called at any time
during the handshake. See bug 681839.

I need this to support TLS Channel ID (formerly called origin-bound
certificates).
(In reply to Wan-Teh Chang from comment #65)
> ::: lib/ssl/ssl3con.c
> @@ +7047,5 @@
> > +		SSL_TRC(3, ("%d: SSL3[%p]: defering false start check because"
> > +			    " certificate authentication is still pending.",
> > +			    SSL_GETPID(), ss->fd));
> > +	    }
> > +	}
> 
> Did you mean to put this block of code (lines 7026-7051) inside the
> "if (!ss->firstHsDone)" block?
> 
> This means this block of code will be skipped in a renegotiation.
> We need to skip the ssl3_SendNextProto() call in a renegotiation,
> but it seems that this block of code should not care whether the
> handshake is the initial handshake or a renegotiation.

During a renegotiation, we know we will not false start, so we shouldn't call the false start callback. This helps because the application can count the number of times it returned "yes, go ahead and false start" from its false start callback to accurately count how many times false start happens: the number of "yes" results from the false start callback divided by the number of calls to the handshake callback == the number of successful false starts.

Also, note that the comment says:

		/* The certificate authentication and the server's Finished
		 * message are racing each other. If the certificate
		 * authentication wins, then we will try to false start in
		 * ssl3_AuthCertificateFinished.
		 */

The racing doesn't happen (because it can't happen safely) on renegotiations. The case of async cert verification during renegotiation is dealt with in this code near the top of the function:

    if (ss->ssl3.hs.authCertificatePending &&
	(sendClientCert || ss->ssl3.sendEmptyCert || ss->firstHsDone)) {
	ss->ssl3.hs.restartTarget = ssl3_SendClientSecondRound;
	return SECWouldBlock;
    }


> ::: lib/ssl/sslreveal.c
> @@ +95,2 @@
> >    /* according to public API SSL_GetChannelInfo, this doesn't need a lock */
> > +  if (sslsocket->opt.useSecurity && sslsocket->enoughFirstHsDone) {
> 
> Here I want to go further and remove "&& sslsocket->enoughFirstHsDone",
> so that SSL_HandshakeNegotiatedExtension can be called at any time
> during the handshake. See bug 681839.
> 
> I need this to support TLS Channel ID (formerly called origin-bound
> certificates).

OK. I will make this change.
Comment on attachment 803518 [details] [diff] [review]
false-start-async.patch

I converted this patch to a Chromium code review:
https://codereview.chromium.org/23621040/

I then tested this patch with Chromium.

The backward-compatible mode (no custom CanFalseStartCallback) works with
Chromium.

When I changed Chromium to install a custom CanFalseStartCallback that
requires the NPN or ALPN extension for False Start, I got a libSSL locking
order assertion failure in SSL_HandshakeNegotiatedExtension. I can fix that
by releasing XmitBufLock before calling ssl3_CheckFalseStart. Another fix
would be to call ssl3_CheckFalseStart after ssl3_SendFinished, but I don't
know if that's desirable.

Here is my fix, with two other typo fixes:
https://codereview.chromium.org/23621040/diff2/7001:16001/net/third_party/nss/ssl/ssl3con.c
On second thought, it seems wrong to release XmitBufLock
in the middle of several ssl3_SendXXX calls. Brian, is it
OK to call ssl3_CheckFalseStart after ssl3_SendFinished?

On the other hand, if ssl3_SendNextProto leaves the XmitBuf
in a consistent state, then it should be OK to release
XmitBufLock.
No update on this bug during the previous 6 days, and mozilla 25 is already in the final testing phase (beta).

May I kindly ask that you please work towards a quick a pragmatic solution, so that 3.15.2 can be released quickly and that mozilla 25 won't ship with a forked NSS?
Comment on attachment 803518 [details] [diff] [review]
false-start-async.patch

Review of attachment 803518 [details] [diff] [review]:
-----------------------------------------------------------------

Brian: I've read this patch carefully. In general it looks good. I suggest
the following changes.

1. ssl3con.c
https://codereview.chromium.org/23621040/diff2/7001:36001/net/third_party/nss/ssl/ssl3con.c?column_width=80

2. sslimpl.h
https://codereview.chromium.org/23621040/diff2/7001:36001/net/third_party/nss/ssl/sslimpl.h?column_width=80

3. sslsecur.c
https://codereview.chromium.org/23621040/diff2/7001:36001/net/third_party/nss/ssl/sslsecur.c?column_width=80

4. sslsock.c
https://codereview.chromium.org/23621040/diff2/7001:36001/net/third_party/nss/ssl/sslsock.c?column_width=80

Let me know if you have questions about any suggested changes.

Also, in either this patch or in a future patch, Adam and Ryan think
it is OK to make False Start require a custom canFalseStartCallback.
This will allow you to remove the backward compatible mode.

This means to get False Start, one will not only need to enable the
SSL_ENABLE_FALSE_START option but also register a canFalseStartCallback
(which can be SSL_DefaultCanFalseStartCallback). Existing code will
silently fail to False Start, and therefore won't notice whether
handshakeCallback for False Start would be called at False Start time
or true handshake completion time. So, please think about this. Again,
this can be done in a future patch.
Attachment #803518 - Flags: review?(wtc) → review-
I am making the changes wtc asked for now. I expect to have a patch by EOD.
Assignee: mcmanus → brian
I ran out of time for testing this patch before I had to leave for the day.

Wan-Teh: In these patches, there is a reason for this change:

-   PRINT_BUF(3, (ss, "Send record (plain text)", pIn, nIn));
+   PRINT_BUF(50, (ss, "Send record (plain text)", pIn, nIn));

If you run with SSLTRACE=3 without this change, then these PRINT_BUF calls obscure the output. With this change, it is much easier to see the output of the SSL_TRC calls I added.

I believe that it is actually a bug that PRINT_BUF calls use 3 as the argument in the first place. IIRC, when I originally made this change, I compared this call to other similar calls in libssl to determine that 50 is the correct number.

Anyway, over the weekend I will complete the testing of this change.
Attachment #803518 - Attachment is obsolete: true
(In reply to Brian Smith (:briansmith, was :bsmith@mozilla.com) from comment #72)
>
> Wan-Teh: In these patches, there is a reason for this change:
> 
> -   PRINT_BUF(3, (ss, "Send record (plain text)", pIn, nIn));
> +   PRINT_BUF(50, (ss, "Send record (plain text)", pIn, nIn));

Thank you for the explanation. You are right, the other three PRINT_BUF calls in
the same function (for printing encrypted record data) use 50 as the argument.
Comment on attachment 808061 [details] [diff] [review]
WIP: false-start-async with Googlers' review comment addressed

Review of attachment 808061 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/ssl/ssl.h
@@ +674,5 @@
> +** connection should use false start or not. SECSuccess indicates that the
> +** callback completed successfully, and if so *canFalseStart indicates if false
> +** start can be used. If the callback does not return SECSuccess then the
> +** handshake will be canceled. NSS's recommended criteria can be evaluated by
> +** calling SSL_RecommendedCanFalseStart from the custom callback.

I am afraid that we will need to document the recommended criteria
so that an application can decide what additional or alternative
criteria they need to use.

Note: AGL also suggested just dropping support for apps that expect
the handshakeCallback to be called at False Start time because he
thinks Chromium is the only such app. If that will allow you to
simplify the code further, you can consider that.

::: lib/ssl/sslimpl.h
@@ +1131,5 @@
>      unsigned long    clientAuthRequested;
>      unsigned long    delayDisabled;       /* Nagle delay disabled */
>      unsigned long    firstHsDone;         /* first handshake is complete. */
> +    unsigned long    enoughFirstHsDone;   /* enough of the first handshake is
> +					   * donefor callbacks to be able to

donefor => done for
Attached patch false-start-async.patch (obsolete) — Splinter Review
This patch makes the changes that you suggested with the following exceptions:

1. I left the PRINT_BUF(50) change since we both agree that it is correct and because it helps with inspecting the false start logic by reducing noise.
2. Instead of setting enoughFirstHsDone in ssl3_SendChangeCipherSpec, I set it in ssl3_SendClientSecondRound after the ssl3_SendChangeCipherSpec call. This way we avoid making any changes for the case where we are the server. I think it is a good idea to minimize changes to the server role.

3. I did not document the criteria used for SSL_RecommendedCanFalseStart. I think we should have a discussion about what the recommended criteria should actually be, before we document them. Also, I think we should reserve some flexibility for changing the criteria. If an application wants to use fixed criteria than they can do so by hard-coding that criteria into their false start callback.
Attachment #808061 - Attachment is obsolete: true
Attachment #808721 - Flags: review?(wtc)
Since I made the change to make a false start callback mandatory, tstclnt and strsclnt need to be updated to have a callback.

Also, I added some testing to ensure that issues like the xmit lock ordering issue are detected during tests. Note that I made the SECU_CheckSSLInfoFunctions generic so that it could be called within auth certificate hooks, bad cert hooks, and client auth data hooks too. However, I did not modify tstclnt and strsclnt to do that because it would require redoing how those programs keep track of state to keep track of whether the current handshake is a renegotiation or not.
Attachment #808725 - Flags: review?(wtc)
> @@ -103,20 +103,22 @@ ssl_Do1stHandshake(sslSocket *ss)
>  
>  	    SSL_TRC(3, ("%d: SSL[%d]: handshake is completed",
>  			SSL_GETPID(), ss->fd));
>              /* call handshake callback for ssl v2 */
>  	    /* for v3 this is done in ssl3_HandleFinished() */
>  	    if ((ss->handshakeCallback != NULL) && /* has callback */
>  		(!ss->firstHsDone) &&              /* only first time */
>  		(ss->version < SSL_LIBRARY_VERSION_3_0)) {  /* not ssl3 */
> -		ss->firstHsDone     = PR_TRUE;
> +		ss->firstHsDone = PR_TRUE;
> +		ss->enoughFirstHsDone = PR_TRUE;
>  		(ss->handshakeCallback)(ss->fd, ss->handshakeCallbackData);
>  	    }
> -	    ss->firstHsDone         = PR_TRUE;
> +	    ss->firstHsDone = PR_TRUE;
> +	    ss->enoughFirstHsDone = PR_TRUE;
>  	    ss->gs.writeOffset = 0;
>  	    ss->gs.readOffset  = 0;

:(. This is wrong.

First, there is the minor issue that we shouldn't be outputing "handshake is completed" in the case we false started.

However, there is a more significant problem: It is wrong to set ss->firstHsDone and ss->enoughFirstHsDone here in the false start case. ss->firstHsDone must only be set by ssl3_FinishHandshake.

I discovered this badness when updating the sslstress.txt tests and by adding some more SSL_TRC statements.
Based on discussions in other bugs, I assume the new target release of this bug is NSS 3.15.3?
(In reply to Kai Engert (:kaie) from comment #78)
> Based on discussions in other bugs, I assume the new target release of this
> bug is NSS 3.15.3?

Yes.
Target Milestone: 3.15.2 → 3.15.3
Priority: -- → P1
Is there an ETA on a NSS release with this landed?
Attached patch false-start-async.patch (obsolete) — Splinter Review
Attachment #808721 - Attachment is obsolete: true
Attachment #808721 - Flags: review?(wtc)
Attachment #814807 - Flags: review?(wtc)
Attached patch remove-ssl3_AlwaysBlock.patch (obsolete) — Splinter Review
This patch removes ssl3_AlwaysBlock. I believe that this is effectively dead code, and it makes the usage of ss->handshake in the SSL3+ case clearer if it is removed.
Attachment #814809 - Flags: review?(wtc)
The stress tests currently aren't actually test any false start code paths anymore because the default false start criteria require ephemeral key exchange, and non of the stress tests have false start enabled and ECC cipher suites enabled at the same time. These cases only test the synchronous path. I am planning to write some GTests to cover the asynchronous paths, but it will be a while before I am able to contribute the GTest integration into NSPR/NSS.
Attachment #815732 - Flags: review?(wtc)
Comment on attachment 814809 [details] [diff] [review]
remove-ssl3_AlwaysBlock.patch

Review of attachment 814809 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/ssl/ssl3con.c
@@ +6801,5 @@
>  	rv = SECFailure; /* force it to send a no_certificate alert */
>      }
>      switch (rv) {
> +    case SECWouldBlock:
> +	errCode = PR_NOT_IMPLEMENTED_ERROR;

I agree that ss->getClientAuthData cannot return SECWouldBlock right
now because the SSL_RestartHandshakeAfterCertReq doesn't work. But I
have a patch that makes SSL_RestartHandshakeAfterCertReq work. I used
the patch to make origin-bound certificates work in Chromium.

Do you think that we can support a non-blocking ss->getClientAuthData
without using ssl3_SetAlwaysBlock? You have more experience in this
area.
(In reply to Wan-Teh Chang from comment #84)
> ::: lib/ssl/ssl3con.c
> @@ +6801,5 @@
> >  	rv = SECFailure; /* force it to send a no_certificate alert */
> >      }
> >      switch (rv) {
> > +    case SECWouldBlock:
> > +	errCode = PR_NOT_IMPLEMENTED_ERROR;
> 
> I agree that ss->getClientAuthData cannot return SECWouldBlock right
> now because the SSL_RestartHandshakeAfterCertReq doesn't work. But I
> have a patch that makes SSL_RestartHandshakeAfterCertReq work. I used
> the patch to make origin-bound certificates work in Chromium.
> 
> Do you think that we can support a non-blocking ss->getClientAuthData
> without using ssl3_SetAlwaysBlock? You have more experience in this
> area.

I made this modification because the code changes ss->handshake to ssl3_AlwaysBlock and I wanted to make it as clear as possible that setting ss->handshake = 0 in the false start case would not cause a problem, especially since SSL_RestartHandshakeAfterCertReq doesn't work anyway.

If I were to try to make SSL_RestartHandshakeAfterCertReq work, I would follow the model of the asynchronous peer cert verification logic:

  * Add PRBool peerCertDataPending next to PRBool authCertificatePending;
  * Set peerCertDataPending when the callback returns SECWouldBlock
  * in ssl3_SendClientSecondRound, change:

    if (ss->ssl3.hs.authCertificatePending &&
	(sendClientCert || ss->ssl3.sendEmptyCert || ss->firstHsDone)) {

    to:

    if (ss->ssl3.hs.clientAuthDataPending ||
        (ss->ssl3.hs.authCertificatePending &&
	 (sendClientCert || ss->ssl3.sendEmptyCert || ss->firstHsDone))) {

  * Make SSL_RestartHandshakeAfterCertReq like
    SSL_AuthCertificateComplete.

Note that this is simpler because other parts of libssl that need to know about pending asynchronous operations are testing ss->ssl3.hs.restartTarget.

Maybe you implemented the fix for SSL_RestartHandshakeAfterCertReq in a different way that still relies on setting ss->handshake to ssl3_AlwaysBlock. I believe that way probably works as long as the application isn't doing the asynchronous peer cert verification. However, I am not sure that it works when the application is doing the asynchronous peer cert verification.

I wish I could remember more clearly why I didn't use the ss->handshake = ssl3_AlwaysBlock technique for the SSL_AuthCertificateComplete patch. I believe one reason is that I wanted the server's second round to race against the asynchronous cert verification, and ss->handshake = ssl3_AlwaysBlock did not allow that. I am not sure if there were other reasons.
Comment on attachment 814807 [details] [diff] [review]
false-start-async.patch

Review of attachment 814807 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc.

Brian: I reviewed this patch at https://codereview.chromium.org/27254004/.
I have some questions and suggested changes. Please let me know if you want
me to transfer my review comments here.
Attachment #814807 - Flags: review?(wtc) → review+
Can this get checked in now?

We are approaching the next round of Firefox merges (Tuesday) and we should work on creating the NSS 3.15.3 release and uplifting NSS 3.15.3 to mozilla-aurora, prior to Tuesday.
I have been waiting for a new patch from Brian.
Attached patch false-start-async.patch (obsolete) — Splinter Review
I will comment in the chromium code review about the changes I made.
Attachment #821955 - Flags: review?(wtc)
Attachment #814807 - Attachment is obsolete: true
Attachment #814809 - Attachment is obsolete: true
Attachment #814809 - Flags: review?(wtc)
Attachment #821957 - Flags: review?(wtc)
No longer blocks: 898431
Attached patch false-start-async.patch (obsolete) — Splinter Review
This version of the patch addresses the review comments from the Chromium code review.
Attachment #784493 - Attachment is obsolete: true
Attachment #821955 - Attachment is obsolete: true
Attachment #821955 - Flags: review?(wtc)
Attachment #824137 - Flags: review?(wtc)
Wan-Teh, will you be able to review the testing-related patches in this bug soon too? Currently, we don't have any test coverage in the NSS test suite for False Start because all the existing tests are for non-ephemeral key exchange and because tstclnt and strsclnt don't register a false start callback.
Flags: needinfo?(wtc)
Comment on attachment 824137 [details] [diff] [review]
false-start-async.patch

Review of attachment 824137 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc. I uploaded this patch as patch set 4 at
https://codereview.chromium.org/27254004/ and reviewed it there.
Attachment #824137 - Flags: review?(wtc) → review+
This is the version of the patch that I checked in. It addresses the latest review comments in the Chromium code review.
Attachment #824137 - Attachment is obsolete: true
Attachment #825813 - Flags: checked-in+
Is this bug fixed?

Can the old attachments marked as obsolete (and r? dropped) ?
Could this work have caused the assertion failures reported in bug 934378?

Assertion failure: ss->ssl3.hs.ws == wait_new_session_ticket || ss->ssl3.hs.ws == wait_change_cipher || ss->ssl3.hs.ws == wait_finished, at ssl3con.c:9967
(In reply to Kai Engert (:kaie) from comment #96)
> Could this work have caused the assertion failures reported in bug 934378?

Indeed, your code added that assertion. Adding dependency.
Depends on: 934378
(In reply to Kai Engert (:kaie) from comment #95)
> Is this bug fixed?
> 
> Can the old attachments marked as obsolete (and r? dropped) ?

All of those patches should be reviewed and checked in. They are for coverage testing this code.

(In reply to Kai Engert (:kaie) from comment #97)
> (In reply to Kai Engert (:kaie) from comment #96)
> > Could this work have caused the assertion failures reported in bug 934378?
> 
> Indeed, your code added that assertion. Adding dependency.

Indeed, apparently we need even more coverage testing. Thanks for letting me know about this bug.
changing target milestone to 3.15.4
Target Milestone: 3.15.3 → 3.15.4
Brian: please feel free to reject this patch.

I suggested this change during code review. I am not
sure if you prefer to declare all variables at the
beginning of a function.
Attachment #833259 - Flags: review?(brian)
Flags: needinfo?(wtc)
Attachment #833259 - Flags: review?(brian) → review+
Not sure if this fixes any bug, but this matches the
original code better.
Attachment #8338217 - Flags: review?(brian)
Comment on attachment 8338217 [details] [diff] [review]
Handle both ssl3_HandleRecord calls in ssl3_GatherCompleteHandshake

Review of attachment 8338217 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/ssl/ssl3gthr.c
@@ +373,5 @@
>  	}
>  	if (rv < 0) {
>  	    return ss->recvdCloseNotify ? 0 : rv;
>  	}
> +	if (ss->gs.buf.len > 0) {

Note that I omitted the rv == (int) SECSuccess test.
I assume it is not necessary here because we know 
rv < 0 is false.
Comment on attachment 833259 [details] [diff] [review]
Declare the |falseStart| local variable in the smallest lexical scope

Review of attachment 833259 [details] [diff] [review]:
-----------------------------------------------------------------

https://hg.mozilla.org/projects/nss/rev/774c7dec7565
Attachment #833259 - Flags: checked-in+
The original checkin for the fix for this bug was:
http://hg.mozilla.org/projects/nss/rev/1b9c43d28713

I forgot to add the comment to the bug when I made that checkin.
Attachment #8338217 - Flags: review?(brian) → review+
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Comment on attachment 8338217 [details] [diff] [review]
Handle both ssl3_HandleRecord calls in ssl3_GatherCompleteHandshake

https://hg.mozilla.org/projects/nss/rev/f28426e944ae
Attachment #8338217 - Flags: checked-in+
Red Hat QE has this question: https://bugzilla.redhat.com/show_bug.cgi?id=1053725#c14
Comment 77 does suggests at first that there are some test but I am afraid there is none. The test related patches submitted here were never r+'d and applied and looking at the hg logs for the files in nss/tests/ssh/ I don't see any commits for this bug. Where any new tests added to our suite for this?
Flags: needinfo?(wtc)
Flags: needinfo?(brian)
Comment on attachment 815732 [details] [diff] [review]
Add cases to sslstress for false start with ECC

Review of attachment 815732 [details] [diff] [review]:
-----------------------------------------------------------------

Elio: I haven't reviewed Brian's testing-related patches, so they haven't been checked in yet. Sorry.

This patch is easy to review. r=wtc.
Attachment #815732 - Flags: review?(wtc) → review+
Comment on attachment 808725 [details] [diff] [review]
Fix false start for tstclnt and strsclnt and improve code coverage of SSL socket info functions during test runs

Review of attachment 808725 [details] [diff] [review]:
-----------------------------------------------------------------

r=wtc.

::: cmd/lib/secutil.h
@@ +405,5 @@
>  SECU_ParseSSLVersionRangeString(const char *input,
>                                  const SSLVersionRange defaultVersionRange,
>                                  const PRBool defaultEnableSSL2,
>                                  SSLVersionRange *vrange,
>                                  PRBool *enableSSL2);

Maybe we should also move this function to the new sslutil.c file?

@@ +411,5 @@
> +SECStatus
> +SECU_EnableFalseStart(PRFileDesc * fd, PRBool enableFalseStart);
> +
> +SECStatus
> +SECU_CheckSSLInfoFunctions(PRFileDesc *fd, PRBool enoughFirstHsDone);

Please document these two functions.

::: cmd/lib/sslutil.c
@@ +9,5 @@
> +canFalseStartCallback(PRFileDesc *fd, void *arg, PRBool *canFalseStart)
> +{
> +    SECStatus rv;
> +
> +    rv = SECU_CheckSSLInfoFunctions(fd, PR_TRUE);

Should we pass PR_FALSE here?

We are passing PR_TRUE to all the SECU_CheckSSLInfoFunctions calls and the !enoughFirstHsDone case isn't being tested.

@@ +14,5 @@
> +    if (rv != SECSuccess) {
> +        return rv;
> +    }
> +
> +    if (!fd || !canFalseStart) {

The arguments check should be done first, before the SECU_CheckSSLInfoFunctions call.

@@ +32,5 @@
> +    rv = SSL_OptionSet(fd, SSL_ENABLE_FALSE_START, enableFalseStart);
> +    if (rv != SECSuccess) {
> +        return rv;
> +    }
> +    rv = SSL_SetCanFalseStartCallback(fd, canFalseStartCallback, NULL);

Should we skip the SSL_SetCanFalseStartCallback call if enableFalseStart is false?

@@ +70,5 @@
> +                       " result");
> +        return SECFailure;
> +    }
> +
> +    /* Functions that only succeed even when enoughFirstHsDone */

Remove "even".

@@ +87,5 @@
> +        PR_NOT_REACHED("SSL_SecurityStatus returned unexpected result");
> +        if (rv != SECFailure) {
> +            PORT_SetError(SEC_ERROR_LIBRARY_FAILURE);
> +        }
> +        return rv;

This should be "return SECFailure", just like line 82.

::: cmd/strsclnt/strsclnt.c
@@ +711,5 @@
>  void
>  myHandshakeCallback(PRFileDesc *socket, void *arg) 
>  {
> +    SECStatus rv;
> +    

Nit: remove the spaces.

::: cmd/tstclnt/tstclnt.c
@@ +171,5 @@
> +    if (rv != SECSuccess) {
> +        /* We can't do anything else because this is is a void function, but
> +         * this also should never happen so it doesn't really matter.
> +         */
> +        PR_Abort(); 

Please make the same changes to strsclnt.c.

1. We should print an error message with file name and line number before aborting the process.

2. Nit: remove the space at the end of line.
Attachment #808725 - Flags: review?(wtc) → review+
Comment on attachment 821957 [details] [diff] [review]
remove-ssl3_AlwaysBlock.patch [v2]

Review of attachment 821957 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/ssl/ssl3con.c
@@ +6801,5 @@
>  	rv = SECFailure; /* force it to send a no_certificate alert */
>      }
>      switch (rv) {
> +    case SECWouldBlock:
> +	errCode = PR_NOT_IMPLEMENTED_ERROR;

Brian: I have a patch that makes async getClientAuthData work. Do you think we still won't need the ssl3_SetAlwaysBlock function when we support async getClientAuthData?
Flags: needinfo?(wtc)
(In reply to Wan-Teh Chang from comment #109)
> Brian: I have a patch that makes async getClientAuthData work. Do you think
> we still won't need the ssl3_SetAlwaysBlock function when we support async
> getClientAuthData?

I guess it depends on your implementation of that logic. See also comment 85. If we were to implement the async client cert logic the way I had been intending to, then we wouldn't need it, AFAICT.
Flags: needinfo?(brian)
(In reply to Elio Maldonado from comment #106)
> Red Hat QE has this question:
> https://bugzilla.redhat.com/show_bug.cgi?id=1053725#c14
> Comment 77 does suggests at first that there are some test but I am afraid
> there is none. The test related patches submitted here were never r+'d and
> applied and looking at the hg logs for the files in nss/tests/ssh/ I don't
> see any commits for this bug. Where any new tests added to our suite for
> this?

I will try to get to this this week, but I am doubtful I will. Hopefully next week.
Flags: needinfo?(brian)
Flags: needinfo?(brian)
Whiteboard: [leave open] → [leave open][briansmith ETA: 2014-02-28]
Flags: needinfo?(brian)
Hi Brian,

Can we expect the new test case code to be available in the next few days?
(In reply to Hubert Kario from comment #112)
> Can we expect the new test case code to be available in the next few days?

Hi, I won't have time to work on the tests. Sorry.
Flags: needinfo?(brian)
Whiteboard: [leave open][briansmith ETA: 2014-02-28]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: