Closed Bug 909162 Opened 11 years ago Closed 10 years ago

Add a way for the server side of libssl to simulate TLS intolerance for testing

Categories

(NSS :: Libraries, enhancement, P1)

enhancement

Tracking

(Not tracked)

RESOLVED INVALID
3.15.4

People

(Reporter: briansmith, Assigned: briansmith)

References

Details

Attachments

(2 files, 1 obsolete file)

We'd like a way to test our TLS intolerance logic in Gecko without needing to import a whole new TLS server implementation. This patch simulates a few types of TLS intolerance--ones that result in an alert being returned instead of a server_hello message. Sometime in the future I hope to expand this to allow it to simulate other types of intolerance.
Attached patch nss-simulate-intolerance.patch (obsolete) — Splinter Review
This patch does the most basic intolerance simulation: sending an alert instead of a server hello. I believe the API will allow us to simulate other, tricker, types of intolerance in the future, but I'm punting on that for now.
Attachment #799968 - Flags: review?(wtc)
Severity: normal → enhancement
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: 3.15.2 → 3.15.3
Since you changed the target milestone, you might also need to update the symbol version.

I can't see anything else wrong with the patch expect maybe some white space issues.
Comment on attachment 799968 [details] [diff] [review]
nss-simulate-intolerance.patch

Review of attachment 799968 [details] [diff] [review]:
-----------------------------------------------------------------

Ryan, wtc mentioned you might be able to get to this before him. If so, that would be great, because this is used for testing our TLS fallback logic, which is one of the things that's blocking us from enabling TLS 1.2
Attachment #799968 - Flags: review?(ryan.sleevi)
Comment on attachment 799968 [details] [diff] [review]
nss-simulate-intolerance.patch

Review of attachment 799968 [details] [diff] [review]:
-----------------------------------------------------------------

I would prefer that such testing code not be committed to mainline NSS.

At the risk of proposing #ifdefs (which I've argued against in the past on dev.tech.crypto / nss-dev), if we do need to start exposing unit test shims, I'd prefer a (single) define to cover all sorts of testing needs. This has the implication of potentially requiring NSS to be compiled twice, which is understandably ugly, so may not be the best approach.

Still, adding a new error code and new state to the SSL Handshake State strike me as very undesirable.
Attachment #799968 - Flags: review?(ryan.sleevi) → review-
changing target milestone to 3.15.4
Target Milestone: 3.15.3 → 3.15.4
No longer blocks: 839310
Ryan, I addressed your review comments in this patch:

1. I made it so that somebody with a custom build system for NSS (i.e. Chromium) can build libssl without this functionality.
2. I declare the test-only functionality in ssltest.h so that normal users of libssl will not be bothered by it.
3. As much as possible, I isolated the test-only functionality in ssltest.c so it is out of the way.
4. I named the functions SSL_TEST_* so as to make it more clear that they are test-only.

Unfortunately, I think we don't have good alternatives for this type of testing, in general. I looked into using TLS Lite, as you guys do. However, TLS Lite is too lightweight to meet all of our current or future testing needs. For example, it doesn't implement session tickets at all, so it can't support session ticket coverage testing. Generally, we're going to need to be able to test exceptional behavior that requires a fully-functioning TLS stack in the peer, and I think that the approach I am taking here is a good compromise. It is better to do something slightly ugly to have more test coverage than to forgo test coverage in the name of cleanliness.

Note that this patch applies on top of the patches for bug 930874 and bug 935847.
Attachment #799968 - Attachment is obsolete: true
Attachment #799968 - Flags: review?(wtc)
Attachment #830013 - Flags: review?(ryan.sleevi)
This second patch builds on top of the first one to add an additional case that we use in the test for Gecko bug 901718.
Attachment #831303 - Flags: review?(ryan.sleevi)
Comment on attachment 830013 [details] [diff] [review]
nss-simulate-intolerance.patch

Review of attachment 830013 [details] [diff] [review]:
-----------------------------------------------------------------

::: lib/ssl/ssl.def
@@ +167,5 @@
>  ;+    global:
>  SSL_PeerCertificateChain;
>  SSL_RecommendedCanFalseStart;
>  SSL_SetCanFalseStartCallback;
>  SSL_TEST_OptionSet;

Am I missing something? In my local tree, I do not see this function at all, nor do I see it in mxr. Dependent patch that is missing?

@@ +168,5 @@
>  SSL_PeerCertificateChain;
>  SSL_RecommendedCanFalseStart;
>  SSL_SetCanFalseStartCallback;
>  SSL_TEST_OptionSet;
> +SSL_TEST_SimulateIntolerance;

BUG? Is it legal to export an undefined function, in the event NSS_SSL_NO_TEST_ONLY_FUNCTIONALITY is defined?

::: lib/ssl/ssl.h
@@ +1012,5 @@
>   */
>  SSL_IMPORT SECStatus SSL_AuthCertificateComplete(PRFileDesc *fd,
>  						 PRErrorCode error);
> +
> +#ifndef NSS_SSL_NO_TEST_ONLY_FUNCTIONALITY

While I realize you may disagree, I think test-only code should be "opt-in" rather than "opt-out", so that all existing downstream consumers are not opted-in to particularly dangerous functionality.

::: lib/ssl/ssl3con.c
@@ +8274,5 @@
> +    /* This must be done after the SNI hook has been called because the SNI
> +     * hook may call SSL_SimulateIntolerance. This must be done after
> +     * ss->clientHelloVersion has been set because ssl3_SimulateIntolerance
> +     * depends on it.
> +     */

I don't understand why this has to be done after the SNI hook, for any other reason than you want to do it in the SNI hook (a point which I don't fully understand).

Am I missing something?

::: lib/ssl/sslimpl.h
@@ +324,5 @@
>      unsigned int serverRefreshSessionTicketDuringResume : 1;
>      unsigned int serverFlushAndSleepAfterCertificate : 1;
> +
> +    SSLIntoleranceTypeFlags serverSimulatedIntoleranceTypes;
> +    SSLIntoleranceResponseType serverSimulatedIntoleranceResponseType;  

unnecessary trailing whitespace

::: lib/ssl/sslt.h
@@ +197,5 @@
> +#ifndef NSS_SSL_NO_TEST_ONLY_FUNCTIONALITY
> +
> +#define SSL_INTOLERANCE_VERSION_1_0 (1 << 0)
> +#define SSL_INTOLERANCE_VERSION_1_1 (1 << 1)
> +#define SSL_INTOLERANCE_VERSION_1_2 (1 << 2)

If you're going to call the type "TypeFlags" (line 203), it seems like you should be declaring a mask for the version, so that people do not do equality checks.

For example, I can potentially see you adding unit tests for other types of intolerance (eg: large ClientHello, record version mismatches, etc), and those would not be versions.

@@ +202,5 @@
> +
> +typedef PRUint32 SSLIntoleranceTypeFlags;
> +
> +typedef enum {
> +    SSL_INTOLERANCE_NONE = 0, /* cancel intolerance simulation; must be zero */

I don't understand why "must be zero" is here, or what it exactly means.

@@ +206,5 @@
> +    SSL_INTOLERANCE_NONE = 0, /* cancel intolerance simulation; must be zero */
> +    SSL_INTOLERANCE_HANDSHAKE_FAILURE_ALERT = 1,
> +    SSL_INTOLERANCE_UNEXPECTED_MESSAGE_ALERT = 2,
> +    SSL_INTOLERANCE_ILLEGAL_PARAMETER_ALERT = 3,
> +    SSL_INTOLERANCE_PROTOCOL_VERSION_ALERT = 4,

No intolerance via just hanging the conn?

::: lib/ssl/ssltest.c
@@ +46,5 @@
>  }
>  
> +SECStatus
> +SSL_TEST_SimulateIntolerance(PRFileDesc *fd,
> +                             SSLIntoleranceTypeFlags intoleranceTypes,

s/intoleranceTypes/intoleranceFlags/

@@ +50,5 @@
> +                             SSLIntoleranceTypeFlags intoleranceTypes,
> +                             SSLIntoleranceResponseType responseType)
> +{
> +    sslSocket *ss;
> +    

unnecessary whitespace.

@@ +61,5 @@
> +
> +    /* We don't grab any locks because SSL_SimulateIntolerance may be called
> +     * from within the SNI callback, in which case we would already be holding
> +     * the locks.
> +     */

So should you instead be doing:

PORT_Assert( ss->opt.noLocks || ssl_HaveSSL3HandshakeLock(ss)) ?

This is consistent with line 79
Attachment #830013 - Flags: review?(ryan.sleevi) → review-
Comment on attachment 831303 [details] [diff] [review]
add-intolerance-close-connection-without-alert.patch

Review of attachment 831303 [details] [diff] [review]:
-----------------------------------------------------------------

r+ to this patch (which addresses a comment in the previous patch before I saw this), but the other issues in the previous patch still need to be addressed first.
Attachment #831303 - Flags: review?(ryan.sleevi) → review+
(In reply to Ryan Sleevi from comment #8)
> >  SSL_TEST_OptionSet;
> 
> Am I missing something? In my local tree, I do not see this function at all,
> nor do I see it in mxr. Dependent patch that is missing?

Yes, the dependent patch is in bug 938001, which this bug is marked as depending on.

> > +SSL_TEST_SimulateIntolerance;
> 
> BUG? Is it legal to export an undefined function, in the event
> NSS_SSL_NO_TEST_ONLY_FUNCTIONALITY is defined?

Earlier, you asked me to make this code optional so that it can be completely avoided in Chromium. Chromium doesn't use the NSS build system and it doesn't use the NSS *.def files (AFAICT). Consequently, the patch requires a custom build system and/or custom *.def files (like Chromium has, IIUC) to build with NSS_SSL_NO_TEST_ONLY_FUNCTIONALITY defined. Making the flag work with the NSS build system seems to be non-trivial work, because we'd have to make a way to conditionally export this function based on the flag, and I think that work is unnecessary.

> While I realize you may disagree, I think test-only code should be "opt-in"
> rather than "opt-out", so that all existing downstream consumers are not
> opted-in to particularly dangerous functionality.

I expect that the only type of build that will disable the test functionality is Chromium's build with its own custom build system. The reason is that normal NSS has an ABI guarantee and, for example, the Gecko tests that make use of this functionality need to be able to run against Red Hat's system NSS builds, which means Linux-distro-packaged NSS will need to enable the test functionality. Therefore, I think opt-out is reasonable for this.


> ::: lib/ssl/ssl3con.c
> @@ +8274,5 @@
> > +    /* This must be done after the SNI hook has been called because the SNI
> > +     * hook may call SSL_SimulateIntolerance. This must be done after
> > +     * ss->clientHelloVersion has been set because ssl3_SimulateIntolerance
> > +     * depends on it.
> > +     */
> 
> I don't understand why this has to be done after the SNI hook, for any other
> reason than you want to do it in the SNI hook (a point which I don't fully
> understand).
> 
> Am I missing something?

In Gecko we have a test server that uses the SNI on each connection to determine which test is being run: test-ocsp-stapling-unknown.example.com, test-intolerance-handshake_failure.example.com, etc. This makes tests much faster than, say, staring up a server and shutting it down for each test. In order for this to work, the SNI hook needs to be able to call the function.

> If you're going to call the type "TypeFlags" (line 203), it seems like you
> should be declaring a mask for the version, so that people do not do
> equality checks.
> 
> For example, I can potentially see you adding unit tests for other types of
> intolerance (eg: large ClientHello, record version mismatches, etc), and
> those would not be versions.

OK, I will consider this.

> > +typedef enum {
> > +    SSL_INTOLERANCE_NONE = 0, /* cancel intolerance simulation; must be zero */
> 
> I don't understand why "must be zero" is here, or what it exactly means.

The implementation of ssl3_SimulateIntolerance only works correctly if SSL_INTOLERANCE_NONE == 0.

> No intolerance via just hanging the conn?

We can add more types of intolerance to test, including in particular a failed MAC check in the Finished message and other such things. First, I just want to get these basic checks reviewed.

> @@ +61,5 @@
> > +
> > +    /* We don't grab any locks because SSL_SimulateIntolerance may be called
> > +     * from within the SNI callback, in which case we would already be holding
> > +     * the locks.
> > +     */
> 
> So should you instead be doing:
> 
> PORT_Assert( ss->opt.noLocks || ssl_HaveSSL3HandshakeLock(ss)) ?
> 
> This is consistent with line 79

The function may also be called from outside any callback function, in which case no locks would be being held. AFAICT, this is consistent with the other strange handling of locks in sslauth.c and other places. Basically, the rule seems to be that some of these functions must either be called before multiple threads start accessing the socket concurrently, or from within a callback.
(In reply to Brian Smith (:briansmith, was :bsmith; Please NEEDINFO? me if you want a response) from comment #10)
> (In reply to Ryan Sleevi from comment #8)
> > >  SSL_TEST_OptionSet;
> > 
> > Am I missing something? In my local tree, I do not see this function at all,
> > nor do I see it in mxr. Dependent patch that is missing?
> 
> Yes, the dependent patch is in bug 938001, which this bug is marked as
> depending on.
> 

Right, which I don't have access to :)

> > > +SSL_TEST_SimulateIntolerance;
> > 
> > BUG? Is it legal to export an undefined function, in the event
> > NSS_SSL_NO_TEST_ONLY_FUNCTIONALITY is defined?
> 
> Earlier, you asked me to make this code optional so that it can be
> completely avoided in Chromium. Chromium doesn't use the NSS build system
> and it doesn't use the NSS *.def files (AFAICT). Consequently, the patch
> requires a custom build system and/or custom *.def files (like Chromium has,
> IIUC) to build with NSS_SSL_NO_TEST_ONLY_FUNCTIONALITY defined. Making the
> flag work with the NSS build system seems to be non-trivial work, because
> we'd have to make a way to conditionally export this function based on the
> flag, and I think that work is unnecessary.
> 
> > While I realize you may disagree, I think test-only code should be "opt-in"
> > rather than "opt-out", so that all existing downstream consumers are not
> > opted-in to particularly dangerous functionality.
> 
> I expect that the only type of build that will disable the test
> functionality is Chromium's build with its own custom build system. The
> reason is that normal NSS has an ABI guarantee and, for example, the Gecko
> tests that make use of this functionality need to be able to run against Red
> Hat's system NSS builds, which means Linux-distro-packaged NSS will need to
> enable the test functionality. Therefore, I think opt-out is reasonable for
> this.

I still strongly dislike adding "test-only" code to NSS production binaries. That seems to run counter to the purpose of "release quality", when testing code is left in an actively enabled, particularly within security-sensitive libraries.

Put differently, is there a different way to accomplish this *without* functionality that is test only? Can it be accomplished with existing callbacks?

> 
> 
> > ::: lib/ssl/ssl3con.c
> > @@ +8274,5 @@
> > > +    /* This must be done after the SNI hook has been called because the SNI
> > > +     * hook may call SSL_SimulateIntolerance. This must be done after
> > > +     * ss->clientHelloVersion has been set because ssl3_SimulateIntolerance
> > > +     * depends on it.
> > > +     */
> > 
> > I don't understand why this has to be done after the SNI hook, for any other
> > reason than you want to do it in the SNI hook (a point which I don't fully
> > understand).
> > 
> > Am I missing something?
> 
> In Gecko we have a test server that uses the SNI on each connection to
> determine which test is being run: test-ocsp-stapling-unknown.example.com,
> test-intolerance-handshake_failure.example.com, etc. This makes tests much
> faster than, say, staring up a server and shutting it down for each test. In
> order for this to work, the SNI hook needs to be able to call the function.

This seems to be foisting an implementation detail of Gecko within NSS?

To be clear, I don't have any particular objection to doing it "before" versus "after" the SNI callback, just that the existing patch fails to document such guarantees or lifetimes/timing.

I think just stronger documentation in the public API portion can help resolve this.

> 
> > If you're going to call the type "TypeFlags" (line 203), it seems like you
> > should be declaring a mask for the version, so that people do not do
> > equality checks.
> > 
> > For example, I can potentially see you adding unit tests for other types of
> > intolerance (eg: large ClientHello, record version mismatches, etc), and
> > those would not be versions.
> 
> OK, I will consider this.
> 
> > > +typedef enum {
> > > +    SSL_INTOLERANCE_NONE = 0, /* cancel intolerance simulation; must be zero */
> > 
> > I don't understand why "must be zero" is here, or what it exactly means.
> 
> The implementation of ssl3_SimulateIntolerance only works correctly if
> SSL_INTOLERANCE_NONE == 0.

Then you're coupling implementation details to a description in the header, which is not exactly good documentation practice.

That said, your comment doesn't really seem to align with the implementation of ssl3_SimulateIntolerance, since it's treating ss->opt.serverSimulatedIntoleranceResponseType as a literal, and not a mask (like the TypeFlags).

I would just drop the "must be zero", since it could be anything and the code would still work correctly.

> 
> > No intolerance via just hanging the conn?
> 
> We can add more types of intolerance to test, including in particular a
> failed MAC check in the Finished message and other such things. First, I
> just want to get these basic checks reviewed.

Sure, and I see you did just that in the follow-up patch (by having it close the conn w/ no alert), which is fine.

> 
> > @@ +61,5 @@
> > > +
> > > +    /* We don't grab any locks because SSL_SimulateIntolerance may be called
> > > +     * from within the SNI callback, in which case we would already be holding
> > > +     * the locks.
> > > +     */
> > 
> > So should you instead be doing:
> > 
> > PORT_Assert( ss->opt.noLocks || ssl_HaveSSL3HandshakeLock(ss)) ?
> > 
> > This is consistent with line 79
> 
> The function may also be called from outside any callback function, in which
> case no locks would be being held. AFAICT, this is consistent with the other
> strange handling of locks in sslauth.c and other places. Basically, the rule
> seems to be that some of these functions must either be called before
> multiple threads start accessing the socket concurrently, or from within a
> callback.

I thought we agreed that was an anti-pattern, since every time someone looks at this code, it's a question about locks, and there's extensive comments of

/* NEED LOCKS IN HERE */ or /* XXX: NEED LOCKS? */.

I think this could be remedied with a public stated contract (eg: in the header file) about the appropriate times this can be called, which resolves the ambiguity between "Did he mean to add locks?" and "This is fine, because this is the API contract".
I now agree with rsleevi that we shouldn't do this.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: