Closed Bug 1049740 Opened 5 years ago Closed 5 years ago

gather telemetry to inform changes in minimum RSA key size

Categories

(Core :: Security: PSM, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox37 --- wontfix
firefox38 --- fixed
firefox39 --- fixed

People

(Reporter: kwilson, Assigned: keeler)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 4 obsolete files)

According to Mozilla Policy and the CA/Browser Forum Baseline Requirements, certificates should now have RSA key sizes of RSA 2048 bits are stronger.

https://www.mozilla.org/en-US/about/governance/policies/security-group/certs/policy/maintenance/
"8. We consider the following algorithms and key sizes to be acceptable and supported in Mozilla products: 
... RSA 2048 bits or higher; and
RSA 1024 bits (only until December 31, 2013)."
and
"9. We expect CAs to maintain current best practices to prevent algorithm attacks against certificates. As such, the following steps will be taken: 
... all end-entity certificates with RSA key sizes smaller than 2048 bits must expire by December 31, 2013;
after December 31, 2013, Mozilla will disable or remove all root certificates with RSA key sizes smaller than 2048 bits"

CA/Browser Forum Baseline Requirements, Appendix A:
"Subordinate CA Certificates - Validity period beginning after 31 Dec 2010 or ending after 31 Dec 2013 -
Minimum RSA modulus size (bits) - 2048"
and
"Subscriber Certificates - Validity period ending after 31 Dec 2013 -
Minimum RSA modulus size (bits) - 2048"

So we should start showing the Untrusted Connection error when we encounter certificates in the chain that use less than RSA 2048 signatures.
If we aren't doing so already, please start collecting telemetry about the certificates that are encountered with RSA key sizes smaller than 2048 bits. Please have the telemetry distinguish between end-entity certs, intermediate certs, and root certs.
Assignee: nobody → rlb
Attached patch bug-1049740.patch (obsolete) — Splinter Review
Here's an initial hack at adding telemetry for this decision.

mozilla::pkix::TrustDomain already has a CheckPublicKey method, so this patch extends that method to collect telemetry on whether increasing the minimum key size to 2048 bits would result in connection failure.

This patch assumes that we might want to set different thresholds for CA certs vs. EE, so it has to update TrustDomain::CheckPublicKey to take an EndEntityOrCA argument.  This results in a ton of small changes.  If we're willing to outlaw 1024-bit EE certs as well, then we don't have to make this change.
Attachment #8469361 - Flags: feedback?(kwilson)
Attachment #8469361 - Flags: feedback?(dkeeler)
Comment on attachment 8469361 [details] [diff] [review]
bug-1049740.patch

Review of attachment 8469361 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for working on this, Richard. I think we should go in a slightly different direction, though.

One thing to mention is we already have key size telemetry for end-entities:
http://telemetry.mozilla.org/#filter=nightly%2F34%2FSSL_AUTH_RSA_KEY_SIZE_FULL&aggregates=multiselect-all!Submissions&evoOver=Builds&locked=true&sanitize=true&renderhistogram=Table
http://telemetry.mozilla.org/#filter=nightly%2F34%2FSSL_AUTH_DSA_KEY_SIZE_FULL&aggregates=multiselect-all!Submissions&evoOver=Builds&locked=true&sanitize=true&renderhistogram=Table
http://telemetry.mozilla.org/#filter=nightly%2F34%2FSSL_AUTH_ECDSA_CURVE_FULL&aggregates=multiselect-all!Submissions&evoOver=Builds&locked=true&sanitize=true&renderhistogram=Table
(The mapping from telemetry histogram slot number to key size is here: http://hg.mozilla.org/mozilla-central/annotate/aa1617678a90/security/manager/ssl/src/nsNSSCallbacks.cpp#l1063 )

We collect this telemetry during TLS handshake callbacks rather than during verification, which helps prevent over-counting due to verifications that are unrelated to TLS connections (e.g. various places in the UI). I think it would be better to gather the telemetry for intermediates/CAs after the call to VerifySSLServerCert here: https://hg.mozilla.org/releases/mozilla-beta/annotate/22589028e561/security/manager/ssl/src/SSLServerCertVerification.cpp#l969 (you'll have to pass in a reference to some sort of scoped cert chain to get the result out). If the verification succeeds, check the intermediates/root for keys that are too small. Unfortunately, this won't give us an exact idea of when verification would fail if we restrict key sizes, because it may be that there's an equally valid path with all good key sizes that the library simply didn't explore. We could get closer to the truth by having mozilla::pkix prioritize exploring paths with larger keys, though.
Also, I would re-use the functions used to sort key sizes into telemetry histogram slots since that gives us a bit more data to work with (as opposed to just key good vs. key too small).
Attachment #8469361 - Flags: feedback?(dkeeler) → feedback-
(In reply to David Keeler (:keeler) [use needinfo?] from comment #3)
> Comment on attachment 8469361 [details] [diff] [review]
> bug-1049740.patch
> 
> Review of attachment 8469361 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thanks for working on this, Richard. I think we should go in a slightly
> different direction, though.
> 
> One thing to mention is we already have key size telemetry for end-entities:

Good point.  No need to duplicate that.


> We collect this telemetry during TLS handshake callbacks rather than during
> verification, which helps prevent over-counting due to verifications that
> are unrelated to TLS connections (e.g. various places in the UI). I think it
> would be better to gather the telemetry for intermediates/CAs after the call
> to VerifySSLServerCert here:
> https://hg.mozilla.org/releases/mozilla-beta/annotate/22589028e561/security/
> manager/ssl/src/SSLServerCertVerification.cpp#l969 (you'll have to pass in a
> reference to some sort of scoped cert chain to get the result out). If the
> verification succeeds, check the intermediates/root for keys that are too
> small. Unfortunately, this won't give us an exact idea of when verification
> would fail if we restrict key sizes, because it may be that there's an
> equally valid path with all good key sizes that the library simply didn't
> explore. We could get closer to the truth by having mozilla::pkix prioritize
> exploring paths with larger keys, though.

My approach here was to try to apply the telemetry at the same place we would apply the policy control.  So the question is really: If we were going to do this, where would we put the check?  Which depends mostly on what scope we want the check to have.

1. SSL only, regardless of trust domain 
   => AuthCertificate()
2. All users of NSSCertDBTrustDomain 
   => NSSCertDBTrustDomain::CheckPublicKey()
3. All users of NSSCertDBTrustDomain and AppTrustDomain
   => ::mozilla::pkix::CheckPublicKeyLength()

Given that this enforcement is related to our policy requirements, I'm inclined to put it in with other policy enforcement, i.e., in NSSCertDBTrustDomain (option 2).  But I don't really see a reason that apps should get lower assurances in this regard, so it seems sensible to apply this constraint to AppTrustDomain as well (Option 3).


> Also, I would re-use the functions used to sort key sizes into telemetry
> histogram slots since that gives us a bit more data to work with (as opposed
> to just key good vs. key too small).

That should work trivially if we end up doing this in AuthCertificate(), otherwise we'll need to reproduce the mapping function.
Comment on attachment 8469361 [details] [diff] [review]
bug-1049740.patch

Review of attachment 8469361 [details] [diff] [review]:
-----------------------------------------------------------------

Just resolving my feedback request so I don't keep getting notified of an overdue request. I'm sure Richard and Keeler will find an optimal solution.
Attachment #8469361 - Flags: feedback?(kwilson) → feedback+
Duplicate of this bug: 134735
Are we also, like Chrome, going to show degraded security UI for short keys before we start rejecting them outright? If so would that be part of this bug or a separate one?
Comment on attachment 8469361 [details] [diff] [review]
bug-1049740.patch

Review of attachment 8469361 [details] [diff] [review]:
-----------------------------------------------------------------

What we're interested in is how frequently we would fail to build a cert chain that consists of only 2048+-bit certs (or ECC certs).

The way to correctly calculate this telemetry is:
1. Create an NSSCertDBTrustDomain that enforces a 2048-bit minimum and attempt to build a cert chain. If it succeeds, then count this as "good" and stop.
2. If you are not able to build a 2048+-bit cert chain, then try again with the minimum set to 1024 bits. If it succeeds, then count this as "compatibility risk" and stop.
3. If you're not able to build a cert chain when the minimum is set to 1024 bits, then count it as "already bad."

This procedure takes into account the fact that there are multiple possible cert chains, some of which may have 1024-bit certs in them, and some of which may have only 2048-bit certs.

When deciding where/how to actually increment the telemetry, you need to closely follow the pattern established by the patch for bug 1107666 to account for the fact that a single TLS handshake may result in multiple validations.

I do think it is good to get this telemetry done soon.

::: security/pkix/lib/pkixnss.cpp
@@ +79,5 @@
> +        if (length < MINIMUM_NON_ECC_BITS_CA) {
> +          Telemetry::Accumulate(Telemetry::CERT_VALIDATION_KEY_LENGTH_CHECK, 2);
> +        } else {
> +          Telemetry::Accumulate(Telemetry::CERT_VALIDATION_KEY_LENGTH_CHECK, 3);
> +        }

If you count this telemetry here, your counts will be WAY off; see bug 1107666 for an example.
Attached patch patch (obsolete) — Splinter Review
Brian, I believe this implements what you outlined in comment 8. Would you mind having a look? Thanks.
Assignee: rlb → dkeeler
Attachment #8469361 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8568866 - Flags: review?(brian)
Comment on attachment 8568866 [details] [diff] [review]
patch

Review of attachment 8568866 [details] [diff] [review]:
-----------------------------------------------------------------

::: security/certverifier/CertVerifier.cpp
@@ +171,5 @@
>  {
>    PR_LOG(gCertVerifierLog, PR_LOG_DEBUG, ("Top of VerifyCert\n"));
>  
>    PR_ASSERT(cert);
>    PR_ASSERT(usage == certificateUsageSSLServer || !(flags & FLAG_MUST_BE_EV));

PR_ASSERT(usage == certificateUsageSSLServer || !keySizeStatus);

@@ +279,5 @@
>          rv = Result::ERROR_POLICY_VALIDATION_FAILED;
>          break;
>        }
>  
> +      // Now try non-EV, but with the same minimum key size.

I suggest you rename the two constants as follows:

MINIMUM_NON_ECC_BITS_EV -> MINIMUM_NON_ECC_BITS
MINIMUM_NON_ECC_BITS_DV -> MINIMUM_NON_ECC_BITS_WEAK.

And then remove ", but with the same minimum key size" from the above comment.

@@ +295,5 @@
>                                          ocspStaplingStatus);
> +      if (rv == Success) {
> +        if (keySizeStatus) {
> +          *keySizeStatus = KeySizeLargeMinimumSucceeded;
> +        }

break;

@@ +296,5 @@
> +      if (rv == Success) {
> +        if (keySizeStatus) {
> +          *keySizeStatus = KeySizeLargeMinimumSucceeded;
> +        }
> +      } else {

Then you don't need this else.

::: security/certverifier/CertVerifier.h
@@ +33,5 @@
>      OCSP_STAPLING_INVALID = 4,
>    };
>  
> +  // These values correspond to the CERT_CHAIN_KEY_SIZE_STATUS telemetry.
> +  enum KeySizeStatus {

enum class?

::: security/manager/ssl/src/SSLServerCertVerification.cpp
@@ +1126,5 @@
>      Telemetry::Accumulate(Telemetry::SSL_OCSP_STAPLING, ocspStaplingStatus);
>    }
> +  if (keySizeStatus != CertVerifier::KeySizeNeverChecked) {
> +    Telemetry::Accumulate(Telemetry::CERT_CHAIN_KEY_SIZE_STATUS, keySizeStatus);
> +  }

For the OCSP telemetry, it is important for us to collect the telemetry even in the error cases because we're trying to figure out how often we were *already* failing due to error.

But, for most of the certificate validation telemetry, we only collect the telemetry in the Success case, in GatherSuccessfulValidationTelemetry. This makes me wonder if I'm doing the wrong thing in bug 1133562. It seems like this patch and my patches in bug 1133562 should be consistent in this respect. What do you think?
Attachment #8568866 - Flags: review?(brian) → review+
Comment on attachment 8568866 [details] [diff] [review]
patch

Review of attachment 8568866 [details] [diff] [review]:
-----------------------------------------------------------------

Would it make sense to add the sort of telemetry checking done at
  https://hg.mozilla.org/mozilla-central/annotate/13b2859e5c15/security/manager/ssl/tests/unit/test_cert_overrides.js#l49
to test_keysize.js (assuming it's possible)?
Thanks for the review. I'm working on addressing your comments.

(In reply to Brian Smith (:briansmith, :bsmith, use NEEDINFO?) from comment #10)
> For the OCSP telemetry, it is important for us to collect the telemetry even
> in the error cases because we're trying to figure out how often we were
> *already* failing due to error.
> 
> But, for most of the certificate validation telemetry, we only collect the
> telemetry in the Success case, in GatherSuccessfulValidationTelemetry. This
> makes me wonder if I'm doing the wrong thing in bug 1133562. It seems like
> this patch and my patches in bug 1133562 should be consistent in this
> respect. What do you think?

I think part of the motivation for doing it this way is if the number of "compatibility risk" cases are on the same order of or smaller than the number of "this was already bad" cases, it's a good indication we can make the change without too much breakage. In the past decisions like this based on telemetry only gathered in the Success cases have gone more like "we think x% of connections will break if we do this, and since x is so small, it's ok". I think both are probably valid.

(In reply to Cykesiopka from comment #11)
> Would it make sense to add the sort of telemetry checking done at
>  
> https://hg.mozilla.org/mozilla-central/annotate/13b2859e5c15/security/
> manager/ssl/tests/unit/test_cert_overrides.js#l49
> to test_keysize.js (assuming it's possible)?

Looks like test_keysize.js doesn't go through AuthCertificate, which is where this telemetry is (currently) gathered, so it wouldn't work there without some changes. I might add something to test_cert_overrides.js, though.
This bug ended up morphing a bit, so I'm going to open a new one for the actual changes if/when the telemetry gives us the go-ahead.
Summary: Show Untrusted Connection Error when cert in chain uses less than RSA 2048 signatures → gather telemetry to inform changes in minimum RSA key size
https://hg.mozilla.org/mozilla-central/rev/c6f3b60f6f8a
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Comment on attachment 8570168 [details] [diff] [review]
patch as landed

Approval Request Comment
[Feature/regressing bug #]: gathering key size telemetry
[User impact if declined]: we want to protect users better by default by raising the minimum RSA key size. Uplifting this telemetry will help us make the decision of whether or not to go ahead with this sooner.
[Describe test coverage new/current, TreeHerder]: has tests
[Risks and why]: low - seems to be doing fine in Nightly. Could be slightly less performant in some cases.
[String/UUID change made/needed]: none
Attachment #8570168 - Flags: approval-mozilla-aurora?
Attached patch patch for beta (obsolete) — Splinter Review
Approval Request Comment
see comment 17
Attachment #8574963 - Flags: review+
Attachment #8574963 - Flags: approval-mozilla-beta?
Attached patch patch for beta (obsolete) — Splinter Review
Whoops - had to update the test.

Approval Request Comment
see comment 17
Attachment #8574963 - Attachment is obsolete: true
Attachment #8574963 - Flags: approval-mozilla-beta?
Attachment #8574969 - Flags: review+
Attachment #8574969 - Flags: approval-mozilla-beta?
Attachment #8570168 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8574969 [details] [diff] [review]
patch for beta

This change is much more involved than the typical Telemetry probe addition. Given that we're getting close to the end of the 37 Beta cycle, I think we should wait on this probe until 38. At this point that means 3 more weeks before the probe will be on Beta. Beta-
Attachment #8574969 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Attachment #8574969 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.