1364159 - add optimization to potentially avoid calling CERT_CreateSubjectCertList in NSSCertDBTrustDomain::FindIssuer

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Assignee

Description

•

7 years ago

CERT_CreateSubjectCertList is not an inexpensive function call, since it enumerates the certificate database (i.e. reads from disk a lot). If we're verifying for a TLS handshake, however, we should already have in memory a certificate chain sent by the peer (there are some cases where we won't, such as session resumption (see bug 731478)). If we can, we should use those certificates before falling back to calling CERT_CreateSubjectCertList.

Comment hidden (mozreview-request)

CERT_CreateSubjectCertList is not an inexpensive function call, since it
enumerates the certificate database (i.e. reads from disk a lot). If we're
verifying for a TLS handshake, however, we should already have in memory a
certificate chain sent by the peer (there are some cases where we won't, such as
session resumption (see bug 731478)). If we can, we should use those
certificates before falling back to calling CERT_CreateSubjectCertList.

Review commit: https://reviewboard.mozilla.org/r/139846/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/139846/

:Cykesiopka

Comment 2

•

7 years ago

mozreview-review

Comment on attachment 8868262 [details]
bug 1364159 - potentially avoid calling CERT_CreateSubjectCertList in NSSCertDBTrustDomain::FindIssuer

https://reviewboard.mozilla.org/r/139846/#review143546

Looks reasonable to me, but I have some questions:
1. Do you have an estimate on how much faster this would make verifications?
2. Could we partially fix Bug 1063276 (pass a cert chain instead of just a cert), then use that cert chain to perform this optimisation?

::: security/certverifier/NSSCertDBTrustDomain.cpp:149
(Diff revision 1)
>  
> +// Remove from newCandidates any CERTCertificates in alreadyTried.
> +// alreadyTried is likely to be small or empty.
> +static void
> +RemoveCandidatesAlreadyTried(UniqueCERTCertList& newCandidates,
> +                             UniqueCERTCertList& alreadyTried)

const?

::: security/certverifier/NSSCertDBTrustDomain.cpp:169
(Diff revision 1)
> +
> +// Add to matchingCandidates any CERTCertificates from candidatesIn that have a
> +// DER-encoded subject name equal to the given subject name.
> +static Result
> +AddMatchingCandidates(UniqueCERTCertList& matchingCandidates,
> +                      UniqueCERTCertList& candidatesIn,

const?

::: security/certverifier/NSSCertDBTrustDomain.cpp:191
(Diff revision 1)
> +                                             certDuplicate.get());
> +      if (srv != SECSuccess) {
> +        return MapPRErrorCodeToResult(PR_GetError());
> +      }
> +      // matchingCandidates now owns certDuplicate
> +      certDuplicate.release();

We should prepend `Unused << ` here for clarity, and because not doing so might cause static analysis failures (I don't remember if the checks apply to both mozilla::UniquePtr and std::unique_ptr though).

::: security/manager/ssl/nsNSSCallbacks.cpp:1090
(Diff revision 1)
>    if (!cert) {
>      return;
>    }
>  
> +  UniqueCERTCertList peerCertChain(SSL_PeerCertificateChain(fd));
> +  MOZ_ASSERT(cert, "SSL_PeerCertChain failed in TLS handshake callback?");

Looks like `cert` should be `peerCertChain` instead.
`SSL_PeerCertChain` should probably also be `SSL_PeerCertificateChain` instead, but this doesn't really matter.

Attachment #8868262 - Flags: review?(cykesiopka.bmo) → review+

J.C. Jones [:jcj] (he/him)

Comment 3

•

7 years ago

mozreview-review

Comment on attachment 8868262 [details]
bug 1364159 - potentially avoid calling CERT_CreateSubjectCertList in NSSCertDBTrustDomain::FindIssuer

https://reviewboard.mozilla.org/r/139846/#review144216

Looks good.

::: security/certverifier/NSSCertDBTrustDomain.cpp:183
(Diff revision 1)
> +      continue; // probably just too big - continue processing other candidates
> +    }
> +    if (InputsAreEqual(candidateSubjectName, subjectName)) {
> +      UniqueCERTCertificate certDuplicate(CERT_DupCertificate(node->cert));
> +      if (!certDuplicate) {
> +        return Result::FATAL_ERROR_NO_MEMORY;

Seems weird to error here when we skip the out-of-memory condition above without bailing out.

Functions like FindIssuerInner keep going unless they hit a library failure.

Attachment #8868262 - Flags: review?(jjones) → review+

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Assignee

Comment 4

•

7 years ago

mozreview-review-reply

Comment on attachment 8868262 [details]
bug 1364159 - potentially avoid calling CERT_CreateSubjectCertList in NSSCertDBTrustDomain::FindIssuer

https://reviewboard.mozilla.org/r/139846/#review143546

Thanks!

Unfortunately this probably won't make a huge difference on modern machines and with the current default cert db format. On older hardware and when we switch to the sql-based db, it might be noticeable (again, though, this is really only a slight optimization).

As for bug 1063276, I think as long as we have some places where we aren't guaranteed to have a certificate chain, we should probably keep things as-is for now.

> const?

Good call.

> const?

Fixed.

> We should prepend `Unused << ` here for clarity, and because not doing so might cause static analysis failures (I don't remember if the checks apply to both mozilla::UniquePtr and std::unique_ptr though).

Sounds good.

> Looks like `cert` should be `peerCertChain` instead.
> `SSL_PeerCertChain` should probably also be `SSL_PeerCertificateChain` instead, but this doesn't really matter.

Good catch.

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Assignee

Comment 5

•

7 years ago

mozreview-review-reply

Comment on attachment 8868262 [details]
bug 1364159 - potentially avoid calling CERT_CreateSubjectCertList in NSSCertDBTrustDomain::FindIssuer

https://reviewboard.mozilla.org/r/139846/#review144216

Thanks!

> Seems weird to error here when we skip the out-of-memory condition above without bailing out.
> 
> Functions like FindIssuerInner keep going unless they hit a library failure.

What's being skipped on lines 175-178 is that mozilla::pkix doesn't handle inputs greater than 16k. In those situations we generally just pretend such certificates don't exist rather than returning a catastrophic error. (See e.g. https://dxr.mozilla.org/mozilla-central/source/security/certverifier/NSSCertDBTrustDomain.cpp#103 )

Comment hidden (mozreview-request)

Comment on attachment 8868262 [details]
bug 1364159 - potentially avoid calling CERT_CreateSubjectCertList in NSSCertDBTrustDomain::FindIssuer

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/139846/diff/1-2/

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Assignee

Comment 7

•

7 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=623468a6c31c

Pulsebot

Comment 8

•

7 years ago

Pushed by dkeeler@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/600b709c2634
potentially avoid calling CERT_CreateSubjectCertList in NSSCertDBTrustDomain::FindIssuer r=Cykesiopka,jcj

Ryan VanderMeulen [:RyanVM]

Comment 9

•

7 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/600b709c2634

Status: NEW → RESOLVED

Closed: 7 years ago

status-firefox55: --- → fixed

Flags: in-testsuite+

Resolution: --- → FIXED

Target Milestone: --- → mozilla55

J.C. Jones [:jcj] (he/him)

Updated

•

7 years ago

Comment 10

•

7 years ago

Hello. This bug may have contributed to a sudden change in the Telemetry probe CERT_VALIDATION_SUCCESS_BY_CA[1] which seems to have occurred in Nightly 20170524[2][3].

Quite a few buckets went to 0 and quite a few others had a significant increase. I guess this means we're seeing radically different certs being successfully validated?

Is this a good thing? A bad thing?

Did this happen on purpose? Is this expected?

Is this probe measuring something useful?

[1]: http://alerts.telemetry.mozilla.org/index.html#/detectors/1/metrics/1059/alerts/?from=2017-05-24&to=2017-05-24
[2]: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5bc1c758ab57c1885dceab4e7837e58af27b998c&tochange=291a11111bdd05c5cd55dd552da4b1285ceba9b2
[3]: https://mzl.la/2rqaWMR

Flags: needinfo?(dkeeler)

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Assignee

Comment 11

•

7 years ago

I'm pretty sure this isn't a bad thing. Before this patch, Firefox would use whatever issuer certificates it could find in its certificate cache and built-in list to verify a server certificate. As of this patch, Firefox first essentially attempts to first build a path that uses the certificates presented by the server (and then it falls back to the old behavior). Due to prolific cross-signing by CAs, there can often be many paths from a server certificate to a trust anchor in our program. I think the telemetry changes indicate that for many sites, Firefox would previously find a trust path that used some cross-signed intermediate not actually sent by the server. Now, though, Firefox should more often find a trust path that looks like what the server sent, hence the change. While I wasn't explicitly expecting this, it does make sense.

As for if the probe is useful, I think it's good to have an idea of the relative usage of CAs in our program (if a root consistently has 0 hits, it might be a good candidate for removal).

Flags: needinfo?(dkeeler)

Chris H-C :chutten

Comment 12

•

7 years ago

There are an awful lot of entries in the histogram that have 0 hits, but I don't know how many of them have corresponding roots in the codebase. Maybe a bug would be a good first step in auditing this data.

Speaking of auditing, this probe presently lacks an "alert_emails" field, so only I (well, and some other telemetry folks) get notified when there's sudden changes like this. Would this be a good component to file a bug in to get this fixed?

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Assignee

Comment 13

•

7 years ago

Some roots may have already been removed. The mapping of bucket number to root CA is here: https://dxr.mozilla.org/mozilla-central/source/security/manager/tools/KnownRootHashes.json (anything over 188 is not yet assigned, so it makes sense that those would get 0 hits).

Adding an alert_emails field would be great. Core :: Security: PSM would be the right component. I imagine seceng-telemetry@mozilla.com would be a good start.

Chris H-C :chutten

Updated

•

7 years ago

Comment 14

•

7 years ago

I think that this bug introduced an undesirable behavior by accident.  It seems that Firefox is now reporting (and maybe using) the longest possible chain the server sends, rather than the shortest possible chain.

For example, if the server sends a chain like:

(server name) -> Online CA -> Root G3 -> Root G2 -> Root G1

And all three roots are in the trust store, it seems that G1 is getting the credit now.  This is why many of the buckets went to zero.

The reason this is undesirable is because it means that we no longer have data on what percentage are solely relying on G1.  If we want to retire older roots, then it is critical log the shortest chain.

Flags: needinfo?(dkeeler)

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Assignee

Updated

•

7 years ago

Depends on: 1400913

Dana Keeler (she/her) (use needinfo) [:keeler] (on leave)

Assignee

Comment 15

•

7 years ago

Thanks, Peter - I filed bug 1400913.

Flags: needinfo?(dkeeler)