Closed Bug 1293231 Opened 3 years ago Closed 3 years ago

Certificate Transparency - basic telemetry reports

Categories

(Core :: Security: PSM, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox52 --- fixed

People

(Reporter: sergei.cv, Assigned: sergei.cv)

References

(Blocks 2 open bugs, )

Details

(Whiteboard: [psm-assigned])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36
Priority: -- → P2
Whiteboard: [psm-backlog]
Hi David,

Some notes regarding the review:

1. In CertVerifier, my understanding is that errors do not stop the certificate verification attempts (it just goes on with another set of parameters). Consequently, SCT verification errors are currently ignored.

2. There are some open questions regarding Histograms.json (see the TODOs there). The main issue is the maximum buckets limitation which we will need to ask to override for the per-CA probe.

3. I'll be on a summer leave starting tomorrow, and will be back on August 28.

Thanks!
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review69970

Awesome - this looks really good right from the start. I noted a few issues to address, though, so I'll have a look at the interdiff. In terms of errors during certificate verification, if we are confident that the only error codes returned by ct verification are fatal, we should check for those and return them if they're ever encountered. For the telemetry stuff, I think you'll have to ask a telemetry/data privacy peer. In any case, hope your vacation is going well.

::: netwerk/base/security-prefs.js:14
(Diff revision 1)
>  pref("security.tls.unrestricted_rc4_fallback", false);
>  
>  pref("security.ssl.treat_unsafe_negotiation_as_broken", false);
>  pref("security.ssl.require_safe_negotiation",  false);
>  pref("security.ssl.enable_ocsp_stapling", true);
> +pref("security.ssl.enable_certificate_transparency", true);

We should probably make this an integer so we can do things like set it to 0 to disable it entirely, 1 to just collect telemetry, 2 to require it for EV, etc.

::: security/certverifier/CertVerifier.h:150
(Diff revision 1)
>                 PinningMode pinningMode, SHA1Mode sha1Mode,
>                 BRNameMatchingPolicy::Mode nameMatchingMode,
> -               NetscapeStepUpPolicy netscapeStepUpPolicy);
> +               NetscapeStepUpPolicy netscapeStepUpPolicy,
> +               // Handling of Signed Certificate Timestamps (SCTs).
> +               // See RFC 6962 (Certificate Transparency).
> +               bool sctsEnabled);

Similarly, this should be some sort of Mode enum that informs the CertVerifier what to do with any encountered scts (just disable/collect telemetry for now).

::: security/certverifier/CertVerifier.cpp:162
(Diff revision 1)
>  }
>  
> +void
> +CertVerifier::LoadKnownCTLogs()
> +{
> +  for (const CTLogInfo& log : kCTLogList) {

I'm not seeing CTLogInfo or kCTLogList - are these in another patch?

::: security/certverifier/CertVerifier.cpp:165
(Diff revision 1)
> +CertVerifier::LoadKnownCTLogs()
> +{
> +  for (const CTLogInfo& log : kCTLogList) {
> +    Input publicKey;
> +    if (publicKey.Init(reinterpret_cast<const uint8_t*>(log.logKey),
> +                       log.logKeyLength) == Success) {

Let's flip these - check for != Success and assert in those cases.

::: security/certverifier/CertVerifier.cpp:381
(Diff revision 1)
>  
> +  Input sctsFromTLSInput;
> +  if (sctsFromTLSSECItem) {
> +    rv = sctsFromTLSInput.Init(sctsFromTLSSECItem->data,
> +                               sctsFromTLSSECItem->len);
> +    // Silently discard the error of the extension being too big,

Maybe we should first check the length of sctsFromTLSSECItem and not even bother calling Init if we know it will fail.

::: security/certverifier/CertVerifier.cpp:509
(Diff revision 1)
>            }
>            if (sha1ModeResult) {
>              *sha1ModeResult = sha1ModeResults[i];
>            }
> +          // SCT related errors do not affect the certificate verification.
> +          Unused << VerifySignedCertificateTimestamps(trustDomain, builtChain,

We should only be returning fatal errors from VerifySignedCertificateTimestamps, right? So I think we would want to check for and return errors here.

::: security/certverifier/CertVerifier.cpp:603
(Diff revision 1)
>              }
>              if (sha1ModeResult) {
>                *sha1ModeResult = sha1ModeResults[j];
>              }
> +            // SCT related errors do not affect the certificate verification.
> +            Unused << VerifySignedCertificateTimestamps(trustDomain, builtChain,

Same here.

::: security/certverifier/MultiLogCTVerifier.cpp:205
(Diff revision 1)
>      return rv;
>    }
>  
> +  // |sct.timestamp| is measured in milliseconds since the epoch,
> +  // ignoring leap seconds. When converting it to a second-level precision
> +  // pkix::Time, we round it up to the closest second for safety.

What's the concern here? (As in, why are we rounding?)

::: security/certverifier/NSSCertDBTrustDomain.h:11
(Diff revision 1)
>  
>  #ifndef NSSCertDBTrustDomain_h
>  #define NSSCertDBTrustDomain_h
>  
>  #include "CertVerifier.h"
> +#include "OCSPCache.h"

What's this for?

::: security/certverifier/NSSCertDBTrustDomain.cpp:416
(Diff revision 1)
>        return stapledOCSPResponseResult;
>      }
>    } else if (endEntityOrCA == EndEntityOrCA::MustBeEndEntity) {
>      // no stapled OCSP response
>      mOCSPStaplingStatus = CertVerifier::OCSP_STAPLING_NONE;
> +    mSCTListFromOCSPStapling.reset();

I don't understand why it's necessary to ever reset mSCTListFromOCSPStapling if we do it in ResetOCSPStaplingStatus as well.

::: security/certverifier/NSSCertDBTrustDomain.cpp:827
(Diff revision 1)
>    mBuiltChain = Move(certList);
>  
> +  // If CT extensions are available and chain building succeeds,
> +  // mSCTListFromCertificate will be set later by NoteAuxiliaryExtension
> +  // callback.
> +  mSCTListFromCertificate.reset();

I think we should move this to ResetOCSPStaplingStatus and rename it to ResetAccumulatedState or something.

::: security/certverifier/NSSCertDBTrustDomain.cpp:992
(Diff revision 1)
> +  }
> +  return result;
> +}
> +
> +Input
> +NSSCertDBTrustDomain::GetSCTListFromCertificate() const

I think it would be better to have these two functions return references to UniqueSECItems so the ownership of the data is more clear.

::: security/certverifier/NSSCertDBTrustDomain.cpp:1014
(Diff revision 1)
> +    case AuxiliaryExtension::EmbeddedSCTList:
> +      out = &mSCTListFromCertificate;
> +      break;
> +    case AuxiliaryExtension::SCTListFromOCSPResponse:
> +      out = &mSCTListFromOCSPStapling;
> +      break;

We should add a default case with MOZ_ASSERT_UNREACHABLE in it.

::: security/certverifier/moz.build:10
(Diff revision 1)
>  # file, You can obtain one at http://mozilla.org/MPL/2.0/.
>  
>  EXPORTS += [
>      'BRNameMatchingPolicy.h',
>      'CertVerifier.h',
> +    'CTLogVerifier.h',

It would be nice to limit the number of header files we're exposing here, but it's not a big deal if that involves moving lots of code around or adding cumbersome indirection.

::: security/manager/ssl/SSLServerCertVerification.cpp:1254
(Diff revision 1)
> +    GatherTelemetryForSingleSCT(sct, 1);
> +  }
> +  for (auto& sct : info.verifyResult.invalidScts) {
> +    GatherTelemetryForSingleSCT(sct, 2);
> +  }
> +  int32_t rootCABinId = mozilla::psm::RootCABinNumber(&cert->derCert);

Isn't this the end-entity certificate? We want the root here (which would be CERT_LIST_TAIL, I believe).

::: security/manager/ssl/SSLServerCertVerification.cpp:1650
(Diff revision 1)
>    // we currently only support single stapled responses
>    if (csa && csa->len == 1) {
>      stapledOCSPResponse = &csa->items[0];
>    }
>  
> +  const SECItem *sctsFromTLSExtension = SSL_PeerSignedCertTimestamps(fd);

nit: 'const SECItem* scts...'

::: toolkit/components/telemetry/Histograms.json:7717
(Diff revision 1)
> +    "alert_emails": ["seceng-telemetry@mozilla.com"],
> +    "expires_in_version": "never",
> +    "kind": "enumerated",
> +    "n_values": 99,
> +    "bug_numbers": [1293231],
> +    "description": "***TODO: |n_values| should be 256 as for other _BY_CA histograms, but this is not allowed by default*** Number of valid Signed Certificate Timestamps by CA (see RootHashes.inc for names of CAs)"

I guess you have to ask a telemetry/privacy review peer for permission here?

::: toolkit/components/telemetry/Histograms.json:7725
(Diff revision 1)
> +    "alert_emails": ["seceng-telemetry@mozilla.com"],
> +    "expires_in_version": "never",
> +    "kind": "enumerated",
> +    "n_values": 50,
> +    "bug_numbers": [1293231],
> +    "description": "*** TODO: finalize |n_values| *** Histogram of valid Signed Certificate Timestamps per connection, from all sources (embedded / OCSP Stapling / TLS handshake). Bucket 0 is kept for the cases where SCTs were received but none were valid."

Well, we could do a different kind of measure here - e.g. exponential? Or, are we really ever expecting more than a handful of SCTs per connection? We could probably get away with a maximum of 32 or something.
Attachment #8780225 - Flags: review?(dkeeler) → review-
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review70146

Forgive the drive by review - I was skimming the patch and noticed some issues.

::: netwerk/base/security-prefs.js:14
(Diff revision 1)
>  pref("security.tls.unrestricted_rc4_fallback", false);
>  
>  pref("security.ssl.treat_unsafe_negotiation_as_broken", false);
>  pref("security.ssl.require_safe_negotiation",  false);
>  pref("security.ssl.enable_ocsp_stapling", true);
> +pref("security.ssl.enable_certificate_transparency", true);

https://wiki.mozilla.org/PKI:CT says CT will be "off by default". Has this changed?

::: security/certverifier/CertVerifier.cpp:164
(Diff revision 1)
> +void
> +CertVerifier::LoadKnownCTLogs()
> +{
> +  for (const CTLogInfo& log : kCTLogList) {
> +    Input publicKey;
> +    if (publicKey.Init(reinterpret_cast<const uint8_t*>(log.logKey),

Please use ```mozilla::BitwiseCast<const uint8_t*, [expected original type]>(log.logKey)``` here instead.
It's slightly safer, and makes it more visually obvious that the cast is reasonable.
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review69970

> I'm not seeing CTLogInfo or kCTLogList - are these in another patch?

Added the missing CTKnownLogs.h file.

> Maybe we should first check the length of sctsFromTLSSECItem and not even bother calling Init if we know it will fail.

Agreed, it would be better. But the problem is that the maximum length an Input can accept is not explicitly available (see https://dxr.mozilla.org/mozilla-central/source/security/pkix/include/pkix/Input.h#93).

> We should only be returning fatal errors from VerifySignedCertificateTimestamps, right? So I think we would want to check for and return errors here.

Fixed, it is now handled similarly to IsCertChainRootBuiltInRoot which can also return fatal errors.

> What's the concern here? (As in, why are we rounding?)

Updated the comment in the code. We convert the time from millisecond precision to second, so we need to round either up or down. Theoretically, rounding up in this case is "more secure" since it can't pass invalid (future) SCTs, only block valid ones. Practically it probably does not matter how we round. Since any rounding looks suspicious in the code, I wanted to add a comment explaining why it's done, and so had to do it the "right" way.

> I think it would be better to have these two functions return references to UniqueSECItems so the ownership of the data is more clear.

I've added some docs on the data ownership for these functions. Let's keep it as Inputs? It is more convenient for class clients to have the data returned as Inputs. Also, the fact that the data is held in SECItems is actually an implementation detail (the original data is returned as Inputs by pkix and kept in SECItems internally).

> I guess you have to ask a telemetry/privacy review peer for permission here?

Removed this probe for now.

> Well, we could do a different kind of measure here - e.g. exponential? Or, are we really ever expecting more than a handful of SCTs per connection? We could probably get away with a maximum of 32 or something.

The maximum is now 10.
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review70146

Thanks for caring! The issue is fixed, also see the clarification below.

> https://wiki.mozilla.org/PKI:CT says CT will be "off by default". Has this changed?

No, that hasn't changed. The setting was badly named. We currently do not enforce any CT-related policies on certificates, only collect telemetry information. There is no CT qualification of certificates, and the new code does not affect certificate validation. We do turn on a flag on the TLS handshake which asks servers to send SCTs via TLS if available, but besides that there are no visible changes.

I've renamed the setting and changed the type to enum as David suggested. It currently only has two options, "telemetry" and "disabled". The default is the "telemetry" mode. The user can change it to the "disabled" mode which turns off all the CT related functionality (which is the TLS flag and the processing of incoming SCTs to generate telemetry).
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review74180

Awesome - this is looking good. These are just my comments on the interdiff - I may have more comments on the overall patch. I'll let you know. Also, we should probably have another reviewer, so I'll ask Cykesiopka to do a more formal review.

::: netwerk/base/security-prefs.js:22
(Diff revisions 1 - 2)
>  pref("security.ssl.enable_alpn", true);
>  
> +// Configures Certificate Transparency support mode:
> +// 0: Fully disabled.
> +// 1: Only collect telemetry. CT qualification checks are not performed.
> +pref("security.ssl.certificate_transparency.mode", 1);

Actually, let's put this under "security.pki" instead of "security.ssl".

::: security/certverifier/CertVerifier.h:150
(Diff revisions 1 - 2)
>      ocspEVOnly = 2
>    };
>    enum OcspStrictConfig { ocspRelaxed = 0, ocspStrict };
>    enum OcspGetConfig { ocspGetDisabled = 0, ocspGetEnabled = 1 };
>  
> +  enum CertificateTransparencyMode {

Might as well make this an enum class, right?

::: security/certverifier/CertVerifier.cpp:162
(Diff revisions 1 - 2)
>  }
>  
>  void
>  CertVerifier::LoadKnownCTLogs()
>  {
> +  mCTVerifier.reset(new mozilla::ct::MultiLogCTVerifier());

Why have this dynamically allocated?

::: security/certverifier/MultiLogCTVerifier.cpp:31
(Diff revisions 1 - 2)
> +    case SignedCertificateTimestamp::VerificationStatus::UnknownLog:
>        target = &result.unknownLogsScts;
>        break;
> -    case SCTVerifyStatus::Invalid:
> +    case SignedCertificateTimestamp::VerificationStatus::InvalidSignature:
> +    case SignedCertificateTimestamp::VerificationStatus::InvalidTimestamp:
>        target = &result.invalidScts;

I think it might be simpler to keep these in separate SCTLists (which would make it easier to tell them apart later, when we're accumulating telemetry).

::: security/certverifier/CTKnownLogs.h:8
(Diff revision 2)
> +/* This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +/* This file is generated by print_log_list.py from
> + * https://github.com/google/certificate-transparency/ */

I think we should include the generator for this file (as well as its inputs). It can be based on the original (depending on its license, I guess), but we might want to fix the formatting (see below).

::: security/certverifier/CTKnownLogs.h:10
(Diff revision 2)
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +/* This file is generated by print_log_list.py from
> + * https://github.com/google/certificate-transparency/ */
> +
> +

style nit: let's not have an extra blank line here

::: security/certverifier/CTKnownLogs.h:24
(Diff revision 2)
> +  const char* const logName;
> +  const char* const logUrl;
> +};
> +
> +const CTLogInfo kCTLogList[] = {
> +    {"\x30\x59\x30\x13\x06\x07\x2a\x86\x48\xce\x3d\x02\x01\x06\x08\x2a\x86"

formatting nit: I would think this should be formatted more like so:

const CTLogInfo kCTLogList[] = {
  { "\x30..."
    "\x48..."
    ...
    91,
    "Google 'Pilot' log",
    "https://ct.googleapis.com/pilot/" },

(so 2 spaces of indentation for each opening { and a space after/before {/})

::: security/certverifier/CTKnownLogs.h:137
(Diff revision 2)
> +     "\x6f\xdf\x3c\x2c\x43\x57\xa1\x47\x0c\x91\x04\xf4\x75\x4d\xda\x89\x81"
> +     "\xa4\x14\x06\x34\xb9\x98\xc3\xda\xf1\xfd\xed\x33\x36\xd3\x16\x2d\x35"
> +     "\x02\x03\x01\x00\x01",
> +     294,
> +     "CNNIC CT log",
> +     "https://ctserver.cnnic.cn/"}};

nit: the final }; should be on its own line, and let's have a blank line after that

::: security/certverifier/CTKnownLogs.h:146
(Diff revision 2)
> +    "\x51\x5d\x67\x93\xd4\x44\xd1\x0a\x67\xac\xbb\x4f\x4f\xfb\xc4",
> +    "\xa4\xb9\x09\x90\xb4\x18\x58\x14\x87\xbb\x13\xa2\xcc\x67\x70\x0a\x3c"
> +    "\x35\x98\x04\xf9\x1b\xdf\xb8\xe3\x77\xcd\x0e\xc8\x0d\xdc\x10",
> +    "\xee\x4b\xbd\xb7\x75\xce\x60\xba\xe1\x42\x69\x1f\xab\xe1\x9e\x66\xa3"
> +    "\x0f\x7e\x5f\xb0\x72\xd8\x83\x00\xc4\x7b\x89\x7a\xa8\xfd\xcb"};
> +

nit: unnecessary blank line

::: security/certverifier/CTSerialization.cpp:512
(Diff revision 2)
>    if (rv != Success) {
>      return rv;
>    }
>    result.timestamp = timestamp;
>  
> +  result.origin = SignedCertificateTimestamp::Origin::Unknown;

Should we just do this in SignedCertificateTimestamp's constructor?

::: security/certverifier/SignedCertificateTimestamp.h:113
(Diff revision 2)
> +    // Such SCT are considered invalid (see RFC 6962, Section 5.2).
> +    InvalidTimestamp
> +  };
> +
>    Origin origin;
> +  VerificationStatus verificationStatus;

It seems unnecessary to have both this verification status and to separate out the SCTs into different lists when verifying them. We should do one or the other (as I mentioned above, I think it would be simpler to just keep one more kind of SCTList).
Attachment #8780225 - Flags: review?(dkeeler) → review-
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review75138

Sorry for the delay.

I only have a few minor comments, but since keeler has asked for things like the script for generating CTKnownLogs.h, I'll just clear the review flag for now.

::: security/certverifier/CertVerifier.h:152
(Diff revision 2)
>    enum OcspStrictConfig { ocspRelaxed = 0, ocspStrict };
>    enum OcspGetConfig { ocspGetDisabled = 0, ocspGetEnabled = 1 };
>  
> +  enum CertificateTransparencyMode {
> +    Disabled = 0,
> +    TelemetryOnly = 1

Nit: Add a trailing comma here.

::: security/certverifier/CertVerifier.h:182
(Diff revision 2)
> +  UniquePtr<mozilla::ct::MultiLogCTVerifier> mCTVerifier;
> +
> +  void LoadKnownCTLogs();
> +  mozilla::pkix::Result VerifySignedCertificateTimestamps(
> +                     NSSCertDBTrustDomain& trustDomain,
> +                     UniqueCERTCertList& builtChain,

This should probably be ```const UniqueCERTCertList& chain``` or something similar instead.
In particular, the lack of ```const``` and the usually reserved for outparams ```builtChain``` name makes it seem like the method modifies ```builtChain```, when it doesn't at all.

::: security/certverifier/CertVerifier.cpp:162
(Diff revision 2)
>  }
>  
> +void
> +CertVerifier::LoadKnownCTLogs()
> +{
> +  mCTVerifier.reset(new mozilla::ct::MultiLogCTVerifier());

Nit: If for some reason ```mCTVerifier``` does need to stay dynamically allocated, this line would look nicer as ```mCTVerifier = MakeUnique<mozilla::ct::MultiLogCTVerifier>();```.

::: security/certverifier/CertVerifier.cpp:168
(Diff revision 2)
> +  for (const CTLogInfo& log : kCTLogList) {
> +    Input publicKey;
> +    Result rv = publicKey.Init(
> +      BitwiseCast<const uint8_t*, const char*>(log.logKey), log.logKeyLength);
> +    if (rv != Success) {
> +      MOZ_ASSERT_UNREACHABLE("Failed reading a log key for a known CT Log");

This should have a corresponding ```#include "mozilla/Assertions.h"```.
Same everywhere else new uses of this are being added.

::: security/certverifier/NSSCertDBTrustDomain.cpp:967
(Diff revision 2)
>  void
> -NSSCertDBTrustDomain::NoteAuxiliaryExtension(AuxiliaryExtension /*extension*/,
> +NSSCertDBTrustDomain::ResetAccumulatedState()
> -                                             Input /*extensionData*/)
>  {
> +  mOCSPStaplingStatus = CertVerifier::OCSP_STAPLING_NEVER_CHECKED;
> +  mSCTListFromOCSPStapling.reset();

We tend to use ```= nullptr;``` more, but this is fine as well.

::: security/manager/ssl/SSLServerCertVerification.cpp:768
(Diff revision 2)
>    const uint32_t mProviderFlags;
>    const Time mTime;
>    const PRTime mPRTime;
>    const TimeStamp mJobStartTime;
>    const UniqueSECItem mStapledOCSPResponse;
> +  const UniqueSECItem mSctsFromTLSExtension;

Nit: s/mSct/mSCT/.

::: security/manager/ssl/SSLServerCertVerification.cpp:1253
(Diff revision 2)
> +    return;
> +  }
> +
> +  // The integer argument to GatherTelemetryForSingleSCT below is the
> +  // verification status. See SSL_SCTS_VERIFICATION_STATUS in Histograms.json.
> +  for (auto& sct : info.verifyResult.verifiedScts) {

Nit: I would argue that ```auto``` should not be used here - it's not obvious from context what the type of ```sct``` is.
Same below.

::: security/manager/ssl/nsNSSComponent.cpp:1505
(Diff revision 2)
>    bool ocspMustStapleEnabled = Preferences::GetBool("security.ssl.enable_ocsp_must_staple",
>                                                      true);
>    PublicSSLState()->SetOCSPMustStapleEnabled(ocspMustStapleEnabled);
>    PrivateSSLState()->SetOCSPMustStapleEnabled(ocspMustStapleEnabled);
>  
> +  CertVerifier::CertificateTransparencyMode ctMode =

We probably want to ensure this value is actually valid like done for ```sha1Mode```. IIUC, there's nothing to stop a user from setting the value of the pref to something completely random.

::: security/manager/ssl/nsNSSComponent.cpp:1508
(Diff revision 2)
>    PrivateSSLState()->SetOCSPMustStapleEnabled(ocspMustStapleEnabled);
>  
> +  CertVerifier::CertificateTransparencyMode ctMode =
> +    static_cast<CertVerifier::CertificateTransparencyMode>(
> +      Preferences::GetInt("security.ssl.certificate_transparency.mode",
> +                          CT_MODE_DEFAULT));

We can probably do something similar to ```static_cast<int32_t>(CertVerifier::SHA1Mode::Allowed)``` below and get rid of this constant.

::: security/manager/ssl/nsNSSIOLayer.cpp:2438
(Diff revision 2)
> +  if (SECSuccess != SSL_OptionSet(fd, SSL_ENABLE_SIGNED_CERT_TIMESTAMPS,
> +      sctsEnabled)) {

Nit: I realise this is just following the convention in the surrounding code, but we usually don't do yoda comparisons like this in new code.
Attachment #8780225 - Flags: review?(cykesiopka.bmo)
Assignee: nobody → sergei.cv
Priority: P2 → P1
Whiteboard: [psm-backlog] → [psm-assigned]
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review74180

> Why have this dynamically allocated?

This is needed so we can drop the "#include"s of MultiLogCTVerifier.h and CTLogVerifier.h from CertVerifier.h. Since CertVerifier now only stores a pointer to MultiLogCTVerifier, there is no need to include its definition.

> I think it might be simpler to keep these in separate SCTLists (which would make it easier to tell them apart later, when we're accumulating telemetry).

Since we have 4 statuses already, I think we can just have a single list for all and use the |verificationStatus| field of each SCT.
I've changed it to be a single list, take a look?

> I think we should include the generator for this file (as well as its inputs). It can be based on the original (depending on its license, I guess), but we might want to fix the formatting (see below).

I undersand that Eran's team will take care of keeping the logs list in Firefox up to date. I've added a custom formatter for Firefox to the generator script, so we can tweak the output as needed. After we finalize the output format, I will send the patch to the CT team at Google.
The generator itself is a Python script which takes a json file downloaded from https://www.certificate-transparency.org/known-logs, validates signatures etc, and then generates the logs list file (CTKnownLogs.h in our case). I think it would be best to always use the latest version from Github (with the Firefox formatter), as maintained by the CT team.
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review75138

> Nit: I realise this is just following the convention in the surrounding code, but we usually don't do yoda comparisons like this in new code.

I think in this case consistency is important. "if (SECSuccess != NSS_Function())" pattern is used throughout the whole file, doing it differently for this specific case will unnecessarily stand out.
(In reply to Sergei Chernov from comment #12)
> Comment on attachment 8780225 [details]
> Bug 1293231 - Certificate Transparency - basic telemetry reports;
> 
> https://reviewboard.mozilla.org/r/70956/#review74180
> 
> > Why have this dynamically allocated?
> 
> This is needed so we can drop the "#include"s of MultiLogCTVerifier.h and
> CTLogVerifier.h from CertVerifier.h. Since CertVerifier now only stores a
> pointer to MultiLogCTVerifier, there is no need to include its definition.

Ok, sounds good.

> > I think it might be simpler to keep these in separate SCTLists (which would make it easier to tell them apart later, when we're accumulating telemetry).
> 
> Since we have 4 statuses already, I think we can just have a single list for
> all and use the |verificationStatus| field of each SCT.
> I've changed it to be a single list, take a look?

Yep - this looks good.

> > I think we should include the generator for this file (as well as its inputs). It can be based on the original (depending on its license, I guess), but we might want to fix the formatting (see below).
> 
> I undersand that Eran's team will take care of keeping the logs list in
> Firefox up to date. I've added a custom formatter for Firefox to the
> generator script, so we can tweak the output as needed. After we finalize
> the output format, I will send the patch to the CT team at Google.
> The generator itself is a Python script which takes a json file downloaded
> from https://www.certificate-transparency.org/known-logs, validates
> signatures etc, and then generates the logs list file (CTKnownLogs.h in our
> case). I think it would be best to always use the latest version from Github
> (with the Firefox formatter), as maintained by the CT team.

For any generated file we check in to the tree, I think it's best to also check in the file that generated it (and perhaps even the data). That way, we have a historical record of how the data was created for each version of the generated file. This is how we import e.g. the HSTS and HPKP preload lists.
Some notes regarding the latest patch (following discussion with David and Eran):

1. The "SCTs per connection" telemetry histogram now counts the total number of SCTs available per connection (not just the validated ones).

2. The generator script for the Known Logs header file will be included later; a separate bug will be filed.
Telemetry Review
================

We'd like to add 3 new telemetry probes as part of implementing Certificate Transparency (CT) support in Firefox (see https://wiki.mozilla.org/PKI:CT).

The probes will be documented in Histograms.json - see the diff at https://reviewboard.mozilla.org/r/70956/diff/4#23

The new probes will allow us to evaluate the adoption of CT in the wild and will help to decide on the future plans for CT support in Firefox. At this stage, we are interested in the following data regarding Signed Certificate Timestamps (SCTs) sent as part of SSL connections:
1. How many SSL connections include SCTs.
2. How many SCTs a typical SSL connection includes.
3. Popularity of various channels for supplying SCTs (certificate embedding / TLS handshake / OCSP stapling).

Thanks!
Flags: needinfo?(benjamin)
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review77756

Looks good!

::: security/certverifier/CTKnownLogs.h:59
(Diff revision 4)
> +  { "Certly.IO log",
> +    "https://log.certly.io/",
> +    "\x30\x59\x30\x13\x06\x07\x2a\x86\x48\xce\x3d\x02\x01\x06\x08\x2a\x86\x48"
> +    "\xce\x3d\x03\x01\x07\x03\x42\x00\x04\x0b\x23\xcb\x85\x62\x98\x61\x48\x04"
> +    "\x73\xeb\x54\x5d\xf3\xd0\x07\x8c\x2d\x19\x2d\x8c\x36\xf5\xeb\x8f\x01\x42"
> +    "\x0a\x7c\x98\x26\x27\xc1\xb5\xdd\x92\x93\xb0\xae\xf8\x9b\x3d\x0c\xd8\x4c"
> +    "\x4e\x1d\xf9\x15\xfb\x47\x68\x7b\xba\x66\xb7\x25\x9c\xd0\x4a\xc2\x66\xdb"
> +    "\x48",
> +    91 },
> +  { "Izenpe log",
> +    "https://ct.izenpe.com/",
> +    "\x30\x59\x30\x13\x06\x07\x2a\x86\x48\xce\x3d\x02\x01\x06\x08\x2a\x86\x48"
> +    "\xce\x3d\x03\x01\x07\x03\x42\x00\x04\x27\x64\x39\x0c\x2d\xdc\x50\x18\xf8"
> +    "\x21\x00\xa2\x0e\xed\x2c\xea\x3e\x75\xba\x9f\x93\x64\x09\x00\x11\xc4\x11"
> +    "\x17\xab\x5c\xcf\x0f\x74\xac\xb5\x97\x90\x93\x00\x5b\xb8\xeb\xf7\x27\x3d"
> +    "\xd9\xb2\x0a\x81\x5f\x2f\x0d\x75\x38\x94\x37\x99\x1e\xf6\x07\x76\xe0\xee"
> +    "\xbe",
> +    91 },

According to https://www.chromium.org/Home/chromium-security/certificate-transparency , these logs are no longer qualified in Chromium.
Do we care about this distinction on the Mozilla side?

::: security/certverifier/CertVerifier.h:179
(Diff revision 4)
>    const NetscapeStepUpPolicy mNetscapeStepUpPolicy;
> +  const CertificateTransparencyMode mCTMode;
>  
>  private:
>    OCSPCache mOCSPCache;
> +  UniquePtr<mozilla::ct::MultiLogCTVerifier> mCTVerifier;

Consider putting in a comment what you wrote on why this is dynamically allocated.
Attachment #8780225 - Flags: review?(cykesiopka.bmo) → review+
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review78232

Great - r=me with comment addressed.

::: security/certverifier/CertVerifier.cpp:289
(Diff revisions 2 - 4)
> +          MOZ_ASSERT_UNREACHABLE("Unexpected SCT verificationStatus");
> +      }
> +    }
> -  MOZ_LOG(gCertVerifierLog, LogLevel::Debug,
> +    MOZ_LOG(gCertVerifierLog, LogLevel::Debug,
> -          ("SCT verification result: "
> +            ("SCT verification result: "
> -           "verified=%zu invalid=%zu unknownLog=%zu decodingErrors=%zu\n",
> +             "verified=%zu unknownLog=%zu invalid=%zu decodingErrors=%zu\n",

It would be good if these counts corresponded to the telemetry we're gathering (i.e. separate out invalid signature from invalid timestamp).
Attachment #8780225 - Flags: review?(dkeeler) → review+
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

https://reviewboard.mozilla.org/r/70956/#review77756

> According to https://www.chromium.org/Home/chromium-security/certificate-transparency , these logs are no longer qualified in Chromium.
> Do we care about this distinction on the Mozilla side?

Excellent note! Currently the logs list is only used for telemetry, so it's OK to include the disqualified logs. To implement actual CT policies on certificates, we will need to add some special treatment for these logs. A disqualified log can only be used for SCTs issued before the disqualification time (which we will need to add to the header file).
I've fixed the MOZ_LOG in CertVerifier.cpp and added a comment in CertVerifier.h. Also rebased the patch before pushing - sorry for the mess it made in Review Board.
Comment on attachment 8780225 [details]
Bug 1293231 - Certificate Transparency - basic telemetry reports;

Only question here is about the "never" expiring. It seems pretty clear that you want to explore this data for a while, so it's pretty automatic to collect it for 6 or even 12 months. But if you're collecting it forever, you need to have an explicit plan for how that will provide user value and what kinds of dashboards and alerting you're going to have to keep providing that value.

So data-review=me with expires: 56. Otherwise please re-request feedback providing details about your long-term plan.
Flags: needinfo?(benjamin)
Attachment #8780225 - Flags: feedback+
We are currently planning to implement several time limited telemetry probes in the near future, but the probes in this patch need to be present as long as Certificate Transparency is supported by Firefox (even at the basic level). The data collected measures the CT ecosystem, and needs to be available and up-to-date. It will serve as a baseline for defining new CT enforcement policies (such as "require at least 2 SCTs on EV certificates"). There is no deadline for the "final" CT policy, but it will take years as CT ecosystem is evolving, and we need the baseline data to be available in the meantime.

To be more specific:
1. SSL_SCTS_ORIGIN - some SCTs distribution channels (OCSP stapling and TLS extension) require special server support and configuration, and we'd like to measure the current state of that support.
2. SSL_SCTS_PER_CONNECTION - how many SCTs one can expect from a typical SSL connection.
3. SSL_SCTS_VERIFICATION_STATUS - a spike in the number of SCTs from unknown log servers, SCTs that fail to validate or SCT decoding errors - all these will indicate that we should spend some effort to investigate. It's probably good to have an alert for these, but it is hard to define it at this stage.

Let me know if this makes sense.
Flags: needinfo?(benjamin)
As for dashboards, we'll build something similar to https://people.mozilla.org/~dkeeler/pinning-dashboard/ to track this data.
As proposed, this is only going to collect against the prerelease population (and whatever tiny fraction of release opts in). ISTM that in order to get the confidence you need, this should be based on the entire release population (or a random sampling of everyone). That's important because we know prerelease is biased in some important ways: very few enterprise users, generally more technical, and generally on newer hardware.

That admittedly requires a little more work for dashbaording because t.m.o doesn't aggregate release data by default due to cost. Thoughts?
Flags: needinfo?(benjamin) → needinfo?(dkeeler)
Do you have any data on what fraction of the release population opts into reporting telemetry?

Even if it's a small population, we'd still get telemetry on CT adoption, as all users are expected to encounter sites that serve SCTs (Google's sites do; almost all sites that have EV certs do - including bugzilla.mozilla.org itself).
~1.5% - but this is unified teleemtry, so meausring this on the entire population is a simple matter on the client of adding releaseChannelCollection: opt-out. The pain there is then doing the analysis.
Attachment #8780225 - Flags: review?(eranm)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #28)
> As proposed, this is only going to collect against the prerelease population
> (and whatever tiny fraction of release opts in). ISTM that in order to get
> the confidence you need, this should be based on the entire release
> population (or a random sampling of everyone).

Good point. How does the random sampling work? It seems like we could get what we want with that (I don't think we need to collect this telemetry on 100% of users).
Flags: needinfo?(dkeeler)
We don't have random sampling right now. It's in the backlog but likely won't be ready in Q4. So we have the following options here:

* do opt-in/prerelease only and live with skew
* do 100% of release (acceptable from a data perspective, though perhaps bloaty)
* Get assistance to implement telemetry sampling
* implement custom sampling for this specific case (yuck)

I'm ok with any of these except perhaps the last.
While it would be nice to have custom dashboards, it is good enough to be able to look at aggregated metrics for specific date ranges (e.g. the last 30 days) to evaluate CT adoption. Something like https://mzl.la/2cUIRV9 . This looks even better: https://mzl.la/2cUJjmh

We can view our data like this if collected with "releaseChannelCollection: opt-out", right? Benjamin, if it's OK by you, I think that would be our best option.
No you can't, because telemetry.mozilla.org aggregates data only from opt-in users (currently).

That is on the docket for Q4 improvements though!
Sounds great! So let's go with the "100% of release" option and wait for the telemetry improvements to be able to view the full data? Our target release is somewhere in Q1 2017 anyway.
1. Added "releaseChannelCollection: opt-out" to our 3 probes, see https://reviewboard.mozilla.org/r/70956/diff/5-6#1

2. Made a cosmetic change to SSLServerCertVerification.cpp (moved a "switch" around).

Seems like we are ready to land. Thanks everyone!
Keywords: checkin-needed
has problems to apply:

applying b8c687b8de15
patching file config/external/nss/nss.symbols
Hunk #1 FAILED at 672
1 out of 1 hunks FAILED -- saving rejects to file config/external/nss/nss.symbols.rej
patching file security/certverifier/CertVerifier.cpp
Hunk #3 succeeded at 125 with fuzz 2 (offset 0 lines).
patching file security/certverifier/CertVerifier.h
Hunk #4 FAILED at 146
1 out of 4 hunks FAILED -- saving rejects to file security/certverifier/CertVerifier.h.rej
patching file security/manager/ssl/nsNSSIOLayer.cpp
Hunk #2 succeeded at 2513 with fuzz 1 (offset 15 lines).
patch failed to apply
abort: fix up the working directory and run hg transplant --continue
Flags: needinfo?(sergei.cv)
Keywords: checkin-needed
Rebased onto rev 315323	(66a77b9bfe5d) and pushed ("hg push review"). Attaching a full patch file here just in case.
Flags: needinfo?(sergei.cv)
Keywords: checkin-needed
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/50143dbdcb47
Certificate Transparency - basic telemetry reports; r=Cykesiopka,keeler
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/50143dbdcb47
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Depends on: 1353216
Flags: needinfo?(dkeeler)
this change causes performance regression(bug 1353216)
We'll address this in bug 1353216.
Flags: needinfo?(dkeeler)
Duplicate of this bug: 1356099
You need to log in before you can comment on or make changes to this bug.