Closed Bug 1651132 Opened 4 years ago Closed 4 years ago

T-Systems / DFN-PKI: 42 certificates with RSA modulus size in bits not divisable by 8

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: brauckmann, Assigned: brauckmann)

Details

(Whiteboard: [ca-compliance] [ov-misissuance])

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.106 Safari/537.36

Steps to reproduce:

DFN-PKI had as of 2020-07-06 42 valid certificates with RSA modulus sizes (in bits) that are not divisable by 8. This is a violation of the Mozilla Root Store Policy chapter 5.1.

The affected certificates will be revoked within a 5 day deadline by 2020-07-11 11:00 CEST.

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

We were notified on saturday 2020-07-04 at 12:16 CEST about zlint ERRORs on 3 certificates showing up in crt.sh via e-mail to a non-emergency mail address. On monday 2020-07-06 approx 11:00 CEST we took notice of the mail. As the notificaton was targeted to a non-emergency address, we take the latter date as the basis for the revocation deadline.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

2020-07-06 approx 11:00 CEST: Non-emergency mail was read
2020-07-06 11:13 CEST: Problem was escalated, confirmed violation of Mozilla policy
2020-07-06 12:41 CEST: Found 12 more certificates via crt.sh linter going back to beginning of 2018, started reaching out to customers to revoke.
2020-07-06 approx 15:00 CEST: Started to develop method to search own database.

2020-07-07 11:53 CEST: Found further certificates in own database, startet reaching out to those customers.
2020-07-07 15:00 CEST: Software fix installed

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

A software fix was nearly immediately installed.

  1. In a case involving certificates, a summary of the problematic certificates. For each problem: the number of certificates, and the date the first and last certificates with that problem were issued. In other incidents that do not involve enumerating the affected certificates (e.g. OCSP failures, audit findings, delayed responses, etc.), please provide other similar statistics, aggregates, and a summary for each type of problem identified. This will help us measure the severity of each problem.

42 certificates.

First date 2017-05-15, last date 2020-07-02

(The "first date" lists the first certificate that was unrevoked and unexpired as of 2020-07-06)

  1. In a case involving certificates, the complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem. In other cases not involving a review of affected certificates, please provide other similar, relevant specifics, if any.

Affected server certificates (41):

https://crt.sh/?id=2512694351&opt=zlint
https://crt.sh/?id=3014942337&opt=zlint
https://crt.sh/?id=2147495171&opt=zlint
https://crt.sh/?id=1153611094&opt=zlint
https://crt.sh/?id=1153416396&opt=zlint
https://crt.sh/?id=1153481516&opt=zlint
https://crt.sh/?id=1113730841&opt=zlint
https://crt.sh/?id=1165480779&opt=zlint
https://crt.sh/?id=1609059405&opt=zlint
https://crt.sh/?id=1609042461&opt=zlint
https://crt.sh/?id=1609042463&opt=zlint
https://crt.sh/?id=1540254449&opt=zlint
https://crt.sh/?id=2988727149&opt=zlint
https://crt.sh/?id=138259114&opt=zlint
https://crt.sh/?id=140554771&opt=zlint
https://crt.sh/?id=2988727149&opt=zlint
https://crt.sh/?id=917937881&opt=zlint
https://crt.sh/?id=961531457&opt=zlint
https://crt.sh/?id=1375513950&opt=zlint
https://crt.sh/?id=291973249&opt=zlint
https://crt.sh/?id=1576514183&opt=zlint
https://crt.sh/?id=1597912791&opt=zlint
https://crt.sh/?id=308799292&opt=zlint
https://crt.sh/?id=859161026&opt=zlint
https://crt.sh/?id=770845579&opt=zlint
https://crt.sh/?id=301899504&opt=zlint
https://crt.sh/?id=1435941257&opt=zlint
https://crt.sh/?id=1392514898&opt=zlint
https://crt.sh/?id=3028729702&opt=zlint
https://crt.sh/?id=3028723093&opt=zlint
https://crt.sh/?id=2706962952&opt=zlint
https://crt.sh/?id=1707813510&opt=zlint
https://crt.sh/?id=574121979&opt=zlint
https://crt.sh/?id=574121977&opt=zlint
https://crt.sh/?id=606471495&opt=zlint
https://crt.sh/?id=696228361&opt=zlint
https://crt.sh/?id=1640510197&opt=zlint
https://crt.sh/?id=1445638919&opt=zlint
https://crt.sh/?id=2080518408&opt=zlint
https://crt.sh/?id=621410092&opt=zlint
https://crt.sh/?id=1275993810&opt=zlint

One user certificate, not listed for privacy reasons, with
SHA256 Fingerprint=BB:E6:C6:D0:9B:CC:89:70:38:14:14:3E:19:47:4E:4C:43:8B:90:FC:97:F3:DE:17:ED:A6:A7:D7:34:E4:76:53

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

The requirement "RSA keys whose modulus size in bits is divisible by 8" from 5.1 of Mozilla Root Policy was missed since its early days. Changes to the Mozilla Policy from the last years did not affect this line, so was not detected by us on policy updates.

Contributing factor (not meant as an excuse, just an additional info that you might find useful): We are using cablint for pre-issuance tests which does not catch this issue. zlint would have detected the issue.

  1. List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.

Affected certificates will be revoked by 2020-07-11 11:00 CEST.

Check for "RSA modulus size divisable by 8" is now implemented in our issuance software.

Transition to zlint for pre-issuance tests is on the development roadmap, but will not be implemented in 2020 due to development resource constraints.

A recheck of the Mozilla Root Policy for further missed requirements will be started and is expected to be finished 2020-08-14.

Assignee: bwilson → brauckmann
Status: UNCONFIRMED → ASSIGNED
Type: defect → task
Ever confirmed: true
Whiteboard: [ca-compliance]

(In reply to Jürgen Brauckmann from comment #0)

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

We were notified on saturday 2020-07-04 at 12:16 CEST about zlint ERRORs on 3 certificates showing up in crt.sh via e-mail to a non-emergency mail address. On monday 2020-07-06 approx 11:00 CEST we took notice of the mail. As the notificaton was targeted to a non-emergency address, we take the latter date as the basis for the revocation deadline.

Could you be more precise here? This raises some concerns/red-flags, which is something that being more specific can help address, if there truly is nothing to worry about.

What address was it received to? What is the emergency address? How does this relate to your CP/CPS; that is, where is this address documented (that was used), and if it's the wrong one to use, where is the right address documented? This has material impact on the expected revocation date, because if this was a valid report, then the timer starts when the report was received.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

The requirement "RSA keys whose modulus size in bits is divisible by 8" from 5.1 of Mozilla Root Policy was missed since its early days. Changes to the Mozilla Policy from the last years did not affect this line, so was not detected by us on policy updates.

It's important that you acknowledge it was missed, but I don't see anything as part of the remediation plan that seeks to understand how or why it was missed, or how this will be resolved going forward. I understand you're doing a further review that is expected to take 5 weeks (!), but that only seems to attempt to see if other problems were missed.

Going forward, what plans or controls exist to ensure compliance?

  1. List of steps your CA is taking to resolve the situation and ensure that such situation or incident will not be repeated in the future, accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.

I don't see any details as to what a "software fix" was, or why that's sufficient, or how that relates to root cause. The goal is to provide sufficient detail that another CA could reasonably implement the same check, in order to prevent the same details. Please be very precise about what checks or changes you made, so that it's clear.

Flags: needinfo?(brauckmann)

Could you be more precise here?

The notification was sent to two personal mail addresses (to my own and to a colleague).

The correct mail address cert-problems@dfn.de is documented in our CP 1.3 and 4.9 https://www.pki.dfn.de/fileadmin/PKI/DFN-PKI_CP-EN.pdf

Going forward, what plans or controls exist to ensure compliance?

We are still investigating the root cause. Of course we do have controls in place that monitor changes in Mozilla's requirements. This specific requirement has been added to Mozilla's root store policy 2.4 in February 2017. We are currently assessing our internal communication from that time to understand how this addition to Mozilla's policy could be overlooked and we will adjust our process for monitoring changes in Mozilla's root store policy accordingly. Furthermore, we will assess if similar problems exist in our controls for other sources of requirements.

To ensure that there are not any more requirements from Mozilla that we currently do not implement, we will re-check our compliance with every requirement. If any systemic problems in our current controls are identified, we will adjust these controls accordingly.

These actions will be finished and/or necessary changes implemented no later than 2020-08-14.

I don't see any details as to what a "software fix" was,

Our software did not check that incoming RSA public keys had a modulus whose size is divisable by 8 (in bits). Such a check was implemented, so that other certificate requests cannot be approved.

or why that's sufficient, or how that relates to root cause.

Incomplete coverage of a requirements document led to incomplete checks in the issuance software.

Flags: needinfo?(brauckmann)

Thanks. I've set the Next Update to 2020-07-23 to make sure we check in on how progress is being made, because 5 weeks does sound like a long time to investigate. I'm hoping and expecting that with that much time, it will be an opportunity to provide a thorough and well-researched understanding about what went wrong, or how the processes can be and will be improved.

That said, it sounds like this is progressing in the right direction. Please provide an update on the 23 to share more about the steps that have been taken, that things are on track, and any preliminary thoughts you may have.

Whiteboard: [ca-compliance] → [ca-compliance] - Next Update - 23-July 2020

Revocation of the affected certificates is now complete.

Steps that have been taken, as mentioned above:

  • affected certificates have been revoked
  • issuance software now checks for "modulus size in bits is divisble by 8"
  • investigation for root cause and measures to improve is being worked on
  • Recheck of compliance is also being worked on

Preliminary thoughts:

We are still investigating the root cause. Of course we do have controls in place that monitor changes in Mozilla's requirements. This specific
requirement has been added to Mozilla's root store policy 2.4 in February 2017. We are currently assessing our internal communication from that time to
understand how this addition to Mozilla's policy could be overlooked

Regarding this concrete case, we found that:

  • As we follow m.s.d.p, we took notice of the discussion around version 2.4 of the policy in 2016/beginning of 2017
  • There was a brief discussion of "modulus size divisable by 8" on m.s.d.p. in a > 20 message thread title "Appropriate role for lists of algorithms and key sizes". We missed that brief discussion.
  • When reviewing the differences (Gerv specifically added https://github.com/mozilla/pkipolicy/compare/2.3...2.4 to the announcement), the reviewer missed "whose modulus size in bits is divisible by 8," from https://github.com/mozilla/pkipolicy/commit/3d18836f5f424049b32b2d2e2e2382cf58884b24
  • All changes from 2.3 to 2.4 were categorized as "we are doing this already" by the reviewer, except the need for an english version of all cp/cps documents. All internal communication concentrated on this point.

Generally:

  • Review of changes to requirements documents is currently done by one person, who then concludes "all is fine" or "something needs to be done". This seems to be the crucial point in the process.

and we will adjust our process for monitoring changes in Mozilla's root store policy accordingly.

We are thinking about doing such reviews by two people in the future, or implement a two-step process. I will update on this at 2020-08-14 at the latest.

Furthermore, we will assess if similar problems exist in our controls for other sources of requirements.

Yes, its the same for all compliance requirements.

To ensure that there are not any more requirements from Mozilla that we currently do not implement, we will re-check our compliance with every requirement.

We are still in the progress of checking the complete current policy 2.7. Update on this at 2020-08-14 at the latest.

Summary: DFN-PKI: 42 certificates with RSA modulus size in bits not divisable by 8 → T-Systems / DFN-PKI: 42 certificates with RSA modulus size in bits not divisable by 8

Thanks. This continues to be a valuable examination of the "how" things went wrong, and a good model for other CAs in how to think about things, as well as how to examine past relevant facts and data.

I'm updating the Next-Update date accordingly.

Whiteboard: [ca-compliance] - Next Update - 23-July 2020 → [ca-compliance] - Next Update - 14-August 2020

and we will adjust our process for monitoring changes in Mozilla's>> root store policy accordingly.
We are thinking about doing such reviews by two people in the future,
or implement a two-step process. I will update on this at 2020-08-14 at
the latest.

We have modified our compliance plan to

  • require the person who's doing a review of changed requirements to
    document every single assessment about the relevance of changes for us
  • require another person to recheck those assessments.

The modified compliance plan is now in force.

We are still in the progress of checking the complete current policy
2.7. Update on this at 2020-08-14 at the latest.

The check is completed. We identified two further points that needed improvement:

  • The requirement from 5.1.1 regarding the enforcement of SPKI
    structures with OID "rsaEncryption" was covered by our issuance
    software, but not by the acceptance tests. This was fixed. So, the
    requirement was fulfilled before and there was no compliance
    violation/misissuance, but it was an opportunity to detect gaps in test
    coverage.

  • Our CP is structured according to RFC 3647 with one exception: The sequence of sections 8.1 to 8.4 is mixed up and "8.4 Topics
    covered by assessment" from the RFC is currently called "Audited sectors".

We will update our CP to correct this issue. This will happen
until 2020-09-30 at the latest. (Explanation for this rather long
timeframe: Next week we are meeting again with our ETSI auditors, and
we would like to do potential changes they may be requesting and this
issue in one go)

CP update is in progress and on track.

Whiteboard: [ca-compliance] - Next Update - 14-August 2020 → [ca-compliance] - Next Update - 2-Oct-2020

As of today, the new policy is in force: https://www.pki.dfn.de/fileadmin/PKI/DFN-PKI_CP-EN.pdf

I think theres nothing left we need to do for this ticket?

Flags: needinfo?(bwilson)

I'll schedule to close this on 30-October-2020 unless additional discussion is needed.

Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] - Next Update - 2-Oct-2020 → [ca-compliance] [ov-misissuance]
You need to log in before you can comment on or make changes to this bug.