Closed Bug 1534580 Opened 8 months ago Closed 4 months ago

DFN-PKI: 40 OV certificates with wrong ST

Categories

(NSS :: CA Certificate Compliance, task)

task
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: brauckmann, Assigned: brauckmann)

Details

(Whiteboard: [ca-compliance])

User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/72.0.3626.121 Safari/537.36

Steps to reproduce:

DFN-PKI issued 40 certificates with wrong value for ST

Actual results:

wrong value for ST was included into certificates

Expected results:

correct value should have been included

DFN-PKI issued 40 OV certificates with inconsistent L= and ST=-field. Those certificates contain L=Bremerhaven,ST=Niedersachsen,C=DE.

The correct value would have been L=Bremerhaven,ST=Bremen,C=DE.

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

2019-03-06, approx 15:00 CET: Customer called and complained about wrong ST field.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

2018-10-08: Client was set up with pre-approved value "L=Bremerhaven,ST=Niedersachsen". Customer issued certificates without recognizing the wrong value himself.
2019-03-06, approx 15:00 CET: Customer called and complained about wrong ST field.
2019-03-11, 10:50 CET: Revocation started together with customer.

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

Configuration was changed to the correct value "L=Bremerhaven,ST=Bremen".

  1. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.

5 user certificates and 35 server certificates. First issuance 2018-10-17, last issuance 2019-03-06

Up to now (2019-03-12 12:35 CET), all affected server certificates are revoked. 2 user certificates are still to be revoked.

  1. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.

server certificates:

https://crt.sh/?id=1257896190
https://crt.sh/?id=1228538239
https://crt.sh/?id=1193758507
https://crt.sh/?id=1098491899
https://crt.sh/?id=984056965
https://crt.sh/?id=984041996
https://crt.sh/?id=977573410
https://crt.sh/?id=948974386
https://crt.sh/?id=948908471
https://crt.sh/?id=948905600
https://crt.sh/?id=948844013
https://crt.sh/?id=948830677
https://crt.sh/?id=948513141
https://crt.sh/?id=948409327
https://crt.sh/?id=945823496
https://crt.sh/?id=945526837
https://crt.sh/?id=943545187
https://crt.sh/?id=941060836
https://crt.sh/?id=941042046
https://crt.sh/?id=941013371
https://crt.sh/?id=940993986
https://crt.sh/?id=940978620
https://crt.sh/?id=940966438
https://crt.sh/?id=940960735
https://crt.sh/?id=940944542
https://crt.sh/?id=940153054
https://crt.sh/?id=932770746
https://crt.sh/?id=932068595
https://crt.sh/?id=932024158
https://crt.sh/?id=932008279
https://crt.sh/?id=931981156
https://crt.sh/?id=931970818
https://crt.sh/?id=931956434
https://crt.sh/?id=884703188
https://crt.sh/?id=884684388

user certificates not listed for privacy reasons.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

Wrong: L=Bremerhaven,ST=Niedersachsen,C=DE.
Correct: L=Bremerhaven,ST=Bremen,C=DE.

Short detour into geography: The city of Bremerhaven is an entity in the
German state Bremen, but is not located inside Bremen's main state
boundaries. Instead, it is situated as an exclave about 60km north of
Bremen's main territory surrounded by the state of Niedersachsen (Lower Saxony).

When setting up a PKI client located in the German town Bremerhaven, all OV validation info was gathered and verified, including the correct address with locality and state of the client. The validation agent then did not copy the correct value for "state" into the configuration templates. Instead he typed the value which he thought was correct from memory. As Bremerhaven is indeed surrounded by Niedersachsen territory, he was 100% sure to do the right thing.

Mandated review by a colleague which is done before setup did not catch the mistake.

Also, our daily review of all issued certificates did not reveal the problem.

  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
  • We checked for further occurrences of this mistake, and did not find any.

  • Staff retrained to properly transfer values from validaton doc to configuration template

The remaining certificates (2 user certificates) where revoked yesterday.

Assignee: wthayer → brauckmann
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance]

Staff retrained to properly transfer values from validaton doc to configuration template

Can you please provide further details about what the 'proper' way is? I think understanding this flow is important to understanding whether or not any systemic issues could have been caught.

In light of the ongoing issues surrounding "Some-State" and "Some-City", have technical controls been evaluated with respect to these fields? If so, can you provide an update on that evaluation and what changes or determination has been made? I realize those have emerged after this issue was created - but did any of those issues cause a re-evaluation about ways to detect or mitigate such issues?

Flags: needinfo?(brauckmann)

(In reply to Ryan Sleevi from comment #3)

Staff retrained to properly transfer values from validaton doc to configuration template

Can you please provide further details about what the 'proper' way is? I think understanding this flow is important to understanding whether or not any systemic issues could have been caught.

"Proper way" is to use exactly the validated information. No other information whatsoever including presumed common knowledge must be used. This is what has been re-trained as stated above.

In light of the ongoing issues surrounding "Some-State" and "Some-City", have technical controls been evaluated with respect to these fields? If so, can you provide an update on that evaluation and what changes or determination has been made? I realize those have emerged after this issue was created - but did any of those issues cause a re-evaluation about ways to detect or mitigate such issues?

"some-state" etc. could not have been validated in our process as that would never have passed the 4-eyes-based validation process. Any human would immediately recognize this as invalid (in contrast to placing a L in a wrong ST). As we have a very limited number of ST/L/ORG combinations, we can have a thorough manual process and don't need automatic controls for scalability reasons.

Flags: needinfo?(brauckmann)
Flags: needinfo?(wthayer)

I have no further questions related to the invalid data or remediation. However, it does not appear that all of these certificates were revoked within 5 days of the customer reporting the issue as required by the BRs (e.g. https://crt.sh/?id=948513141). If DFN concurs, please explain why this requirement was not met and what is being done to ensure that it will be met in the future. If DFN believes that the revocation deadline was met, please explain.

Flags: needinfo?(wthayer) → needinfo?(brauckmann)

The customer did not file the complaint explicitly as urgent, but as a general service request on Thursday. Unfortunately, also due to the weekend 2019-03-09/2019-03-10, it took until Monday to actually recognize that the service request contained an incident. We immediately then started reaching out to the customer and were finished with revocations within 24 hours.

We are updating our training documentation for 1st level support to improve on recognizing and classifying such service request as incidents in a timely manner.

Flags: needinfo?(brauckmann)

(In reply to Jürgen Brauckmann from comment #6)

We are updating our training documentation for 1st level support to improve on recognizing and classifying such service request as incidents in a timely manner.

Please comment in this bug when the updated training documentation is in use.

Flags: needinfo?(brauckmann)

(In reply to Wayne Thayer [:wayne] from comment #7)

(In reply to Jürgen Brauckmann from comment #6)

We are updating our training documentation for 1st level support to improve on recognizing and classifying such service request as incidents in a timely manner.

Please comment in this bug when the updated training documentation is in use.

(Answering for Jürgen who is currently travelling)

Re-training of all affected personell has been conducted immediately after the incident. We will update this thread again when the final updates have been added to the training documentation.

Ralf/Jürgen: Thanks for the commitment to update in Comment #8. However, do you have a concrete timeframe for when this will be completed?

Flags: needinfo?(groeper)

Again requesting an update.

Timeframe: End of next week at the latest (2019-08-02)

(sorry for the delay, we somehow misinterpreted the question)

Flags: needinfo?(brauckmann)
Whiteboard: [ca-compliance] → [ca-compliance] - Next Update - 03-August 2019

The final updates have been added to the training documentation and it is in use.

It appears that all questions have been answered and remediation is complete.

Status: ASSIGNED → RESOLVED
Closed: 4 months ago
Flags: needinfo?(groeper)
Resolution: --- → FIXED
Whiteboard: [ca-compliance] - Next Update - 03-August 2019 → [ca-compliance]
You need to log in before you can comment on or make changes to this bug.