Closed Bug 1706950 Opened 4 years ago Closed 3 years ago

PKIoverheid: KPN issued Invalid organizationalUnitName

Categories

(CA Program :: CA Certificate Compliance, task)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: michel, Assigned: jorik.vant.hof)

Details

(Whiteboard: [ca-compliance] [ov-misissuance])

Hello,
I noticed that the certificate https://crt.sh/?id=4410349748&opt=zlint,ocsp issued very recently has an invalid organizationalUnitName.

Assignee: bwilson → jorik.vant.hof
Summary: Staat der Nederlanden: Invalid organizationalUnitName → PKIoverheid: KPN issued Invalid organizationalUnitName
Whiteboard: [ca-compliance]
Status: NEW → ASSIGNED
  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.
  • We received a Bugzilla notification email with the bug at 04/23/2021 at 07:02 (CEST).
  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

21-04-21 16:12GMT issuance of certificate with invalid OU field
22-04-21 21:27GMT mis-issuance reported on bug post 1706950
23-04-21 07:12GMT KPN informed by Logius of bug post 1706950
23-04-21 07:15GMT started internal investigation of the cause and checked if more certificates are affected
23-04-21 07:45GMT informed end customer
23-04-21 09:22GMT confirmation no other certificates with mis-issued OU field
23-04-21 09:22GMT cause was KPN employee mistyping backslash instead of backspace.
23-04-21 10:45GMT new certificate issued to end customer, https://crt.sh/?id=4419443744
23-04-21 11:00GMT customer installed the new certificate
23-04-21 11:03GMT affected certificate revoked
23-04-21 11:57GMT Regex.match input validation implemented on OU field of application form.

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.
  • After checking the total population against zlint, the affected certificate turned out to be the only affected certificate.
  1. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
  1. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.
  • https://crt.sh/?id=4410349748
    dfbc211bcfbc1e236c0e669e3c9f925d7553433e536077569cde2f1ef337d6a6
    Date issued: 04/21/2021 – Date revoked 04/23/2021
  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
  • The cause of the mis-issuance was a human error in the validation process caused by mistyping the backslash key two times instead of the backspace key when erasing an invalid input from the end customer in the OU field.
  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
  • Next coming weeks KPN will take action so that the OU field will not be a free format field anymore. Extra field validation checks have been introduced today to prevent recurrence of this type of human error.

  • Furthermore KPN will implement linting tools to reduce the chance of mis-issuance of certificates.


  • PKIoverheid developed its own post-linting solution. Because we have to switch to another platform and we wished to introduce additional validation and analysis capabilities, we had to do a redesign and rebuild of the tool. We plan to have this data management and analytics redesign hosted in our new private cloud environment in May. Therefore, from now on we will perform a manual post-issuance check once a day.

  • Furthermore, our new post-issuance validation platform will also make it possible to validate certificate attributes against publicly available registries. Because this is inherently impossible for the OU field, we are in the process of submitting a change to our CP banning the use of this field in public certificates. Next month this change will be formalized.

Could you provide more information on certificate linting at PKIoverheid? Linting has been a best practice for a number of years now, and yet it sounds as if KPN, and possibly PKIoverheid has not implemented any sort of linting until now? Will the PKIoverheid post-issuance linting solution apply to KPN certificates? If not, what is the timeline for KPN to implement automated linting of every issued certificate?

I am most concerned by the implication that PKIoverheid considers post-issuance linting to be a final solution rather than a stopgap measure, because post-issuance linting doesn't prevent misissuance. Are there plans and timelines for implementing pre-issuance linting at KPN and/or PKIoverheid?

Flags: needinfo?(david.weissenberg)

We would like to emphasize that we do not consider post-issuance linting a final solution to the problem at hand. Pre-linting is certainly seen and promoted by us as a best-practice within PKIoverheid and has also been implemented by a number of TSPs. However, KPN has not yet taken this step. We and KPN feel it is important to fix this as soon as feasible. KPN has indicated that it will implement linting tools to prevent the possibility of a reoccurrence. Even though KPN has already tested a pre-linting solution in its test environment, there are still a number of dependencies within the TWS context that need to be figured out. An elaboration of the plan to implement pre-issuance linting, including timeline will be ready in the second half of May.

PKIoverheid had a post-linting tool running, this application has recently been redesigned and will be hosted in our new private cloud environment as a data management and analytics solution in May. The new platform has the capacity to meet additional certificate control needs and desires. This solution will apply to all PKIoverheid SSL/TLS certificates. In the meantime we will perform a manual post-issuance check once a day as a stopgap measure. As an additional measure we are considering a proposal to our PoR in which we make pre-linting mandatory for PKIoverheid TSPs.

Flags: needinfo?(david.weissenberg)

Reminder from https://wiki.mozilla.org/CA/Responding_To_An_Incident#Keeping_Us_Informed

You should also provide updates at least every week giving your progress, and confirm when the remediation steps have been completed - unless Mozilla representatives agree to a different schedule by setting a “Next Update” date in the “Whiteboard” field of the bug. Such updates should be posted to the MDSP mailing list, if there is one, and the Bugzilla bug. The bug will be closed when remediation is completed.

Question 7 in Comment #1 failed to give an actionable timeline, as also covered on that page:

accompanied with a binding timeline of when your CA expects to accomplish each of these remediation steps.

In Comment #3, we got the following update:

Even though KPN has already tested a pre-linting solution in its test environment, there are still a number of dependencies within the TWS context that need to be figured out. An elaboration of the plan to implement pre-issuance linting, including timeline will be ready in the second half of May.

We're now in June.

Flags: needinfo?(david.weissenberg)

Our apologies for the late response regarding this issue.

The implementation of (Pre)-Linting tools will be taken up by KPN through their controlled OTAP process. This must be done carefully because it must be integrated into their internal processes. Therefore, this change will be combined with the replacement of the CA infrastructure.

Based on that schedule, KPN expects to have the acceptance environment ready on September 1 and to go live on October 1.

KPN has already implemented the programmed checks in their issuance software, as stated in comment 1, to mitigate the error until the pre-linting tool is rolled out.

Flags: needinfo?(david.weissenberg)

I'm not fully sure I understand comment #5, and I'm hoping you can provide more detail here. That is, the controlled OTAP process, or the discussion of "replacement of the CA infrastructure". Could you explain more what is happening here, and the timeframes? I note that, like Comment #3, these end up using acronyms, but to be honest, I'm not sure their context in the case of PKI here or if they're just... enterprise speak :P

Comment #1 stated the use of the OU would be forbidden end of May, but I also don't see an update here on that.

I think it'd be useful to provide a bit of background of what you consider relevant architectural detail here (and if this has been provided on past bugs or other online documents, you should feel free to refer to them), to help understand what steps have been done, what remains to be done, and what the risks are to those remaining tasks being completed on time / why they can't be completed sooner, in order to help better understand the risks being balanced here.

Flags: needinfo?(david.weissenberg)

It wasn't until we read your comment that we realized "OTAP" is in fact a Dutch term; in English DTAP is used. KPN is testing integrated linting functionality on their Testing and/or Acceptance environment, and then will use their change process to push to Production.

Indeed it was our intent to push the change in the CP for our TSPs to forbid OUs at the end of May. However, this has faced an unforseen delay which has been resolved yesterday. This means we can now officially process the proposed changes (of which the banning of OU is just one). At the end of next week everything should officially be formalized. In the proposal the start date of OU prohibition is set to 1 August 2021

KPN has pre-issuance Regular Expression validation checks for most of the certificate request fields. Until 23 April 2021 KPN had no input validation for the OU-field, because it was is an optional text field without strict syntax. Based on the bugreport, KPN implemented on April 23, 2021 Regular Expressions input validation on OU field of the application form.

In comment #5 KPN mentions the replacement of the CA infrastructure. This is a scheduled infrastructure life-cycle management action in which implementing pre-issuing linting tools was also to be a component. However, considering the risk of similar mis-issuance, KPN decided to implement linting tools on the current infrastructure, removing the dependency on the CA infrastructure replacement project. The implementation will still be done through a controlled DTAP process and will be finished before mid-August.

Robert: Can you provide an update now that we're in the next week?

Right now, it feels like Comment #3 / Comment #5 don't really give a good insight into the timelines and dependencies here, other than "unforseen delay" (Comment #7; unclear "why"), what the "proposed changes" are (Comment #7; unclear what else is on the list), or what the "number of dependencies" (Comment #3; asked for more detail in Comment #4) are.

It's unclear if the proposed timeline is "PKIoverheid will have pre-issuance linting for all of its CAs by 2021-10-01" (Comment #5), or if that's changed ("will be finished before mid-August", Comment #7).

Fundamentally, the goal is to have a clear understanding about what the timeline for action is, and what the factors are with that timeline. It's not that something taking three months is inherently bad, but wanting to make sure there is a clear understanding about why something would take three months, what the interim steps are to doing that, and how we can be confident there won't be "unforseen delays".

Flags: needinfo?(david.weissenberg) → needinfo?(robert.leyting)

Hello Ryan, we would like to provide you with an update and extension of the timeline provided so far.

Right now, it feels like Comment #3 / Comment #5 don't really give a good insight into the timelines and dependencies here, other than "unforseen delay" (Comment #7; unclear "why"), what the "proposed changes" are (Comment #7; unclear what else is on the list), or what the "number of dependencies" (Comment #3; asked for more detail in Comment #4) are.

  • The unforseen delay mentioned in Comment 7 does not concern KPN and as such does not relate to either comment 3 or 5. The delay only reffered to Logius updating the PKIoverheid CP for its TSPs concerning the OU field. The website we need to publish this on now adheres to stricter publishing (WCAG amongst others) guidelines, so we first had to fix some formalities before the actual document could be published.
  • The change concerning the OU-field was one of a larger batch of changes. These changes are proposed to improve the PoR in general and are not specific to this OU issue. Examples of proposed changes are banning certain fields or standardizing specific field entries (not necessarily concerning TLS certificates).
  • The dependencies concerning the implementation of the pre-issuance linting tool are mentioned in the timeline provided below.

It's unclear if the proposed timeline is "PKIoverheid will have pre-issuance linting for all of its CAs by 2021-10-01" (Comment #5), or if that's changed ("will be finished before mid-August", Comment #7).

  • When KPN concludes it's pre-issuance linting implementation all PKIoverheid TSPs issuing publicly trusted TLS certificates will have pre-issuance linting solutions in place.

Fundamentally, the goal is to have a clear understanding about what the timeline for action is, and what the factors are with that timeline. It's not that something taking three months is inherently bad, but wanting to make sure there is a clear understanding about why something would take three months, what the interim steps are to doing that, and how we can be confident there won't be "unforseen delays".

  • Before the initial bug post KPN had planned a CA infrastructure replacement including implementing pre-issuance linting tools in the new infrastructure. This was initially scheduled to be finished before October. Due to the urgency of this bug KPN decided to implement pre-issuance linting tools on the existing CA infrastructure, removing the dependency on the CA infrastructure replacement project.

The current implementation plan for pre-issuance linting at KPN is:

  1. Installing linting tool on acceptation environment.
    Ready.
  2. Testing of the linting tool, including integration with the validation workflow software. As mentioned in comment 3, testing with the CA software has been completed successfully.
    On going. Expected completion: July 9
  3. Evaluating of test results.
    Expected completion: July 12
  4. Installing linting tool on production (including disaster recovery) and setting up the management process.
    Expected completion: July 16

This is a happy flow, depending on the outcome of the testing and evaluations this timeline may expand. We will give a weekly update on the progress of the implementation.

Thanks David. This is a much clearer action plan and set of dependencies, and weekly updates sounds fine.

Flags: needinfo?(robert.leyting)

Ryan, KPN has informed us that as of today the zlint, pre-issuance linting tool has been rolled out on their PKIoverheid-production environment.

Ben: Based on Comment #9, I believe Comment #11 closes out the open remediations.

Flags: needinfo?(bwilson)

Hi David,
Can you please confirm the completion of 4. Installing linting tool on production (including disaster recovery) and setting up the management process? If so, then I will schedule this bug to be closed on our about Friday, 16-July-2021.
Thanks,
Ben

Flags: needinfo?(david.weissenberg)

Hello Ben,
Hereby we confirm that KPN finished point 4 from comment 9:

the installation of the linting tool on production (including disaster recovery) and setting up the management process.

Flags: needinfo?(david.weissenberg)

I will close this on Friday, 30-July-2021.

Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Flags: needinfo?(bwilson)
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ov-misissuance]
You need to log in before you can comment on or make changes to this bug.