Entrust: Certificate issued with validity greater than 825-days

ASSIGNED
Assigned to

Status

task
ASSIGNED
2 months ago
Last month

People

(Reporter: bruce.morton, Assigned: bruce.morton)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [ca-compliance])

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.100 Safari/537.36

Steps to reproduce:

EV SSL certificate issued for 3 years.

Expected results:

EV SSL certificate shall have a validity period of 825-days or less

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

Entrust Datacard discovered the miss-issue through testing when renewing certificates for the root embedding test sites. The problem was discovered at approximately 18:30 UTC on 21 June 2019.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

June 21, 16:00 UTC - Issued two certificates with a validity period greater than 825-days
June 21, 18:30 UTC - Issue was discovered
June 21, 19:46 UTC - Both certificates were revoked by this time.

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

The CA has stopped issuing certificates with a validity period greater than 825-days.

  1. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.

Two certificates with a validity period great than 825-days were issued on 21 June 2019.

  1. The complete certificate data for the problematic certificates.

https://crt.sh/?id=1599780572
https://crt.sh/?id=1599781352

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

The issuing CA is not set up for production as there are no third party certificates currently being issued. There is no automated issuance nor is there any pre-issuance linting set for a non-production CA. The certificates to support the BR required test sites were issued manually. The issuing CA was set with a default validity period set at 36 months. This period was not changed to issue the test site certificates.

  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

Both certificates were revoked on 21 June 2019, when the issue was discovered.

In order to avoid this issue in the future, a manual process has been established to ensure that the correct certificate profile elements are in the certificates. This process will require an approved certificate profile be followed to create the certificate and that the profile also be used to verify that the certificate meets the requirements. A certificate that does not meet the requirements will not be released, but be revoked.

Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

Bruce,

I view this as a rather significant event, precisely because this involved manual controls and review for issuing the required sites. I do not believe the root cause analysis or remediation steps go into the necessary detail. Further, the suggestions of not releasing, but revoking, does not demonstrate the necessary awareness nor seriousness of this issue.

Will you provide more details about the surrounding events? The timeline does not provide any insight into the process involved here, and as such, can lead to a conclusion that Entrust did not have any procedures in place to ensure oversight and review of such issuance. If that is not the case, and I hope it is not, because it would be rather detrimental to trust in the CA, I am hoping you can provide a more thorough explanation about the events here, what controls existed, and why they failed. The current incident report seems that the CA is negligent on multiple fronts - this is an area where a more detailed incident report can address and sufficiently mitigate.

Flags: needinfo?(bruce.morton)
Assignee: wthayer → bruce.morton
Type: defect → task
Whiteboard: [ca-compliance]

Further investigation indicates that there were 5 miss-issued certificates from the same incident with the same root cause. I have updated the miss-issue items below to cover the full incident.

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

On June 21, 2019 at approximately 18:30 UTC, Entrust Datacard discovered a miss-issue of 2 certificates with a validity period greater than 825-days through online testing of certificates for the root embedding test sites.

On June 25, 2019 at approximately 14:30 UTC, Entrust Datacard discovered the miss-issue of 5 EV SSL certificates with OV subject data. These certificates include the 2 certificates which were issued for more than 825-days. This data was discovered by reviewing zlint data. The zlint data showed that the certificates were missing the business category and the serial number from the subject. Further review also indicated that the registration jurisdiction information was was also missing from the subject of all certificates.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

June 21, 16:00 UTC - Issued three certificates to support valid, expired and revoked test sites
June 21, 18:30 UTC - Issue was discovered that 2 certificates were issued for 3 years which exceeds the maximum allowed of 825-days
June 21, 19:46 UTC - Both certificates were revoked and replaced by this time.
June 25, 14:30 UTC - Zlint data indicated that all 5 certificates issued on 21 June 2019 were issued with OV validated data, but included the EV certificate policy OID. Zlint indicated an error of the serial number and business type were missing from the subject name.
June 25, 20:59 UTC - One certificate was not expired or revoked, so was revoked at this time.

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

The CA has stopped issuing EV SSL certificates with incorrect data in the subject and with an incorrect validity period.

  1. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.

Two EV SSL certificates were issued with a validity period greater than 825-days on 21 June 2019.
Five EV SSL certificates with incorrect data in the subject were issued on 21 June 2019.

  1. The complete certificate data for the problematic certificates.

https://crt.sh/?id=1600217400
https://crt.sh/?id=1600215957
https://crt.sh/?id=1599780572
https://crt.sh/?id=1599781352
https://crt.sh/?id=1599692465

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

The issuing CA is not set up for production as there are no third party certificates currently being issued. There is no automated issuance nor is there any pre-issuance linting set for a non-production CA. The certificates from the issuing CA are issued manually. There is a process for manually issuing certificates. This process defines the certificate type, the validity period, and the subject name. It also provides the CSR. The process requires verification to confirm the subject name information has been verified along with checking CAA.

In this incident the trusted role which issued the certificates made errors where 1) 2 certificates were issued using the default validity of 36 months instead of the requested 26 months and 2) all 5 certificates were issued as EV SSL certificate type, but included only OV validated information.

  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

The certificates were revoked or expired as follows:

  • 2 certificates were issued, then revoked on June 21, 2019, as they were issued to support a revoked test site.
  • 1 certificate was issued for a 1 day validity, as it was issued to support an expired test site. This certificate expired on June 22, 2019.
  • 1 certificate was revoked on June 21, 2019 as it was discovered that the validity period exceeded 825-days.
  • 1 certificate was revoked June 25, 2019, after the EV data miss-issue was discovered.

In order to avoid this issue in the future, it will be addressed in 2 phases.

Phase 1 - The existing manual process will be updated to include the following:

  • Manual certificate issuance must be justified and approved
  • Manual certificate issuance request must include certificate type, validity period, subject name, SANs and certificate profile
  • CSR must be verified to meet policy requirements which includes key size and no use of weak keys
  • Verification team must ensure that the subject information has been verified and is correct to use with the certificate request
  • CAA must be checked
  • Certificate issuance must be checked against the approved certificate profile, the approved subject and also checked against a publicly available linting tool

Phase 2 - The process will be moved to a managed account with automates checks:

  • All subject data in the account will be pre-verified
  • All certificate requests will 1) validate the quality of the CSR, 2) move through our "policy engine" to ensure BR and EV requirements are met, and 3) check CAA
  • All certificate requests will have pre-issuance linting performed
  • All EV pre-certificates will be published to CT logs

We think that Phase 2 will eliminate the errors from manual certificate issuance.

Flags: needinfo?(bruce.morton)

Bruce: can you provide estimates for when phase 1 and 2 remediation will be completed?

Flags: needinfo?(bruce.morton)

Phase 1 will be completed immediately and before any other certificate is issued in this manner. Phase 2 will require some technical changes, which I do not have a schedule for at this time. I will update this miss-issue report when I get a time frame.

Flags: needinfo?(bruce.morton)

Do you have a timeframe yet?

Flags: needinfo?(bruce.morton)

Bruce: Still waiting for an update based on Comment #6. We're now 2.5 weeks after Comment #5, and weekly updates are expected.

(In reply to Ryan Sleevi from comment #7)

Bruce: Still waiting for an update based on Comment #6. We're now 2.5 weeks after Comment #5, and weekly updates are expected.

Phase 1 is effective immediately.
Phase 2 may take 6 months or so. We are working on a project to migrate our CA infrastructure and so phase 2 will be delayed until that project is completed. Can I follow up in about 3 months with the status?

Thanks, Bruce.

Flags: needinfo?(bruce.morton)

Thanks. I was just looking to get a clear timeframe about next steps.

If I understand correctly: Phase 2 is described in Comment #3. It is gated on a migration of CA infrastructure. Phase 2 will not begin until that migration is complete. Including that migration, Phase 2 is expected to take approximately 6 months - or roughly 2020-01-15. An update will be provided midway, 2019-10-15, about the status of the CA infrastructure migration and the work for Phase 2.

If so, I think that addresses my questions, and I'm punting to Wayne to see if he has further questions, or wants to set the Next-Update to 2019-10-15. I think an important part of that is, in the spirit of https://wiki.mozilla.org/CA/Responding_To_An_Incident#Keeping_Us_Informed , to ensure progress updates on the CA infrastructure migration are made, particularly if they jeopardize the deployment of Phase 2. It's not good to find out that things are not on track the day before they're supposed to be deployed.

Flags: needinfo?(wthayer)

The Phase 1 procedure development has been finalized. The 3 test-site certificates have been re-issued using the Phase 1 procedure:
https://crt.sh/?id=1672737154
https://crt.sh/?id=1672737431
https://crt.sh/?id=1672737923

We will have to use the Phase 1 procedure again to renew certificates will expire in September 2019; after which, we will plan to use the Phase 2 procedure.

While finalizing and testing the Phase 1 procedure, 4 certificates were miss-issued. These certificates were not used or CT logged. These certificates have been revoked or have expired. Here are the serial numbers:
00b9aab05a97686192000000004c223b05
00d89f275d75ca43ef000000004c223b0a
00d38099edd7bdcdec000000004c223b0b
2d18ca96e860b9a9000000004c223b09

(In reply to Ryan Sleevi from comment #9)

Thanks. I was just looking to get a clear timeframe about next steps.

If I understand correctly: Phase 2 is described in Comment #3. It is gated on a migration of CA infrastructure. Phase 2 will not begin until that migration is complete. Including that migration, Phase 2 is expected to take approximately 6 months - or roughly 2020-01-15. An update will be provided midway, 2019-10-15, about the status of the CA infrastructure migration and the work for Phase 2.

If so, I think that addresses my questions, and I'm punting to Wayne to see if he has further questions, or wants to set the Next-Update to 2019-10-15. I think an important part of that is, in the spirit of https://wiki.mozilla.org/CA/Responding_To_An_Incident#Keeping_Us_Informed , to ensure progress updates on the CA infrastructure migration are made, particularly if they jeopardize the deployment of Phase 2. It's not good to find out that things are not on track the day before they're supposed to be deployed.

This miss-issue report will be tracked by our Policy Authority to ensure data is provided through closure.

(In reply to Bruce Morton from comment #10)

While finalizing and testing the Phase 1 procedure, 4 certificates were miss-issued. These certificates were not used or CT logged. These certificates have been revoked or have expired. Here are the serial numbers:
00b9aab05a97686192000000004c223b05
00d89f275d75ca43ef000000004c223b0a
00d38099edd7bdcdec000000004c223b0b
2d18ca96e860b9a9000000004c223b09

So these are 4 new, previously unreported misissuances? If so:

  • Please log them (https://crt.sh/gen-add-chain helps with this)
  • Please update this incident report or create a new one to explain what happened and how it will be prevented in the future.

I'm going to hold off on setting the Next Update to 15-October pending a response to this comment.

Flags: needinfo?(wthayer) → needinfo?(bruce.morton)

(In reply to Wayne Thayer [:wayne] from comment #12)

So these are 4 new, previously unreported misissuances? If so:

  • Please log them (https://crt.sh/gen-add-chain helps with this)
  • Please update this incident report or create a new one to explain what happened and how it will be prevented in the future.

In finalizing the Phase 1 procedure, 4 certificates were miss-issued due to manual error.

The following certificate was issued with the incorrect profile, the result was the certificate was signed using SHA-1 and has no subjectAltName.
https://crt.sh/?id=1681033012

The following 3 certificates were issued to the correct profile, but there was a spelling error in the url for the OCSP response.
https://crt.sh/?id=1642629915
https://crt.sh/?id=1642630141
https://crt.sh/?id=1642630140

All 4 certificates have been revoked or have expired.

The Phase 1 process and documentation were updated and the final 3 certificates were correctly issued per comment 10.

The Phase 2 process will eliminate the manual errors.

Flags: needinfo?(bruce.morton)

Bruce: it's not acceptable to brush off these additional misissuances of as manual errors. When remediating a prior misissuance, I would expect the Entrust team to follow documented and peer-reviewed procedures and to double- and triple-check configurations before triggering production issuance. This sounds more like testing in production, with publicly-trusted certificates. I have created bug #1567659 to track this new issue.

You need to log in before you can comment on or make changes to this bug.