Closed Bug 1539358 Opened 5 years ago Closed 5 years ago

SECOM: Insufficient Serial Number Entropy

Categories

(CA Program :: CA Certificate Compliance, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: h-kamo, Assigned: h-kamo)

Details

(Whiteboard: [ca-compliance] [uncategorized])

Wayne-san,

Here is our incident report regarding Insufficient Serial Number Entropy.

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

We realized this issue through the discussion in the Mozilla mailing list on March 7, 2019.

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

We have realized this issue and started investigating on our CAs from March 8, 2019. We checked the EJBCA configuration and found the problem.
Thus, we are now investigating seriously about this issue and how to resolve the situation.
The target to resolve this issue is end of April right now.

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

We have not stopped issuing EE certificates yet but would like to deploy a fix to the issue as soon as possible and start issuing 64bit serial number certificates.

  1. A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.

The EE certificates issued from the following intermediate CAs.
Details are under investigation now.

https://crt.sh/?caid=76275
https://crt.sh/?caid=80513
https://crt.sh/?caid=30351
https://crt.sh/?caid=80509
https://crt.sh/?caid=77098
https://crt.sh/?caid=79635
https://crt.sh/?caid=2200
https://crt.sh/?caid=20168
https://crt.sh/?caid=77532
https://crt.sh/?caid=1440
https://crt.sh/?caid=101833
https://crt.sh/?caid=1706
https://crt.sh/?caid=97708
https://crt.sh/?caid=101834
https://crt.sh/?caid=77568
https://crt.sh/?caid=1713
https://crt.sh/?caid=76274
https://crt.sh/?caid=1689
https://crt.sh/?caid=80515
https://crt.sh/?caid=17796
https://crt.sh/?caid=97709
https://crt.sh/?caid=80516
https://crt.sh/?caid=101832
https://crt.sh/?caid=6926
https://crt.sh/?caid=1895
https://crt.sh/?caid=5064
https://crt.sh/?caid=1846
https://crt.sh/?caid=12241
https://crt.sh/?caid=80519
https://crt.sh/?caid=93963
https://crt.sh/?caid=10346
https://crt.sh/?caid=12387
https://crt.sh/?caid=80518
https://crt.sh/?caid=12341
https://crt.sh/?caid=52131
https://crt.sh/?caid=76
https://crt.sh/?caid=1430
https://crt.sh/?caid=307

  1. The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.

Details are under investigation now.

  1. Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.

We did not detect the issue with the linting tools.

  1. List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.

We would like to deploy a fix to the issue as soon as possible and start issuing 64bit serial number certificates.

Thank you for your consideration.

Best regards,
Hisashi Kamo

Assignee: wthayer → h-kamo
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Whiteboard: [ca-compliance]

(In reply to Hisashi Kamo from comment #0)

  1. How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.

We realized this issue through the discussion in the Mozilla mailing list on March 7, 2019.

Please explain why the 20 days since the first discovery until the incident report?

Further, considering that one of the first incident reports [1] appeared on March 1, 2019, why it took 7 days until SECOM became aware of the issue.

[1] https://groups.google.com/d/msg/mozilla.dev.security.policy/-RB8ovYgOHE/HeciPTGGAQAJ

  1. A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.

We have realized this issue and started investigating on our CAs from March 8, 2019. We checked the EJBCA configuration and found the problem.
Thus, we are now investigating seriously about this issue and how to resolve the situation.
The target to resolve this issue is end of April right now.

It is unclear if this is proposing that the EJBCA configuration will not be changed until the end of April. This is a significant and substantial delay, considering that other CAs have been able to change the configuration within hours, or prompting stopped all issuance until they could.

Is this a correct understanding of the proposed timeline? If so, please explain what makes SECOM unique among all CAs reporting this and why so significant a delay. If this is not a correct understanding, please provide further explanation.

  1. Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.

We have not stopped issuing EE certificates yet but would like to deploy a fix to the issue as soon as possible and start issuing 64bit serial number certificates.

As noted in the question, please provide a meaningful explanation about why issuance has not stopped, if the matter has not been resolved.

Flags: needinfo?(h-kamo)

Ryan-san,

Thank you for your comments.
Let us provide our answers as bellow.

Please explain why the 20 days since the first discovery until the incident report?

Further, considering that one of the first incident reports [1] appeared on March 1, 2019, why it took 7 days until >SECOM became aware of the issue.

[1] https://groups.google.com/d/msg/mozilla.dev.security.policy/-RB8ovYgOHE/HeciPTGGAQAJ

We apologize very much for delay to post our incident report.
From now on, we will make sure to post the first report as quick as we can.

The reasons for the delayed posting of the incident report are as follows.

  • We carried out check for the overall of our CA systems.
  • As a result, some CAs did not correspond to the serial number problem, but several problematic CAs were found.
  • In addition to migrate of CA products, CAs which had problem was found to affect our related systems (for example, web service systems which are sub-systems of issuing certificates) and customer side systems (for example, systems which link and capture after issuance of certificates), thus it took time to analyze for resolution.

It is unclear if this is proposing that the EJBCA configuration will not be changed until the end of April. This is a >significant and substantial delay, considering that other CAs have been able to change the configuration within hours, >or prompting stopped all issuance until they could.

Is this a correct understanding of the proposed timeline? If so, please explain what makes SECOM unique among all CAs >reporting this and why so significant a delay. If this is not a correct understanding, please provide further >explanation.

In addition to migrate of CA products, other related systems (such as our related systems and customer side systems as described in #1 above) are also involved, thus it takes time to analyze the impact on many of our customers and resolve and also make verification.
Although we planned to resolve this issue once by the end of April, we are now heavily aware of the situation and will make arrangements with our customers on an intermittent basis.
Some CAs aim to complete in a shorter period of time, such as the target by at least 4/9/2019.

As noted in the question, please provide a meaningful explanation about why issuance has not stopped, if the matter >has not been resolved.

As described in # 2, in order to made arrangements with our customers, we have prioritized and responded to the early prospects of migrating the CA systems and related systems.
We would like to ask for your understanding, as the impact on customers is so great.
We will make our best efforts to resolve this situation as soon as possible.

Thank you for your consideration.

Best regards,
Hisashi Kamo

Flags: needinfo?(h-kamo)

Thank you for your reply.

Unfortunately, the response still has me quite confused. There's significant discussion about CAs, but it appears to be a discussion about subscriber certificates.

I think it's useful to separate out the two discussions, namely:

  1. SECOM's path to resolving new certificates being issued, and ensuring all CAs in its hierarchy are capable of issuing new certificates with the appropriate serial number entropy and doing so (that is, getting all new certificates compliant)

  2. Resolving any existing certificates that may have been issued without the necessary entropy.

Your reply leads me to believe your discussion and concern is about #2, however, my concern is about #1. I'm hoping you can provide greater clarity as to the nature of the issue and hopefully separate out the question of revocation from the question of system configuration and compliance.

Flags: needinfo?(h-kamo)

Ryan-san,

Thank you for your comments.
We are now heavily aware of this issue and working very seriously.

As Ryan-san commented, we are going to resolve #1 and getting all new certificates compliant and ensuring all CAs in its hierarchy are capable of issuing new certificates with the appropriate serial number entropy.
Some CAs aim to complete in a shorter period of time, such as the target by at least 4/9/2019, but it takes a considerable time due to make arrangements with our customers.
Therefore, we are now talking with our customers including the arrangement of stop issuing new certificates until the completion of #1 although we have business impact.

The factors taking time to correspond to # 1 are as follows.

  • Working on not only the configuration of EJBCA but also careful migration on our certificate management systems due to providing function to manage server and client certificates by serial numbers.

  • Due to our certificate management systems, the impact on non-public private CAs also needs to be considered, and it took time to analyze the impact.

  • Conducting prior verification of all processing (certificate issuance / renewal / revocation / inquiry) handling serial numbers on our certificate management systems and customer side systems.

  • Requiring arrangements with our customers due to service suspension on our certificate management systems when releasing the resolved programs of the relevant CAs.

-Having customers include government and well-known financial institutions, it is necessary to make announcements of suspension on our certificate management systems for a reasonable time in advance.
Release work needs to be done on holidays at night considering the impact on end users. (Considering the work on the shortest weekend night at this moment)

We will make all of our effort as much as we can for resolving this issue.
We would like to ask for your understanding and continued support of the release schedule.

Best regards,
Hisashi Kamo

Flags: needinfo?(h-kamo)

Thank you for the additional details - it's now much clearer why there are challenges in just changing the serial number to a greater length. Is it a correct understanding that this is because a number of systems refer to certificates specifically by their serial number, and as such, all such systems need to be updated to account for a longer serial number / truly random sequence?

If it is, can you provide additional detail as to how you're proposing the new system to be compliant? For example, given the challenges presented, it would seem appropriate to ensure that all systems are capable of handling at least 20 bytes of serial number (the recommended size from RFC 5280 as to the safe maximum), rather than merely increasing to, say, 9 bytes.

This part of the incident response, by sharing root causes and planned remediation, helps the community provide feedback and/or concerns with the mitigations in an early fashion for CAs, helping them minimize unnecessary or duplicative work.

Flags: needinfo?(h-kamo)

Ryan-san,

Thank you for your comments.

Yes, it is a correct understanding.

We are planning to change the configuration of EJBCA to 16 bytes.
Concerning with each system on our certificate management systems, we are planning to make migration in which we can deal with serial number of 20 bytes.
Thereby, the serial number have a randomness of 64bit, and when we need the extension over 16 bytes in the future, the change of the related system can be minimized.

Thank you for your consideration.

Best regards,
Hisashi Kamo

Flags: needinfo?(h-kamo)

Ryan-san,

Please let us post the intermediate CAs that resolved in early morning on April 8th as follows.

https://crt.sh/?caid=76275
https://crt.sh/?caid=80513
https://crt.sh/?caid=30351
https://crt.sh/?caid=80509
https://crt.sh/?caid=77098
https://crt.sh/?caid=79635
https://crt.sh/?caid=2200
https://crt.sh/?caid=20168
https://crt.sh/?caid=1440
https://crt.sh/?caid=101833
https://crt.sh/?caid=97708
https://crt.sh/?caid=101834
https://crt.sh/?caid=77568
https://crt.sh/?caid=1713
https://crt.sh/?caid=76274
https://crt.sh/?caid=1689
https://crt.sh/?caid=80515
https://crt.sh/?caid=17796
https://crt.sh/?caid=97709
https://crt.sh/?caid=101832
https://crt.sh/?caid=6926
https://crt.sh/?caid=1895
https://crt.sh/?caid=5064
https://crt.sh/?caid=1846
https://crt.sh/?caid=93963
https://crt.sh/?caid=12341
https://crt.sh/?caid=76
https://crt.sh/?caid=1430
https://crt.sh/?caid=307

In regards to the rest of 9CAs, we are making an arrangements for the schedule with our customers on an intermittent basis.
We will make our best efforts to resolve for the rest of 9CAs as soon as possible.

As some of the last comment #6 were incorrect, let us correct as below.

serial number have a randomness of 64bit

In regards to the CA resolved, our serial number is true random sequence, which has 127bit entropy.
Thereby, the serial number has a randomness over 64bit.

Thank you for your consideration.

Best regards,
Hisashi Kamo

Ryan-san,

Please let us update our schedule for remaining 9CAs.

In regards to the rest of 9CAs, we are making an arrangements for the schedule with our customers on an intermittent >basis.
We will make our best efforts to resolve for the rest of 9CAs as soon as possible.

Based on our arrangements with the customers, we are now planning that early morning on May 13 for resolution of the rest of 9CAs.

Thank you for your consideration.

we’d like to inform you that most of Japan business entities including us will have special long National holiday from April 27th to May 6th because of the new emperor’s enthronement, which you may already knew from the news, and this happens only one time ever in our history.
For that reason, we really appreciate your understanding that we can start contacting with you after May 7th.

Best regards,
Hisashi Kamo

Whiteboard: [ca-compliance] → [ca-compliance] - Next Update - 14-May 2019

Ryan-san,

Please let us tell you that the resolution for the rest of 9CAs was finished in early morning on May13.

Based on our arrangements with the customers, we are now planning that early morning on May 13 for resolution of the >rest of 9CAs.

Thank you for your consideration.

Best regards,
Hisashi Kamo

It appears that remediation has been completed.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Whiteboard: [ca-compliance] - Next Update - 14-May 2019 → [ca-compliance]
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [uncategorized]
You need to log in before you can comment on or make changes to this bug.