Closed Bug 1417777 Opened 7 years ago Closed 7 years ago

DigiCert: Insufficient entropy in serial numbers

Categories

(CA Program :: CA Certificate Compliance, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jeremy.rowley, Assigned: jeremy.rowley)

Details

(Whiteboard: [ca-compliance] [ov-misissuance] [ev-misissuance])

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.94 Safari/537.36

Steps to reproduce:

We discovered that DigiCert's v1 validation system used random numbers with insufficient entropy when performing WHOIS-based email validations.  Instead of 112 bits of entropy, we were unfortunately only using 77 bits. 

The issue was discovered on Nov 1 by our partner, CTJ.  They reported it to a marketing representative, who forwarded the issue to me.  Unfortunately, because the report was sent the same day as closing, the issue was lost in my inbox for a while.  

Late Nov 2, we started investigating.  Although the random number had 128 bits, not everything was random.  On Nov 3, we patched the system with the low entropy and started investigating all other systems using random values.  All other systems had random values with sufficient entropy.  When we did a root cause analysis over the next week, we found that during implementation of the blessed 10 methods, one developer confused 128 bits of entropy for 128 bits.  

The number of certs impacted is pretty much all certs issued under the old system (which is probably close to half of the certs out there) since the requirement went into effect in March of this year. Note that the random value requirements were added prior to the July deadline for only using the blessed 10.  The new validation engine is being used for all Symantec certs on Dec 1 and any partners who order certs through DigiCert. 

I'm hoping this is a minor enough infraction, although a stupid one, that we don't have to replace all of the certificates issued as its a substantial number.
In the more traditional format:

Summary: DigiCert's random values used for email-based validation contained only 77 bits of entropy instead of 112 per the BRs. The effective date of the BR requirement was in March or July, depending on how you count.   

1.	How your CA first became aware of the problem (e.g. via a problem report submitted to your Problem Reporting Mechanism, a discussion in mozilla.dev.security.policy, a Bugzilla bug, or internal self-audit), and the time and date.
The issue was reported to DigiCert on Nov 1, 2017 by CTJ, a DigiCert partner. They noticed that the random value seemed less random than required.

2.	A timeline of the actions your CA took in response. A timeline is a date-and-time-stamped sequence of all relevant events. This may include events before the incident was reported, such as when a particular requirement became applicable, or a document changed, or a bug was introduced, or an audit was done.
a. Nov 1. Issue reported by CTJ to DigiCert.
b. Nov 2. Investigation into issue began. 
c. Nov 2. Insufficient entropy confirmed. 
d. Nov 3. Rolled a patch to the system to increase entropy.
e. Nov 3. Started investigation into all random value use
e. Nov 6. Started investigation as to root cause.
f. Nov 9. Concluded no other random values had the same issue and no random values in validation V2 had the issue.
g. Nov 10. Finished root cause investigation. Determined it was single developer that did not understand the difference between 112 bits and 112 bits of entropy. Provided training to the team on what 112 bits of entropy means.

3.	Whether your CA has stopped, or has not yet stopped, issuing certificates with the problem. A statement that you have will be considered a pledge to the community; a statement that you have not requires an explanation.
We fixed the issue on Nov 3.  All random values use at least 112 bits of entropy.

4.	A summary of the problematic certificates. For each problem: number of certs, and the date the first and last certs with that problem were issued.
Still figuring this out. Thousands at a minimum. I expect its roughly 80% of the certificates issued through that system, which is pretty much still in use for retail.  The Symantec integration does not tie into validation v1, using v2 instead. The last cert impacted was issued on Nov 3, just before the patch was rolled out.

5.	The complete certificate data for the problematic certificates. The recommended way to provide this is to ensure each certificate is logged to CT and then list the fingerprints or crt.sh IDs, either in the report or as an attached spreadsheet, with one list per distinct problem.
Do we need to provide this? It's going to basically be a database dump. 

6.	Explanation about how and why the mistakes were made or bugs introduced, and how they avoided detection until now.
A developer failed to understand the difference between 112 bits and 112 bits of entropy. While implementing the BR requirement, we insisted on 128 bits of entropy as the minimum. The confusion caused him to think the process was done. The random value looked right during QA because it was sufficient length, just not random enough.  

7.	List of steps your CA is taking to resolve the situation and ensure such issuance will not be repeated in the future, accompanied with a timeline of when your CA expects to accomplish these things.
We fixed the system and are consolidating operations to validation v2 as part of the Symantec migration. We've also segmented developers a bit better so that only one team can touch the CA issuance and a separate team touches validation. The developers comprising the teams are only experienced developers who understand the BRs.
Hi Jeremy,

Are you able to provide code or pseudocode which proves that definitely at least 77 bits of entropy were used?

The 112 bit requirement does have a safety margin in it, so if you can prove everything was at least 77, I'm not going to ask you to provide a dump of every certificate you've ever issued, or revoke them all.

Gerv
Sure thing - here's the pseudo-code:

getRandomValue(length)
{   
    Use the max of the passed length and 16
    Use the Kernal Random Number Generator to generate random binary data
    Using the character set [0123456789bcdfghjlmnpqrstvwxz]:
    While the value is less than the length:
        Loop through each byte of the random binary data:
            get the integer value for the byte
            if the integer value is < (29 * 8): // to avoid bias
                determine the index to use by integer value modulus 29 (the length of the set)
                concatenate the random value with the new character reached by that index
    return the random value;
}

Here's what was being passed:
getRandomValue(16);


Thus the random element was log2(29)*16 = 77.727695922
OK, sure. I agree that getting DigiCert to stick 80% of their certs in CT and providing a long list of them is overkill. As would be revoking and replacing them all.

You write:
> The random value looked right during QA because it was sufficient length, just not random enough.

Given that the lack of sufficient randomness was due to character set choice, I would have thought some test code could perhaps have detected this problem. Can you look at implementing randomness tests for all serial number generation code you currently use which are not already deprecated?
https://en.wikipedia.org/wiki/Randomness_tests

Gerv
Thanks Gerv - we will add a test (although it will probably deploy post Dec 1)
Summary: Entropy → DigiCert: Insufficient entropy in serial numbers
Assignee: kwilson → jeremy.rowley
Whiteboard: [ca-compliance]
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: NSS → CA Program
Whiteboard: [ca-compliance] → [ca-compliance] [ov-misissuance] [ev-misissuance]
You need to log in before you can comment on or make changes to this bug.