Closed Bug 854346 Opened 11 years ago Closed 8 years ago

Treat expired certs with no revocation information as revoked, and do not allow an override

Categories

(Core :: Security: PSM, defect)

defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: gerv, Unassigned)

Details

At the moment, we allow a "continue anyway" option for expired certificates. We do not allow one for certificates we know to be revoked.

Some CAs and sites want to experiment with short-lived certificates, where the revocation method is "the cert expires". (The cert would have the same lifetime as an OCSP response, so the risk profile is the same.)

If any certificate, short-lived or otherwise, has no revocation pointers (CRL or OCSP) then the only way it can ever stop working is for it to expire. 

Given all that, we should treat an expired cert with no revocation information as the same as revoked, in that we should not allow an override. (The error message should remain the same, however. The only change this bug requests is the removal of the override option.)

Gerv
"Do not allow an override", or "make the override more difficult than a simple click"?
If my server configuration can be updated only from a web interface, protected by a certificate, and my certificate becomes expired, how will I change it?
The way to avoid this problem is not to use a cert with no revocation information on that site.

Gerv
(In reply to Gervase Markham [:gerv] from comment #0)
> Some CAs and sites want to experiment with short-lived certificates, where
> the revocation method is "the cert expires". (The cert would have the same
> lifetime as an OCSP response, so the risk profile is the same.)
> 
> If any certificate, short-lived or otherwise, has no revocation pointers
> (CRL or OCSP) then the only way it can ever stop working is for it to
> expire. 
> 
> Given all that, we should treat an expired cert with no revocation
> information as the same as revoked, in that we should not allow an override.
> (The error message should remain the same, however. The only change this bug
> requests is the removal of the override option.)

If the certificate is expired and contains a CRL DP and the browser is configured to not request CRLs (i.e. the default behavior of Firefox), then should the cert be considered revoked? If the certificate has an OCSP responder URI but the browser is configured to not fetch OCSP requests, should it be considered revoked? If all the revocation checking in the browser has been disabled, should we still consider expired certificates with no revocation pointers to be revoked?

The OCSP spec says (about "good"):

  "At a minimum, this positive response indicates that the certificate
   is not revoked, but does not necessarily mean that the certificate
   was ever issued or that the time at which the response was produced
   is within the certificate's validity interval. Response extensions
   may be used to convey additional information on assertions made by
   the responder regarding the status of the certificate such as
   positive statement about issuance, validity, etc."

I take this to mean that the OCSP responder may return a "good" status for a certificate that was revoked if the certificate is expired.

The current rationale for treating known-revoked certificates more harshly than expired certificates is that we've received *explicit* notification from the CA that the certificate is bad. What the bug reporter is not in line with the current rationale. (On the the other hand, the implementation of this in Firefox --and Chromium?--is bad because once a certificate expires we don't check it for revocation at all.)

In particular, why would a network attacker ever go through the effort of stealing the private key of a good but expired certificate to use in an attack? They could get the same UX (just with different warning text) by using a self-signed certificate.

It seems like now if we want to infer extra badness for some subset of expired certificates, we might as well go all the way and infer that all expired certificates are revoked (regardless of whether OCSP and/or CRL DP URLs in the cert exist). I think this is a sensible position if we state the rationale as being that it is too expensive/difficult for CAs and websites to deal with protecting their legitimate certificates in perpetuity (protecting private keys forever and never throwing away any revocation information for any revoked certificate).

Accordingly, if we're going down this road, we should first attempt to simply stop allowing cert error overrides for expired certificates in general. Undoubtedly, if we do so we're going to get bug reports saying "I've been using cert overrides to access my company's webmail server (or IMAP server, in the case of Thunderbird) and now I can't check my email." My question is whether we consider accessibility of sites with expired certificates to be a feature worth supporting, or whether we'd just mark those bugs RESOLVED INVALID, or whether it is too early to say. I'm tempted to gather some telemetry on this, but first I'd like to hear what the cutoff number would be. That is, if the telemetry says that X% of users are browsing sites with expired certificates, what value of X would be too high to make it unreasonable for us to treat all expired certificates as revoked.
One more thing: Should the same policy should apply to intermediate certificates too? And, if so, how exactly? Consider these two chains:

     -> Expired Intermediate with no revo info -> Trusted Root
    /
EE +--> Revoked Intermediate -> Trusted Root
    \
     -> Intermediate -> Untrusted Root

Should we allow the cert override (i.e. prefer the bottom chain) or not (i.e. prefer one of the top two chains)? Note that this may seem like a contrived example, but actually it is as reasonable as anything PKIX-related is. Consider the case where the CA replaced the top intermediate with the middle intermediate because they wanted to "add revocation checking," and then they revoked the middle intermediate after they transitioned to the new root, which the browser doesn't trust yet. It seems wrong to let an untrusted issuer influence how we deal with certificates issued by trusted issuers. But, in this case, that would be the most reasonable thing to do. I am sure we can found counterexamples where that is the worst thing to do.

I suspect that many certificate path building/verification libraries do not give the application to reliably make the decision one way or another. For example, I am pretty sure that libpkix will return "no trusted issuer" for at least two of three of the paths in my example above. And, if the library is configured to return trees of errors for all paths it returns, then certificate path building becomes incredible inefficient in many common scenerios (AFAICT). And, many (all?) certificate path libraries suffer from the problem that they don't check for revocation of expired certificates (end-entity or intermediate), so making a distinction between presence or absence of revocation pointers in an expired cert is somewhat silly given that that information would be ignored regardless.

In general, when we consider full PKIX path building, it becomes difficult to define reasonable behavior that allows one to treat specific kinds of errors (except for explicit notice of revocation) worse than others.
I don't love this proposal and I'm not positive, but I think it commits a base rate fallacy. It's true that in the case of a revoked-through-short-life cert, revocation is the right conclusion to draw. But on the current internet it seems more likely to me, by a significant margin, that the site is just presenting a crappy (self-signed, auto-generated, no revocation information, long-forgotten) cert, not an actual revocation target. This isn't a great cert regardless, that's why we'll error them out, but taking away the override feels like solving an edge case of an edge case in a frustrating way.

Put the other way around: what's the threat model here? An attacker obtains a fraudulent cert from someone who doesn't provide revocation information (!) (or, at least, doesn't provide it after expiry, which I understand is more common), waits for expiry, and then attempts to trick people into clicking through our overrides in order to perpetrate the attack? That's a pretty low probability approach, but maybe for spear phishing it would be worth the attempt. Couldn't that person be victimized in more mundane ways, though? (homograph site, downloadable software?)

I feel like the barrier is already so high on SSL errors that, combined with comment 3 and 4 above, this is arguably wontfix. The problem we're solving seems improbable, and the solution has unwanted side effects (albeit also minor).

A variation of this bug, where we proposed hard-fail for expired certs *with* revocation info when we can't get an affirmative response, would avoid breaking the "benign, crappy cert" case, and so might be more palatable. But I don't know if that gives the CAs anything they want, and I still wouldn't consider it a priority unless we knew that there was a real chance of seeing this attack perpetrated, which seems unlikely. I like us being creative around this stuff, but it feels like a solution looking for a problem.
(In reply to Brian Smith (:bsmith) from comment #3)
> If the certificate is expired and contains a CRL DP and the browser is
> configured to not request CRLs (i.e. the default behavior of Firefox), then
> should the cert be considered revoked? If the certificate has an OCSP
> responder URI but the browser is configured to not fetch OCSP requests,
> should it be considered revoked? If all the revocation checking in the
> browser has been disabled, should we still consider expired certificates
> with no revocation pointers to be revoked?

Good question. What do you think? If someone disables all revocation checking, maybe they don't care if certs are revoked or not...

> The OCSP spec says (about "good"):
> 
>   "At a minimum, this positive response indicates that the certificate
>    is not revoked, but does not necessarily mean that the certificate
>    was ever issued or that the time at which the response was produced
>    is within the certificate's validity interval. Response extensions
>    may be used to convey additional information on assertions made by
>    the responder regarding the status of the certificate such as
>    positive statement about issuance, validity, etc."
> 
> I take this to mean that the OCSP responder may return a "good" status for a
> certificate that was revoked if the certificate is expired.

The OCSP spec is all kinds of broken in terms of the meanings of its responses, primarily because it was designed to allow implementations backed by CRLs. I believe Mozilla is moving towards requiring that CAs only return "good" if they have issued the certificate and it's known-good.

> The current rationale for treating known-revoked certificates more harshly
> than expired certificates is that we've received *explicit* notification
> from the CA that the certificate is bad. What the bug reporter is not in
> line with the current rationale.

A fair point.

> (On the the other hand, the implementation
> of this in Firefox --and Chromium?--is bad because once a certificate
> expires we don't check it for revocation at all.)

This is because, at least for CRLs, there is no requirement that CAs maintain revocation information for expired certificates.

> In particular, why would a network attacker ever go through the effort of
> stealing the private key of a good but expired certificate to use in an
> attack? They could get the same UX (just with different warning text) by
> using a self-signed certificate.

Fair point.

> It seems like now if we want to infer extra badness for some subset of
> expired certificates, we might as well go all the way and infer that all
> expired certificates are revoked (regardless of whether OCSP and/or CRL DP
> URLs in the cert exist).

That would have a much wider impact; certs without any revocation information in them should be a small minority.

(In reply to Johnathan Nightingale [:johnath] from comment #5)
> Put the other way around: what's the threat model here? 

The primary threat model in my mind is someone losing control of the private key of their short-life cert.

The motivation for me filing this bug was a discussion of short-life certs. Someone said "for a short-life cert, expiry _is_ revocation." And then someone else said "but browsers allow you to override expiry, but not revocation." So it seemed to me that we should treat expired short-life certs as revoked. And then I thought: "well, why not _any_ cert where expiry is the only means of revocation?"

We could keep it to short-life ones, and that would avoid catching the certs of the type johnath mentions.

Gerv
(In reply to Gervase Markham [:gerv] from comment #6)
> The OCSP spec is all kinds of broken in terms of the meanings of its
> responses, primarily because it was designed to allow implementations backed
> by CRLs. I believe Mozilla is moving towards requiring that CAs only return
> "good" if they have issued the certificate and it's known-good.

That is a great idea, for CAs in our program. But, we still have to consider that there are many CAs not in our program (e.g. enterprise CAs) that are likely not going to attain that standard. (Probably, in general, we will have to start distinguishing CAs in our program from CAs that are not in our program in more ways.)

> > (On the the other hand, the implementation
> > of this in Firefox --and Chromium?--is bad because once a certificate
> > expires we don't check it for revocation at all.)
> 
> This is because, at least for CRLs, there is no requirement that CAs
> maintain revocation information for expired certificates.

Right. However, I think that this is actually unintended behavior in our cert override mechanism, because it doesn't just apply to expired certs or untrusted issuers but also to domain name mismatches.

> > It seems like now if we want to infer extra badness for some subset of
> > expired certificates, we might as well go all the way and infer that all
> > expired certificates are revoked (regardless of whether OCSP and/or CRL DP
> > URLs in the cert exist).
> 
> That would have a much wider impact; certs without any revocation
> information in them should be a small minority.

I agree with "should" meaning is bad for there to be a lot of them. However, I disagree with "should" meaning that we think that is true. Especially if we count certs that only have CRLDP and no OCSP (so, no revocation checking in Firefox), there are a lot of such sites--especially CDN-hosted sites.

> The primary threat model in my mind is someone losing control of the private
> key of their short-life cert.

I expect that short-lived certificates will get renewed automatically with the *same* key, so loss of control of the private key is very similar to the long-lived cert case.

In general, why should we require site owners to protect their private keys beyond the notAfter period of the corresponding certificate? IMO, this is an undue burden for site owners. They've explicitly told us (through the notAfter date) when we should stop trusting the cert and we're not respecting that. (OTOH, we actually do honor their request if they are HSTS, so one argument is that without HSTS this is a "do what I mean, not what I say" situation.)

> The motivation for me filing this bug was a discussion of short-life certs.
> Someone said "for a short-life cert, expiry _is_ revocation." And then
> someone else said "but browsers allow you to override expiry, but not
> revocation." So it seemed to me that we should treat expired short-life
> certs as revoked. And then I thought: "well, why not _any_ cert where expiry
> is the only means of revocation?"

"browser allow you to override expiry, but not revocation" is not true, as I explained above. Even in a browser that checks revocation of intermediates and tries to prioritize revocations over other errors, there are various gray areas that may cause the browser to allow an override to a site that chains to 99 revoked certificates if there is a network error on OCSP check for the 100th candidate certificate, and/or if AIA certificate fetching fails. And, also, once the certificate has actually expired, then we don't honor the revocation anymore.

> We could keep it to short-life ones, and that would avoid catching the certs
> of the type johnath mentions.

In one of the original proposals, there was the idea that short-lived certificates would be given a special policy OID to identify them as certificates that shouldn't allow cert overrides for expiration. That makes some sense to me.

The thing that makes less sense is saying that a cert where notAfter < notBefore + X does not allow cert overrides but a cert where notAfter > notBefore + X does.

(In reply to Johnathan Nightingale [:johnath] from comment #5)
> Put the other way around: what's the threat model here?

The threat model is that a site stops caring about whether a private key of a certificate that they used in the 1990's can be used to attack users of their site, because that cert expired 15 years ago. The site cannot request revocation that works in any case because (a) browsers won't check that cert for revocation in the first place, and (b) they likely don't have a working relationship with the CA that issued them the cert 15 years ago. In general, honoring the notAfter date makes system administration easier and better for sites that are doing the right things at the expense of sites that aren't doing the right things, whereas allowing these overrides punishes every site for the benefit of sites that are doing the wrong thing.

Further, the sites that train their users to click through the expired cert errors for their own site end up training our users to click through cert errors in general. 

> Couldn't that person be victimized in more mundane ways,
> though? (homograph site, downloadable software?)

I think our goal should be to do smart things for cert errors so that we can eventually get rid of the "whatever" button. I think it is a valid argument to say that we should do all those smart things first before we start getting stricter about enforcing any particular constraint like notAfter. However, once we've fixed the cases where *we're* being stupid, then we're left (hopefully) with the cases where the worst websites are creating problems for normal websites.

As an example of being smarter in this case: if a cert has been valid for a long time and is about to expire, maybe we should have some kind of unobtrusive but notable message in the UI pointing out that the site is about to break in the next 7 or less days. Although that sounds like a suboptimal UX, I bet it would be pretty effective in getting admins to avoid their certs from expiring in the first place.

> I feel like the barrier is already so high on SSL errors that, combined with
> comment 3 and 4 above, this is arguably wontfix. The problem we're solving
> seems improbable, and the solution has unwanted side effects (albeit also
> minor).

I agree that this shouldn't be a high priority. However, it is important that the new short-lived certificate proposal succeeds, so I think it is important for us to continue to watch and participate in the discussion about this.

> I still wouldn't consider it a priority unless we knew that there was a
> real chance of seeing this attack perpetrated, which seems unlikely. I
> like us being creative around this stuff, but it feels like a solution
> looking for a problem.

At least, there are more important things to do before this becomes the top priority.
> > > (On the the other hand, the implementation
> > > of this in Firefox --and Chromium?--is bad because once a certificate
> > > expires we don't check it for revocation at all.)
> > 
> > This is because, at least for CRLs, there is no requirement that CAs
> > maintain revocation information for expired certificates.
> 
> Right. However, I think that this is actually unintended behavior in our
> cert override mechanism, because it doesn't just apply to expired certs or
> untrusted issuers but also to domain name mismatches.

I'm not sure I follow - RFC 3280/5280 are pretty clear on this: 

(From Section 5)
   A complete CRL lists all unexpired certificates, within its scope,
   that have been revoked for one of the revocation reasons covered by
   the CRL scope.  A full and complete CRL lists all unexpired
   certificates issued by a CA that have been revoked for any reason.


It's not just that site operators stop caring about their key after expiration - their CA does as well. You can equally see this within the CP/CPSes of most CAs participating within the Mozilla root program.

This is because the alternative - including all expired certificates in CRLs - would *vastly* balloon the already large CRLs that exist (some upwards of 10 - 50MB - and thats JUST with unexpired)

OCSP, which was originally designed to be fed from a CRL (hence all of the many, many discussions on "good" vs "unknown" and whether or not it's possible), equally shares this, although RFC 2560 provides for a model that has been implemented by no public CA ever (to my knowledge) - the Section 4.4.4 archive cutoff.

> > The primary threat model in my mind is someone losing control of the private
> > key of their short-life cert.
> 
> I expect that short-lived certificates will get renewed automatically with
> the *same* key, so loss of control of the private key is very similar to the
> long-lived cert case.
> 
> In general, why should we require site owners to protect their private keys
> beyond the notAfter period of the corresponding certificate? IMO, this is an
> undue burden for site owners. They've explicitly told us (through the
> notAfter date) when we should stop trusting the cert and we're not
> respecting that. (OTOH, we actually do honor their request if they are HSTS,
> so one argument is that without HSTS this is a "do what I mean, not what I
> say" situation.)

I'm not really sure I follow this argument. When a certificate is expired, the user is minimally warned of untrustworthiness - the same as if a MITM were to sit on their network and send an untrusted cert.

What exactly is the burden being created here? That if users were to look at the cert, it would have the name of some CA that users *might* recognize, and thus "must" be trustworthy? What's to prevent an attacker from just spinning up an entire fake chain that "looks real"? In Chrome for Windows, we see malware doing this all the time - trying to look as 'legit' as possible, mod for the key, so I don't see how this is different from expiration.

> 
> > The motivation for me filing this bug was a discussion of short-life certs.
> > Someone said "for a short-life cert, expiry _is_ revocation." And then
> > someone else said "but browsers allow you to override expiry, but not
> > revocation." So it seemed to me that we should treat expired short-life
> > certs as revoked. And then I thought: "well, why not _any_ cert where expiry
> > is the only means of revocation?"
> 
> "browser allow you to override expiry, but not revocation" is not true, as I
> explained above. Even in a browser that checks revocation of intermediates
> and tries to prioritize revocations over other errors, there are various
> gray areas that may cause the browser to allow an override to a site that
> chains to 99 revoked certificates if there is a network error on OCSP check
> for the 100th candidate certificate, and/or if AIA certificate fetching
> fails. And, also, once the certificate has actually expired, then we don't
> honor the revocation anymore.
> 
> > We could keep it to short-life ones, and that would avoid catching the certs
> > of the type johnath mentions.
> 
> In one of the original proposals, there was the idea that short-lived
> certificates would be given a special policy OID to identify them as
> certificates that shouldn't allow cert overrides for expiration. That makes
> some sense to me.

+1

> 
> The thing that makes less sense is saying that a cert where notAfter <
> notBefore + X does not allow cert overrides but a cert where notAfter >
> notBefore + X does.

++1

> 
> (In reply to Johnathan Nightingale [:johnath] from comment #5)
> > Put the other way around: what's the threat model here?
> 
> The threat model is that a site stops caring about whether a private key of
> a certificate that they used in the 1990's can be used to attack users of
> their site, because that cert expired 15 years ago. The site cannot request
> revocation that works in any case because (a) browsers won't check that cert
> for revocation in the first place, and (b) they likely don't have a working
> relationship with the CA that issued them the cert 15 years ago. In general,
> honoring the notAfter date makes system administration easier and better for
> sites that are doing the right things at the expense of sites that aren't
> doing the right things, whereas allowing these overrides punishes every site
> for the benefit of sites that are doing the wrong thing.

Could you define "attack"?

Is this attack at all different from a 'simple' MITM with an untrusted cert? If so, how?
(In reply to Brian Smith (:bsmith) from comment #7)
> > We could keep it to short-life ones, and that would avoid catching the certs
> > of the type johnath mentions.
> 
> In one of the original proposals, there was the idea that short-lived
> certificates would be given a special policy OID to identify them as
> certificates that shouldn't allow cert overrides for expiration. That makes
> some sense to me.

Ryan ++ed this; I'm interested to understand why one method of tagging such certs is bad but another is good. I agree the "check the lifetime" way is a bit side-channel, but it avoids making the cert bigger, and would have the same result 99.9% of the time.

> As an example of being smarter in this case: if a cert has been valid for a
> long time and is about to expire, maybe we should have some kind of
> unobtrusive but notable message in the UI pointing out that the site is
> about to break in the next 7 or less days. Although that sounds like a
> suboptimal UX, I bet it would be pretty effective in getting admins to avoid
> their certs from expiring in the first place.

In other words, you want to ship Expiry Canary? :-)
https://addons.mozilla.org/en-US/firefox/addon/expiry-canary/

That would be cool, particularly as admins could then push out the warning time and/or change the domain warning list for their own personal browsers.

Gerv
This isn't something we're going to implement for the time being.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.