tryLater OCSP response causes hard failure when stapled

NEW
Unassigned

Status

()

Core
Security: PSM
P3
normal
a year ago
7 months ago

People

(Reporter: mt, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [psm-backlog], URL)

Attachments

(1 attachment)

(Reporter)

Description

a year ago
Created attachment 8818185 [details]
PCAP of handshake with www.isc.org

This is probably a WONTFIX, but I want to get this on the record.

www.isc.org is currently stapling an OCSP response that includes a tryLater response.  That's insane, but...

Chrome connects happily.  However Firefox aborts the load.

All the OCSP prefs are at their default.  Critically, I don't have security.OCSP.require enabled.  It's only when I set security.ssl.enable_ocsp_stapling to false that it loads.

As I type this in, the problem has corrected itself, so while I did test Edge (it loaded), I don't know if the problem was already fixed or not.
Ryan, I was wondering if you have any comments on this. I can't find much discussion on this particular issue (that is, what to do when a stapled OCSP response is invalid for whatever reason (and, of course, it may depend on what that reason is)).
Flags: needinfo?(ryan.sleevi)

Comment 2

11 months ago
re: Comment #1: I suspect that's a broader conversation the Mozilla security team would want to have, and y'all do you, but a few thoughts:

1) From the Chrome side, we don't (currently) validate the OCSP stapled response for anything. We're mostly just grabbing it for SCTs, but we also push it to 'the OS' (aka either slap it to NSS or pass it to Windows).
2) When I say "slap it to NSS", we're still using the libpkix path in Chrome, and so it may simply be a behaviour divergence with moz::pkix
3) There's a philosophical debate about what to do if we get stapled junk
  a) One view says that unless we're in a must-staple mode, then stapling junk is no different than receiving junk from the CA, and we should treat it the same (e.g. "tryLater" is thus ignored, unless a positive response is required - such as the case for EV historically)
  b) Another view says that Postel's Law is more of a suggestion, and to help the ecosystem, we should abort the load. Servers should know not to staple junk, because as long as servers are stapling junk sometimes, then must-staple is impractical to deploy (since using must-staple with a junk-stapling server == broken users)

We (Chrome) haven't decided on a-vs-b, but I am very much a believer in the b-camp. However, whether or not b is practical depends on the failure rate/stapled junk rate, and Emily Stark was working to explore and gather metrics there in conjunction with her expect-staple explorations. However, I'm not sure the current status of that.

I think Firefox is doing the right thing, instinctively, but I don't believe that was a data-driven or experimentally-supported change, so you might find it difficult to justify that if you find it breaks users. Best I can do for now is to cheer.

Does that help?
Flags: needinfo?(ryan.sleevi)
(Reporter)

Comment 3

11 months ago
Cheering does help.

This must be rare, but obviously not rare enough.  If we could expand our telemetry to cover the error code, we might be able to make an informed decision.

One very important piece of context... Check out the default for Apache, which is - as I said - bananas: https://httpd.apache.org/docs/2.4/mod/mod_ssl.html#sslstaplingreturnrespondererrors
Priority: -- → P3
Whiteboard: [psm-backlog]
You need to log in before you can comment on or make changes to this bug.