Closed Bug 1339329 Opened 7 years ago Closed 7 years ago

Incorrect messaging and user instructions for expired SSL/TLS certificates

Categories

(Firefox :: Security, defect)

51 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 58
Tracking Status
firefox58 --- fixed

People

(Reporter: webmastersguide2000, Assigned: Gijs)

References

Details

Attachments

(3 files)

Attached file invision_cert.pem
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:51.0) Gecko/20100101 Firefox/51.0
Build ID: 20170125094131

Steps to reproduce:

Visit a site with a recently expired TLS (SSL) certificate. (I have attached the certificate I used for testing).


Actual results:

Your computer thinks it is 2/13/2017, when it should be 2/9/2017. To fix this problem, change your date and time settings to match the correct time.

Not Firefox said "it should be 2/9/2017", when it was in fact 2/13/2017", and the certificate truly was expired.




Expected results:

The certificate is actually expired, and should be treated as expired, with an appropriate error message.
I missed typing a few words on the initial report.
Firefox gives and incorrect error message for the expired certificate, and suggests an incorrect course of action.

The site did actually have an expired certificate. Firefox misreported the problem, showing this error message instead:
--
Your computer thinks it is 2/13/2017, when it should be 2/9/2017. To fix this problem, change your date and time settings to match the correct time.
--

That message is incorrect. I have not (yet) analyzed the related code. Because detection of expired certificates is incorrect, in that even with an innocent user and innocent server, it incorrectly diagnoses the problem as incorrect local time, it's quite likely the incorrect code can be manipulated by attacker with worse results.
Mark or Johann, any idea how this would happen? Is there a way for Ray to provide us with debug data about his kinto ping?

Dan, does this need to stay closed as a security issue? I don't think so, but I'd like a second opinion.
Flags: needinfo?(mgoodwin)
Flags: needinfo?(jhofmann)
Flags: needinfo?(dveditz)
Ray, can you please post the value of your "services.blocklist.clock_skew_seconds" and "services.blocklist.last_update_seconds" preferences from about:config?

This is very likely bug 1299403 (which is public). The question is why it's happening again. Maybe your blocklist just wasn't updated in a long time.

> it's quite likely the incorrect code can be manipulated by attacker with worse results.

The time difference is taken from the response to a successful HTTPS ping to one of our blocklist servers, significantly raising the bar for an attack since that pref or the response would have to be faked first. (Dan, please correct me if I'm missing something here).

Thanks for reporting this!
Flags: needinfo?(jhofmann) → needinfo?(webmastersguide2000)
Component: Untriaged → Security
services.blocklist.clock_skew_seconds;466036
services.blocklist.last_update_seconds;1486584117

The actual time difference between the dates shown by Firefox is about 260,000 seconds - much larger than clock_skew_seconds.

I note that these two entries are in bold. I'm guessing that means they've been changed from the default, for some definition of "default" (do different platforms actually have different defaults?) This is surprising, because I don't think I changed them. I don't normally use Firefox, only a few times per year, so I would have no reason to change unusual settings.

PS - I happen to be an information security professional, writing tools to evaluate TLS security settings in another window, so I'm familiar with the issues. I haven't done any Firefox development in many years, however - my last contribution my have been to Seamonkey before Firefox 1.0, so I'm not familiar with the details of Firefox's implementation.
This doesn't need to be hidden: it's bad messaging to the user but not an attack. An attack would be if someone could force a user's clock _backwards_ so that an expired cert appears un-expired, but then they wouldn't get this message.

(In reply to Ray Morris from comment #4)
> these two entries are in bold. I'm guessing that means they've been changed from
> the default [....] This is surprising, because I don't think I changed them.

Bold does mean they are non-default values, but the "user set" heading for that column is misleading. Gecko and Firefox use the preference system as a settings store for all kinds of internal values that are "user set" only in the loosest sense that they were set while the browser was being used.

> I don't normally use Firefox, only a
> few times per year, so I would have no reason to change unusual settings.

In that case this error message might mean that you visited the expired site before the blocklist update had kicked off and the browser was using a stale value from the last time you used the browser. I don't know if the last_update_seconds is the clock on your machine or the clock
Group: firefox-core-security
Flags: needinfo?(webmastersguide2000)
Flags: needinfo?(dveditz)
Summary: Incorrect handling of expired SSL/TLS certificates → Incorrect messaging and user instructions for expired SSL/TLS certificates
Do we check if the blocklist/kinto ping is out of date? If we don't already that seems like something that might be worth doing.
Flags: needinfo?(jhofmann)
> The actual time difference between the dates shown by Firefox is about 260,000 seconds - much larger than clock_skew_seconds.

No. 466036 > 260000

Assuming you meant _smaller_, it's indeed weird that we display a difference of 3-4 days when the value is about 5 days, but setting clock_skew_seconds to 260000 locally shows me the correct result of 3 days and I can't replay the circumstances that led to this slight difference on your computer, so I'd say this is not worth looking into.

> Do we check if the blocklist/kinto ping is out of date? If we don't already that seems like something that might be worth doing.

I don't think there's a way to check if it's out of date, because we have to assume that the system clock is wrong.

Based on the comments so far I see no indication that there's something wrong with the frontend/application code.
For some reason, when the last blocklist ping arrived at your computer it had a timestamp difference of ~5 days. Either your clock was indeed wrong on that day or there's another issue server-side. I'll leave this for mgoodwin to decide.
Flags: needinfo?(jhofmann)
(In reply to Johann Hofmann [:johannh] from comment #7)
> > The actual time difference between the dates shown by Firefox is about 260,000 seconds - much larger than clock_skew_seconds.
> 
> No. 466036 > 260000
> 
> Assuming you meant _smaller_, it's indeed weird that we display a difference
> of 3-4 days when the value is about 5 days, but setting clock_skew_seconds
> to 260000 locally shows me the correct result of 3 days and I can't replay
> the circumstances that led to this slight difference on your computer, so
> I'd say this is not worth looking into.
> 
> > Do we check if the blocklist/kinto ping is out of date? If we don't already that seems like something that might be worth doing.
> 
> I don't think there's a way to check if it's out of date, because we have to
> assume that the system clock is wrong.

Sure, but we can check if the time (according to our system clock) we fetched blocklist/kinto data is significantly behind the current system clock time (iow if the ping is out of date). If so, don't suggest the user updates their clock until we have a more recent blocklist ping (which we can implement by just not showing the custom 'update your system clock' message in this case - the blocklist will refresh by itself and we'll start giving the user that message if the more recent ping still indicates a clock skew).
(In reply to :Gijs from comment #8) 
> Sure, but we can check if the time (according to our system clock) we
> fetched blocklist/kinto data is significantly behind the current system
> clock time (iow if the ping is out of date). If so, don't suggest the user
> updates their clock until we have a more recent blocklist ping (which we can
> implement by just not showing the custom 'update your system clock' message
> in this case - the blocklist will refresh by itself and we'll start giving
> the user that message if the more recent ping still indicates a clock skew).

Right, good point, if it strongly deviates from the current system time it's likely also wrong. This would solve the case where someone had a wrong system time when they received the ping but has a different time now.

Still, if there is a server-side issue you could get a wrong ping anytime.

Also, if we set the threshold to e.g. one week (which I think is a sensible value) this bug would still have occurred.
> In that case this error message might mean that you visited the expired site
> before the blocklist update had kicked off and the browser was using a stale
> value from the last time you used the browser. 

According to History, I had last used Firefox on December 26th. So that would not account for Firefox thinking it's February 9.


> it's bad messaging to the user but not an attack.

Offhand, I don't immediately see any way to exploit this exact symptom.
I do want to mention something from my viewpoint as an outside security professional, who doesn't know the Firefox code well but has an objective view. When I'm looking for something to exploit, I look for unnecessary complexity in security-related code. That's what this entire feature seems to be - unnecessary complexity in security-related code.   All code is subject to bugs; unnecessary code around security means, for me, unnecessary weaknesses I can exploit. The fact that the Firefox team *thought* they had already fixed this, and still can't quite understand what's going on even though you've looked at it before, tells me that nobody quite understands how this code *actually* works under certain conditions (and doesn't work). 

Any time there's security-related code that's not well understood by it's developers, that's making developers scratch their head, I'm pretty confident there's an exploit somewhere in the neighborhood. The next two steps me,if I weren't in the middle of my ASV PCI audit today, would be to a) look around the general vicinity of this code to find where the security exposure is and b) give the exposure a cool name so I can get some press.
(In reply to Ray Morris from comment #10)
> > In that case this error message might mean that you visited the expired site
> > before the blocklist update had kicked off and the browser was using a stale
> > value from the last time you used the browser. 
> 
> According to History, I had last used Firefox on December 26th. So that
> would not account for Firefox thinking it's February 9.

We determine a clock *skew* interval based on a remote server time, not the remote server time. If, when we fetched the remote time, your computer clock was behind (for instance, because you were testing something requiring a different system time, or because you're using a VM that might not have its system time updated immediately when restored, or...?) we will have registered the skew interval as indicating your system time was behind/ahead. If the skew interval is then big enough to explain a cert being invalid, we will tell the user this.

> > it's bad messaging to the user but not an attack.
> 
> Offhand, I don't immediately see any way to exploit this exact symptom.
> I do want to mention something from my viewpoint as an outside security
> professional, who doesn't know the Firefox code well but has an objective
> view. When I'm looking for something to exploit, I look for unnecessary
> complexity in security-related code. That's what this entire feature seems
> to be - unnecessary complexity in security-related code.

Wrong system clocks are a really frequent reason for people seeing these errors. Especially in older hardware where the relevant batteries die and the clock just doesn't work. For those people not to be able to browse the internet even when certs are in fact valid is dumb, so we offer more helpful advice than just "this cert isn't valid so now you can't use the website". So I would disagree with "unnecessary".

> The fact that the Firefox team *thought* they had
> already fixed this, and still can't quite understand what's going on even
> though you've looked at it before, tells me that nobody quite understands
> how this code *actually* works under certain conditions (and doesn't work). 

The insinuation we're incompetent is pretty insulting. "What's going on" is pretty straightforward:

1) at some point in the past, we determined your system clock was wrong. We stored the interval (ie diff. between actual time according to our server, and time on your machine).
2) yesterday, you access a website whose cert is not valid in your system time, but given the interval we stored, would be valid if you corrected your system clock by the interval we stored previously.
3) firefox 'helpfully' suggests correcting your clock.

Now, it's sad that we show this message and we're incorrect, but without getting removing all the "hey, let's try to help the user" bits, it's not immediately obvious how to fix (there's a few options, anyway).

> Any time there's security-related code that's not well understood by it's
> developers, that's making developers scratch their head, I'm pretty
> confident there's an exploit somewhere in the neighborhood. The next two
> steps me,if I weren't in the middle of my ASV PCI audit today, would be to
> a) look around the general vicinity of this code to find where the security
> exposure is and b) give the exposure a cool name so I can get some press.

It's not so much that we don't understand the problem, it's that we're trying to figure out the best way to avoid the false positive you encountered, without throwing the proverbial baby out with the bathwater.
Mark, do we have data on how often users report system clocks that are significantly *ahead*? Perhaps we could limit this to only the notBefore of the cert and "out of date clocks", which seem like they're likely more prevalent than clocks that are 'ahead'.
(In reply to :Gijs from comment #12)
> Mark, do we have data on how often users report system clocks that are
> significantly *ahead*? Perhaps we could limit this to only the notBefore of
> the cert and "out of date clocks", which seem like they're likely more
> prevalent than clocks that are 'ahead'.

Since we check out of date clocks against the build date anyway right now (as a fallback if we don't have kinto pings) we could also just turn off the blocklist skew comparison if it causes too much trouble. (Or pref it?)

We could also consider adding a telemetry probe measuring the clock skew on these pages to get more information on this.
> The insinuation we're incompetent is pretty insulting.

I didn't say anyone is incompetent. I said there appears to be code that's not well understood. Scrolling up just a bit, I see a "this looks like a bug we thought we had fixed". Having a bug isn't incompetent, basically everything that has code has bugs. Pretending you'll be able to avoid having any bugs would be incompetence, imho.

I understand the motivation for the feature. It's surely useful, and just as surely not *necessary*.


> system clock by the interval we stored previously.
> without getting removing all the "hey, let's try to help the user" bits, it's not
> immediately obvious how to fix (there's a few options, anyway).

One possibility would be to ping for the correct time *now*. Yes that would add latency, as opposed to using a previously calculated value.
I mentioned I use Firefox only very occasionally. 

Is it possible that my clock skew was last set BEFORE the caching was fixed? I used Firefox briefly in December, for one site I believe. If FF doesn't necessarily set clock skew every time it's launched, that would explain things.  Due to the caching issue, in November FF sets a very wrong clock skew. I don't use FF for while.  If FF doesn't reset the skew during either my December visit to one site or my February visit, it'll still be using a very incorrect skew in February.

Solution: if last_updated < 1_week_ago; update_skew
(In reply to :Gijs from comment #12)
> Mark, do we have data on how often users report system clocks that are
> significantly *ahead*? Perhaps we could limit this to only the notBefore of
> the cert and "out of date clocks", which seem like they're likely more
> prevalent than clocks that are 'ahead'.

We don't have data to hand - but I might be able to infer it from TLS error report data. Should I have a go at this?
Flags: needinfo?(mgoodwin) → needinfo?(gijskruitbosch+bugs)
(In reply to Mark Goodwin [:mgoodwin] from comment #16)
> (In reply to :Gijs from comment #12)
> > Mark, do we have data on how often users report system clocks that are
> > significantly *ahead*? Perhaps we could limit this to only the notBefore of
> > the cert and "out of date clocks", which seem like they're likely more
> > prevalent than clocks that are 'ahead'.
> 
> We don't have data to hand - but I might be able to infer it from TLS error
> report data. Should I have a go at this?

Well, there's a number of possible solutions here:

- remove the kinto-based logic from the frontend and only use the build date
- don't show the custom error page if the kinto data is old. Aside:
(In reply to Ray Morris from comment #15)
> Solution: if last_updated < 1_week_ago; update_skew

Unfortunately this is not so easy because we can't rely on being able to update the clock skew when we're showing the error page (for instance, if you're on a wifi network with a captive portal) - so we can't block the load of the error page on this. Worse, I don't think we currently have a reasonable technical means of 'delaying' the load of the error page arbitrarily (due to the abstractions in the code), and updating the error page after it's been loaded seems like it causes other issues.

So if we go this route, I think we would need to just not show the custom time-based error stuff and give a 'regular' "This cert is not valid" error page for any such errors, until we (independently) get a more up-to-date kinto ping.


- don't show the custom error page if it's the notAfter rather than notBefore that's causing date/time issues (ie if the clock is ahead, in Firefox's math, rather than behind).



Code-wise, the last option is the simplest (that'll be like 1 extra if statement), though it's possible it ends up making users' life a little worse by no longer showing the 'helpful' error page in cases where we should do. That seems pretty unlikely though, and I figured we might have data to confirm my hypothesis there.

Future-code-complexity-wise, the first one is simplest (removing code is easy, right?). We might want to do the middle thing anyway to reduce the risk of outdated information being used here. 

Whether it's worth the data analysis to get data to confirm not many users end up with clocks that area ahead to such an extent they break TLS for those users depends how hard the data analysis is, which I don't know. I'd say, if it's an hour or 2 to get us a bit more confidence here, do it, if it's 2 weeks of work, it's not worth it.

I'll leave picking one of the options (or something else I haven't thought of yet) to you + Johann. :-)
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(mgoodwin)
Doing some tests recently, I had the following message:
"
  Nightly did not connect to expired.badssl.com because your computer’s clock appears to show the wrong time and this is preventing a secure connection.
  Your computer is set to 3/17/2017, when it should be 3/15/2017. To fix this problem, change your date and time settings to match the correct time.

  [Advanced]
  expired.badssl.com uses an invalid security certificate.
  The certificate expired on April 13, 2015 at 1:59 AM. The current time is March 17, 2017 at 9:03 PM.
  Error code: SEC_ERROR_EXPIRED_CERTIFICATE
"

May I suggest as a small mitigation not to show the bad computer's clock message if both the actual and guessed date/time are beyond the certificate expiration date. In the case above, changing the computer's date as proposed would have not fixed the problem at all.
Attached patch Patch v0.1Splinter Review
Mark, this works for me locally but unfortunately our "expired" test cert in the unit tests has equal notBefore and notAfter values, making it impossible to properly test that this works correctly in automation (and thereby breaking our existing test).

How hard would it be to have a more useful cert incorporated in automation? If it's very hard, I'm tempted to remove some of the test coverage (which would be unfortunate but, IME, hard to avoid if we actually want to fix this bug).
Assignee: nobody → gijskruitbosch+bugs
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #8851974 - Flags: feedback?(mgoodwin)
(clearing needinfo as both needinfo + feedback requests don't seem particularly useful)
Flags: needinfo?(mgoodwin)
Chat today in irc://irc.mozilla.org/#firefox (logged at <http://logs.glob.uno/?c=firefox#c409548>): 

* offered a screenshot of Nightly

* identified two more probable duplicates, bug 1361123 and bug 1358904 – both have been commented accordingly.
Comment on attachment 8851974 [details] [diff] [review]
Patch v0.1

Review of attachment 8851974 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good.

It's hard but not very hard to have a more useful cert incorporated; the certificates can be found in build/pgo/certs - the 'special' ones (those referenced via "cert=" in server-locations.txt) are in the cert8.db file in that directory; you should be able to replace the expired cert in there with one with a different no-before value without too much trouble (you can see example usage of certutil in genpgocert.py).

Let me know if you need some help with this.
Attachment #8851974 - Flags: feedback?(mgoodwin) → feedback+
Flags: needinfo?(gijskruitbosch+bugs)
I just hit this today, running Firefox Nightly 56.0a1 (2017-07-31). The error message says "Your computer is set to 8/4/2017, when it should be 8/2/2017.", but 8/4/2017 is indeed correct. My services.blocklist.last_update_seconds is 1501623898 (which is 8/1/2017) and services.blocklist.clock_skew_seconds is 198562.

The site in question is https://www.starfighters.io/ and the cert expired back on November 12, 2016.
Mark kindly agreed he could take on pushing this over the finish line.
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(mgoodwin)
See Also: → 1412686
See Also: → 1414269
Per discussion with Johann, going to just go with this patch and skip the relevant test, and file a followup to fix the test.
Flags: needinfo?(mgoodwin)
Blocks: 1414804
Comment on attachment 8925496 [details]
Bug 1339329 - cert error time warnings should only appear for recent blocklists and if clock skew is sufficient to make cert valid,

https://reviewboard.mozilla.org/r/196636/#review202204

Thank you!

::: browser/base/content/content.js:270
(Diff revision 1)
>  
> +  _getCertValidityRange() {
> +    let {securityInfo} = docShell.failedChannel;
> +    securityInfo.QueryInterface(Ci.nsITransportSecurityInfo);
> +    let certs = securityInfo.failedCertChain.getEnumerator();
> +    let notBefore = 0, notAfter = Number.MAX_SAFE_INTEGER;

Nit: I thought it was convention to have one declaration per line, but correct me if I'm wrong.
Attachment #8925496 - Flags: review?(jhofmann) → review+
Comment on attachment 8925496 [details]
Bug 1339329 - cert error time warnings should only appear for recent blocklists and if clock skew is sufficient to make cert valid,

https://reviewboard.mozilla.org/r/196636/#review202204

> Nit: I thought it was convention to have one declaration per line, but correct me if I'm wrong.

You're not wrong, I just don't like the convention sometimes. Fixed, though - my preferences shouldn't override conventions. :-)
Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/3681cc015ec7
cert error time warnings should only appear for recent blocklists and if clock skew is sufficient to make cert valid, r=johannh
https://hg.mozilla.org/mozilla-central/rev/3681cc015ec7
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 58
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: