Closed Bug 1412686 Opened 7 years ago Closed 7 years ago

SEC_ERROR_EXPIRED_CERTIFICATE incorrectly suggests changing clock

Categories

(Firefox :: Security, defect)

58 Branch
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 1339329
Tracking Status
firefox58 --- affected

People

(Reporter: rillian, Unassigned)

References

Details

I got a TLS insecure connection page today. The message says:

> Nightly did not connect to status.lineageos.org because your computer’s clock
> appears to show the wrong time and this is preventing a secure connection.
> 
> Your computer is set to 2017-10-29, when it should be 2017-10-27. To fix this
> problem, change your date and time settings to match the correct time.

This message is incorrect, the current date is, in fact October 29. We shouldn't assume the user is wrong without checking.
We get the 'correct' time from the blocklist ping, and fall back to comparing against the build id if it's not available. Since this is nightly, seeing the cert expiry was prior to the date in the build id would have helped. However, the actual fallback logic is just based on |PREF_BLOCKLIST_CLOCK_SKEW_SECONDS| being less than a day, so the code assumes the blocklist is always updated at least once in the last 24 hours.

For whatever reason, this wasn't the case for me. about:config says `services.blocklist.clock_skew_seconds;132466`.

I suggest we always do the buildid check, since it's more reliable than a network service.
> curl --head https://firefox.settings.services.mozilla.com/v1/buckets/monitor/collections/changes/records
> [...]
> Date: Thu, 19 Oct 2017 18:04:06 GMT

So it looks like the problem is triggered by the blocklist server returning the wrong date. François, do you know who to talk to about fixing that?
Flags: needinfo?(francois)
(In reply to Ralph Giles (:rillian) | needinfo me from comment #2)
> So it looks like the problem is triggered by the blocklist server returning
> the wrong date. François, do you know who to talk to about fixing that?

Jason Thomas is the ops person on that project.
Flags: needinfo?(francois)
See Also: → 1299403
Mark most likely knows about the client code that's using the server response.
(In reply to Ralph Giles (:rillian) | needinfo me from comment #1)
> We get the 'correct' time from the blocklist ping, and fall back to
> comparing against the build id if it's not available. 

What is the actual BLP request being made to the service? Are we using services.settings.server or extensions.blocklist.url?
To summarize bug 1299403:

1. This same issue happened about a year ago.

2. Cloudfront preserves the Date header from the original server response. Cache behaviour differs here, but it's a reasonable thing to do since the header varies for *every* query to the origin server.

3. Our cloudfront config was aggressively caching the results, returning the same result for multiple days, in particular when compression was enabled. That triggered the same issue I saw. Vaguely-specified changes were made to the cloudfront config to make it less aggressive, resolving the issue.

So, to resolve this bug: maybe our cloudfront config broke again? Checking unconditionally against the buildid would have helped in this bug again Nightly, but won't help users of Firefox release much.

This morning the cloudfront cache has refreshed and I get the correct certificate error page:

> $ curl -s --head https://firefox.settings.services.mozilla.com/v1/buckets/monitor/collections/changes/records | grep '^Date:'
> Date: Mon, 30 Oct 2017 17:35:30 GMT

Adding --compressed gets me a different response Date, but it's still within the last day.

Mark, what do you think?
Blocks: 712612
Flags: needinfo?(mgoodwin)
(In reply to Ralph Giles (:rillian) | needinfo me from comment #8) 
> So, to resolve this bug: maybe our cloudfront config broke again? Checking
> unconditionally against the buildid would have helped in this bug again
> Nightly, but won't help users of Firefox release much.

We haven't made any cloudfront config changes in several months.
(In reply to Ralph Giles (:rillian) | needinfo me from comment #1)
> We get the 'correct' time from the blocklist ping, and fall back to
> comparing against the build id if it's not available. Since this is nightly,
> seeing the cert expiry was prior to the date in the build id would have
> helped. However, the actual fallback logic is just based on
> |PREF_BLOCKLIST_CLOCK_SKEW_SECONDS| being less than a day, so the code
> assumes the blocklist is always updated at least once in the last 24 hours.

Not really - it checks that the *difference at time of fetching* of the blocklist date and your local machine's date is > 1 day. That is, if according to the blocklist ping, your machine's clock is off by more than 1 hour. It's unrelated to how recent the last blocklist fetch was, because we store the delta at the time of the fetch and basically just assume your clock is still off by approximately that much.

> I suggest we always do the buildid check, since it's more reliable than a
> network service.

Unfortunately we don't have data about how often this would produce a helpful response vs. the cloud service response, though it's clearly going to be a strict subset. It would obviously reduce false positives given that we're realizing the cloud service response is not as reliable as one might want... When the code was written, the assumption was that the cloud response would be reliable. If that is going to be hard to maintain in future, perhaps we need to revisit that, but I don't know who can make that call. Perhaps Ops could do monitoring for this kind of issue so it doesn't take 10 days for someone to realize the ping dates are stale...

In case it's helpful, there's a half-finished patch on bug 1339329 that:
- only uses the blocklist skew if we last checked the blocklist less than 5 days ago (ie the skew is not stale)
- only uses the skew if it would actually make the cert valid (so it doesn't just tell you "hey, update your clock" whenever you encounter any old outdated cert).

Unfortunately it ran into test issues. Mark was going to take a look at that after I ran into some certutil issues trying to finish it off, but I expect he's been busy. :-)
See Also: → 1397061, 1339329
I don't really see why this isn't a dupe of bug 1339329. :)

I'm also going to spin off a bug for disabling the kinto based warning until we resolve this.
Status: NEW → RESOLVED
Closed: 7 years ago
Component: Site Identity and Permission Panels → Security
Flags: needinfo?(mgoodwin)
Resolution: --- → DUPLICATE
See Also: → 1414269
You need to log in before you can comment on or make changes to this bug.