Closed Bug 1299403 Opened 8 years ago Closed 7 years ago

incorrect "Your computer thinks it is (date), when it should be (date)" warning

Categories

(Cloud Services :: Server: Remote Settings, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: glob, Assigned: dmaher)

References

Details

a discussion in #security lead to freddyb reporting that firefox is telling him that his computer's clock is incorrectly set (http://logs.glob.uno/?c=mozilla%23security&s=31+Aug+2016&e=31+Aug+2016#c92236).

> Your computer thinks it is 31.8.2016, when it should be 10.8.2016

i was curious so i did some digging.

it looks like the clock skew detection logic is based on the "date" header returned by firefox.settings.services.mozilla.com.

i see invalid date headers when there's an x-cache:miss from cloudfront, which may trigger invalid clock skew warnings:

here's a forced-refresh response from https://firefox.settings.services.mozilla.com/v1/buckets/monitor/collections/changes/records :

Access-Control-Allow-Origin: *
Age: 2042
Cache-Control: max-age=3600
Connection: keep-alive
Content-Length: 183
Content-Type: application/json; charset=UTF-8
Date: Mon, 08 Aug 2016 22:33:46 GMT
Etag: "1470695074322"
Expires: Mon, 08 Aug 2016 23:33:46 GMT
Last-Modified: Mon, 08 Aug 2016 22:24:34 GMT
Total-Records: 1
Via: 1.1 b08d3fd1ea7c0f4b62f5adbb976ab099.cloudfront.net (CloudFront)
X-Amz-Cf-Id: 5G85_pjWusrVlkwaDhsROoa6mzZf5fHehzyAPitGi4l7Ime6Ka9j9A==
X-Cache: Hit from cloudfront
access-control-expose-headers: Content-Length, Expires, Alert, Retry-After, Last-Modified, Total-Records, ETag, Pragma, Cache-Control, Backoff, Next-Page

then a normal refresh, which results in a 304 not modified:

Connection: keep-alive
Date: Wed, 31 Aug 2016 08:04:12 GMT
Etag: "1470695074322"
Last-Modified: Mon, 08 Aug 2016 22:24:34 GMT
Via: 1.1 b08d3fd1ea7c0f4b62f5adbb976ab099.cloudfront.net (CloudFront)
X-Amz-Cf-Id: egX8MubSmRir2vk9nCh-2k9yDdX5rTjoV22atCM4tmSNqmuFKP0V6A==
X-Cache: Miss from cloudfront
access-control-expose-headers: Content-Length, Expires, Alert, Retry-After, Last-Modified, Total-Records, ETag, Pragma, Cache-Control, Backoff, Next-Page

note the "date" header is incorrect in the first response, and correct in the second.
(In reply to Byron Jones ‹:glob› from comment #0)
> i see invalid date headers when there's an x-cache:miss from cloudfront,
> which may trigger invalid clock skew warnings

oops, that should read "i see invalid date headers when there's an x-cache:hit from cloudfront".
Assignee: nobody → tarek
Component: Security: UI → Server: Kinto
Product: Core → Cloud Services
I checked at the Kinto server level and Date header have got the expected behavior so it seems that CloudFront is caching the Date header for us.
My reading of rfc2616 is that a cache giving the Date from when the "message was originated" is probably correct behavior.

Given that, he question is: How do we get an (accurate) server time to the client when a CDN is involved?
(In reply to Mark Goodwin [:mgoodwin] from comment #3)
> My reading of rfc2616 is that a cache giving the Date from when the "message
> was originated" is probably correct behavior.

The RFC in question allows for varying behaviour with regards to end-to-end headers depending on whether the intermediary is considered transparent or non-transparent. This explains why some CDNs (et al.) behave in a different way. IMHO, Cloudfront's behaviour is correct within a particular interpretation of the RFC (which is, of course, correct *enough*). ;)

> Given that, he question is: How do we get an (accurate) server time to the
> client when a CDN is involved?

It does *not* appear possible to modify the behaviour of Cloudfront with regards to the Date field; we'll need to look elsewhere for a solution.
(In reply to Daniel Maher [:phrawzty] from comment #4)
> It does *not* appear possible to modify the behaviour of Cloudfront with
> regards to the Date field; we'll need to look elsewhere for a solution.

OK; I think we're looking at one of two solutions here:
1) Something like a periodic no-op update of the changes collection to force the date to update every now and then.
2) Get the server time from the Date header on retrieving the blocklist.xml somehow and pass that into the blocklist-updater.

I'm currently playing with 2. Can someone (:natim?) look at the possibility of option one?
Flags: needinfo?(rhubscher)
> 1) Something like a periodic no-op update of the changes collection to force the date to update every now and then.

I though this was already what we had. The TTL on the CDN is 24 hours (as far as I know) we could make it lower but in anycase it explains a difference of 21 days.

I am wondering if the CDN checks the ETag every 24 hours and refresh the cache only if it changed. which would explain that the Date header is not updated every 24 hours.

I don't know enough of how CloudFront works to answer you if it is possible to force an update every 24 hours for instance.

mgoodwin: Do you know if 24 hours would be a great time window fit?
phrawtzy: Do you know if we can force an update every 24 hours in the CDN?
Flags: needinfo?(rhubscher) → needinfo?(dmaher)
The default behaviour of Cloudfront is relatively straightforward and is generally well documented[0]. However, we are not not using CF's default settings, and the behaviour we're seeing is at least partially indicative of such. In particular, I'm thinking about the "whitelisted" headers[1], which Cloudfront takes into account when deciding when and how to cache/expire objects. I mention this because the Stage instance *didn't* whitelist[2] these headers and the behaviour was much more sane (i.e. regular hourly expirations, etc). 

It may be that we're shooting ourselves in the foot with the custom settings...


[0] https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/WorkingWithObjects.html
[1] If-Match, If-Modified-Since, If-None-Match, and If-Unmodified Since
[2] I brought the config in line with Prod yesterday, which is as it should be.
Flags: needinfo?(dmaher)
> I don't know enough of how CloudFront works to answer you if it is possible to force an 
> update every 24 hours for instance.

I think what Mark is thinking about is a ping/poke on Kinto so its Date header changes even if the data has not changed, and that's propagated to the CDN so Firefox can pick it up. Can you think about something we can do there ?


In any case, we should try hard to fix the Date header so we don't have to uplift a change in Firefox
Flags: needinfo?(rhubscher)
(In reply to Tarek Ziadé (:tarek) from comment #8)
> I think what Mark is thinking about is a ping/poke on Kinto so its Date
> header changes even if the data has not changed, and that's propagated to
> the CDN so Firefox can pick it up. Can you think about something we can do
> there ?

Kinto (the service itself) always returns a "live" date field when queried directly. This is normal and respects RFC 2616 concerning message generation. Since this is normal, Cloudfront doesn't consider this header at all as part of the caching algorithm - that wouldn't make any sense, since this field is *always* updated.

Changing a different header would make more sense, such as ETag (mentioned by Rémy in comment #6).

(In reply to Daniel Maher [:phrawzty] from comment #7)
> It may be that we're shooting ourselves in the foot with the custom
> settings...

(In reply to Tarek Ziadé (:tarek) from comment #8)
> In any case, we should try hard to fix the Date header so we don't have to
> uplift a change in Firefox

We can't change how Cloudfront generates the content of the Date header, so the only way to ensure that said field is modified is to work within the confines of the caching mechanism itself.

IMHO, we are erring on the side of cache retention (to our detriment) beyond what is really necessary. Thus, beyond modifying Kinto's responses (see above), I would further submit that we should be more aggressive with cache invalidation (and less aggressive with cache retention) on the whole.  I think that these two approaches together will get us the behaviour we want.

Comments?
We could add a job to update the change collection ETag on time to time.
But I wish we could configure that at the CDN level.
Flags: needinfo?(rhubscher)
Are we any closer to working out why Cloudfront behaved the way it did? From testing this again today, I can't get a response with a Date header older than a few minutes old. Have there been any changes in the configuration?
(In reply to Mark Goodwin [:mgoodwin] from comment #11)
> Are we any closer to working out why Cloudfront behaved the way it did? From
> testing this again today, I can't get a response with a Date header older
> than a few minutes old. Have there been any changes in the configuration?

Glob, what do you see when you try to reproduce this now?
Flags: needinfo?(glob)
$ curl -i https://firefox.settings.services.mozilla.com/v1/buckets/monitor/collections/changes/records
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 183
Connection: keep-alive
Access-Control-Expose-Headers: Content-Length, Expires, Alert, Retry-After, Last-Modified, Total-Records, ETag, Pragma, Cache-Control, Backoff, Next-Page
Cache-Control: max-age=3600
Date: Tue, 06 Sep 2016 14:08:22 GMT
ETag: "1470695074322"
Expires: Tue, 06 Sep 2016 15:08:22 GMT
Last-Modified: Mon, 08 Aug 2016 22:24:34 GMT
Total-Records: 1
Age: 35
X-Cache: Hit from cloudfront
Via: 1.1 ee75b9feabd5286c4876cd5841a1ad18.cloudfront.net (CloudFront)
X-Amz-Cf-Id: Vq_e2HkjpHn8QDxWhOho2I5_L0-zhqSELu7Mmf2VdRlh7rAJLyf-5Q==

{"data":[{"host":"firefox.settings.services.mozilla.com","last_modified":1470695074322,"bucket":"blocklists","id":"0e543556-43bf-3139-1fda-2a0068116c6d","collection":"certificates"}]}
using your command line i wasn't able to reproduce it, so i tried it again with firefox and saw the same bad date response:

~$ curl 'https://firefox.settings.services.mozilla.com/v1/buckets/monitor/collections/changes/records' -H 'Host: firefox.settings.services.mozilla.com' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:50.0) Gecko/20100101 Firefox/50.0' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' --compressed -H 'Referer: https://bugzilla.mozilla.org/' -H 'DNT: 1' -H 'Connection: keep-alive' -H 'Upgrade-Insecure-Requests: 1' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache' -i
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 183
Connection: keep-alive
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Content-Length, Expires, Alert, Retry-After, Last-Modified, Total-Records, ETag, Pragma, Cache-Control, Backoff, Next-Page
Cache-Control: max-age=3600
Date: Tue, 09 Aug 2016 00:24:24 GMT
ETag: "1470695074322"
Expires: Tue, 09 Aug 2016 01:24:24 GMT
Last-Modified: Mon, 08 Aug 2016 22:24:34 GMT
Total-Records: 1
Age: 2610
X-Cache: Hit from cloudfront
Via: 1.1 1ae22fd852bd5b36a4391c54f02b5439.cloudfront.net (CloudFront)
X-Amz-Cf-Id: mTdPbHTEglpq4U29T63zSku1qGCfvAsUD9t8ajDsWOwAPJCU0evRQw==

removing the params one by one points to compression:

~$ curl -i --compressed https://firefox.settings.services.mozilla.com/v1/buckets/monitor/collections/changes/records
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 183
Connection: keep-alive
Access-Control-Allow-Origin: *
Access-Control-Expose-Headers: Content-Length, Expires, Alert, Retry-After, Last-Modified, Total-Records, ETag, Pragma, Cache-Control, Backoff, Next-Page
Cache-Control: max-age=3600
Date: Tue, 09 Aug 2016 00:24:24 GMT
ETag: "1470695074322"
Expires: Tue, 09 Aug 2016 01:24:24 GMT
Last-Modified: Mon, 08 Aug 2016 22:24:34 GMT
Total-Records: 1
Age: 2750
X-Cache: Hit from cloudfront
Via: 1.1 470917b83663a136083f105e2fd03290.cloudfront.net (CloudFront)
X-Amz-Cf-Id: KxSFvDI7in11-pQFKofNPdy07B4Ke9bYPMJlfyWlajCxk60ik3FPtA==

~$ curl -i https://firefox.settings.services.mozilla.com/v1/buckets/monitor/collections/changes/records
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
Content-Length: 183
Connection: keep-alive
Access-Control-Expose-Headers: Content-Length, Expires, Alert, Retry-After, Last-Modified, Total-Records, ETag, Pragma, Cache-Control, Backoff, Next-Page
Cache-Control: max-age=3600
Date: Tue, 06 Sep 2016 14:44:01 GMT
ETag: "1470695074322"
Expires: Tue, 06 Sep 2016 15:44:01 GMT
Last-Modified: Mon, 08 Aug 2016 22:24:34 GMT
Total-Records: 1
X-Cache: Miss from cloudfront
Via: 1.1 939ea9d62ec616846d41b16cea0dc65b.cloudfront.net (CloudFront)
X-Amz-Cf-Id: 4mC3j4rJ5gc4w8x40Wesvd7AWviNeIOcCkUoUaGeDjW9ih-kle-DbQ==
Flags: needinfo?(glob)
Following some discussion with phrawzty - I'd like to do some testing against stage. Can we make the cache expiry in stage more aggressive, please?
Flags: needinfo?(dmaher)
(In reply to Mark Goodwin [:mgoodwin] from comment #15)
> Following some discussion with phrawzty - I'd like to do some testing
> against stage. Can we make the cache expiry in stage more aggressive, please?

I've basically done a factory-reset on the settings for the Cloudfront distribution in Stage. In other words, we're using the CF defaults - no cookie forwarding, no header whitelisting, etc. This is *not optimal*, but it's a really basic starting point for diagnosing and figuring out exactly what we want to do.

As an aside, we could also consider being *really* aggressive with cache invalidation in order to make the testing take less time (set max TTL to 60 second or something?). Lemme know.
Flags: needinfo?(dmaher)
Assignee: tarek → dmaher
Priority: -- → P1
How's the testing going, Mark? Anything I can help with?
Flags: needinfo?(mgoodwin)
(In reply to Daniel Maher [:phrawzty] from comment #17)
> How's the testing going, Mark? Anything I can help with?

So far, stage looks good for me. It would be helpful if others in different locations could also test.

From the sleuthing glob did on this, the output from the following should be sufficient:

curl -is https://kinto.stage.mozaws.net/v1/buckets/monitor/collections/changes/records
curl -is --compress https://kinto.stage.mozaws.net/v1/buckets/monitor/collections/changes/records
Flags: needinfo?(mgoodwin)
Mark the URL you gave are not the one behind the CDN there, reason why should works perfectly here...

You need to try with:

curl -is https://settings.stage.mozaws.net/v1/buckets/monitor/collections/changes/records
curl -is --compress https://settings.stage.mozaws.net/v1/buckets/monitor/collections/changes/records
(In reply to Rémy Hubscher (:natim) from comment #19)
> Mark the URL you gave are not the one behind the CDN there, reason why
> should works perfectly here...
> 
> You need to try with:

The good news is it still looks OK; the problem we have with this test currently is that the data is recently modified. Can we hold of making changes to the collections in stage for a bit?
(In reply to Mark Goodwin [:mgoodwin] from comment #20)
> The good news is it still looks OK

... though obviously, we need to keep on doing this over a period of time for this to be meaningful
We need to have different caching based on the querystring parameters in the GET of the url.

https://settings.stage.mozaws.net/v1/buckets/monitor/collections/changes/records?_since=1458748761419
https://settings.stage.mozaws.net/v1/buckets/monitor/collections/changes/records?_since=1458748761420
https://settings.stage.mozaws.net/v1/buckets/monitor/collections/changes/records

Are three different caching.

Also if you want to also cache with different headers, you only need `If-Match` and `If-None-Match` but I don't know how CloudFront handle 412 and 304 HTTP codes. It might be that it just ignore them and give back the previous version of it rather than caching the 304/412 responses that are actually part of our protocol.
As per our meeting of a few moments ago, I have enabled Query String caching in Stage; we'll leave the Header caching disabled for now.
Blocks: 712612
So what's the latest here?

We could always just implement ROUGHTIME :-)
https://roughtime.googlesource.com/roughtime/

Gerv
(In reply to Gervase Markham [:gerv] from comment #24)
> So what's the latest here?

Phrawzty and friends did some things that made the CDN caching a bit less aggressive. Maybe glob could verify that things look better from where he is now?

> We could always just implement ROUGHTIME :-)
> https://roughtime.googlesource.com/roughtime/

I spoke to a few people about this (and we discussed it in crypto eng. meetings a couple of times). I like it but think the general consensus is along the lines of "let's watch it for a bit".
Flags: needinfo?(glob)
it's much better:

~$ date
Thu  3 Nov 2016 21:14:17 AWST
~$ curl -s -i https://firefox.settings.services.mozilla.com/v1/buckets/monitor/collections/changes/records  | grep '^Date: '
Date: Thu, 03 Nov 2016 13:12:34 GMT
~$ curl -s -i --compressed https://firefox.settings.services.mozilla.com/v1/buckets/monitor/collections/changes/records | grep '^Date: '
Date: Thu, 03 Nov 2016 12:55:07 GMT

being 17.5 minutes off is much better than the 28 days i was seeing before :)
Flags: needinfo?(glob)
It sounds like the caching changes have been applied to the production servers. Is there anything else that remains to be done in this bug?
Flags: needinfo?(glob)
(In reply to Panos Astithas [:past] from comment #27)
> It sounds like the caching changes have been applied to the production
> servers. Is there anything else that remains to be done in this bug?

that's more of a question for mgoodwin
Flags: needinfo?(glob) → needinfo?(mgoodwin)
Not that I'm aware of.
Flags: needinfo?(mgoodwin)
Thanks, resolving then.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
See Also: → 1412686
See Also: → 1414269
You need to log in before you can comment on or make changes to this bug.