Closed Bug 1399864 Opened 7 years ago Closed 6 years ago

Leverage server ETag/IMS support when polling for blocklist.xml changes

Categories

(Toolkit :: Blocklist Implementation, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox60 --- fixed

People

(Reporter: leplatrem, Assigned: Gijs)

References

Details

Attachments

(2 files)

Now that the server is able to respond `304 Not Modified` when a `If-Modified-Since` request header is specified, we could prevent downloading the blocklist each time we poll for changes.

Note: before landing this, we need to make sure with the OPS team that their CDN/Nginx micro-cache would take the request header into account, to prevent caching 304 responses for everyone — or the opposite, ie. always miss cache.
Assignee: nobody → mathieu
Does this really need tracking for Firefox 57?
Flags: needinfo?(mathieu)
Priority: -- → P3
> Does this really need tracking for Firefox 57?

No it does not.
Flags: needinfo?(mathieu)
See Also: → 1425602
This patch has bitrotted a little bit (notably I think you'll want to call Services.prefs.getCharPref(..., "") directly because the `getPref` helper is gone) but I'd be happy to review it if it got updated? Alternatively, I can try to update it if you're busy, and get review from another toolkit peer.
Flags: needinfo?(mathieu)
> Alternatively, I can try to update it if you're busy, and get review from another toolkit peer

Yes please do, if you can. I think there was some test failing with the patch. I'm available if you have any doubt though!
Flags: needinfo?(mathieu)
(In reply to Mathieu Leplatre (:leplatrem) from comment #5)
> > Alternatively, I can try to update it if you're busy, and get review from another toolkit peer
> 
> Yes please do, if you can. I think there was some test failing with the
> patch. I'm available if you have any doubt though!

Alright, I'll have a look. Do you know who to contact on the ops team wrt comment #0 ?

(In reply to Mathieu Leplatre (:leplatrem) from comment #0)
> Note: before landing this, we need to make sure with the OPS team that their
> CDN/Nginx micro-cache would take the request header into account, to prevent
> caching 304 responses for everyone — or the opposite, ie. always miss cache.
Assignee: mathieu → gijskruitbosch+bugs
Status: NEW → ASSIGNED
Flags: needinfo?(mathieu)
Version: 57 Branch → Trunk
Our main OPS contact for the blocklist stack is Wei (:wezhou)

Apparently the Cloudfront frontend works as expected with If-Modified-Since/Last-Modified or If-None-Match/ETag:

Example with an ETag (Miss):

    http HEAD https://firefox.settings.services.mozilla.com/v1/blocklist/3/%7Bec8030f7-c20a-464f-9b0e-13a3a9e97384%7D/46.0/ 'If-None-Match:"1500496563565"'
    HTTP/1.1 304 Not Modified
    Access-Control-Expose-Headers: Alert, Retry-After, Backoff, Content-Length
    Connection: keep-alive
    Date: Wed, 31 Jan 2018 09:54:28 GMT
    ETag: "1500496563565"
    Last-Modified: Wed, 19 Jul 2017 20:36:03 GMT
    Via: 1.1 f7b43bebaeae45bb7756c57683994054.cloudfront.net (CloudFront)
    X-Amz-Cf-Id: zmlNeiP0TJZbfon8vQXbiLOJMDyU85oQ7dpzMWBSedxOx0eaCf7SqA==
    X-Cache: Miss from cloudfront
    X-Content-Type-Options: nosniff

With the same one (Hit):

    http HEAD https://firefox.settings.services.mozilla.com/v1/blocklist/3/%7Bec8030f7-c20a-464f-9b0e-13a3a9e97384%7D/46.0/ 'If-None-Match:"1500496563565"'
    HTTP/1.1 304 Not Modified
    Access-Control-Expose-Headers: Alert, Retry-After, Backoff, Content-Length
    Age: 20
    Connection: keep-alive
    Date: Wed, 31 Jan 2018 09:54:48 GMT
    ETag: "1500496563565"
    Via: 1.1 a62788af052971abc00f8142b8edd67e.cloudfront.net (CloudFront)
    X-Amz-Cf-Id: MPxgJMrsWYzgeJb16kaPJc-vqLTAtDB45kINJuP47ynk90yl_dhAiA==
    X-Cache: Hit from cloudfront
    X-Content-Type-Options: nosniff

With a different one (Miss):

   http HEAD https://firefox.settings.services.mozilla.com/v1/blocklist/3/%7Bec8030f7-c20a-464f-9b0e-13a3a9e97384%7D/46.0/ 'If-None-Match:"abc"'          
    HTTP/1.1 200 OK
    Access-Control-Expose-Headers: Alert, Retry-After, Backoff, Content-Length
    Connection: keep-alive
    Content-Length: 299717
    Content-Type: application/xml;charset=UTF-8
    Date: Wed, 31 Jan 2018 09:56:34 GMT
    ETag: "1500496563565"
    Last-Modified: Wed, 19 Jul 2017 20:36:03 GMT
    Via: 1.1 37e0f9eb3dbed8073073c3f045191d7c.cloudfront.net (CloudFront)
    X-Amz-Cf-Id: hDEc_pE9CfNRjuHu0zOQdrpKMEYjY_TSvAg12BsPr1VrUq6UVkDzpg==
    X-Cache: Miss from cloudfront
    X-Content-Type-Options: nosniff

And without header (200 OK):

    http HEAD https://firefox.settings.services.mozilla.com/v1/blocklist/3/%7Bec8030f7-c20a-464f-9b0e-13a3a9e97384%7D/46.0/        
    HTTP/1.1 200 OK
    Access-Control-Expose-Headers: Alert, Backoff, Retry-After, Content-Length
    Age: 35
    Connection: keep-alive
    Content-Length: 299717
    Content-Type: application/xml;charset=UTF-8
    Date: Wed, 31 Jan 2018 09:54:02 GMT
    ETag: "1500496563565"
    Last-Modified: Wed, 19 Jul 2017 20:36:03 GMT
    Via: 1.1 666fb5cb1d5bef6558ed44d80e5cc6fc.cloudfront.net (CloudFront)
    X-Amz-Cf-Id: QzQqjNtzUOxoa_4gGY2HQsl9h4RvLsDwCHrYBraVtttCR49Kd5wGIg==
    X-Cache: Hit from cloudfront
    X-Content-Type-Options: nosniff
Flags: needinfo?(mathieu)
So, for reference, you can test the patch by running something like this in the browser console:

Services.blocklist.wrappedJSObject.notify();

which will essentially run the blocklist update at that point. You can then also trivially modify the lastModified blocklist pref to check if you get the correct result (200 instead of 304) if sending with an "old" if-modified-since value.
You may also want to have a look at https://github.com/leplatrem/aboutremotesettings to be able to trigger it with an interface.
Comment on attachment 8947050 [details]
Bug 1399864 - Send If-Modified-Since header when polling for blocklist.xml changes,

https://reviewboard.mozilla.org/r/216854/#review222740

Thanks! This should significantly reduce the this._loadBlocklistFromString call count :-).
Attachment #8947050 - Flags: review?(florian) → review+
(In reply to Mathieu Leplatre (:leplatrem) from comment #7)
> Our main OPS contact for the blocklist stack is Wei (:wezhou)
> 
> Apparently the Cloudfront frontend works as expected with
> If-Modified-Since/Last-Modified or If-None-Match/ETag:
> <snip>

Wei, can you confirm we can make this change (using Last-Modified information in an If-Modified-Since header for subsequent requests and no-oping for 304s) on the client side without ill effect, as suggested in comment #7 ? Just trying to make sure this change (first going into Nightly) won't take ops by surprise, or planned changes from ops won't accidentally bust things up after we ship this... :-)
Flags: needinfo?(wezhou)
Thanks :Gijs for the heads-up.

Our nginx doesn't cache 304. Our CDN (AWS cloudfront) caches 304 for the duration specified by the header fields that the origin (nginx) returns along with an object, which is expected.
Flags: needinfo?(wezhou)
Pushed by gijskruitbosch@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/8f9dcb9c62de
Send If-Modified-Since header when polling for blocklist.xml changes, r=florian
(In reply to :wezhou from comment #13)
> Thanks :Gijs for the heads-up.
> 
> Our nginx doesn't cache 304. Our CDN (AWS cloudfront) caches 304 for the
> duration specified by the header fields that the origin (nginx) returns
> along with an object, which is expected.

OK. This sounds like we can at least ship this to Nightly, so I will do that.

What cache duration does nginx currently specify? I don't see one returned through the headers that cloudfront produces (though I could be missing something). When the blocklist updates, obviously we will want to avoid a situation where people will keep getting 304s until a week later. Generally speaking, I would expect that we'd want to have the updated list live in a matter of minutes / hours.
Flags: needinfo?(wezhou)
https://hg.mozilla.org/mozilla-central/rev/8f9dcb9c62de
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
(In reply to :Gijs from comment #15)

> What cache duration does nginx currently specify?

Nginx just passes whatever the application tells it to return. It is currently not configured to add its own cache related headers to the response.
Flags: needinfo?(wezhou)
(In reply to :wezhou from comment #17)
> (In reply to :Gijs from comment #15)
> 
> > What cache duration does nginx currently specify?
> 
> Nginx just passes whatever the application tells it to return. It is
> currently not configured to add its own cache related headers to the
> response.

OK. I assume that the kinto server returns a similar thing for the 304 that it used to return for the 200 response with the actual data, so in any case we won't be any worse off because previously the potentially-obsolete 200 response would have been cached, including the identical out-of-date response content (the 304 contents of course will be empty). Mathieu, does that sound plausible? :-)
Flags: needinfo?(mathieu)
Essentially, what I'm trying to ensure is that we don't end up in a situation where we keep saying "304 not modified" forever even if the underlying data changed, or in a situation where we overload the nginx server because there's no caching. I'm really not in the best position to make assertions about that given how little I know about kinto and how we serve content on it, but it *sounds* like we're good, assuming my reasoning in comment #18 is sound.
We should test against https://blocklists.settings.services.mozilla.com/ (extensions.blocklist.url) instead of the url in comment 7. See also bug 1288197 and bug 1341332 for references. 

We should also check with IT Data team to make sure that log processing is not affected since the http status code is changing. :ccing mpressman
> Essentially, what I'm trying to ensure is that we don't end up in a situation where we keep saying "304 not modified" forever even if the underlying data changed

If I'm not mistaken, the configured max age is really small (~15sec), so we should be safe. You can observe it by looking at the `X-Cache` response header:

    ➜  http HEAD https://blocklists.settings.services.mozilla.com/v1/blocklist/3/%7Bec8030f7-c20a-464f-9b0e-13a3a9e97384%7D/46.0/ 'If-None-Match:"1500496563565"'
    HTTP/1.1 304 Not Modified
    Age: 9
    Date: Mon, 05 Feb 2018 11:11:00 GMT
    ETag: "1500496563565"
    Last-Modified: Wed, 19 Jul 2017 20:36:03 GMT
    X-Cache: Hit from cloudfront

After a few seconds:

    ➜  http HEAD https://blocklists.settings.services.mozilla.com/v1/blocklist/3/%7Bec8030f7-c20a-464f-9b0e-13a3a9e97384%7D/46.0/ 'If-None-Match:"1500496563565"'
    HTTP/1.1 304 Not Modified
    Date: Mon, 05 Feb 2018 11:11:05 GMT
    ETag: "1500496563565"
    Last-Modified: Wed, 19 Jul 2017 20:36:03 GMT
    X-Cache: Miss from cloudfront
Flags: needinfo?(mathieu)
See Also: → 1440637
Component: Blocklist Policy Requests → Blocklist Implementation
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: