Closed Bug 1406932 Opened 7 years ago Closed 7 years ago

service worker updates don't always write to http cache for some reason

Categories

(Core :: DOM: Service Workers, defect)

defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: d.huigens, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

2.63 KB, application/x-javascript
Details
Attached file sw-cache-test.js
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:58.0) Gecko/20100101 Firefox/58.0
Build ID: 20171008220130

Steps to reproduce:

1. Register a Service Worker
2. Let Firefox request the Service Worker file from the server
3. Fetch the Service Worker file in Javascript


Actual results:

Firefox requests the Service Worker file again, even if you use {updateViaCache: 'all'} and it had appropriate cache headers (or you use {cache: 'force-cache'}).


Expected results:

It should be possible to read the Service Worker file that Firefox got from the server.

A short while back, I wrote a blog post [1] about using Service Workers to improve the security & privacy guarantees of web apps that distrust the server (e.g., those that use client-side crypto). Basically, whenever the web app gets updated, the (previously installed) Service Worker checks the new code (against a signature with a public key, or against the version on Github, or...), analogous to Binary Transparency.

Now, to make that secure, you also need to check the new Service Worker when it updates, and warn the user when it doesn't match its signature. To do that, we need to be able to read the source of the new Service Worker. Chrome allows you to do that with fetch('/serviceworker.js'), although Chrome doesn't support {cache: 'only-if-cached'} yet.


I've attached a node.js test case which basically responds with a new Service Worker file every time you request it, with a incremental + random id, and then checks that fetch()ing the SW returns the SW with the same id. Run the testcase with `node sw-cache-test.js` and navigate to http://localhost:8888.


[1]: http://blog.airbornos.com/post/2017/08/03/Transparent-Web-Apps-using-Service-Worker
What do you mean by "Chrome allows you to do that with fetch('/serviceworker.js')"?  Do you mean its just getting a normal http cache hit?

Even that, though, will not be reliable.  Entries can be purged from http cache at any time.

Also, can you make your example into a glitch.com project?  I am on windows and don't have node installed.
Flags: needinfo?(d.huigens)
> What do you mean by "Chrome allows you to do that with fetch('/serviceworker.js')"?  Do you mean its just getting a normal http cache hit?

Yeah.

> Even that, though, will not be reliable.  Entries can be purged from http cache at any time.

Once Chrome supports fetch(..., {cache: 'only-if-cached'}), it will be possible to make sure that we're getting the file from cache. Nevertheless, you raise a good point that it's not guaranteed to still be in cache, but since we're fetching the file in the updatefound event, immediately after the browser itself requested it, that sounds really unlikely. If it happens anyway, we show a warning (in the web app) which is probably a false positive.

An alternative, dedicated, API to get the new Service Worker's code is conceivable (registration.installing.sourceCode or whatever), less prone to edge cases like the one you bring up, and also fine by me. It sounds like more work overall though, and IMO fetch()ing it is "good enough" and can be made secure with some effort.

> Also, can you make your example into a glitch.com project?  I am on windows and don't have node installed.

https://sw-cache-test.glitch.me/

Thanks, I didn't know about gitch.com.
Flags: needinfo?(d.huigens)
Thanks, its easier to see what is happening now.

It appears you are always running your fetch('serviceworker.js') call from the `updatefound` event handler.  If this event has been fired then we have already performed a network request for the next version of the service worker.  It seems to me in this case the next version of the service worker should definitely be in the http cache at this point.

It appears to me that there is a bug in chrome that its not updating its http cache on update.

Is there a reason you feel that the browser should not populate the http cache when it fetches the SW script during update?

I think you probably need a spec change to expose the currently installed service worker script in some way.  This is something we considered through the Cache API originally, but we decided not to do it to begin with.  Perhaps you have a compelling use case for exposing it, though.
> It appears to me that there is a bug in chrome that its not updating its http cache on update.
> 
> Is there a reason you feel that the browser should not populate the http cache when it fetches the SW script during update?

It sounds like you're seeing the opposite of what I'm seeing. I do want the behavior you're describing. In Chrome, I see console output such as:

[sw 6] run
[sw 4] updatefound
[body] updatefound
[sw 6] updatefound
[body] real sw_id: 6.8945942689007875
[body] response: var SW_ID = 6.8945942689007875; …

What this means is that the new SW (ID 6) gets run, the old SW (4) and webpage get notified of the update, the new SW's real sw_id is 6.something, and the sw_id of the SW that the webpage fetch()ed is also 6.something. In other words, Chrome put the new SW in cache, and fetch() returned that new SW.

In Firefox, I see output such as:

[sw 11] run
[body] updatefound
[sw 5] updatefound
[sw 11] updatefound
[body] real sw_id: 11.255133817698356
[body] response: var SW_ID = 2.5680106414768225; …

The new sw_id is 11.something, and the fetch()ed SW is 2.something. The fetched SW is, in fact, coming from http cache (it is not a new request), it's just not the same cache as Firefox stores its version of the Service Worker in (which is also cached, correctly.)

So Firefox seems to be keeping two cache entries, while Chrome is keeping only one. If this is not what you're seeing, could you post the console output?
Oh, I see that better on the reload of the page.

And I think I might see the bug too:

http://searchfox.org/mozilla-central/source/dom/workers/ServiceWorkerRegistrationInfo.cpp#356

We bypass the http cache completely if we think its been more than a day since the last update check.  We also happen to be doing the timestamp conversion wrong so we always think its more than a day.

Of course, that doesn't say why LOAD_BYPASS_CACHE is effecting later accesses to the cache.
There is definitely something strange going on.  Our update channels aren't writing to the http cache for some reason.
Summary: Allow fetch('/serviceworker.js') to get Service Worker file from cache → service worker updates don't always write to http cache for some reason
Oh, I think I figured out the problem.

By default fetch() sets credentials to "omit".  Service worker updates use a credentials value of "same-origin".  So the service worker script request is sending cookies while your fetch() call is not.

We separate http caches depending on if they send cookies or not to avoid leaking privacy information.

Does your test pass if you change your fetch() calls to use `credentials: 'same-origin'`?  When I do this from the console it seems to work as expected.
Flags: needinfo?(d.huigens)
If this is the problem, then I'm curious why it works in chrome.  I thought they had the same http cache splitting as we do.
I guess its a known difference that chrome doesn't do this.  Spec issue:

https://github.com/whatwg/fetch/issues/307
> Does your test pass if you change your fetch() calls to use `credentials: 'same-origin'`?  When I do this from the console it seems to work as expected.

Yes! It does, thanks a lot. That fixes my use-case, I think. Do you want me to update or make a new glitch.com project?

> By default fetch() sets credentials to "omit".  Service worker updates use a credentials value of "same-origin".  So the service worker script request is sending cookies while your fetch() call is not.

I see, thanks. I tried a few things to make the two requests more alike (headers and such), but not this :)
Flags: needinfo?(d.huigens)
> We bypass the http cache completely if we think its been more than a day since the last update check.  We also happen to be doing the timestamp conversion wrong so we always think its more than a day.

I think that's not what I'm seeing -- If I refresh the page within 5 seconds, which is the max-age I'm setting, no updatefound event is firing.
(In reply to d.huigens from comment #10)
> > Does your test pass if you change your fetch() calls to use `credentials: 'same-origin'`?  When I do this from the console it seems to work as expected.
> 
> Yes! It does, thanks a lot. That fixes my use-case, I think. Do you want me
> to update or make a new glitch.com project?

If you don't mind, that might be best.  Just in case someone else tries running the glitch in the future.

(In reply to d.huigens from comment #11)
> > We bypass the http cache completely if we think its been more than a day since the last update check.  We also happen to be doing the timestamp conversion wrong so we always think its more than a day.
> 
> I think that's not what I'm seeing -- If I refresh the page within 5
> seconds, which is the max-age I'm setting, no updatefound event is firing.

Well, we're off by a factor of a thousand.  So I guess we will use the cache for ~86 seconds instead of the desired 86,400 seconds in a day.

I'll file a separate bug for the time check issue.  Marking this WORKSFORME.
Status: UNCONFIRMED → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: