If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

service worker controlled pages sometimes fail to load and you just see a white screen

RESOLVED FIXED in Firefox 46

Status

()

Core
DOM: Service Workers
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: bkelly, Assigned: bkelly)

Tracking

({regression})

32 Branch
mozilla47
regression
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox44 unaffected, firefox45 unaffected, firefox46 fixed, firefox47 fixed)

Details

Attachments

(2 attachments, 1 obsolete attachment)

(Assignee)

Description

2 years ago
I have two different reports of an intermittent load failure with service worker sites using fetch event and skipWaiting():

1) I saw this on my own site [1] with nightly 47.  Clearing storage fixed the issue for me.
2) Alex Gibson reports [2] he has seen this on SVGOMG [3] with dev edition 46.  He had to "manually unregister" the service worker to fix it.

I initially thought the problem on my own blog was due to me having a messed up profile from local testing.  The report from Alex, though, makes me think there is something else going on.

We don't have steps to reproduce yet, but filing this bug to track the issue.

If you think you see this, please capture a zip of your profile!

[1]: https://blog.wanderview.com
[2]: https://twitter.com/alex_gibson/status/697737149697945600
[3]: https://jakearchibald.github.io/svgomg/
(Assignee)

Comment 1

2 years ago
Some weak theories as to what is going on:

1) Updates are somehow still firing in the middle of the navigation and skipWaiting() is killing the active worker in the process of handling the fetch event.  This has been reported on FF44. [1]  This really shouldn't happen on FF45+, though, since we now delay updates.  It would also not be a permanent failure and things would work when opened in a new tab.

2) The Cache API is trying to do a schema update and is failing for some reason.  This *should* result in network requests falling back to normal operation without interception, though.  This would be consistent with needing to wipe the storage, though.  (Alex notes he previously had it open on same dev edition version, although maybe we did a silent update on browser restart since this report is so close to the branch merge.)

[1]: https://twitter.com/ericf/status/697479909598490624
For the record, I hit this bug using Firefox Dev Edition 46.0a2 (2016-02-10). I had previously visited the svgomg site around 6/7 days prior using the same profile. I also use a separate profile for Dev Edition than I do for Release/Beta/Nightly.
(Assignee)

Comment 3

2 years ago
I'm going to test the cache schema change theory in comment 0 by hacking a failure into the code.  That should show if it correctly falls back to network.
(Assignee)

Comment 4

2 years ago
I simulated failures in cache schema upgrade, opening the sqlite db, match query, and put query.  None of these seemed to cause the problem in this bug.  If Cache API is busted the site just falls back to network as designed.

Let me see if problems with the ServiceWorkerRegistrar could cause this.
I experienced a similar issue while reproducing bug 1241045.
I used that test case, then removed the service worker. I then tested some new code on the same URL, and the old content (from the first, failing, test case) was loaded.
Despite clearing the cache and everything, the sqlite files in storage/ still referenced the old content.

Unfortunately I removed the directory since then and I haven't kept a copy.
(Assignee)

Updated

2 years ago
Depends on: 1247623
(Assignee)

Comment 6

2 years ago
Another theory:

The ssl cert associated with the site is expiring between when its cached by the service worker and returned in the failing case.

Combined with bug 1247623 this could cause the problem we are seeing here.  Also, SVGOMG's current cert was issued on Jan 19, 2016.  If there was some overlap with last cert then perhaps that fits the time window.
(Assignee)

Comment 7

2 years ago
Another possible report:

  https://twitter.com/dangoor/status/697826836676485120
(Assignee)

Comment 8

2 years ago
I got some profile data from @dangoor.  I think I might know whats going on.  Also, see bug 1246319.
See Also: → bug 1246319
(Assignee)

Comment 9

2 years ago
So, thanks to the fixes in bug 1247623 and bug 1246319 I may have reproduced this locally again.

It seems we fail to deserialize the resource's ssl cert starting in FF46.  This is because the UUID changed in bug 1240173.

I will hack in a work around for now and file a follow-up to come up with a more reliable serialization strategy for the security info.
Blocks: 1240173
(Assignee)

Comment 10

2 years ago
Created attachment 8719803 [details] [diff] [review]
Allow old nsIX509Cert serialized objects to be read off disk. r=bz
Attachment #8719803 - Flags: review?(bzbarsky)
(Assignee)

Comment 11

2 years ago
I will look into writing a gtest this afternoon.
(Assignee)

Updated

2 years ago
status-firefox44: --- → unaffected
status-firefox45: --- → unaffected
status-firefox46: --- → affected
status-firefox47: --- → affected
(Assignee)

Updated

2 years ago
See Also: → bug 1248628
Comment on attachment 8719803 [details] [diff] [review]
Allow old nsIX509Cert serialized objects to be read off disk. r=bz

>+ *       serialization than more complex changes will be needed.

s/than/then/

>+  //        workers around this issue.

s/workers/works/

r=me.  Make sure to uplift to 46!
Attachment #8719803 - Flags: review?(bzbarsky) → review+
(Assignee)

Comment 13

2 years ago
Created attachment 8719987 [details] [diff] [review]
P1 Allow old nsIX509Cert serialized objects to be read off disk. r=bz
Attachment #8719803 - Attachment is obsolete: true
Attachment #8719987 - Flags: review+
(Assignee)

Comment 14

2 years ago
Created attachment 8719990 [details] [diff] [review]
P2 Add gtest to ensure we can continue to deserialize old security info strings. r=bz

This fails before the P1 and succeeds with it applied.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1a7559b316cd
Attachment #8719990 - Flags: review?(bzbarsky)
Comment on attachment 8719990 [details] [diff] [review]
P2 Add gtest to ensure we can continue to deserialize old security info strings. r=bz

r=me.  I assume you've verified that the test fails without the patch?
Attachment #8719990 - Flags: review?(bzbarsky) → review+

Comment 16

2 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/26ef2d27d757
https://hg.mozilla.org/integration/mozilla-inbound/rev/e9a8fbf9af4d
(Assignee)

Comment 17

2 years ago
Comment on attachment 8719987 [details] [diff] [review]
P1 Allow old nsIX509Cert serialized objects to be read off disk. r=bz

Approval Request Comment
[Feature/regressing bug #]: Service workers interaction with bug 1247623. 
[User impact if declined]: If a user visited a service worker enabled site prior to bug 1247623 and then visit after that bug, then the user will get a base experience.  Depending on what the SW does, they may see a white screen and the site will never load.
[Describe test coverage new/current, TreeHerder]: P2 includes a gtest.
[Risks and why]: Minimal.  We are simply adding a compat workaround to load security information with an old UUID.
[String/UUID change made/needed]: None.
Attachment #8719987 - Flags: approval-mozilla-aurora?
(Assignee)

Updated

2 years ago
Keywords: regression
(Assignee)

Comment 18

2 years ago
Comment on attachment 8719990 [details] [diff] [review]
P2 Add gtest to ensure we can continue to deserialize old security info strings. r=bz

See comment 17.
Attachment #8719990 - Flags: approval-mozilla-aurora?
(Assignee)

Comment 19

2 years ago
Note, if the idl commit hook complains about us not bumping the UUID when uplifting, please do not change the UUID!  Instead we should use the IGNORE commit syntax since we only added a comment to the idl.  Bumping the UUID would create further bustage here.

Comment 20

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/26ef2d27d757
https://hg.mozilla.org/mozilla-central/rev/e9a8fbf9af4d
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
status-firefox47: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Comment on attachment 8719987 [details] [diff] [review]
P1 Allow old nsIX509Cert serialized objects to be read off disk. r=bz

Fixes recent SW regression, includes a new test, please uplift to aurora.
Attachment #8719987 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8719990 [details] [diff] [review]
P2 Add gtest to ensure we can continue to deserialize old security info strings. r=bz

Test changes, please uplift to aurora.
Attachment #8719990 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Comment 23

2 years ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-aurora/rev/fd4fdadb313c
https://hg.mozilla.org/releases/mozilla-aurora/rev/4c972ab674f5
status-firefox46: affected → fixed
You need to log in before you can comment on or make changes to this bug.