Intermittent e10s TEST-UNEXPECTED-TIMEOUT | /_mozilla/service-workers/service-worker/update-after-oneday.https.html | Update should be triggered after a functional event when last update time is over 24 hours - Test timed out

RESOLVED FIXED in Firefox 46

Status

RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: philor, Assigned: bkelly)

Tracking

({intermittent-failure})

Trunk
mozilla46
intermittent-failure
Points:
---

Firefox Tracking Flags

(firefox45 affected, firefox46 fixed)

Details

Attachments

(1 attachment)

Comment hidden (Intermittent Failures Robot)
NI myself to investigate this after I get back from PTO in January.
Flags: needinfo?(bkelly)
Comment hidden (Intermittent Failures Robot)
Comment hidden (Intermittent Failures Robot)
(Assignee)

Updated

3 years ago
Assignee: nobody → bkelly
Status: NEW → ASSIGNED
Flags: needinfo?(bkelly)
Note sure if its the problem here, but it seems we are attempting to schedule two timers for each FetchEvent.  The first for the functional event firing and the second after the document load completes.  They should be coalesced, but maybe if the server is running particularly slow we can get an extra update in there.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c357dcbed721
It looks like we're not getting a fetch event at all for the sub-resource load.  In this case its an image.  I wonder if image cache is screwing things up somehow.

Here is a try that does cache busting on the image:

  https://treeherder.mozilla.org/#/jobs?repo=try&revision=75bdae5641b5
Cache busting did not help.

More debugging shows, however, that ServiceWorkerPrivate::TerminateWorker() can be called even if we have a KeepAliveToken supposedly holding the worker alive.  It seems the FetchEventRunnable is getting created and the work is being killed before it can be executed on the worker.

Lets try aborting TerminateWorker if mTokenCount is > 0:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=20b545139379
That was a bad patch in comment 9.  We need to be able to force terminate a worker even if it has an mTokenCount.

In this case, though, its happening because the ServiceWorkerInfo object is being destroyed.  This is quite odd because it should be held alive as the reg.active worker.  More try debug...
So what is happening is this:

1) First navigation update triggers a PropagateRegistration()
2) This async calls back to the child process with LoadRegistration()
3) LoadRegistration() checks to see if it needs to start a new active worker or not
4) Logic error in (3) causes us to replace our current active worker with a new active worker using the same script spec.
5) Replacing the current active worker kills the WorkerPrivate and its thread.
6) It just happens to do this between creating the FetchEventRunnable and executing it on the worker thread.

Here is a try build that attempts to fix the logic error LoadRegistration():

https://treeherder.mozilla.org/#/jobs?repo=try&revision=82eca92c300c
Created attachment 8705742 [details] [diff] [review]
Don't replace active worker unnecessarly after saving registration in e10s mode. r=ehsan
Comment on attachment 8705742 [details] [diff] [review]
Don't replace active worker unnecessarly after saving registration in e10s mode. r=ehsan

20 green try triggers.  I think this fixes the problem.
Attachment #8705742 - Flags: review?(ehsan)

Comment 14

3 years ago
Interesting.  I wonder if bug 1232555 is somehow related to this...
See Also: → bug 1232555

Updated

3 years ago
Attachment #8705742 - Flags: review?(ehsan) → review+
(Assignee)

Updated

3 years ago
See Also: bug 1232555
(Reporter)

Updated

3 years ago
Summary: Intermittent TEST-UNEXPECTED-TIMEOUT | /_mozilla/service-workers/service-worker/update-after-oneday.https.html | Update should be triggered after a functional event when last update time is over 24 hours - Test timed out → Intermittent e10s TEST-UNEXPECTED-TIMEOUT | /_mozilla/service-workers/service-worker/update-after-oneday.https.html | Update should be triggered after a functional event when last update time is over 24 hours - Test timed out
(Reporter)

Comment 16

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/b82dc4ab197b
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
status-firefox46: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla46
Comment hidden (Intermittent Failures Robot)
You need to log in before you can comment on or make changes to this bug.