Closed Bug 1230030 Opened 9 years ago Closed 9 years ago

Intermittent e10s TEST-UNEXPECTED-TIMEOUT | /_mozilla/service-workers/service-worker/update-after-oneday.https.html | Update should be triggered after a functional event when last update time is over 24 hours - Test timed out

Tracking

(firefox45 affected, firefox46 fixed)

Status:

RESOLVED FIXED

Milestone:

mozilla46

Tracking Flags:

Tracking

Status

firefox45

---

affected

firefox46

---

fixed

People

(Reporter: philor, Assigned: bkelly)

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Don't replace active worker unnecessarly after saving registration in e10s mode. r=ehsan 9 years ago Ben Kelly [:bkelly, not reviewing] 1.31 KB, patch	ehsan.akhgari : review+	Details \| Diff \| Splinter Review

Phil Ringnalda (:philor)

Reporter

Description

•

9 years ago

https://treeherder.mozilla.org/logviewer.html#?job_id=6021059&repo=fx-team

Comment hidden (Intermittent Failures Robot)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 2

•

9 years ago

NI myself to investigate this after I get back from PTO in January.

Flags: needinfo?(bkelly)

Comment hidden (Intermittent Failures Robot)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

9 years ago

Assignee: nobody → bkelly

Status: NEW → ASSIGNED

Flags: needinfo?(bkelly)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 5

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=350337f0ae6a

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 6

•

9 years ago

Note sure if its the problem here, but it seems we are attempting to schedule two timers for each FetchEvent. The first for the functional event firing and the second after the document load completes. They should be coalesced, but maybe if the server is running particularly slow we can get an extra update in there. https://treeherder.mozilla.org/#/jobs?repo=try&revision=c357dcbed721

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 7

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=52b869f033ea

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 8

•

9 years ago

It looks like we're not getting a fetch event at all for the sub-resource load. In this case its an image. I wonder if image cache is screwing things up somehow. Here is a try that does cache busting on the image: https://treeherder.mozilla.org/#/jobs?repo=try&revision=75bdae5641b5

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 9

•

9 years ago

Cache busting did not help. More debugging shows, however, that ServiceWorkerPrivate::TerminateWorker() can be called even if we have a KeepAliveToken supposedly holding the worker alive. It seems the FetchEventRunnable is getting created and the work is being killed before it can be executed on the worker. Lets try aborting TerminateWorker if mTokenCount is > 0: https://treeherder.mozilla.org/#/jobs?repo=try&revision=20b545139379

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 10

•

9 years ago

That was a bad patch in comment 9. We need to be able to force terminate a worker even if it has an mTokenCount. In this case, though, its happening because the ServiceWorkerInfo object is being destroyed. This is quite odd because it should be held alive as the reg.active worker. More try debug...

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 11

•

9 years ago

So what is happening is this: 1) First navigation update triggers a PropagateRegistration() 2) This async calls back to the child process with LoadRegistration() 3) LoadRegistration() checks to see if it needs to start a new active worker or not 4) Logic error in (3) causes us to replace our current active worker with a new active worker using the same script spec. 5) Replacing the current active worker kills the WorkerPrivate and its thread. 6) It just happens to do this between creating the FetchEventRunnable and executing it on the worker thread. Here is a try build that attempts to fix the logic error LoadRegistration(): https://treeherder.mozilla.org/#/jobs?repo=try&revision=82eca92c300c

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 12

•

9 years ago

Attached patch Don't replace active worker unnecessarly after saving registration in e10s mode. r=ehsan — Details — Splinter Review

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 13

•

9 years ago

Comment on attachment 8705742 [details] [diff] [review] Don't replace active worker unnecessarly after saving registration in e10s mode. r=ehsan 20 green try triggers. I think this fixes the problem.

Attachment #8705742 - Flags: review?(ehsan)

(no longer active)

Comment 14

•

9 years ago

Interesting. I wonder if bug 1232555 is somehow related to this...

Updated

•

9 years ago

Attachment #8705742 - Flags: review?(ehsan) → review+

Pulsebot

Comment 15

•

9 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/b82dc4ab197b

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

9 years ago

Updated

•

9 years ago

Summary: Intermittent TEST-UNEXPECTED-TIMEOUT | /_mozilla/service-workers/service-worker/update-after-oneday.https.html | Update should be triggered after a functional event when last update time is over 24 hours - Test timed out → Intermittent e10s TEST-UNEXPECTED-TIMEOUT | /_mozilla/service-workers/service-worker/update-after-oneday.https.html | Update should be triggered after a functional event when last update time is over 24 hours - Test timed out

Phil Ringnalda (:philor)

Reporter

Comment 16

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/b82dc4ab197b

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

status-firefox46: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla46

Comment hidden (Intermittent Failures Robot)

You need to log in before you can comment on or make changes to this bug.