Closed Bug 1232558 Opened 10 years ago Closed 10 years ago

Intermittent e10s leakcheck | tab process: 12537 bytes leaked (ChannelEventQueue, ChildDNSService, CompareCache, CompareManager, CompareNetwork, ...)

Tracking

()

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

e10s

---

People

(Reporter: philor, Assigned: bkelly)

References

(Blocks 1 open bug)

Details

(Keywords: intermittent-failure)

Attachments

(3 files, 5 obsolete files)

Don't queue an update job if we are shutting down the ServiceWorkerManager. r=ehsan 10 years ago Ben Kelly [:bkelly, not reviewing] 940 bytes, patch		Details \| Diff \| Splinter Review
Don't queue an update job if we are shutting down the ServiceWorkerManager. r=ehsan 10 years ago Ben Kelly [:bkelly, not reviewing] 1.78 KB, patch		Details \| Diff \| Splinter Review
Try to ensure service worker jobs do not run during shutdown. r=ehsan 10 years ago Ben Kelly [:bkelly, not reviewing] 2.84 KB, patch		Details \| Diff \| Splinter Review
P2 Abort the ServiceWorkerScriptCache CompareManager at xpcom-shutdown. r=ehsan 10 years ago Ben Kelly [:bkelly, not reviewing] 6.37 KB, patch		Details \| Diff \| Splinter Review
P1 Try to ensure service worker jobs do not run during shutdown. r=ehsan 10 years ago Ben Kelly [:bkelly, not reviewing] 2.84 KB, patch		Details \| Diff \| Splinter Review
P1 Try to ensure service worker jobs do not run during shutdown. r=ehsan 10 years ago Ben Kelly [:bkelly, not reviewing] 2.84 KB, patch		Details \| Diff \| Splinter Review
P2 Abort the ServiceWorkerScriptCache CompareManager at xpcom-shutdown. r=ehsan 10 years ago Ben Kelly [:bkelly, not reviewing] 6.48 KB, patch		Details \| Diff \| Splinter Review
P3 Block shutdown until canceled service worker jobs gracefully exit. r=ehsan 10 years ago Ben Kelly [:bkelly, not reviewing] 6.22 KB, patch		Details \| Diff \| Splinter Review

Phil Ringnalda (:philor)

Reporter

Description

•

10 years ago

https://treeherder.mozilla.org/logviewer.html#?job_id=18569567&repo=mozilla-inbound

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

10 years ago

Component: Networking: DNS → DOM: Service Workers

Jim Mathies [:jimm]

Updated

•

10 years ago

Flags: needinfo?(ehsan)

Jim Mathies [:jimm]

Updated

•

10 years ago

Blocks: e10s-tests

tracking-e10s: --- → +

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 1

•

10 years ago

As far as I can tell this has triggered 4 times so far: https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1232558 Its most likely fall out from bug 1226443 starting update activity late during shutdown. I'd like to get bug 1231974 landed before looking at this.

Blocks: 1226443

Comment hidden (Intermittent Failures Robot)

(no longer active)

Updated

•

10 years ago

Flags: needinfo?(ehsan)

Comment hidden (Intermittent Failures Robot)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

10 years ago

Assignee: nobody → bkelly

Status: NEW → ASSIGNED

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 5

•

10 years ago

Attached patch Don't queue an update job if we are shutting down the ServiceWorkerManager. r=ehsan (obsolete) — Details — Splinter Review

Theory: 1) Delayed update is scheduled during a test 2) Tests complete and framework starts cleaning up 3) Delayed update fires, calling PropagateSoftUpdate() which sends a message to parent 4) SWM gets xpcom-shutdown and starting shutting down. No update job is present to be canceled. 5) Parent calls back from the PropagateSoftUpdate() with NotifySoftUpdate() 6) Update job is queued and runs during shutdown. This job is not canceled, because its scheduled after xpcom-shutdown. This patch makes us just short-circuit in SoftUpdate if we get a NotifySoftUpdate() after shutdown. Let's see if it helps: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c299e3d828df

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 6

•

10 years ago

Attached patch Don't queue an update job if we are shutting down the ServiceWorkerManager. r=ehsan (obsolete) — Details — Splinter Review

Still got a leak with that last patch. I realized we are not canceling queued jobs at shutdown, though. Lets see if that helps as well. https://treeherder.mozilla.org/#/jobs?repo=try&revision=e7e543fa561b

Attachment #8701334 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 7

•

10 years ago

Attached patch Try to ensure service worker jobs do not run during shutdown. r=ehsan (obsolete) — Details — Splinter Review

A few more shutdown checks added to this patch.

Attachment #8701485 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 8

•

10 years ago

Attached patch P2 Abort the ServiceWorkerScriptCache CompareManager at xpcom-shutdown. r=ehsan (obsolete) — Details — Splinter Review

And then pipe the cancel/abort down into the ServiceWorkerScriptCache CompareManager. Without this if we are already running the comparison the cancelation at shutdown does nothing. https://treeherder.mozilla.org/#/jobs?repo=try&revision=c3c9719380b9

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 10

•

10 years ago

Attached patch P1 Try to ensure service worker jobs do not run during shutdown. r=ehsan (obsolete) — Details — Splinter Review

Attachment #8701599 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 11

•

10 years ago

Still getting leaks here. I'm starting to wonder if we're falling victim to the same necko leak in bug 1218297. I see an EventTokenBucket in the leak list.

Comment 12

•

10 years ago

Try build with just the necko leak fix: https://treeherder.mozilla.org/#/jobs?repo=try&revision=214fc0e2b765 Try build with the necko leak fix and the patches in this bug: https://treeherder.mozilla.org/#/jobs?repo=try&revision=11867b4133c6 Lets see if the leak shows up in either of these.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 13

•

10 years ago

Comment on attachment 8701629 [details] [diff] [review] P1 Try to ensure service worker jobs do not run during shutdown. r=ehsan The try runs show that the leak happens, but suggests these patches reduce the frequency. So I'd like to proceed with them for now. I'm doing more triggers, but they seem to reduce frequency from 15% to 5% on linux debug m-e10s(1).

Attachment #8701629 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

10 years ago

Attachment #8701600 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 14

•

10 years ago

Comment on attachment 8701629 [details] [diff] [review] P1 Try to ensure service worker jobs do not run during shutdown. r=ehsan Further triggers show the frequency has not actually dropped.

Attachment #8701629 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 15

•

10 years ago

Comment on attachment 8701600 [details] [diff] [review] P2 Abort the ServiceWorkerScriptCache CompareManager at xpcom-shutdown. r=ehsan This patch is probably a good start, though, since it consistently changes the number of objects leaked.

Attachment #8701600 - Flags: review?(ehsan)

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 16

•

10 years ago

I'll have to resume work on this in January.

Comment hidden (Intermittent Failures Robot)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

10 years ago

Comment 19

•

10 years ago

Attached patch P1 Try to ensure service worker jobs do not run during shutdown. r=ehsan — Details — Splinter Review

Attachment #8701629 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 20

•

10 years ago

Attached patch P2 Abort the ServiceWorkerScriptCache CompareManager at xpcom-shutdown. r=ehsan — Details — Splinter Review

Attachment #8701600 - Attachment is obsolete: true

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 21

•

10 years ago

Attached patch P3 Block shutdown until canceled service worker jobs gracefully exit. r=ehsan — Details — Splinter Review

I think we must spin the event loop here in order to gracefully let our network objects cleanup after being canceled. https://treeherder.mozilla.org/#/jobs?repo=try&revision=7f38bab11917

Patrick McManus [:mcmanus]

Updated

•

10 years ago

Blocks: 1233774

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 22

•

10 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=12acf9adfaa6

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 23

•

10 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=4291774e2053

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 24

•

10 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=155da677dcdb

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 25

•

10 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=bfe821ee2e8d

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 26

•

10 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=b94462c4c67a

Comment hidden (Intermittent Failures Robot)

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

10 years ago

Depends on: 1237158

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 28

•

10 years ago

So I believe this leak is greatly exacerbated by excessive updating in the dom/canvas/test mochitests. I filed bug 1237158 to unregister the service worker there and avoid these updates. That should reduce the frequency of this bug. I'd still like to make SWM shutdown cleanly here, though.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 29

•

10 years ago

I think my shutdown hang issues with the "block shutdown until update job exits out" patch is due to e10s channels not firing OnStopRequest if the actor is torn down.

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 30

•

10 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=c3c6e1279d9e

Ben Kelly [:bkelly, not reviewing]

Assignee

Comment 31

•

10 years ago

Jason, do you have any idea why I would not get an OnStopRequest callback from an e10s http channel when I call Cancel() during xpcom-shutdown? I am spinning the event loop waiting for the callback, but it never seems to come. I'm having real trouble gracefully closing network connections during shutdown in e10s mode.

Flags: needinfo?(jduell.mcbugs)

(no longer active)

Comment 32

•

10 years ago

Is this caused by mIPCClosed being true? <https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/HttpChannelParent.cpp#1125>

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

10 years ago

Comment 33

•

10 years ago

We have decided to simply avoid service workers running during shutdown via bug 1237363. This has not triggered since bug 1237158 landed. I'm going to close for now.

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

Resolution: --- → FIXED

Ben Kelly [:bkelly, not reviewing]

Assignee

Updated

•

10 years ago

No longer blocks: 1233774

Comment hidden (Intermittent Failures Robot)

Jason Duell

Comment 35

•

10 years ago

> do you have any idea why I would not get an OnStopRequest callback from an e10s > http channel when I call Cancel() during xpcom-shutdown? Not really. We're having major issues with TCP/UDP sockets hanging on windows during shutdown (close() never returns), but that generally shouldn't affect HTTP OnStopRequest, which happens when all the bytes for the HTTP request have been received, not when the socket is closed.

Flags: needinfo?(jduell.mcbugs)

You need to log in before you can comment on or make changes to this bug.