Closed Bug 1171127 Opened 11 years ago Closed 11 years ago

Ghost windows on washingtonpost.com with e10s enabled

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla41

Tracking Flags:

Tracking

Status

e10s

m8+

---

firefox40

---

affected

firefox41

---

fixed

People

(Reporter: mccr8, Assigned: valentin)

References

(Blocks 1 open bug)

Details

(Keywords: memory-leak, perf, Whiteboard: [MemShrink:P1])

Attachments

(1 file)

Listeners are not released in OnStopRequest in e10s mode 11 years ago Valentin Gosu [:valentin] {{ PTO 3 July - 3 August }} 1.25 KB, patch	jduell.mcbugs : review+ Sylvestre : approval-mozilla-aurora-	Details \| Diff \| Splinter Review

Andrew McCreight [:mccr8]

Reporter

Description

•

11 years ago

With e10s, if you go to a story on washingtonpost.com, the scroll all the way to the bottom so various comment junk starts loading, then navigate to a new page while the loading is in progress, then you leak the page forever. I can't reproduce without e10s enabled. I'm still investigating, but the problem seems to be an orphan object element being held alive by a networking channel. (The problem could be in how the DOM handles the channels, of course.)

Andrew McCreight [:mccr8]

Reporter

Updated

•

11 years ago

tracking-e10s: --- → ?

Andrew McCreight [:mccr8]

Reporter

Comment 1

•

11 years ago

For at least one of the orphan object elements, it seems like what is holding it alive is a ObjectInterfaceRequestorShim. The only thing referring to that is something allocated in mozilla::net::nsHttpHandler::NewProxiedChannel2(), which I think is an HttpChannelChild. As for the channel, it looks like there's an nsDocument http://js.washingtonpost.com/pb/resources/js/echo2/sdk/v3/gui.pack.css that is holding it alive, something allocated under HttpChannelChild() (maybe a ChannelEventQueue?) and an nsPerformance object. The nsPerformance object is being referred to directly only by its wrapper, which seems weird to me. I don't know about the ChannelEventQueue, but the nsDocument and nsPerformance object are being held entrained (via a very long path) by the orphan object elements, so there's your cycle. I'm not really sure what is going wrong here. Maybe something should have cleared the channel to shim edge? My information here is coming from conservative heap scanning and allocation stacks, along with the cycle collector log. I should be able to rerun the analysis while the browser is running, which would let me poke around with a debugger.

Andrew McCreight [:mccr8]

Reporter

Comment 2

•

11 years ago

jdm, do you have any ideas about what might be going wrong here? I'm guessing something akin to bug 1157468 where cleanup isn't being done in some condition.

Andrew McCreight [:mccr8]

Reporter

Updated

•

11 years ago

Flags: needinfo?(josh)

Andrew McCreight [:mccr8]

Reporter

Updated

•

11 years ago

Comment 3

•

11 years ago

So ObjectInterfaceRequestorShim is used as the notification callbacks on the channel nsObjectLoadingContent::OpenChannel opens. It's also used as the channel's streamlistener. Normally necko is responsible for nulling out both the streamlistener and the callbacks after OnStopRequest. And in particular, HttpBaseChannel::ReleaseListeners nulls them both out and should be called OnStopRequest. It also nulls out mProgressSink, which is another pointer to the same object. And if the channel is being held by something other than the stack, it would seem like AsyncOpen succeeded, so we should have gotten OnStopRequest. I guess we don't know which member of the channel is pointing to the shim?

Honza Bambas (:mayhemer)

Comment 4

•

11 years ago

HttpBaseChannel::ReleaseListeners() is not used on the child process. One thing the channel doesn't throw away is the load group reference. There are several places we remove the channel (self) from the load group, never readd (AFAIK) and never release the load group after that.

Honza Bambas (:mayhemer)

Comment 5

•

11 years ago

OK, I think we simply don't clean up correctly mProgressSink on the child.

Boris Zbarsky [:bzbarsky]

Comment 6

•

11 years ago

The loadgroup should not be an issue. But if we don't clean up mProgressSink/mCallbacks/mListener on the child, that should cause all sorts of leaks...

Josh Matthews [:jdm]

Comment 7

•

11 years ago

Sorry, I don't think I'll be able to provide any useful information here before I leave on vacation.

Flags: needinfo?(josh)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 8

•

11 years ago

Jason, this sounds kinda bad. The leaks cause really bad cycle collector performance, which can lead to frequent, long pauses. Is this something that could be fixed easily? Could you find someone to work on it?

Flags: needinfo?(jduell.mcbugs)

Valentin Gosu [:valentin] {{ PTO 3 July - 3 August }}

Assignee

Updated

•

11 years ago

Assignee: nobody → valentin.gosu

Valentin Gosu [:valentin] {{ PTO 3 July - 3 August }}

Assignee

Updated

•

11 years ago

Flags: needinfo?(jduell.mcbugs)

Eric Rahm [:erahm]

Updated

•

11 years ago

Whiteboard: [MemShrink] → [MemShrink:P1]

Brad Lassey [:blassey] (use needinfo?)

Updated

•

11 years ago

tracking-e10s: ? → m8+

Wayne Mery (:wsmwk)

Updated

•

11 years ago

Severity: normal → major

Keywords: mlk, perf

Valentin Gosu [:valentin] {{ PTO 3 July - 3 August }}

Assignee

Comment 9

•

11 years ago

Attached patch Listeners are not released in OnStopRequest in e10s mode — Details — Splinter Review

Attachment #8622127 - Flags: review?(jduell.mcbugs)

Valentin Gosu [:valentin] {{ PTO 3 July - 3 August }}

Assignee

Comment 10

•

11 years ago

Releasing the listeners in OnStopRequest seems to ameliorate the issue - the entries for washingtonpost.com disappear from about:memory after a GC,CC,Minimize. That's assuming I'm testing this correctly. Can someone check with the try build? https://treeherder.mozilla.org/#/jobs?repo=try&revision=7681f9e4f7ab

Andrew McCreight [:mccr8]

Reporter

Comment 11

•

11 years ago

I'm unable to reproduce the leak with the attached patch. Thanks!

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 12

•

11 years ago

I wonder if this should be fixed on b2g also.

Jason Duell

Updated

•

11 years ago

Attachment #8622127 - Flags: review?(jduell.mcbugs) → review+

Valentin Gosu [:valentin] {{ PTO 3 July - 3 August }}

Assignee

Updated

•

11 years ago

Keywords: checkin-needed

Pulsebot

Comment 13

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/02eb7fe4cdd1

Keywords: checkin-needed

Ryan VanderMeulen [:RyanVM]

Comment 14

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/02eb7fe4cdd1

Status: NEW → RESOLVED

Closed: 11 years ago

status-firefox41: affected → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla41

Andrew McCreight [:mccr8]

Reporter

Comment 15

•

11 years ago

Valentin, could you get approval for landing this on Aurora? People are running e10s there and it would be nice to get it fixed.

Flags: needinfo?(valentin.gosu)

Valentin Gosu [:valentin] {{ PTO 3 July - 3 August }}

Assignee

Comment 16

•

11 years ago

Comment on attachment 8622127 [details] [diff] [review] Listeners are not released in OnStopRequest in e10s mode Approval Request Comment [Feature/regressing bug #]: e10s/HttpChannelChild [User impact if declined]: Rather large and long lasting memory leaks in the content process [Describe test coverage new/current, TreeHerder]: All green on TreeHerder. Has been on m-c for a few days. [Risks and why]: Low risk. Just clears some refPtrs in OnStopRequest. [String/UUID change made/needed]: None.

Flags: needinfo?(valentin.gosu)

Attachment #8622127 - Flags: approval-mozilla-aurora?

Sylvestre Ledru [:Sylvestre]

Updated

•

11 years ago

status-firefox40: --- → affected

Sylvestre Ledru [:Sylvestre]

Comment 17

•

11 years ago

Comment on attachment 8622127 [details] [diff] [review] Listeners are not released in OnStopRequest in e10s mode We are going to merge aurora in beta next Monday. e10s is going to be disabled in beta. So, not taking it as it could have potential regressions.

Attachment #8622127 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora-

Andrew McCreight [:mccr8]

Reporter

Updated

•

10 years ago

Blocks: GhostWindows

You need to log in before you can comment on or make changes to this bug.