Last Comment Bug 841993 - Leaking ipc::Transport from 'opens' protocols
: Leaking ipc::Transport from 'opens' protocols
Status: RESOLVED FIXED
[MemShrink:P1]
:
Product: Core
Classification: Components
Component: Graphics: Layers (show other bugs)
: unspecified
: x86_64 Linux
: -- critical (vote)
: mozilla22
Assigned To: Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
:
Mentors:
Depends on:
Blocks: 837187 838869 841976
  Show dependency treegraph
 
Reported: 2013-02-15 20:47 PST by Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
Modified: 2013-02-26 16:48 PST (History)
12 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
tef+
wontfix
wontfix
fixed
fixed
wontfix
fixed


Attachments
Smoking gun (5.50 KB, patch)
2013-02-15 20:47 PST, Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
no flags Details | Diff | Review
Part 1: Destroy the CrossProcessCompositorParent channel (2.78 KB, patch)
2013-02-15 23:06 PST, Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
bent.mozilla: review+
Details | Diff | Review
Part 2: Clean up ImageBridge (7.17 KB, patch)
2013-02-15 23:10 PST, Chris Jones [:cjones] inactive; ni?/f?/r? if you need me
kchen: review+
nical.bugzilla: review+
Details | Diff | Review

Description Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-15 20:47:12 PST
Created attachment 714740 [details] [diff] [review]
Smoking gun

CrossProcessCompositorParent and ImageBridgeParent are given pointers to ipc::Transport when allocated.  They hand these down into AsyncChannel on Open().

AsyncChannel doesn't own the Transport, it only borrows it.  The Transport is owned by the thread or process host.

CrossProcessCompositorParent and ImageBridgeParent don't free the Transports so we accumulate a leak proportional to the number of apps opened and closed/lmk'd.  The amount of memory held by Transport* can be fairly substantial because of buffers and queues, into the 10-100KB range.  This can cause substantial b2g-process memory usage and severely destabilize the system.

We need to fix this to hit stability requirements.  Luckily the patch is small and safe.
Comment 1 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-15 20:49:09 PST
I reproduced the symptom in bug 838869 with this patch.
Comment 2 Michael Vines [:m1] [:evilmachines] 2013-02-15 20:54:13 PST
(If possible landing a fix for this before we start v1.0.1 stability test on Sunday night would be rad.)
Comment 3 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-15 22:06:40 PST
I was totally going to suggest that you take the upcoming fix whether it's landed by then or not ;).
Comment 4 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-15 23:06:08 PST
Created attachment 714755 [details] [diff] [review]
Part 1: Destroy the CrossProcessCompositorParent channel

Sorry for the review spam; this is a straightforward cleanup that doesn't require any knowledge of PCompositor, just a bit of general IPC.
Comment 5 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-15 23:10:22 PST
Created attachment 714756 [details] [diff] [review]
Part 2: Clean up ImageBridge

This one is trickier than I'd hoped, because we were previously just completely leaking ImageBridgeParent :/.

The changes here are
 - use the same memory management scheme for ImageBridgeParent as CrossProcessCompositorChild: it keeps itself alive until ActorDestroy()
 - make the same change as in Part 1 to clean up the Transport*, if there is one (there isn't one for cross-thread ImageBridge)
 - stop freeing the image map from ~ImageBridgeParent, since that's a static and there will be many ImageBridgeParents

Apologies again for review spam.
Comment 6 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-15 23:10:41 PST
Comment on attachment 714756 [details] [diff] [review]
Part 2: Clean up ImageBridge

Damn you bz.
Comment 7 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-15 23:13:50 PST
The test case here is to apply the smoking gun patch and then launch and kill the camera over and over.  b2g process memory usage grows without bound.
Comment 8 Ben Turner (not reading bugmail, use the needinfo flag!) 2013-02-15 23:16:12 PST
Comment on attachment 714755 [details] [diff] [review]
Part 1: Destroy the CrossProcessCompositorParent channel

Looks fine to me. We should look into adding leak instrumentation to IPC::Channel and any other classes we can think of...
Comment 9 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-15 23:19:48 PST
New world's record! :)  Thanks much.

Yes, we should.  The problem is all that is in ipc/chromium ...  I'll file a bug though.
Comment 10 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-15 23:20:06 PST
https://tbpl.mozilla.org/?tree=Try&rev=ca044c974dd9
Comment 11 Ben Turner (not reading bugmail, use the needinfo flag!) 2013-02-15 23:21:42 PST
(In reply to Chris Jones [:cjones] [:warhammer] from comment #9)
> Yes, we should.  The problem is all that is in ipc/chromium ...

Well, instead of doing:

  typedef IPC::Channel Transport;

We could make our own class that inherits it and adds logging.
Comment 12 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-15 23:24:39 PST
That's an option.  I filed bug 842003 btw.
Comment 13 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2013-02-16 01:15:26 PST
Comment on attachment 714756 [details] [diff] [review]
Part 2: Clean up ImageBridge

Review of attachment 714756 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/ipc/ImageBridgeParent.h
@@ +51,5 @@
> +  MessageLoop* mMessageLoop;
> +  Transport* mTransport;
> +  // This keeps us alive until ActorDestroy(), at which point we do a
> +  // deferred destruction of ourselves.
> +  nsRefPtr<ImageBridgeParent> mSelfRef;

Why do we need this field? Can't we just NS_ADDREF(this)?
Comment 14 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-16 01:21:42 PST
Definitely.  But
 - that's what CrossProcessCompositorParent does
 - NS_ADDREF(this) creates a nonlocal invariant, so it's not obvious that the object holds a reference to itself

There's only one of these per process, so I don't think the sizeof(void*) memory difference is worth the programmer overhead.

I could certainly be convinced otherwise, though.
Comment 15 Nicholas Nethercote [:njn] (on vacation until July 11) 2013-02-16 12:42:12 PST
> New world's record! :)  Thanks much.

Nah;  bz once r+'d a patch of mine less than 4 minutes after I posted it :)
Comment 16 Ben Turner (not reading bugmail, use the needinfo flag!) 2013-02-17 08:50:04 PST
(In reply to Nicholas Nethercote [:njn] from comment #15)
> Nah;  bz once r+'d a patch of mine less than 4 minutes after I posted it :)

/me buries the medal in an undisclosed location, cries himself to sleep.
Comment 17 Nicholas Nethercote [:njn] (on vacation until July 11) 2013-02-17 13:38:35 PST
Actually I lie;  it was 4 minutes 31 seconds:  bug 826521 comment 25 :)
Comment 18 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-19 11:23:02 PST
(In reply to Chris Jones [:cjones] [:warhammer] from comment #10)
> https://tbpl.mozilla.org/?tree=Try&rev=ca044c974dd9

Infra decided to ignore this for some reason, so going to push on a wing and a prayer.
Comment 19 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2013-02-19 11:28:12 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/a865bbcdc06d
Comment 20 Ryan VanderMeulen [:RyanVM] 2013-02-20 04:25:45 PST
https://hg.mozilla.org/mozilla-central/rev/a865bbcdc06d

Note You need to log in before you can comment on or make changes to this bug.