Fix LayerTransactionChild, LayerTransactionParent leak

RESOLVED FIXED in Firefox 32

Status

()

defect
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: sotaro, Assigned: sotaro)

Tracking

unspecified
mozilla32
ARM
Gonk (Firefox OS)
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(blocking-b2g:1.4+, firefox30 wontfix, firefox31 wontfix, firefox32 fixed, b2g-v1.4 fixed, b2g-v2.0 fixed)

Details

Attachments

(2 attachments, 14 obsolete attachments)

5.11 KB, patch
nical
: review+
Details | Diff | Splinter Review
3.84 KB, patch
sotaro
: review+
Details | Diff | Splinter Review
+++ This bug was initially created as a clone of Bug #1000525 +++

This bug is created based on Bug 1000525 Comment 34.

During testing the following STR. LayerTransactionChild, LayerTransactionParent and ShadowLayerForwarder are created for each window.open(). But they are not destructed until the application is killed.

STR:
* Run the test app at bug 964386 attachment 8366455 [details] on master b2g device.
It blocks 1.4+ bug.
blocking-b2g: --- → 1.4?
Assignee: nobody → sotaro.ikeda.g
blocking-b2g: 1.4? → 1.4+
I confirmed that the window's TabParent::Destroy() and TabChild::Destroy() are called in the STR.
(In reply to Sotaro Ikeda [:sotaro] from comment #2)
> I confirmed that the window's TabParent::Destroy() and TabChild::Destroy()
> are called in the STR.

Correction:
TabParent::Destroy() and TabChild::RecvDestroy()
I did misunderstanding. ShadowLayerForwarder was correctly destroyed.
Summary: Fix LayerTransactionChild, LayerTransactionParent and ShadowLayerForwarder leak → Fix LayerTransactionChild, LayerTransactionParent leak
I understand why the leak happens.

LayerTransactionChild is reference counted IPC object. It is explicitly add-refed by CompositorChild. Therefore to release it, it need to be explicitly released. The relationship between LayerTransactionChild and ShadowLayerForwarder is very similar to TextureChild and TextureClient.
By applying the patch, I confirmed that LayerTransactionChild is destroyed on master nexus-5.
Attachment #8415890 - Flags: review?(nical.bugzilla)
Attachment #8415890 - Flags: review?(bjacob)
Attachment #8415890 - Flags: review?(bjacob) → review+
(In reply to Sotaro Ikeda [:sotaro] from comment #7)
> https://tbpl.mozilla.org/?tree=Try&rev=46532d199746

PLayerTransaction shutdown seems to have a problem. I am going to try PLayerTransaction shutdown change in Bug 1000525.
Apply LayerTransaction shutdown fix in Bug 1000525.
Attachment #8415890 - Attachment is obsolete: true
Attachment #8415890 - Flags: review?(nical.bugzilla)
Fix nits.
Attachment #8416089 - Attachment is obsolete: true
Attachment #8416098 - Attachment is obsolete: true
Add more IPC Open checks.
Attachment #8416271 - Attachment is obsolete: true
Add more IPC Open check.
Attachment #8416295 - Attachment is obsolete: true
No longer blocks: 1000525
This bug no longer blocks bug 1000525. This bug seems difficult than I thought at first. It seem better to handle separately from bug 1000525.
This bug is present since b2g v1.2. It seems to be added as part of new gfx layers. And Memory leak of LayerTransactionChild and LayerTransactionParent is small. We seem not have much necessity to fix this bug in b2g v1.4.
Set "1.4?" based on Comment 19.
blocking-b2g: 1.4+ → 1.4?
blocking-b2g: 1.4? → backlog
Attachment #8416309 - Attachment is obsolete: true
Attachment #8416798 - Attachment is obsolete: true
Attachment #8416799 - Attachment is obsolete: true
(In reply to Sotaro Ikeda [:sotaro] from comment #25)
> https://tbpl.mozilla.org/?tree=Try&rev=22888ff08911

There are some test failures. It happens when LayerTransactionChild is destroyed via CompositorChild::Destroy(). CompositorChild does not reference count LayerTransactionChild. It directly uses raw pointer. During CompositorChild calling LayerTransactionChild::Destroy(), LayerTransactionChild is destroyed by the following sequence.

nsBaseWidget::DestroyCompositor()
->CompositorChild::Destroy()
->LayerTransactionChild::Destroy()
->PLayerTransactionChild::Send__delete__()
->"destroy TextureChilds"
   This removes all reference to ShadowLayerForwarder.
->ShadowLayerForwarder::~ShadowLayerForwarder()
  ->remove reference to PLayerTransactionChild
->PLayerTransactionChild's ref count becomes 0
Fix how to call LayerTransactionChild::Destroy() from ShadowLayerForwarder::~ShadowLayerForwarder().
Attachment #8416830 - Attachment is obsolete: true
I also found another problem during destroying compositor. Before PTextureParent::Send__delete__() arrive to TexuteChild, TextureChild is already deleted.

TextureChild/TextureParent is destroyed like the following sequence.

TextureChild::SendRemoveTexture()
->TextureParent::RecvRemoveTextureSync()
->PTextureParent::Send__delete__()

During the destroying compositor, TextureChild::SendRemoveTexture()s are called. Then the parent sides send PTextureParent::Send__delete__(). But before the message arrival to the TextureChilds, they are already deleted. Then the following errors happen.

> ###!!! [Child][DispatchAsyncMessage] Error: Route error: message sent to unknown actor ID
(In reply to Sotaro Ikeda [:sotaro] from comment #27)
> Created attachment 8416911 [details] [diff] [review]
> patch v10 - Destroy LayerTransactionChild(r=bjacob)
> 
> Fix how to call LayerTransactionChild::Destroy() from
> ShadowLayerForwarder::~ShadowLayerForwarder().

For fixing, Task is used. But it hits the pending task check during shutdown.
> nsBaseWidget::DestroyCompositor()
> ->CompositorChild::Destroy()
> ->LayerTransactionChild::Destroy()
> ->PLayerTransactionChild::Send__delete__()
> ->"destroy TextureChilds"
>    This removes all reference to ShadowLayerForwarder.
> ->ShadowLayerForwarder::~ShadowLayerForwarder()
>   ->remove reference to PLayerTransactionChild
> ->PLayerTransactionChild's ref count becomes 0

The following corrects call sequence in Comment 26. 

CompositorChild::Destroy()
  ->LayerTransactionChild::Destroy()
    ->PLayerTransactionChild::Send__delete__()
      ->PLayerTransactionChild::DestroySubtree();
        ->PLayerTransactionChild::ActorDestroy()
      ->PLayerTransactionChild::DeallocSubtree();
        ->ShadowLayerForwarder::~ShadowLayerForwarder()
          ->LayerTransactionChild::Destroy()
            ->if (IPCOpen == true)
              ->PLayerTransactionChild::Send__delete__()
                -> ###!!! ABORT: actor has been |delete|d: !!!!!!!!!!!!!!!!!!!
      ->PCompositorChild::RemoveManagee()
        ->CompositorChild::DeallocPLayerTransactionChild()
          ->LayerTransactionChild::ReleaseIPDLReference()
            ->LayerTransactionChild::mIPCOpen = false;
                        // Set IPCOpen flag to false here !!!!!!!!!!!!!!!!!!
            ->LayerTransactionChild::Release();

Second LayerTransactionChild::Destroy() is called withing the first LayerTransactionChild::Destroy() call. When the second LayerTransactionChild::Destroy() is called, The "IPCOpen flag" is still true.
Remove Task usage for calling LayerTransactionChild::Destroy().
Attachment #8416911 - Attachment is obsolete: true
Add mDestroyed flag to LayerTransactionChild.
Attachment #8416937 - Attachment is obsolete: true
(In reply to Sotaro Ikeda [:sotaro] from comment #34)
> https://tbpl.mozilla.org/?tree=Try&rev=f512dc914d65

tryserver results seems good. attachment 8416958 [details] [diff] [review] still has a problem of Comment 28. But it seems not affect to the test results.
Attachment #8416958 - Flags: feedback?(nical.bugzilla)
Status: NEW → ASSIGNED
Comment on attachment 8416958 [details] [diff] [review]
patch v12 - Destroy LayerTransactionChild(r=bjacob)

Review of attachment 8416958 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.
Attachment #8416958 - Flags: feedback?(nical.bugzilla) → feedback+
(In reply to Sotaro Ikeda [:sotaro] from comment #28)
> I also found another problem during destroying compositor. Before
> PTextureParent::Send__delete__() arrive to TexuteChild, TextureChild is
> already deleted.
> 
> TextureChild/TextureParent is destroyed like the following sequence.
> 
> TextureChild::SendRemoveTexture()
> ->TextureParent::RecvRemoveTextureSync()
> ->PTextureParent::Send__delete__()
> 
> During the destroying compositor, TextureChild::SendRemoveTexture()s are
> called. Then the parent sides send PTextureParent::Send__delete__(). But
> before the message arrival to the TextureChilds, they are already deleted.
> Then the following errors happen.
> 
> > ###!!! [Child][DispatchAsyncMessage] Error: Route error: message sent to unknown actor ID

This looks like bug 966284.
> > 
> > During the destroying compositor, TextureChild::SendRemoveTexture()s are
> > called. Then the parent sides send PTextureParent::Send__delete__(). But
> > before the message arrival to the TextureChilds, they are already deleted.
> > Then the following errors happen.
> > 
> > > ###!!! [Child][DispatchAsyncMessage] Error: Route error: message sent to unknown actor ID
> 
> This looks like bug 966284.

I think that it is different. The above error caused by shutdown sequence difference between PCompositor/PLayerTransaction and PTexture. PCompositor and PLayerTransaction actually close the protocols from child side. But PTexture actually close from the parent side, though TextureChild triggers the protocol close from the child side by calling PTextureChild::SendRemoveTexture().

bug 966284 is about the problem of TextureParent::RecvRemoveTextureSync(). ipdl protocol seems not to expect to delete the protocol during receiving sync message.
Bug 966284 is about making sure that a manager protocol doesn't "interrupt" it's managees when it gets destroyed. I write "interrupt" as in force the managees to be destroyed synchronously while it's still in the middle of an asynchronous conversation or a shutdown handshake.
Un-bitrot.
Attachment #8416958 - Attachment is obsolete: true
Fix nits.
Attachment #8418122 - Attachment is obsolete: true
Update a comment.
Attachment #8418124 - Attachment is obsolete: true
Attachment #8418130 - Flags: review?(nical.bugzilla)
Attachment #8418130 - Flags: review?(nical.bugzilla) → review+
Sotaro, this still blocks bug 998504?
blocking-b2g: backlog → 1.4?
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Milan Sreckovic [:milan] from comment #45)
> Sotaro, this still blocks bug 998504?

This bug does not block bug 998504.
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Sotaro Ikeda [:sotaro] from comment #46)
> (In reply to Milan Sreckovic [:milan] from comment #45)
> > Sotaro, this still blocks bug 998504?
> 
> This bug does not block bug 998504.

It is commented in Comment 19.
(In reply to Sotaro Ikeda [:sotaro] from comment #46)
> (In reply to Milan Sreckovic [:milan] from comment #45)
> > Sotaro, this still blocks bug 998504?
> 
> This bug does not block bug 998504.

Thanks, just double checking; I removed the dependency.
So 1.4? -> backlog?
Flags: needinfo?(milan)
blocking-b2g: 1.4? → backlog
Flags: needinfo?(milan)
https://hg.mozilla.org/mozilla-central/rev/b196df92475e
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
This problem could cause file descriptors leak with bugs like Bug 988135, Bug 1006957. It seems better to be uplifted to b2g v1.4. But this bug's patch have a dependency to Bug 933082. It is a relatively big change :-(
Depends on: RefcntAllocator, 985302
Nominate to b2g v1.4 based on Comment 51.
blocking-b2g: backlog → 1.4?
(In reply to Sotaro Ikeda [:sotaro] from comment #51)
> But this bug's
> patch have a dependency to Bug 933082. It is a relatively big change :-(

The above part is my misunderstanding. It is already in b2g v1.4:-)
Patch for b2g v1.4. Carry r=nical,bjacob.
Attachment #8430837 - Flags: review+
blocking-b2g: 1.4? → 1.4+
You need to log in before you can comment on or make changes to this bug.