Closed Bug 959089 Opened 10 years ago Closed 10 years ago

Separate GrallocBuffer allocations from Compositor thread

Categories

(Core :: Graphics: Layers, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla32

People

(Reporter: chiajung, Assigned: chiajung)

References

Details

Attachments

(2 files, 38 obsolete files)

32.50 KB, patch
chiajung
: review+
Details | Diff | Splinter Review
64.10 KB, patch
chiajung
: review+
Details | Diff | Splinter Review
Attached patch WIP (obsolete) — Splinter Review
Currently, we handle all ImageBridge IPC on Compositor thread, however, Compositor thread may busy on other task. We want to create a dedicated thread to handle the buffer allocation.

To separate them we can just Open the ImageBridge protocol on another thread, but there are several problem here:
(1) If we use only ImaegBridge to create GrallocBuffer, we can not pass it on LayerTransaction in current protocol design.
(2) To use the GrallocBuffer with more freedom, GrallocBuffer should only be used as IPC message not IPC actor(In fact, it is a weird actor that has no action between parent and child, and the actor does not show the buffer life cycle, since the underlying buffer has its own ref count)

Current implementation can not hold the sp in RecvCreateGrallocBuffer and cause some problem(the GraphicBuffer will released on return). Originally the GrallocBufferActor will hold it until actor and all reference die(actor is hold by top level protocol), so it is safe to send the GrallocBuffer by IPC, To solve my problem, I have 2 ideas:
(1) hold the GraphicBuffer in ImageBridgeParent, and Send another IPC message from Child to release it.
(2) follow Android implementation: return the buffer on another API call.(dequeueBuffer + requestBuffer in android::BufferQueue)
Attached patch working WIP need major clean up (obsolete) — Splinter Review
Attachment #8359112 - Attachment is obsolete: true
Component: General → Graphics: Layers
(In reply to Chiajung Hung [:chiajung] from comment #0)
> (1) If we use only ImaegBridge to create GrallocBuffer, we can not pass it
> on LayerTransaction in current protocol design.

I feel it should not the intent of the bug. If it is the problem. The solution should be buffer type independent. Not only for gralloc buffer.
In near futre, gfx layer is going to move to MozSurface. The MozSurface is a platform independent rendering buffer abstraction. It is similar to TexutreClient/TextureHost. Architectural consistency to MozSurface becomes necessary.
https://wiki.mozilla.org/Platform/GFX/Surfaces
I agree with sotaro, It's important to design things in a cross-platform way, otherwise a lot more effort needs to be carried afterwards to fit things in the other platforms.

The story of GrallocBufferActor as I understand it is that it's main purpose was to make sure we don't pass the gralloc buffer's file descriptor every time we refer to it in IPDL messages because its serialization/deserialization is expensive (or something along these lines). So the buffer is wrapped in an actor and we can refer to a given texture in IPDL messages through the actor.
PTexture does the same thing but for every texture types instead of just gralloc. This means that GrallocBufferActor is now redundant, but we haven't yet got around removing it.

I don't know for sure what you mean in (2) by using the buffer as IPC message instead of using an actor, but if you meant remove wrapping textures by  actors (such as PGrallocBufferActor and PTexture), this will cause problems if the IPDL protocols are chatty about the texture (if you reuse/update the texture) which is typically the case with content layers, or if the same video stream is being shown in several layers.

So now that we have actors around textures, the main limitation is that you can't use actors created for a certain IPDL channel with another channel.
Therefore, textures that will be used with ImageBirdge MUST be created by something related to ImageBridge (or at least its channel) while textures that will be used in main thread transactions MUST be created by something related to PLayerTransaction (or its PIDL channel). And you can't mix the two.
I don't think there is any easy way to get around this, and I don't think there is a real need to get around it anyway.

What would be nice is to not have to wait for the compositor thread to finish compositing to get an answer for the gralloc allocation request.
Sotaro, could you post a clear plan here? We should try to make progress here.
I agree to make the plan cross-platform. My real goal for this bug is to allocate buffer and not to wait for composition. I think the threading model is not discussed in the wiki page after read it briefly. 

My origin patch is only create a thread and make all allocation goes there. The reason the patch becomes big and complex is due to the actor binding problem :nical mentioned. Once we have a clever way to fix that(MozSurface?), my patch should be simple and clean.

It seems the MozSurface does not handle the actor-tree binding problem I met in my patch. To meet my origin goal, I think we will have to handle the actor-tree binding problem in the end. And I believe we may need some change to handle the buffer mapping(to avoid GraphiaBuffer serialization and not depend on IPDL management) that similar to what I had done in the patch. If that is true, I feel this patch is just like a half way solution toward MozSuface.
Can we do this incrementally and land this and then morph it into MozSurface?
Blocks: b2g-tiling
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
(In reply to Andreas Gal :gal from comment #8)
> 
> *** This bug has been marked as a duplicate of bug 939348 ***

I read the patch in bug 939348, that patch handles the actor-tree binding problem by create another actor to wrap the original one, but still call "new GraphicBuffer" on CompositorThread. If that patch landed, I can fix the threading part in this bug later with a small patch.
Where do we call new GraphicBuffer on the compositor thread with that patch?
(In reply to Andreas Gal :gal from comment #10)
> Where do we call new GraphicBuffer on the compositor thread with that patch?

Since ImageBridgeParent runs on CompositorThread now. And we found some case thatpatch may hard to handle:

CompositorThread: |<########Frame Time 1#########>|----|<########Frame Time 2#########>|
ImageParentThread:|<------------------------------|<##RecvAllocPGrallocBuffer##>---|
ImageBridgeChild: |<##SendAllocPGrallocBuffer##>|--------------------------------------|
CompositorChild:  |<---------------------------------------------------------------|<Create Wrapper actor>|

The Compositor may not respond the wrapper actor creation request until some frames later. And the ISurfaceAllocator request still wait a long time before return. 

I think the only solution for such problem is to not treat a buffer as a actor(remove PGrallocBuffer protocol, and try to use GraphicBuffer another way, which is the ugly part in my WIP patch).
We don't need PGrallocBuffer anymore (since PTexture landed). The reason PGrallocBuffer still exists is that it is still used all over gecko but we shoud fix that.
Ideally, one would create a cralloc buffer, put it in a GrallocTextureClientOGL only hold references to the TextureClient instead of holding references to the gralloc buffer or a PGrallocBuffer Actor. Then, the PTexture actor will be created lazily and asynchronously when we send the buffer to the compositor.
We really have to clean up the insane *Deprecated* mess in our code base. We keep adding more paths but we don't seem to be removing any.
(In reply to Nicolas Silva [:nical] from comment #12)
> We don't need PGrallocBuffer anymore (since PTexture landed). The reason
> PGrallocBuffer still exists is that it is still used all over gecko but we
> shoud fix that.
> Ideally, one would create a cralloc buffer, put it in a
> GrallocTextureClientOGL only hold references to the TextureClient instead of
> holding references to the gralloc buffer or a PGrallocBuffer Actor. Then,
We hold the reference in GrallocTextureClientOGL::mBuffer, right? 
> the PTexture actor will be created lazily and asynchronously when we send
> the buffer to the compositor.

PTexture state that it is a glue for TextureHost/TextureClient. The memory allocation of TextureClient is rely on  ISurfaceAllocator, so ideally, since PTexture use SurfaceDescriptor, we have to replace PGrallocBuffer to MaybeMagicGrallocBufferHandle in SurfaceDescriptorGralloc. (like my WIP)

Say the function named ImageBridgeParent::CreateGrallocBuffer, while implement the function, we may write
{
  sp<GraphicBuffer> result = new GraphicBuffer(); // refCount =1
  return true;
} // refCount = 0, buffer die.

To handle the problem, there are 4 possible solutions:
(1) PGrallocBuffer(current version, BAD)
(2) let PTexture allocate GraphicBuffer and hold the reference on Parent side(which is awkward, since PTexture should not hold any buffer, by design)
(3) hold all buffer created on ImageBridge(like my WIP)
(4) Android solution: create a temp binder object to hold the reference and drop it when IPC done(How to do that in our IPC, maybe use PGrallocBuffer, and kill the actor as soon as child side receive the buffer?):
https://android.googlesource.com/platform/frameworks/native/+/jb-dev/libs/gui/IGraphicBufferAlloc.cpp, Line 107 and Line 58
I am planning to create another top-level protocol, called PGrallocBufferManager, to handle buffer allocation/deallocation tracing memory usage and provide the ability to check buffer ownership while using GrallocBuffer cross process. And I will remove PGrallocBuffer protocol as well.

Basically my WIP patch is almost there, I will clean it up and make it done here.
A couple notes from our meeting:

Each content process will instantiate its own PGrallocBufferManager in the parent process, running on a dedicated thread per child (so children don't contend over allocation).

Allocate() will return a magic gralloc buffer handle, which it also registers in the PGrallocBufferManager, along with an index ("ref") of the gralloc buffer. The client will only send that ref to the parent when talking about this buffer.

The child gets the magic gralloc handle and keeps that via sp<> alive in any PTexture referencing that buffer. The actual PTexture protocol uses the "ref" (integer index) when talking about the buffer.

On the child side the "ref" will be wrapped into a reference counted object to make sure we probably notify the parent in the destructor to drop the buffer (Dealloc(ref)).

If the child dies, the dedicated PGrallocBufferManager is destroyed and in its destructor it frees all currently pending gralloc buffers that the parent holds for the child.

Each PGrallocBufferManager holds the list of buffers for that client and the "ref" is relative to this list. This serves as security check. Children can't refer with a "ref" to any other gralloc buffer than what they allocated themselves.
The smart object wrapper around "ref" in the client is what should hold the sp<> pointer as well. That will guarantee sane lifetime management. Maybe "SurfaceDescriptorGralloc" is the right thing for this.
Sounds like a great plan.
Please keep me in the loop, for reviews in particular.

Most of the deprecated mess removal is now blocked on getting ThebesLayers to play well with the new Textures without regressing perfs, which is getting very close now. For non-thebes stuff, we are always using the new textures (I know that video/GonkNativeWindow has its share of gralloc-specific complications, so no need to consider deprecated textures in video code anymore).
Some note here if anyone want to try my WIP: 
Camera/Video(MP4) will crash system since ImageBridgeParent Thread trying to use GLContext on other thread than CompositorThread when RecvUpdate. Since GLContext is kind of thread local. The root cause is mentioned by :nical in comment 21 of Bug 939348. A common error message of this is:

01-28 09:34:11.117   494   524 E Adreno200-EGL: <qeglDrvAPI_eglMakeCurrent:2693>: EGL_BAD_ACCESS
01-28 09:34:11.117   494   524 E libEGL  : eglMakeCurrent:678 error 3002 (EGL_BAD_ACCESS)

I am cleaning up the WIP, and next WIP will not contain threading change or make it thread-safe to make Camera happy.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Attached patch working WIP, fixed Camera app (obsolete) — Splinter Review
Remove threading changes, which fix the camera crash on B2G process. Since the allocation is still on CompositorThread, this is just a temporary patch to demonstrate the basic idea of how PGrallocBufferManager works.
As described in the bug comments this creates a per-content process thread in the parent to allocate gralloc buffers. The parent returns a pair (reference, buffer). The reference should be sent in all future communication to the parent. The parent maintains a mapping reference->buffer-in-parent so it can look up which buffer was meant. The child has to explicitly deallocate the buffer. For this we introduce a new abstraction GrallocBuffer which should be helpd alive in the child for as long as the buffer is needed. Its ref counted and will send a message to the parent to deallocate the buffer there once the last refernece is dropped. We have to thread GrallocBuffer through the client side to make sure its kept alive properly. GrallocBuffer should be converted to a android::GraphicBuffer only right at use. We shouldn't hold references to that GraphicBuffer on the heap.
Attachment #8361574 - Attachment is obsolete: true
Attachment #8366470 - Attachment is obsolete: true
Attached patch My WIP version (obsolete) — Splinter Review
This WIP should work for most case except camera. And it is still not complete.

However, I want to document some detail about why the patch so complex here:

The goal of this patch is to make GraphicBuffer a pure IPC message rather than actor, but there are some problem during implement:

(1) IPC limits fd to 4 for each Message, and each GraphicBuffer has at least 2 fd inside: The LayerTransactions may carry more than possible and crashes IPC. To change that, I will have to redesign the whole layer system. To overcome the problem, the Android design comes to my mind: serialize once and use index to indicating which buffer to use. 

(2)Android IPC is a single way IPC, it always goes from client(Bp) to server(Bn) but we may try to Send__delete from Host side[1], which is problematic in this patch, since the buffer does not associate to the actor in any form, there is no way to send such message[2] to child side. I will fix this in next patch. I will try to support it rather than fix our buffer management, which seems much more complex than this patch. 

[1]http://dxr.mozilla.org/mozilla-central/source/gfx/layers/composite/ContentHost.h#440 and http://dxr.mozilla.org/mozilla-central/source/gfx/layers/composite/TextureHost.cpp#297
[2]http://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp#371

And since I didn't notice :gal's patch above, I will fix my version first, and if his version is preferred, I will complete this work based on his patch.
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(gal)
Wow, this is indeed quite a patch! I didn't have the courage to thoroughly read it, so my following comments may be somewhat off.

(In reply to Chiajung Hung [:chiajung] from comment #22)
> The goal of this patch is to make GraphicBuffer a pure IPC message rather
> than actor, but there are some problem during implement:
> 
> (1) IPC limits fd to 4 for each Message, and each GraphicBuffer has at least
> 2 fd inside: The LayerTransactions may carry more than possible and crashes
> IPC. To change that, I will have to redesign the whole layer system. To
> overcome the problem, the Android design comes to my mind: serialize once
> and use index to indicating which buffer to use. 

Indeed, passing a gralloc FD pair back and forth between processes in each transaction is not a good idea. We do have IPDL actors to that effect, though. Sotaro is about to enable the new ContentClient for B2G so we will not be using the deprecated textures on b2g very soon. This means that you can use TextureClient/TextureHost which use the PTexture protocol.

If I remember Andreas' plan correctly, the idea was to create the GraphicBuffer on the compositor process (for security reasons), store it in a map, send a message to the client side that contains the serialized gralloc buffer FDs along with the ID of the buffer in the map.

At this point on the parent side we have a GraphicBuffer and an ID, and on the child side we have also a GraphicBuffer and an ID. this mean that any message between the child and the host, that needs to refer to that GraphicBuffer, can just use the ID instead and not suffer from the overhead and limits around FDs. Andreas's plan is more detailed in this area, but the important part I am underlining here is that we only ever need to serialize the FD once.

Now with the non-deprecated TextureClient/Host, we have the PTexture protocol that has a similar role to this gralloc ID. When a GrallocTextureClientOGL wants to be shared shared with the compositor side, it creates an PTextureChild/Parent pair, passing a SurfaceDescriptor in the constructor message. In this message you then only need to pass the ID since the GraphicBuffer already exists on both sides. Then when the rest of layers code want to refer to that texture across IPC, it will use the PTextre actor the same way it does for all other texture types (shmem, D3D11 textures, etc.).

Where I am getting at, is that PGrallocBufferActor is an obsolete abstraction. PTexture fills the same role, except that it does it uniformly on all texture types and all platforms which greatly simplifies cross platform layers code. So if you are massively rewriting gralloc buffer management, I would suggest that you remove PGrallocBufferActor in the process because it is now pretty useless. It will be much easier to remove it once Sotaro lands the ContentClient patch (that got R+'ed this morning). If removing PGrallocBuffer actor is adding a lot of work, then my suggestion is to rely on it as little as possible so that we can later remove it.

> 
> (2)Android IPC is a single way IPC, it always goes from client(Bp) to
> server(Bn) but we may try to Send__delete from Host side[1], which is
> problematic in this patch, since the buffer does not associate to the actor
> in any form, there is no way to send such message[2] to child side.

Once the GraphicBuffer is wrapped in a TextureClient/Host pair you do have an actor (PTexture) which has all the __delete__ ActorDestroy and friends, that is needed to support managing the lifetime of the buffer, and recover from child side crashes. Also the PGralocBufferManager protocol can manage the Child crash scenario in its own ActorDestroy method (which is better because a child process crash could happen before the PTexture pair is setup).


> I will
> fix this in next patch. I will try to support it rather than fix our buffer
> management, which seems much more complex than this patch. 
> 
> [1]http://dxr.mozilla.org/mozilla-central/source/gfx/layers/composite/
> ContentHost.h#440 and
> http://dxr.mozilla.org/mozilla-central/source/gfx/layers/composite/
> TextureHost.cpp#297
> [2]http://dxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/
> ShadowLayerUtilsGralloc.cpp#371
> 
> And since I didn't notice :gal's patch above, I will fix my version first,
> and if his version is preferred, I will complete this work based on his
> patch.
Flags: needinfo?(nical.bugzilla)
I feel that removing PGrallocBufferActor should be outside the scope of this bug. At first, this problem could be fixed by PTexture protocol and PGrallocBufferActor. If we want to change the way of architecture, it should be step by step. The patch try to do a lot of thing at once. Removing PGrallocBufferActor should be handled by other bug as a context of MozSurface.
@nical,

Yes, in my current patch, SurfaceDescriptor allocated from GrallocBufferManager does not contain the real buffer but an integer(index). The problem here is that many class try to store the underlying buffer or the PGrallocActor in origin design by GrallocBufferActor::GetFrom. For example, GonkNativeWindow and GrallocTextureClient/Host. And I just want leave some detail about it to make any other not familiar with this part easy to read. I think I can file another bug for "Not store sp<GraphicBuffer> and PGrallocBufferActor except the buffer owner(TextureClient/TextureHost)" and make it blocks this one.

And as the buffer management design seems prefer release buffer from parent side, Send__delete from parent side should be supported. To do that, in my next patch, all buffer reference(index) runs on IPC channel will have a origin/owner field to make it possible to find which ManagerParent should Send__delete, and this make check the LayerTransaction/ImageBridge buffer usage very simple: If the BufferManagerParent get a buffer but can not find it in his buffer list, than bail out.

@sotaro
I agree kill the PGrallocBufferActor should outside the scope of this bug, but if we remove it and no good displacement for it, B2G will just not work or runs slowly. Remove PGrallocBufferActor and find a displacement is quiet complex and I think my previous patch maybe almost minimal to make it. The only problem of that patch is we still Send__delete from Parent side by DeprecatedTextureHost, which should hopefully fixed by your patch in another bug.

Correct me if I am wrong: I think PTexture is created from Child side to associate TextureHost/Client pair. And once we switch to new TextureClient, TextureHost will be created only when TextureParent::RecvInit which carries the buffer to be shared. And PTexture must be deleted from parent side from the IPDL, but I can not find any point really call to it by search DXR. Assuming that I missed something, someone calls PTextureParent::Send__delete, then how can we delete the buffer from the right BufferManager(GrallocBridge in gal's version) pairs who really hold the buffer?
(In reply to Chiajung Hung [:chiajung] from comment #25)
> @nical,
> 
> Yes, in my current patch, SurfaceDescriptor allocated from
> GrallocBufferManager does not contain the real buffer but an integer(index).
> The problem here is that many class try to store the underlying buffer or
> the PGrallocActor in origin design by GrallocBufferActor::GetFrom. For
> example, GonkNativeWindow and GrallocTextureClient/Host.

GrallocTextureClientOGL only store the GrallocBufferActor to not leak it, but it effectively only use it once to get the android::GraphicBuffer and then it will use the GraphicBuffer directly.
GonkNativeWindow is probably trickier, though.

> And I just want
> leave some detail about it to make any other not familiar with this part
> easy to read. I think I can file another bug for "Not store
> sp<GraphicBuffer> and PGrallocBufferActor except the buffer
> owner(TextureClient/TextureHost)" and make it blocks this one.

good idea

> 
> And as the buffer management design seems prefer release buffer from parent
> side, Send__delete from parent side should be supported. To do that, in my
> next patch, all buffer reference(index) runs on IPC channel will have a
> origin/owner field to make it possible to find which ManagerParent should
> Send__delete, and this make check the LayerTransaction/ImageBridge buffer
> usage very simple: If the BufferManagerParent get a buffer but can not find
> it in his buffer list, than bail out.

sounds good

> 
> @sotaro
> I agree kill the PGrallocBufferActor should outside the scope of this bug,
> but if we remove it and no good displacement for it, B2G will just not work
> or runs slowly. Remove PGrallocBufferActor and find a displacement is quiet
> complex and I think my previous patch maybe almost minimal to make it. The
> only problem of that patch is we still Send__delete from Parent side by
> DeprecatedTextureHost, which should hopefully fixed by your patch in another
> bug.

Now that Sotaro landed bug 947620, I think we don't use GrallocDeprecatedTextureClient anymore. I think you don't have to worry about the deprecated stuff anymore.

> 
> Correct me if I am wrong: I think PTexture is created from Child side to
> associate TextureHost/Client pair.

Yes

> And once we switch to new TextureClient,
> TextureHost will be created only when TextureParent::RecvInit which carries
> the buffer to be shared. And PTexture must be deleted from parent side from
> the IPDL

Yes

The child side sends the PTexture::RemoveTexture message and the host side sends __delete__ when it receives it.
Depends on: 972772
(In reply to Nicolas Silva [:nical] from comment #26)
> (In reply to Chiajung Hung [:chiajung] from comment #25)
> > @nical,
> > 
> > Yes, in my current patch, SurfaceDescriptor allocated from
> > GrallocBufferManager does not contain the real buffer but an integer(index).
> > The problem here is that many class try to store the underlying buffer or
> > the PGrallocActor in origin design by GrallocBufferActor::GetFrom. For
> > example, GonkNativeWindow and GrallocTextureClient/Host.
> 
> GrallocTextureClientOGL only store the GrallocBufferActor to not leak it,
> but it effectively only use it once to get the android::GraphicBuffer and
> then it will use the GraphicBuffer directly.
> GonkNativeWindow is probably trickier, though.
> 
> > And I just want
> > leave some detail about it to make any other not familiar with this part
> > easy to read. I think I can file another bug for "Not store
> > sp<GraphicBuffer> and PGrallocBufferActor except the buffer
> > owner(TextureClient/TextureHost)" and make it blocks this one.
> 
> good idea
it is 972772 
> 
> > 
> > And as the buffer management design seems prefer release buffer from parent
> > side, Send__delete from parent side should be supported. To do that, in my
> > next patch, all buffer reference(index) runs on IPC channel will have a
> > origin/owner field to make it possible to find which ManagerParent should
> > Send__delete, and this make check the LayerTransaction/ImageBridge buffer
> > usage very simple: If the BufferManagerParent get a buffer but can not find
> > it in his buffer list, than bail out.
> 
> sounds good
> 
> > 
> > @sotaro
> > I agree kill the PGrallocBufferActor should outside the scope of this bug,
> > but if we remove it and no good displacement for it, B2G will just not work
> > or runs slowly. Remove PGrallocBufferActor and find a displacement is quiet
> > complex and I think my previous patch maybe almost minimal to make it. The
> > only problem of that patch is we still Send__delete from Parent side by
> > DeprecatedTextureHost, which should hopefully fixed by your patch in another
> > bug.
> 
> Now that Sotaro landed bug 947620, I think we don't use
> GrallocDeprecatedTextureClient anymore. I think you don't have to worry
> about the deprecated stuff anymore.

So, do you think it is a good idea to merge my previous patch to remove PGrallocBuffer protocol in another bug, and concentrate on implementation of new protocol here?
> 
> > 
> > Correct me if I am wrong: I think PTexture is created from Child side to
> > associate TextureHost/Client pair.
> 
> Yes
> 
> > And once we switch to new TextureClient,
> > TextureHost will be created only when TextureParent::RecvInit which carries
> > the buffer to be shared. And PTexture must be deleted from parent side from
> > the IPDL
> 
> Yes
> 
> The child side sends the PTexture::RemoveTexture message and the host side
> sends __delete__ when it receives it.

I read it and found PTexture is quiet handy. The PTexture is a "link" between TextureClient/Host, and once we do not need the link, just drop the link and keep the buffer until TextureClient go out of reference.
No longer depends on: 972772
Depends on: 972772
(In reply to Chiajung Hung [:chiajung] from comment #27)
> So, do you think it is a good idea to merge my previous patch to remove
> PGrallocBuffer protocol in another bug, and concentrate on implementation of
> new protocol here?

I think you should do it the way that makes sense / that is practical for you. Sorry I didn't mean to be too invasive and dictate how you should proceed. I originally mentioned the PGrallocBufferActor business because it is something that we want to get rid of at some point so I wanted to make sure you knew there was the more recent and better system (PTexture and friends) and that you didn't base your work on PGrallocBuffer too much.

It's indeed a good idea to treat the removal of PGrallocBufferActor and the addition of GrallocBufferManager as separate bugs if they don't completely overlap.
My only concern is just about to create huge module only for b2g. If there is a enough reason for it, it is OK. But gfx layer is gecko's system, therefore we have to use gecko's platform independent framework as much as possible. In current architecture, PTexture is a correct tool to solve this problem. And need to try to not make huge change at once. By divide the problem to small problems, we can discuss how we can fix the problem simpler.
Use my patch if you want. I think its a bit cleaner in a few areas and address the problem of holding the gralloc buffer on the compositor side until we took a ref on it in the client. I recommend merging the parts that make sense. Otherwise the rule is: if you pick up the work, you decide what you do. Feel free to just obsolete my patch if that seems easiest. Thanks for doing this work!
Flags: needinfo?(gal)
(In reply to Sotaro Ikeda [:sotaro] from comment #29)
> My only concern is just about to create huge module only for b2g. If there
> is a enough reason for it, it is OK. But gfx layer is gecko's system,
> therefore we have to use gecko's platform independent framework as much as
> possible. In current architecture, PTexture is a correct tool to solve this
> problem. And need to try to not make huge change at once. By divide the
> problem to small problems, we can discuss how we can fix the problem simpler.

I will split this patch into several parts before review:
(1) protocol implementation
(2) use new protocol instead PCompositor/PImageBridge
(3) remove gralloc allocation ability from PCompositor/PImageBridge
(4) remove PGrallocBuffer and GrallocBufferActor

The problem here is not B2G-specific, but only B2G suffer from it currently. I think once OOP is enabled on other platform, the Shmem object others use may have similar problem then. Then we can make this protocol a centralized ISurfaceAllocator for all platform. (Though I currently have no idea about how to share Shmem object cross protocol like Gralloc)

PTexture is currently bind to its managers like PGrallocBuffer now. So I think it can not solve the allocation problem but OK for buffer sharing problem. 

Another idea to solve the problem would be proxy the PCompositor's message thread, and make Composition thread dedicate for composition. And let the MessageProxy thread dispatch tasks to different handler thread besed on the message type, which need handle some sync problem.
Comment #31 looks good. This is a strict improvement and we can make it more general later.
(In reply to Chiajung Hung [:chiajung] from comment #31)
> (In reply to Sotaro Ikeda [:sotaro] from comment #29)
> > My only concern is just about to create huge module only for b2g. If there
> > is a enough reason for it, it is OK. But gfx layer is gecko's system,
> > therefore we have to use gecko's platform independent framework as much as
> > possible. In current architecture, PTexture is a correct tool to solve this
> > problem. And need to try to not make huge change at once. By divide the
> > problem to small problems, we can discuss how we can fix the problem simpler.
> 
> I will split this patch into several parts before review:
> (1) protocol implementation
> (2) use new protocol instead PCompositor/PImageBridge
> (3) remove gralloc allocation ability from PCompositor/PImageBridge
> (4) remove PGrallocBuffer and GrallocBufferActor

souds good!

> 
> The problem here is not B2G-specific, but only B2G suffer from it currently.
> I think once OOP is enabled on other platform, the Shmem object others use
> may have similar problem then. Then we can make this protocol a centralized
> ISurfaceAllocator for all platform. (Though I currently have no idea about
> how to share Shmem object cross protocol like Gralloc)

I don't think we can do that with our current shmem abstraction (it's sorta baked in IPDL) but we'll fix that problem if and when it turns out to be a problem :). You could maybe rename PGrallocBufferManager into something more generic such as PBufferAllocationManager or whatever sounds right to you, if the intent is to make the protocol not b2g-specific.

> 
> PTexture is currently bind to its managers like PGrallocBuffer now. So I
> think it can not solve the allocation problem but OK for buffer sharing
> problem. 

Yes, PTexture is meant to help with managing the lifetime of shared textures, it doesn't help with allocating on a separate thread.

> 
> Another idea to solve the problem would be proxy the PCompositor's message
> thread, and make Composition thread dedicate for composition. And let the
> MessageProxy thread dispatch tasks to different handler thread besed on the
> message type, which need handle some sync problem.

That would be very hard to do without breaking a lot of code. The whole layer transaction model is based on the idea that there is no composition happening when transactions are applied. So we'd need to do some synchronization between the compositor and the proxy that would probably void some of the advantages of proxying in the first place.
Attached patch Full WIP (obsolete) — Splinter Review
This patch implement the full function:
(1) Implementation for new protocol
(2) Replace the old path with new protocol
(3) Remove old path
(4) Remove old protocol

It needs major clean up and rebase and split up. The difference from previous 1 is: 
(1) Add owner id into GrallocBufferRef messages (which is useful to check buffer ownership, and make it easy to track the gralloc buffer usage for each process)
(2) Make delete "both:" compatible

Next version may rebase to newer code base, and/or hook to memory report as :gal does.
Attachment #8375330 - Attachment is obsolete: true
It might be better that GrallocBridgeChild/GrallocBridgeParent are platform independent name.
Attached patch Full WIP--Rebased (obsolete) — Splinter Review
Rebase to newer codebase, next version will be a cleaned up version and or splitted up and protocol name changing. And some work may goes to preliminary bug.
I'm working on another patch where I'm perf-bound to graphic buffer allocation.  I applied the WIP patch here, and it seems to work well, except that something is still holding on to GraphicBuffers even after a call to Parent::DropGraphicBuffer.  I added some refcount logging, and often see:

E/GrallocManagerParent( 1944): Dropping index: 244 pre ref 3
E/GrallocManagerParent( 1944): Dropping index: 244 post ref 2, and sending
E/GrallocManagerChild( 2295): Dropping bufferIndex: 244 pre ref 2
E/GrallocManagerChild( 2295): Dropping bufferIndex: 244 post ref 1
...
E/GrallocManagerParent( 1944): Dropping index: 280 pre ref 4
E/GrallocManagerParent( 1944): Dropping index: 280 post ref 3, and sending
E/GrallocManagerChild( 1944): Dropping bufferIndex: 280 pre ref 4
E/GrallocManagerChild( 1944): Dropping bufferIndex: 280 post ref 3

"post ref" is the refcount after it was removed from the list; it should be 1, which is the sp<GraphicBuffer> that I added local to that function so that I can look at the ref.  So the child will delete the GraphicBuffer right away, the parent will not because something still has a ref.

Note that I am creating and deleting *lots* of GraphicBuffers in quick succession here.  So it's going to exercise all the edge cases of this.  The good news is that all of this is very fast, much faster than going through the compositor.  My testcase on a Nexus 4 easily hits 60fps for the 3-4 seconds before it crashes, compared to ~35fps before this patch.
Should GrallocBufferManagerChild::DropGrallocBuffer remove from mBuffers?  It doesn't currently, seems like that would cause the buffer to always be kept on the child if DropGrallocBuffer is called on the child (since all it does is send the message to remove it on the parent).
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #38)
> Should GrallocBufferManagerChild::DropGrallocBuffer remove from mBuffers? 
> It doesn't currently, seems like that would cause the buffer to always be
> kept on the child if DropGrallocBuffer is called on the child (since all it
> does is send the message to remove it on the parent).

Ooops, looks like I forget something there :)

Thank for your great report, I will fix it in next WIP version.
Next WIP version will contain the fix and/or hook to memory report (maybe count by process).
Attached patch Full WIP2 (obsolete) — Splinter Review
Fix the leak :vlad mentioned above, and hook to memory report.

As I have no idea about how to make memory report more precise now, I leave it same with before(count globally).
Attachment #8369215 - Attachment is obsolete: true
Attachment #8377394 - Attachment is obsolete: true
Attachment #8379614 - Attachment is obsolete: true
Blocks: 950112
blocking-b2g: --- → 1.5?
Great, thanks Chiajung!  One more bug report.  With your patch applied, I'm seeing a crash when a buffer that I should have ends up being null in the list.  With tiling on, I'm also seeing an occasional memory corruption crash, but let's ignore that one for now (though it could be related).

https://pastebin.mozilla.org/4391718 has a log.  Basically, the client tries to get buffer 223, which it already knows about, but it returns NULL.  It received it before and added it to its list, but for some reason it's null by the time it tries to get it.
Oh I should say -- steps to reproduce, build nexus-4-kk, open settings app, scroll up/down in a long list (e.g. Developer settings).
Also, in the log above -- 3440 is the b2g/compositor, 3762 is the client app.  So the unexpected NULL is on the compositor side.
More detail on that issue:

E/GrallocManagerParent( 7239): allocated buffer id 202 (768x2595 fmt 1 usage 307), client: 7573 - buffer 0xaed15680 - number of buffers for this client: 10
I/GrallocBufferRef serializer( 7239): Serialize for buffer:(7573, 198)
I/GrallocBufferManagerChild( 7573): Add buffer 202 to list (ptr 0xb0b33700)
I/Gralloc Utils( 7573): Try get buffer: 202 -> 0xb0b33700
I/GrallocBufferRef serializer( 7573): Serialize for buffer:(7573, 202)
I/GrallocBufferRef deserializer( 7573): Deserialize for buffer:(7573, 198)
E/GrallocManagerChild( 7573): Dropping bufferIndex: 198 pre ref 2 [parent fully done 1]
E/GrallocManagerChild( 7573): Dropping bufferIndex: 198 post ref 1
E/GrallocManagerParent( 7239): Dropping index: 197 refcnt 2
I/GrallocBufferRef serializer( 7239): Serialize for buffer:(7573, 197)
I/GrallocBufferRef deserializer( 7573): Deserialize for buffer:(7573, 197)
E/GrallocManagerChild( 7573): Dropping bufferIndex: 197 pre ref 2 [parent fully done 0]
E/GrallocManagerChild( 7573): Dropping bufferIndex: 197 post ref 1
I/GrallocBufferRef deserializer( 7239): Deserialize for buffer:(7573, 201)
E/GrallocBufferManagerParent( 7239): GetGraphicBuffer 201, mOwningBuffer size 7
I/Gralloc Utils( 7239): Try get buffer: 201 -> 0xaed15380
I/GrallocBufferRef deserializer( 7239): Deserialize for buffer:(7573, 202)
E/GrallocBufferManagerParent( 7239): GetGraphicBuffer 202, mOwningBuffer size 7
I/Gralloc Utils( 7239): Try get buffer: 202 -> 0x0

Note that when buffer 202 is allocated, the number of buffers for the client is 10.  That is, mOwningBuffer.size() == 10.  But later when we call GetGraphicBuffer(202), mOwningBuffer.size() == 7!  Indeed, if I inspect the mOwningBuffer vector, I find the missing entry for 202 a few items later in the vector.  But I can't tell what might have deleted things along the way!

Also -- GrallocBufferManagerParent::GetGraphicBuffer in the cpp file, it takes a int, not ProcessId.  (Should really be uint32_t as index everywhere, make 0 mean "no index".)
Mystery solved.  logcat -v threadtime tells all.

GraphicBuffer allocation only happens on the GrallocManagerParent thread.  However, any thread (e.g. compositor) can call DropGrallocBuffer!  If it does, more than one thread is trying to modify the mOwningBuffer array, and the world gets out of sync.  The easy fix is to have a mutex around mOwningBuffer access.  I'll attach a patch shortly that has all of my hacks on top of the current patch.
Attached patch Fixups, on top of WIP1 (obsolete) — Splinter Review
Here's my set of fixes on top of WIP #1.  This has:

- KitKat unflatten fixes
- Mutexes for Child/Parent when accessing the vector

- start of a "fully done" indicator; this can be done as a followup, it just got included in this diff.  (The idea of this is that if one side knows its refcnt is 1, it can tell the other side "hey I'm fully done with this"; the other side can then do smart things to perhaps reuse/cache the GraphicBuffer for a future allocation request.  Either side can indicate this, so the caches should be on both sides [or explicitly chosen to just be on one side, maybe via one of the sides "giving back" a buffer if it's fully done])

The patch looks good overall.  Code can use a lot cleanup, but you're planning on doing that anyway. :)  Some specific comments I'd like to see addressed in a cleanup...

- "mOwningBuffer" should just be renamed to "mBuffers".
- the usage of "index" throughout isn't really correct; it's a key, or a token, or something.  It's *not* an array index (specifically KeyedVector can obtain an index for a key).  We should not use the term "index" anywhere unless it's an explicit array index.  For this usage, I would use "id"/"key"/"token" everywhere.
- using android::sp and android::GraphicBuffer, so that you can type sp<GraphicBuffer> instead of needing to have android::sp<android::GraphicBuffer>
- I love the logging, but please add some nice #defines at the top so that you can write LOG("Something happened %d", someValue); instead of __android_log_print(LOG_ERROR, "GraphicBufferManagerParent", "Something hap.... every time.
- there's a good amount of convoluted code flow with lots of inner returns; try to avoid "else after return", e.g. prefer "if (foo) { return x; } doSomething();" instead of "if (foo) { return x; } else { doSomething(); }"

I'd be happy to review this, at least the manager pieces.
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #45)
> Mystery solved.  logcat -v threadtime tells all.
> 
> GraphicBuffer allocation only happens on the GrallocManagerParent thread. 
> However, any thread (e.g. compositor) can call DropGrallocBuffer!  If it
> does, more than one thread is trying to modify the mOwningBuffer array, and
> the world gets out of sync.  The easy fix is to have a mutex around
> mOwningBuffer access.  I'll attach a patch shortly that has all of my hacks
> on top of the current patch.
I think we wrap the DropGrallocBuffer calls to post the real work to Manager thread, since it will incur a IPC call later, which should bound to its own thread. This is another mistake in my patch, I just replace GrallocBufferActor::Send__delete to its new couterpart, but forget to handle the threading problem.

For index concern, I initially want it as a buffer index, but later I want to check whether the buffer passed on the IPC channel really belongs to that actor pair and the key for buffer must be globally counted otherwise buffer key 1 of process 123 can not be distinguished from key 1 of process 456. A better but cubersome way maybe store both key and array index in GrallocBufferRef. Then we can direct access the buffer by index and check ownership by key.

(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #46)
> Created attachment 8381639 [details] [diff] [review]
> Fixups, on top of WIP1
> 
> Here's my set of fixes on top of WIP #1.  This has:
> 
> - KitKat unflatten fixes
> - Mutexes for Child/Parent when accessing the vector
> 
> - start of a "fully done" indicator; this can be done as a followup, it just
> got included in this diff.  (The idea of this is that if one side knows its
> refcnt is 1, it can tell the other side "hey I'm fully done with this"; the
> other side can then do smart things to perhaps reuse/cache the GraphicBuffer
> for a future allocation request.  Either side can indicate this, so the
> caches should be on both sides [or explicitly chosen to just be on one side,
> maybe via one of the sides "giving back" a buffer if it's fully done])
Then just create a follow up once this reviewed. I have a idea about a buffer recycler bin, too. Since we tend to allocate/release buffer rapidly with same configuration(size, format), it should be good to have a global recycler bin to shorten allocation time. However, I heard that gralloc driver should always did something like that, I want to investigate that part in a follow up bug.
> 
> The patch looks good overall.  Code can use a lot cleanup, but you're
> planning on doing that anyway. :)  Some specific comments I'd like to see
> addressed in a cleanup...
> 
> - "mOwningBuffer" should just be renamed to "mBuffers".
If you check my very first patch in this bug, you will find there is a sOutstandingBuffer, which store all sp<GraphicBuffer> statically. As I removed sOutstandingBuffer, I will change it back to mBuffers later.
> - the usage of "index" throughout isn't really correct; it's a key, or a
> token, or something.  It's *not* an array index (specifically KeyedVector
> can obtain an index for a key).  We should not use the term "index" anywhere
> unless it's an explicit array index.  For this usage, I would use
> "id"/"key"/"token" everywhere.
I will use nsTArray or std::map later, as this class will be change name to PSharedBufferManager, and maybe included in some other platform later, it should be better to not use android classes. And I will change the index word to key.

> - using android::sp and android::GraphicBuffer, so that you can type
> sp<GraphicBuffer> instead of needing to have
> android::sp<android::GraphicBuffer>
> - I love the logging, but please add some nice #defines at the top so that
> you can write LOG("Something happened %d", someValue); instead of
> __android_log_print(LOG_ERROR, "GraphicBufferManagerParent", "Something
> hap.... every time.

> - there's a good amount of convoluted code flow with lots of inner returns;
> try to avoid "else after return", e.g. prefer "if (foo) { return x; }
> doSomething();" instead of "if (foo) { return x; } else { doSomething(); }"
> 
Most such log things and convoluted code flow are being NS_ASSERTION later, since return false or some bad default value just delay the abort and make it hard to debug. That's why I need such many logs everywhere to check which part goes wrong. (The most annoying case I met in writing this patch is PLayer usually deserialized fail and make PLayerTransaction fail and incur ActorDestroy, which seems nothing to do with my change. And the cause is in fact something wrong in my serializer/deserializer changes.

> I'd be happy to review this, at least the manager pieces.
I am trying my best to split the patch, however, it is very cubersome for following reason:
I will change PGrallocBuffer to MaybeMagicGrallocBufferHandle in SurfaceDescriptorGralloc and NewSurfaceDescriptorGralloc, which make the piece to change this tightly bounded with old protocol removal, otherwise I have to create a piece that change "PGrallocBuffer buffer" to "PGrallocBuffer deprecatedBuffer" and fix corresponding parts, and remove all of them in next piece. <-- This is what I am going to do now (patch split, and major cleanup are at the same time).

For your KitKat fix, I will review what the difference between yours and my rebased and maybe take yours into my patch set later.
Attached patch WIP3, part1 (obsolete) — Splinter Review
I fixed following thread issues:
1. call SendXXX from non-worker thread
2. access non-threadsafe container without Mutex protection

And I changed the thread name to reflect which process it serves in Parent side.

This part contains only the protocol implementation. The only things missing I can tell now is that while Nuwa enabled, GrallocBufferManagerParent constructor will called twice each child process spawn, and I should use the same thread for them other than create 2 different thread (and it looks like 1 of them just idle).
Attachment #8381120 - Attachment is obsolete: true
Attached patch WIP3, part2 (obsolete) — Splinter Review
Remaining part. In some thread fix belongs to this part, and the Kitkat deserializer fix are taken from :vlad's patch.

This part need further splitting and clean up.
Attached patch V1,part1 (obsolete) — Splinter Review
Major change from WIP3:
(1) Change GrallocBufferManager to SharedBufferManager
(2) Avoid redundant thread creation
(3) Fix a child side deadlock
Attachment #8381992 - Attachment is obsolete: true
Attachment #8382742 - Flags: review?(vladimir)
Attached patch WIP4, part2 (obsolete) — Splinter Review
This part need clean up and split up, upload here for who want to test this work.
Attachment #8381994 - Attachment is obsolete: true
nical,

Do you think I need further split part2? To split it into a clearer form will need some extra work, and those work will be removed in later part. I will have to make SurfaceDescriptorGralloc contains a PGrallocBuffer deprecatedBuffer, and fix related part, which is time consuming and useless. If you feel that still need, I will continue to do that. Otherwise I will clean up current part2 for review.
Flags: needinfo?(nical.bugzilla)
(In reply to Chiajung Hung [:chiajung] from comment #52)
> nical,
> 
> Do you think I need further split part2? 

As far as I am concerned, you don't need to split part 2.
This is some exciting work, keep it up!
Flags: needinfo?(nical.bugzilla)
Comment on attachment 8382742 [details] [diff] [review]
V1,part1

Review of attachment 8382742 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great overall!  There's a lot of comments around style and some nits, largely because this is great quality and you might as well polish it up.  The concrete comments are roughly:

- use Mutex, not Monitor
- lock the mutex only in the very specific chunks its needed (most important in the Manager allocation case)
- make Dealloc/Drop aysnc for callers from any thread
- consider making Alloc/Dealloc...Now() functions protected or private, no reason to tempt someone to call them

Only marking r- due to the above and that I'd like to see them fixed in a new version, but otherwise this part looks great. (And the updated version solved many of the problems I was having -- I did not catch the race conditions about doing allocations from another thread on the child!)

::: gfx/layers/ipc/SharedBufferManagerChild.cpp
@@ +32,5 @@
> +{
> +
> +}
> +
> +static bool InSharedBufferManagerChildThread() {

nit: newline after the return type, here and elsewhere in this file.

@@ +65,5 @@
> +
> +SharedBufferManagerChild* SharedBufferManagerChild::GetSingleton()
> +{
> +  return sSharedBufferManagerChildSingleton;
> +}

Any reason to not just put these two (GetThread/GetSingleton) in the header, and declare the statics as static members?  Might give the compiler more room to optimize.

@@ +137,5 @@
> +bool SharedBufferManagerChild::StartUpOnThread(Thread* aThread)
> +{
> +  NS_ABORT_IF_FALSE(aThread, "SharedBufferManager needs a thread.");
> +    if (sSharedBufferManagerChildSingleton == nullptr) {
> +    sSharedBufferManagerChildThread = aThread;

incorrect indentation in this function

@@ +150,5 @@
> +    sSharedBufferManagerChildSingleton->ConnectAsync(sSharedBufferManagerParentSingleton);
> +    return true;
> +  } else {
> +    return false;
> +  }

No "return after else" please.  But a better way to write this function and avoid unneeded indentation:

{
  if (aSharedBufferManagerChildSingleton != nullptr) {
    // we can only exist once, what's going on? should really assert or something
    return false;
  }

  ... other work ...
  return true;
}

@@ +281,5 @@
> +  MonitorAutoLock autoMon(barrier);
> +  bool done = false;
> +
> +  GetMessageLoop()->PostTask(FROM_HERE, NewRunnableFunction(&DeallocSurfaceDescriptorGrallocSync,
> +                                                            aBuffer, &barrier, &done));

Why does Dealloc need to be sync?  Can't we just post this runnable to the thread and move on?

@@ +298,5 @@
> +}
> +
> +bool SharedBufferManagerChild::AllocateGrallocBuffer(const IntSize& aSize, const uint32_t& aFormat, const uint32_t& aUsage, mozilla::layers::MaybeMagicGrallocBufferHandle* aHandle)
> +{
> +  __android_log_print(ANDROID_LOG_INFO, "BufMgrChild", "Try allocate buffer");

Helper #define for __android_log_print(ANDROID_LOG_INFO, "SharedBufferManagerChild", ...) please

@@ +307,5 @@
> +  MagicGrallocBufferHandle realHandle = handle.get_MagicGrallocBufferHandle();
> +  *aHandle = realHandle.mRef;
> +
> +  MonitorAutoLock lock(mBufferMonitor);
> +  mBuffers[realHandle.mRef.mKey] = realHandle.mGraphicBuffer;

For explicitness, put this in its own block, e.g.:

  {
    MonitorAutoLock lock(mBufferMonitor);
    mBuffers[...] = ...;
  }

  return true;

@@ +321,5 @@
> +
> +  MaybeMagicGrallocBufferHandle bufferHandle;
> +  if (aDesc.type() == SurfaceDescriptor::TSurfaceDescriptorGralloc)
> +    bufferHandle = aDesc.get_SurfaceDescriptorGralloc().buffer();
> +  else if (aDesc.type() ==SurfaceDescriptor::TNewSurfaceDescriptorGralloc)

missing space after ==

@@ +337,5 @@
> +  int bufferKey = handle.get_GrallocBufferRef().mKey;
> +
> +  MonitorAutoLock lock(mBufferMonitor);
> +  NS_ASSERTION(mBuffers.indexOfKey(bufferKey) >= 0, "Not my buffer");
> +  mBuffers.erase(bufferKey);

Same thing, { } around the chunk that is actually tracked by the lock

::: gfx/layers/ipc/SharedBufferManagerChild.h
@@ +35,5 @@
> +   */
> +  static bool StartUpOnThread(base::Thread* aThread);
> +
> +  /**
> +   * Destroys The SharedBufferManager protcol.

typo: "protocol"

@@ +84,5 @@
> +  bool
> +  AllocSurfaceDescriptorGrallocNow(const gfx::IntSize& aSize,
> +                                   const uint32_t& aFormat,
> +                                   const uint32_t& aUsage,
> +                                   SurfaceDescriptor* aBuffer);

I would make this protected, same with DeallocSurfaceDescriptorNow.  I don't see the manager thread doing allocations/deallocations that it cares about, and callers shouldn't be tempted to use the Now functions.  If someone is on the same thread somehow, they can just use the regular version which does the check to see if it's on the right thread already.

@@ +98,5 @@
> +  /**
> +   * Deallocate a remotely allocated gralloc buffer.
> +   */
> +  bool
> +  DeallocSurfaceDescriptorGralloc(const SurfaceDescriptor& aBuffer);

This can just return void, and be purely async (as mentioned in the cpp file comment).

@@ +117,5 @@
> +   * 
> +   * Must be called from the SharedBufferManagerChild thread.
> +   */
> +  bool 
> +  DropGrallocBuffer(const SurfaceDescriptor& aDesc);

Needs more documentation -- when are you supposed to call DeallocSurfaceDescriptorGralloc, and when are you supposed to call DropGrallocBuffer?   Also both take a SurfaceDescriptor, so the names really should match.

@@ +122,5 @@
> +  
> +  virtual bool RecvDropGrallocBuffer(const mozilla::layers::MaybeMagicGrallocBufferHandle& handle);
> +
> +  android::sp<android::GraphicBuffer>
> +  GetGraphicBuffer(int index);

nit: unlike the header file, this should all be one one line :)

@@ +133,5 @@
> +
> +protected:
> +  std::map<int, android::sp<android::GraphicBuffer> > mBuffers;
> +
> +  Monitor mBufferMonitor;

Why do you need a Monitor, and not just a Mutex?  (Monitor is a Mutex + condvar, but I don't ever see you signal the monitor.)  You should be able to just use a Mutex here.

::: gfx/layers/ipc/SharedBufferManagerParent.cpp
@@ +140,5 @@
> +}
> +
> +bool SharedBufferManagerParent::RecvAllocateGrallocBuffer(const IntSize& aSize, const uint32_t& aFormat, const uint32_t& aUsage, mozilla::layers::MaybeMagicGrallocBufferHandle* aHandle)
> +{
> +  MonitorAutoLock lock(mBuffersMonitor);

You don't need (shouldn't have) this lock around this entire function.  You only need it to allocate the sBufferKey and to assign to mBuffers[sBufferKey].  In particular, you want "new GraphicBuffer(...)" to happen outside of this lock so that they can run in parallel.

So you really want something like:

outgoingBuffer = new GraphicBuffer(...);

int key;
{
  MutexAutoLock lock(mBuffersMutex);
  key = ++sBufferKey;
  mBuffers[sBufferKey] = outgoingBuffer;
}

*aHandle = MagicGrallocBufferHandle(outgoingBuffer, GrallocBufferRef(mOwner, key));

@@ +141,5 @@
> +
> +bool SharedBufferManagerParent::RecvAllocateGrallocBuffer(const IntSize& aSize, const uint32_t& aFormat, const uint32_t& aUsage, mozilla::layers::MaybeMagicGrallocBufferHandle* aHandle)
> +{
> +  MonitorAutoLock lock(mBuffersMonitor);
> +  sp<GraphicBuffer> outgoingBuffer = new GraphicBuffer(aSize.width, aSize.height, aFormat, aUsage);

for sanity, put some blank lines around this line -- it's the most important call here, so we should be highlighting it

@@ +182,5 @@
> +
> +  if (PlatformThread::CurrentId() == mThread->thread_id())
> +    DropGrallocBufferImpl(aDesc);
> +  else
> +  {

style nit: use {} on both clauses, with "} else {" instead of on separate lines

::: gfx/layers/ipc/SharedBufferManagerParent.h
@@ +37,5 @@
> +
> +  virtual bool RecvAllocateGrallocBuffer(const IntSize&, const uint32_t&, const uint32_t&, mozilla::layers::MaybeMagicGrallocBufferHandle*);
> +  virtual bool RecvDropGrallocBuffer(const mozilla::layers::MaybeMagicGrallocBufferHandle& handle);
> +
> +  void DropGrallocBuffer(mozilla::layers::SurfaceDescriptor aDesc);

Did you lose interest in comments in this file? :)

@@ +38,5 @@
> +  virtual bool RecvAllocateGrallocBuffer(const IntSize&, const uint32_t&, const uint32_t&, mozilla::layers::MaybeMagicGrallocBufferHandle*);
> +  virtual bool RecvDropGrallocBuffer(const mozilla::layers::MaybeMagicGrallocBufferHandle& handle);
> +
> +  void DropGrallocBuffer(mozilla::layers::SurfaceDescriptor aDesc);
> +  MessageLoop* GetMessageLoop();

Shouldn't this be protected/private?

@@ +40,5 @@
> +
> +  void DropGrallocBuffer(mozilla::layers::SurfaceDescriptor aDesc);
> +  MessageLoop* GetMessageLoop();
> +  static SharedBufferManagerParent* GetInstance(ProcessId id);
> +  android::sp<android::GraphicBuffer> GetGraphicBuffer(int key);

add comments.. also a static GetGraphicBuffer(ProcessId id, int key) function would be convenient.

@@ +57,5 @@
> +   * Buffers owned by this SharedBufferManager pair
> +   */
> +  std::map<int, android::sp<android::GraphicBuffer> > mBuffers;
> +  
> +  static std::map<base::ProcessId, SharedBufferManagerParent*> sManagers;

add comments.. I would actually move all the static members to the top of the class, in their own separate block to show that there's a special class of functions for obtaining the right SharedBufferManagerParent

@@ +64,5 @@
> +  base::ProcessId mOwner;
> +  base::Thread* mThread;
> +  static int sBufferKey;
> +  Monitor mBuffersMonitor;
> +  static Monitor sManagerMonitor;

Mutex, not monitor (on both).
Attachment #8382742 - Flags: review?(vladimir) → review+
Blocks: 978631
Comment on attachment 8382742 [details] [diff] [review]
V1,part1

Review of attachment 8382742 [details] [diff] [review]:
-----------------------------------------------------------------

I will upload a updated version based on the review later.

Thanks for the great review!!

::: gfx/layers/ipc/SharedBufferManagerChild.cpp
@@ +32,5 @@
> +{
> +
> +}
> +
> +static bool InSharedBufferManagerChildThread() {

Done.

@@ +65,5 @@
> +
> +SharedBufferManagerChild* SharedBufferManagerChild::GetSingleton()
> +{
> +  return sSharedBufferManagerChildSingleton;
> +}

Most code in this class follows ImageBridge, and I think it is just for simplicity: if they are public static members, when you want to access them from those dispatched function, you will have to add class prefix in it, if they are protected/private static members, you will need make dispatched functions become member or create getters for them.

I choose to make both the statics and dispatched function protected member here.

@@ +137,5 @@
> +bool SharedBufferManagerChild::StartUpOnThread(Thread* aThread)
> +{
> +  NS_ABORT_IF_FALSE(aThread, "SharedBufferManager needs a thread.");
> +    if (sSharedBufferManagerChildSingleton == nullptr) {
> +    sSharedBufferManagerChildThread = aThread;

Done.
Follow the style mentioned below.

@@ +281,5 @@
> +  MonitorAutoLock autoMon(barrier);
> +  bool done = false;
> +
> +  GetMessageLoop()->PostTask(FROM_HERE, NewRunnableFunction(&DeallocSurfaceDescriptorGrallocSync,
> +                                                            aBuffer, &barrier, &done));

Just follow ImageBridge before, since the drop message of the protocol is async anyway, I changed this to be async.

@@ +298,5 @@
> +}
> +
> +bool SharedBufferManagerChild::AllocateGrallocBuffer(const IntSize& aSize, const uint32_t& aFormat, const uint32_t& aUsage, mozilla::layers::MaybeMagicGrallocBufferHandle* aHandle)
> +{
> +  __android_log_print(ANDROID_LOG_INFO, "BufMgrChild", "Try allocate buffer");

Helper added, and log removed.

@@ +307,5 @@
> +  MagicGrallocBufferHandle realHandle = handle.get_MagicGrallocBufferHandle();
> +  *aHandle = realHandle.mRef;
> +
> +  MonitorAutoLock lock(mBufferMonitor);
> +  mBuffers[realHandle.mRef.mKey] = realHandle.mGraphicBuffer;

Done.

@@ +321,5 @@
> +
> +  MaybeMagicGrallocBufferHandle bufferHandle;
> +  if (aDesc.type() == SurfaceDescriptor::TSurfaceDescriptorGralloc)
> +    bufferHandle = aDesc.get_SurfaceDescriptorGralloc().buffer();
> +  else if (aDesc.type() ==SurfaceDescriptor::TNewSurfaceDescriptorGralloc)

Done.

@@ +337,5 @@
> +  int bufferKey = handle.get_GrallocBufferRef().mKey;
> +
> +  MonitorAutoLock lock(mBufferMonitor);
> +  NS_ASSERTION(mBuffers.indexOfKey(bufferKey) >= 0, "Not my buffer");
> +  mBuffers.erase(bufferKey);

Done.

::: gfx/layers/ipc/SharedBufferManagerChild.h
@@ +35,5 @@
> +   */
> +  static bool StartUpOnThread(base::Thread* aThread);
> +
> +  /**
> +   * Destroys The SharedBufferManager protcol.

This typo must present in ImageBridgeChild. :p
Fixed here.

@@ +84,5 @@
> +  bool
> +  AllocSurfaceDescriptorGrallocNow(const gfx::IntSize& aSize,
> +                                   const uint32_t& aFormat,
> +                                   const uint32_t& aUsage,
> +                                   SurfaceDescriptor* aBuffer);

Done.

@@ +98,5 @@
> +  /**
> +   * Deallocate a remotely allocated gralloc buffer.
> +   */
> +  bool
> +  DeallocSurfaceDescriptorGralloc(const SurfaceDescriptor& aBuffer);

Done.

@@ +117,5 @@
> +   * 
> +   * Must be called from the SharedBufferManagerChild thread.
> +   */
> +  bool 
> +  DropGrallocBuffer(const SurfaceDescriptor& aDesc);

I make the DeallocSurfaceDescriptorGrallocNow replace the DropGrallocBuffer here.

@@ +122,5 @@
> +  
> +  virtual bool RecvDropGrallocBuffer(const mozilla::layers::MaybeMagicGrallocBufferHandle& handle);
> +
> +  android::sp<android::GraphicBuffer>
> +  GetGraphicBuffer(int index);

Done.

@@ +133,5 @@
> +
> +protected:
> +  std::map<int, android::sp<android::GraphicBuffer> > mBuffers;
> +
> +  Monitor mBufferMonitor;

Done.

::: gfx/layers/ipc/SharedBufferManagerParent.cpp
@@ +140,5 @@
> +}
> +
> +bool SharedBufferManagerParent::RecvAllocateGrallocBuffer(const IntSize& aSize, const uint32_t& aFormat, const uint32_t& aUsage, mozilla::layers::MaybeMagicGrallocBufferHandle* aHandle)
> +{
> +  MonitorAutoLock lock(mBuffersMonitor);

sBufferKey do not need any protection since no any other place access it. Fixed the style here.

@@ +141,5 @@
> +
> +bool SharedBufferManagerParent::RecvAllocateGrallocBuffer(const IntSize& aSize, const uint32_t& aFormat, const uint32_t& aUsage, mozilla::layers::MaybeMagicGrallocBufferHandle* aHandle)
> +{
> +  MonitorAutoLock lock(mBuffersMonitor);
> +  sp<GraphicBuffer> outgoingBuffer = new GraphicBuffer(aSize.width, aSize.height, aFormat, aUsage);

1-line under should be enough?

@@ +182,5 @@
> +
> +  if (PlatformThread::CurrentId() == mThread->thread_id())
> +    DropGrallocBufferImpl(aDesc);
> +  else
> +  {

Done.

::: gfx/layers/ipc/SharedBufferManagerParent.h
@@ +37,5 @@
> +
> +  virtual bool RecvAllocateGrallocBuffer(const IntSize&, const uint32_t&, const uint32_t&, mozilla::layers::MaybeMagicGrallocBufferHandle*);
> +  virtual bool RecvDropGrallocBuffer(const mozilla::layers::MaybeMagicGrallocBufferHandle& handle);
> +
> +  void DropGrallocBuffer(mozilla::layers::SurfaceDescriptor aDesc);

The comments are following ImageBridge, as ImageBridge does not comment them, I just leave it blank here.
Some comments added.

@@ +38,5 @@
> +  virtual bool RecvAllocateGrallocBuffer(const IntSize&, const uint32_t&, const uint32_t&, mozilla::layers::MaybeMagicGrallocBufferHandle*);
> +  virtual bool RecvDropGrallocBuffer(const mozilla::layers::MaybeMagicGrallocBufferHandle& handle);
> +
> +  void DropGrallocBuffer(mozilla::layers::SurfaceDescriptor aDesc);
> +  MessageLoop* GetMessageLoop();

No, it must be public, see Child's ConnectSharedBufferManager.

@@ +40,5 @@
> +
> +  void DropGrallocBuffer(mozilla::layers::SurfaceDescriptor aDesc);
> +  MessageLoop* GetMessageLoop();
> +  static SharedBufferManagerParent* GetInstance(ProcessId id);
> +  android::sp<android::GraphicBuffer> GetGraphicBuffer(int key);

Done.

@@ +57,5 @@
> +   * Buffers owned by this SharedBufferManager pair
> +   */
> +  std::map<int, android::sp<android::GraphicBuffer> > mBuffers;
> +  
> +  static std::map<base::ProcessId, SharedBufferManagerParent*> sManagers;

Done.

@@ +64,5 @@
> +  base::ProcessId mOwner;
> +  base::Thread* mThread;
> +  static int sBufferKey;
> +  Monitor mBuffersMonitor;
> +  static Monitor sManagerMonitor;

Done.
Attached patch V2, part1 (obsolete) — Splinter Review
V2, based on the review above.
Attachment #8382742 - Attachment is obsolete: true
Attachment #8384397 - Flags: review?(vladimir)
Attached patch V1, part2 (obsolete) — Splinter Review
Cleaned up. Maybe need rebase, I will start a rebase later.
Attachment #8382743 - Attachment is obsolete: true
Attachment #8384432 - Flags: review?(nical.bugzilla)
Attached patch V1,part2,rebased (obsolete) — Splinter Review
Rebased to newest version yesterday.
Attachment #8384432 - Attachment is obsolete: true
Attachment #8384432 - Flags: review?(nical.bugzilla)
Attachment #8384992 - Flags: review?(nical.bugzilla)
Comment on attachment 8384397 [details] [diff] [review]
V2, part1

Review of attachment 8384397 [details] [diff] [review]:
-----------------------------------------------------------------

This is great work. Thanks!

::: gfx/layers/ipc/SharedBufferManagerChild.cpp
@@ +1,4 @@
> +/*
> + * SharedBufferManagerChild.cpp
> + *
> + *  Created on: Feb 5, 2014

Please use the uniform copyright header above. We don't add filenames and author names any more. The version control system tracks that.

@@ +25,5 @@
> +using namespace mozilla::gfx;
> +
> +namespace mozilla {
> +namespace layers {
> +// Singleton

New line before the //

@@ +58,5 @@
> +}
> +
> +// dispatched function
> +static void
> +ConnectSharedBufferManager(SharedBufferManagerChild * child, SharedBufferManagerParent * parent)

no space between * and child and * parent

@@ +193,5 @@
> +
> +MessageLoop *
> +SharedBufferManagerChild::GetMessageLoop() const
> +{
> +  return sSharedBufferManagerChildThread->message_loop();

can this be null here before we spun up the thread?

@@ +195,5 @@
> +SharedBufferManagerChild::GetMessageLoop() const
> +{
> +  return sSharedBufferManagerChildThread->message_loop();
> +}
> +

no double newline here

@@ +288,5 @@
> +  if (aDesc.type() == SurfaceDescriptor::TSurfaceDescriptorGralloc)
> +    bufferHandle = aDesc.get_SurfaceDescriptorGralloc().buffer();
> +  else if (aDesc.type() == SurfaceDescriptor::TNewSurfaceDescriptorGralloc)
> +    bufferHandle = aDesc.get_NewSurfaceDescriptorGralloc().buffer();
> +  NS_ASSERTION(bufferHandle.type() == MaybeMagicGrallocBufferHandle::TMagicGrallocBufferHandle, "shouldn't go this way");

this needs a better assertion message

@@ +304,5 @@
> +  MaybeMagicGrallocBufferHandle handle;
> +  SendAllocateGrallocBuffer(aSize, aFormat, aUsage, &handle);
> +  if (handle.type() != MaybeMagicGrallocBufferHandle::TMagicGrallocBufferHandle)
> +    return false;
> +  MagicGrallocBufferHandle realHandle = handle.get_MagicGrallocBufferHandle();

why do you need realHandle? you could just use handle.get_MagicGrallocBufferHandle() below

::: gfx/layers/ipc/SharedBufferManagerChild.h
@@ +114,5 @@
> +
> +  virtual bool RecvDropGrallocBuffer(const mozilla::layers::MaybeMagicGrallocBufferHandle& handle);
> +
> +  android::sp<android::GraphicBuffer> GetGraphicBuffer(int index);
> +  

whitespace

::: gfx/layers/ipc/SharedBufferManagerParent.h
@@ +86,5 @@
> +  /**
> +   * Buffers owned by this SharedBufferManager pair
> +   */
> +  std::map<int, android::sp<android::GraphicBuffer> > mBuffers;
> +  

whitespace
It may take me a few days to get to this review -- BenWa did some work on caching/recycling the textureclients needed for tiling, so that removed some of the pressure for needing this patch for 1.4.  Would love to see it land every early in the next cycle though, especially in coordination with Sotaro's work in bug 950112 integrated... so that this interface here can be used to quickly allocate things either on JB+ (using 950112's approach), or via Compositor process on earlier, with no change needed for code.
Depends on: 979138
No longer depends on: 972772
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #60)
> It may take me a few days to get to this review -- BenWa did some work on
> caching/recycling the textureclients needed for tiling, so that removed some
> of the pressure for needing this patch for 1.4.  Would love to see it land
> every early in the next cycle though, especially in coordination with
> Sotaro's work in bug 950112 integrated... so that this interface here can be
> used to quickly allocate things either on JB+ (using 950112's approach), or
> via Compositor process on earlier, with no change needed for code.

Unfortunately, I found a bad blocker of this.
You can check Bug 979138 for detail.
Comment on attachment 8384397 [details] [diff] [review]
V2, part1

Review of attachment 8384397 [details] [diff] [review]:
-----------------------------------------------------------------

All done, uploading a newer version now.

::: gfx/layers/ipc/SharedBufferManagerChild.cpp
@@ +1,4 @@
> +/*
> + * SharedBufferManagerChild.cpp
> + *
> + *  Created on: Feb 5, 2014

Done.

@@ +25,5 @@
> +using namespace mozilla::gfx;
> +
> +namespace mozilla {
> +namespace layers {
> +// Singleton

Done.

@@ +58,5 @@
> +}
> +
> +// dispatched function
> +static void
> +ConnectSharedBufferManager(SharedBufferManagerChild * child, SharedBufferManagerParent * parent)

Done.

@@ +193,5 @@
> +
> +MessageLoop *
> +SharedBufferManagerChild::GetMessageLoop() const
> +{
> +  return sSharedBufferManagerChildThread->message_loop();

Done.

@@ +195,5 @@
> +SharedBufferManagerChild::GetMessageLoop() const
> +{
> +  return sSharedBufferManagerChildThread->message_loop();
> +}
> +

Done

@@ +288,5 @@
> +  if (aDesc.type() == SurfaceDescriptor::TSurfaceDescriptorGralloc)
> +    bufferHandle = aDesc.get_SurfaceDescriptorGralloc().buffer();
> +  else if (aDesc.type() == SurfaceDescriptor::TNewSurfaceDescriptorGralloc)
> +    bufferHandle = aDesc.get_NewSurfaceDescriptorGralloc().buffer();
> +  NS_ASSERTION(bufferHandle.type() == MaybeMagicGrallocBufferHandle::TMagicGrallocBufferHandle, "shouldn't go this way");

Done.

@@ +304,5 @@
> +  MaybeMagicGrallocBufferHandle handle;
> +  SendAllocateGrallocBuffer(aSize, aFormat, aUsage, &handle);
> +  if (handle.type() != MaybeMagicGrallocBufferHandle::TMagicGrallocBufferHandle)
> +    return false;
> +  MagicGrallocBufferHandle realHandle = handle.get_MagicGrallocBufferHandle();

Done.

::: gfx/layers/ipc/SharedBufferManagerChild.h
@@ +114,5 @@
> +
> +  virtual bool RecvDropGrallocBuffer(const mozilla::layers::MaybeMagicGrallocBufferHandle& handle);
> +
> +  android::sp<android::GraphicBuffer> GetGraphicBuffer(int index);
> +  

Done.
Attached patch V3, part1 (obsolete) — Splinter Review
Updated base on :gal's review
Attachment #8384397 - Attachment is obsolete: true
Attachment #8384397 - Flags: review?(vladimir)
Attachment #8385126 - Flags: review?(vladimir)
It seems better to compare with Bug 950112 before thinking about commit it. It did almost same thing, but seems like to get more performance.
Comment on attachment 8385126 [details] [diff] [review]
V3, part1

I am going to steal this review. I wrote essentially the same patch a few weeks ago, so I think I understand this code pretty well.
Attachment #8385126 - Flags: review?(vladimir) → review+
Comment on attachment 8384992 [details] [diff] [review]
V1,part2,rebased

Review of attachment 8384992 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/GrallocImages.cpp
@@ +101,4 @@
>    }
>  
>    sp<GraphicBuffer> graphicBuffer =
> +    GetGraphicBufferFromDesc(

Fits into one line.

::: gfx/layers/GrallocImages.h
@@ +149,5 @@
>        return nullptr;
>      }
>    }
>  
> +  virtual bool IsValid() { return GetSurfaceDescriptor().type() == SurfaceDescriptor::TSurfaceDescriptorGralloc || GetSurfaceDescriptor().type() == SurfaceDescriptor::TNewSurfaceDescriptorGralloc; }

split this in multiple lines please

::: gfx/layers/ipc/ISurfaceAllocator.h
@@ +117,4 @@
>    virtual void DestroySharedSurface(SurfaceDescriptor* aSurface);
>  
>    // method that does the actual allocation work
> +  virtual bool AllocGrallocBuffer(const gfx::IntSize& aSize,

why do we still need this?

::: gfx/layers/ipc/ImageBridgeParent.cpp
@@ +11,4 @@
>  #include "base/process_util.h"          // for OpenProcessHandle
>  #include "base/task.h"                  // for CancelableTask, DeleteTask, etc
>  #include "base/tracked.h"               // for FROM_HERE
> +#include "mozilla/gfx/Point.h"                   // for IntSize

indentation

::: gfx/layers/ipc/LayerTransactionChild.cpp
@@ +28,4 @@
>    // WARNING: |this| has gone to the great heap in the sky
>  }
>  
> +/*PGrallocBufferChild*

?

::: gfx/layers/ipc/LayersSurfaces.ipdlh
@@ +23,4 @@
>  
>  union MaybeMagicGrallocBufferHandle {
>    MagicGrallocBufferHandle;
> +  GrallocBufferRef;

This seems wrong. Why are we co-mingeling new buffers and buffer references into one path? When we allocate a new buffer, we should only get back MagicGrallocBufferHandle plus a ref. We register that handle locally and then only refer to the buffer with the ref.

@@ +82,4 @@
>  };
>  
>  struct NewSurfaceDescriptorGralloc {
> +  MaybeMagicGrallocBufferHandle buffer;

This should be GrallocBufferRef, only the allocation function itself should return a MagicGrallocBufferHandle.

@@ +93,4 @@
>  
>  // XXX - soon to be removed
>  struct SurfaceDescriptorGralloc {
> +  MaybeMagicGrallocBufferHandle buffer;

Dito here.

::: gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp
@@ +41,4 @@
>  namespace IPC {
>  
>  void
> +ParamTraits<GrallocBufferRef>::Write(Message* aMsg,

We can do this in the protocol directly so we don't need traits for it.

@@ +365,4 @@
>    }
>  
>    if (aCaps & USING_GL_RENDERING_ONLY) {
> +    allocResult = AllocGrallocBuffer(aSize,

You can merge these two calls I think and just supply different flags and defaultRBSwap state.

@@ +401,4 @@
>  // Both processes
>  
>  /*static*/ sp<GraphicBuffer>
> +GetGraphicBufferFrom(MaybeMagicGrallocBufferHandle aHandle)

We should not overload this. In each call site I think we know exactly what kind we have: handle or ref. If we split this at the protocol level this function will go away.

::: gfx/layers/opengl/GrallocTextureClient.cpp
@@ +59,5 @@
>    }
>  
>    virtual void DeallocateSharedData(ISurfaceAllocator* allocator) MOZ_OVERRIDE
>    {
> +    SurfaceDescriptor temp = SurfaceDescriptor(mGrallocHandle);

sd

@@ +123,5 @@
>      if (!mBufferLocked) {
>        // We just need to wrap the actor in a SurfaceDescriptor because that's what
>        // ISurfaceAllocator uses as input, we don't care about the other parameters.
> +      null_t nullBuffer;
> +      SurfaceDescriptor sd = SurfaceDescriptorGralloc(nullBuffer,

you can just say null_t() here

::: gfx/layers/opengl/GrallocTextureHost.cpp
@@ +309,5 @@
> +    else
> +      owner = handle.get_MagicGrallocBufferHandle().mRef.mOwner;
> +
> +    SharedBufferManagerParent::GetInstance(owner)->DropGrallocBuffer(mGrallocHandle);
> +    //PGrallocBufferParent::Send__delete__(mGrallocActor);

?

::: gfx/layers/opengl/TextureHostOGL.cpp
@@ +1037,4 @@
>  static void
>  AddDeprecatedTextureHostToGrallocBufferActor(DeprecatedTextureHost* aDeprecatedTextureHost, const SurfaceDescriptor* aSurfaceDescriptor)
>  {
> +  /*if (aSurfaceDescriptor && IsSurfaceDescriptorValid(*aSurfaceDescriptor)) {

?

@@ +1045,5 @@
>  
>  static void
>  RemoveDeprecatedTextureHostFromGrallocBufferActor(DeprecatedTextureHost* aDeprecatedTextureHost, const SurfaceDescriptor* aSurfaceDescriptor)
>  {
> +  /*if (aSurfaceDescriptor && IsSurfaceDescriptorValid(*aSurfaceDescriptor)) {

?

::: gfx/thebes/gfxPlatform.cpp
@@ +506,4 @@
>      // deleted.
>      ImageBridgeChild::ShutDown();
>  
> +    SharedBufferManagerChild::ShutDown();

When I wrote this patch I had shutdown crashes because buffers were in flight. What exactly does the shutdown sequence look like that you aren't running into that? Basically buffer deallocations can be in flight and we yank the thread out under the IPC layer.
Sotaro, why would bug 950112 still get more performance? We are using one buffer manager in the parent per child, so this is essentially multi-threaded now. There is still some minimal contention in the client between the image bridge thread and the content thread, but thats a pretty rare event.
Flags: needinfo?(sotaro.ikeda.g)
As in Bug 950112 Comment 9, gecko ipc uses a lot of thread context switch. Android binder ipc need lesser context switch.
- in same process: no context switch
- across process: 2 times thread context switch.

gecko ipc seems to need 7 times thread context switch.

And gralloc buffer allocations could happen concurrently by multi threads in same app. Even in this case, binder ipc part is not blocked(ion driver might block). gecko ipc is blocked in this use case.
Flags: needinfo?(sotaro.ikeda.g)
a lot of thread context switches could introduce delay by the scheduler. Change the thread to another thread by re-scheduling.
The context switching is a bug in IPC we should fix. Lets file a bug for that. I will grab you on IRC to understand whats exactly wrong here.

When do allocations happen concurrently in the content process on separate threads? That seems very rare.
(In reply to Andreas Gal :gal from comment #70)
> The context switching is a bug in IPC we should fix. Lets file a bug for
> that. I will grab you on IRC to understand whats exactly wrong here.
> 
> When do allocations happen concurrently in the content process on separate
> threads? That seems very rare.

Right now, camera preview start/end/siwtch is a typical use case. And after tiling is enabled. tile cache allocation can be done by worker thread.
(In reply to Sotaro Ikeda [:sotaro] from comment #71)
> (In reply to Andreas Gal :gal from comment #70)
> 
> Right now, camera preview start/end/siwtch is a typical use case. And after
> tiling is enabled. tile cache allocation can be done by worker thread.

media framework is going to allocate gralloc buffer from non main thread.

And WebGL's buffer allocation also going to non-main thread in near future. One candidate is Bug 767484.
I think I know where the additional context switches are coming from. IPC talks thread to thread, so we have a local thread in the child that talks to the parent and we context switch to that thread when we need a buffer and then wait for the parent to respond and then send the buffer back to the actual thread that needed a buffer.
(In reply to Sotaro Ikeda [:sotaro] from comment #72)
> And WebGL's buffer allocation also going to non-main thread in near future.
> One candidate is Bug 767484.

Sorry, the above seems my misunderstanding.
(In reply to Chiajung Hung [:chiajung] from comment #61)
> Unfortunately, I found a bad blocker of this.
> You can check Bug 979138 for detail.

That's not a blocker, that's just a number that needs to be changed ;)  However, PImageBridge should probably go away or be subsumed into this protocol.. is there any reason to keep it separate?
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #75)
> However, PImageBridge should probably go away or be subsumed into this
> protocol.. is there any reason to keep it separate?

They are different things. ImageBridge needs to be on the same event loop as the compositor on the parent side, while this protocol needs to not be on the same event loop.
(In reply to Nicolas Silva [:nical] from comment #76)
> (In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #75)
> > However, PImageBridge should probably go away or be subsumed into this
> > protocol.. is there any reason to keep it separate?
> 
> They are different things. ImageBridge needs to be on the same event loop as
> the compositor on the parent side, while this protocol needs to not be on
> the same event loop.

Interesting.. then I guess I don't understand the purpose of the separate ImageBridge protocol, if it's living in the same event loop as the Compositor (I assume the Compositor protocol runs on the compositor thread?).  Why can't it just be merged in with PCompositor?  Though this is a little off topic though, I guess :)
(In reply to Andreas Gal :gal from comment #67)
> Sotaro, why would bug 950112 still get more performance? We are using one
> buffer manager in the parent per child, so this is essentially
> multi-threaded now.

I did not see bug 950112 as IPC, but just a gralloc buffer allocator. It could be totally independent from gecko ipc usage, I think.
(In reply to Sotaro Ikeda [:sotaro] from comment #64)
> It seems better to compare with Bug 950112 before thinking about commit it.
> It did almost same thing, but seems like to get more performance.

It is just a general comment. Can be ignored. I do not have a plan to commit Bug 950112 if this bug could get enough performance.
(In reply to Sotaro Ikeda [:sotaro] from comment #68)
> As in Bug 950112 Comment 9, gecko ipc uses a lot of thread context switch.
> Android binder ipc need lesser context switch.
> - in same process: no context switch
This can be fixed later by sandbox the call in chrome(which can avoid serialize/deserialize and dup of fd), see PHal's implementation.
> - across process: 2 times thread context switch.
> 
> gecko ipc seems to need 7 times thread context switch.
However, Android binder need switch from user space to kernel space, and vice versa.
(Binder IPC involve a binder driver ioctl)
> 
> And gralloc buffer allocations could happen concurrently by multi threads in
> same app. Even in this case, binder ipc part is not blocked(ion driver might
> block). gecko ipc is blocked in this use case.
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #77)
> (In reply to Nicolas Silva [:nical] from comment #76)
> > (In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #75)
> > > However, PImageBridge should probably go away or be subsumed into this
> > > protocol.. is there any reason to keep it separate?
> > 
> > They are different things. ImageBridge needs to be on the same event loop as
> > the compositor on the parent side, while this protocol needs to not be on
> > the same event loop.
> 
> Interesting.. then I guess I don't understand the purpose of the separate
> ImageBridge protocol, if it's living in the same event loop as the
> Compositor (I assume the Compositor protocol runs on the compositor
> thread?).  Why can't it just be merged in with PCompositor?  Though this is
> a little off topic though, I guess :)
ImageBridge is for async image update which PLayerTransaction lacks(from content process's view of point). A good example of async image update is video playback: we do not need wait for main thread reflow, and update the decoded frame as soon as possible.
Chiajung sums it up well, the idea nehind ImageBridge is to be able to have transactions with the compositor thread that are independent of the main thread. So even though ImageBridge needs to be on the compositor thread's event loop on the parent side, the idea is that it is not on the content thread's event loop on the child side. You can think of ImageBridge as a slimmed down PCompositor that is not blocked by the content thread (useful for video playback, and at some point webgl/canvas workers, and anything that doesn't need to be part of layout-driven transactions).
(In reply to Chiajung Hung [:chiajung] from comment #80)
> (In reply to Sotaro Ikeda [:sotaro] from comment #68)
> > As in Bug 950112 Comment 9, gecko ipc uses a lot of thread context switch.
> > Android binder ipc need lesser context switch.
> > - in same process: no context switch
> This can be fixed later by sandbox the call in chrome(which can avoid
> serialize/deserialize and dup of fd), see PHal's implementation.
> > - across process: 2 times thread context switch.

In this use case, we do not need sandboxing like hal. The solution could be simpler than that.
Attached patch V3, part1 (obsolete) — Splinter Review
Fix a very small mistake from previous review, carry r+.
Attachment #8385126 - Attachment is obsolete: true
Attachment #8386565 - Flags: review+
Attached patch V1,part2 (obsolete) — Splinter Review
Fix camera crash, and switch to NewSurfaceDescriptorGralloc.
(just 2-line change)
Attachment #8384992 - Attachment is obsolete: true
Attachment #8384992 - Flags: review?(nical.bugzilla)
Attachment #8386600 - Flags: review?(nical.bugzilla)
Attached patch V1, part2 (obsolete) — Splinter Review
Previous upload mistakely include some other works. Re-upload the correct patch.
Attachment #8386600 - Attachment is obsolete: true
Attachment #8386600 - Flags: review?(nical.bugzilla)
Attachment #8386602 - Flags: review?(nical.bugzilla)
Comment on attachment 8386602 [details] [diff] [review]
V1, part2

Review of attachment 8386602 [details] [diff] [review]:
-----------------------------------------------------------------

Beware, there are .orig files that should not be part of the diff.

This is quite a big changeset so I will need to spend some more time looking at it. This will require some careful testing because at this point it's really hard for a reviewer to be sure about the correctness just by reading the code.

::: gfx/layers/GrallocImages.h
@@ +149,5 @@
>        return nullptr;
>      }
>    }
>  
> +  virtual bool IsValid() { return GetSurfaceDescriptor().type() == SurfaceDescriptor::TSurfaceDescriptorGralloc || GetSurfaceDescriptor().type() == SurfaceDescriptor::TNewSurfaceDescriptorGralloc; }

nit: these long lines should be split. If the function is not a one-liner, please put the opening and closing braces on new lines.

::: gfx/layers/ipc/LayerTransactionChild.cpp
@@ +17,4 @@
>  namespace mozilla {
>  namespace layers {
>  
> +//class PGrallocBufferChild;

Please remove the commented out code.

@@ +52,4 @@
>    NS_RUNTIMEABORT("Um, how did we get here?");
>    return false;
>  #endif
> +}*/

Same here, commented out code should be removed

::: gfx/layers/ipc/ShadowLayerUtilsGralloc.h
@@ +47,4 @@
>    typedef android::GraphicBuffer GraphicBuffer;
>  
>    MagicGrallocBufferHandle()
> +  { memset((void*)&mRef, 0, sizeof(GrallocBufferRef)); }

Please initialize members explicitly or add a default constructor in GrallocBufferRef instead of using memset.

nit: put the opening and closing braces on new lines

::: gfx/layers/opengl/GrallocTextureClient.cpp
@@ +65,4 @@
>    }
>  
>  private:
> +  SurfaceDescriptor mGrallocHandle;

Since you know this is always be a gralloc descriptor you should store the appropriate gralloc descriptor instead of the generic one (it makes it easier for readers of the code to understand what's going on).

::: gfx/layers/opengl/GrallocTextureClient.h
@@ +34,4 @@
>  class GrallocTextureClientOGL : public BufferTextureClient
>  {
>  public:
> +  GrallocTextureClientOGL(SurfaceDescriptor buffer,

Please use  the most specialized descriptor type here as well.

@@ +112,4 @@
>    /**
>     * Unfortunately, until bug 879681 is fixed we need to use a GrallocBufferActor.
>     */
> +  SurfaceDescriptor mGrallocHandle;

You should store directly the most specialized descriptor type here since we know it is always a gralloc descriptor.

::: gfx/layers/opengl/GrallocTextureHost.cpp
@@ +251,2 @@
>  
> +  android::GraphicBuffer* graphicBuffer = GetGraphicBufferFrom(mGrallocHandle.buffer()).get();

There are code paths in GetGraphicBufferFrom that return nullptr. You should check for null pointer here and handle this case.

::: gfx/layers/opengl/TextureHostOGL.cpp
@@ +1041,3 @@
>      GrallocBufferActor* actor = static_cast<GrallocBufferActor*>(aSurfaceDescriptor->get_SurfaceDescriptorGralloc().bufferParent());
>      actor->AddDeprecatedTextureHost(aDeprecatedTextureHost);
> +  }*/

Please remove commented out code.

@@ +1050,3 @@
>      GrallocBufferActor* actor = static_cast<GrallocBufferActor*>(aSurfaceDescriptor->get_SurfaceDescriptorGralloc().bufferParent());
>      actor->RemoveDeprecatedTextureHost(aDeprecatedTextureHost);
> +  }*/

More commented out code to remove
Chiajung, the code reviews may slow down next week because we're trying to land stuff for 1.4 - just so that you know people are not ignoring the requests, I'm asking them to slow down any non-1.4 work in order to hit the deadlines.
(In reply to Milan Sreckovic [:milan] on PTO 3/7-3/18, NI gets 24 hour response from comment #88)
> Chiajung, the code reviews may slow down next week because we're trying to
> land stuff for 1.4 - just so that you know people are not ignoring the
> requests, I'm asking them to slow down any non-1.4 work in order to hit the
> deadlines.

It's OK for me. 
Since it will take quiet much time to review it, and it takes me hours to fix based on the review.

And since I think current implementation will build break other platforms, I will start to fix that, too.
Comment on attachment 8386602 [details] [diff] [review]
V1, part2

Review of attachment 8386602 [details] [diff] [review]:
-----------------------------------------------------------------

I add those .orig files accidently. I'll remove them in next patch,

::: gfx/layers/GrallocImages.h
@@ +149,5 @@
>        return nullptr;
>      }
>    }
>  
> +  virtual bool IsValid() { return GetSurfaceDescriptor().type() == SurfaceDescriptor::TSurfaceDescriptorGralloc || GetSurfaceDescriptor().type() == SurfaceDescriptor::TNewSurfaceDescriptorGralloc; }

Done.

::: gfx/layers/ipc/LayerTransactionChild.cpp
@@ +17,4 @@
>  namespace mozilla {
>  namespace layers {
>  
> +//class PGrallocBufferChild;

Done.

@@ +52,4 @@
>    NS_RUNTIMEABORT("Um, how did we get here?");
>    return false;
>  #endif
> +}*/

Done.

::: gfx/layers/ipc/ShadowLayerUtilsGralloc.h
@@ +47,4 @@
>    typedef android::GraphicBuffer GraphicBuffer;
>  
>    MagicGrallocBufferHandle()
> +  { memset((void*)&mRef, 0, sizeof(GrallocBufferRef)); }

Done.

::: gfx/layers/opengl/GrallocTextureClient.cpp
@@ +65,4 @@
>    }
>  
>  private:
> +  SurfaceDescriptor mGrallocHandle;

I revert to use SurfaceDescriptor since it can be SurfaceDescriptorGralloc or NewSurfaceDescriptorGralloc currently, use the general one keeps some flexibility here.

::: gfx/layers/opengl/GrallocTextureClient.h
@@ +34,4 @@
>  class GrallocTextureClientOGL : public BufferTextureClient
>  {
>  public:
> +  GrallocTextureClientOGL(SurfaceDescriptor buffer,

The were SurfaceDescriptorGralloc in my WIP, but some stuff use NewSurfaceDescriptorGralloc as well, which bugs me and make me switch to general SurfaceDescriptor and check which it is on lower level code.

@@ +112,4 @@
>    /**
>     * Unfortunately, until bug 879681 is fixed we need to use a GrallocBufferActor.
>     */
> +  SurfaceDescriptor mGrallocHandle;

Same as before, once we merge SurfaceDescriptorGralloc and NewSurfaceDescriptorGralloc, I agree it will make things clear, however, it is not clear for me which to use at this time of point.

::: gfx/layers/opengl/GrallocTextureHost.cpp
@@ +251,2 @@
>  
> +  android::GraphicBuffer* graphicBuffer = GetGraphicBufferFrom(mGrallocHandle.buffer()).get();

Let's RUNTIMEABORT here.

::: gfx/layers/opengl/TextureHostOGL.cpp
@@ +1041,3 @@
>      GrallocBufferActor* actor = static_cast<GrallocBufferActor*>(aSurfaceDescriptor->get_SurfaceDescriptorGralloc().bufferParent());
>      actor->AddDeprecatedTextureHost(aDeprecatedTextureHost);
> +  }*/

Done. Do you think I can remove the while function as well? Or do you think I should implement equivalent code?

@@ +1050,3 @@
>      GrallocBufferActor* actor = static_cast<GrallocBufferActor*>(aSurfaceDescriptor->get_SurfaceDescriptorGralloc().bufferParent());
>      actor->RemoveDeprecatedTextureHost(aDeprecatedTextureHost);
> +  }*/

Same above.
Attached patch part 1 (obsolete) — Splinter Review
Remove all android headers from header file, and use macro to protect GraphicBuffer and sp, carry r+.
Attachment #8386565 - Attachment is obsolete: true
Attachment #8388969 - Flags: review+
Attached patch V2, part2 (obsolete) — Splinter Review
Remove all .orig files from patch, add slight change according to review above.
Attachment #8386602 - Attachment is obsolete: true
Attachment #8386602 - Flags: review?(nical.bugzilla)
Attachment #8388970 - Flags: review?(nical.bugzilla)
Attached patch part1 (obsolete) — Splinter Review
Fix silly tiny macro protection fail.
Attachment #8388969 - Attachment is obsolete: true
Attachment #8389041 - Flags: review+
Attached patch V2, part2 (obsolete) — Splinter Review
Fix KitKat.
Attachment #8388970 - Attachment is obsolete: true
Attachment #8388970 - Flags: review?(nical.bugzilla)
Attachment #8389042 - Flags: review?(nical.bugzilla)
Attached patch part1 (obsolete) — Splinter Review
Fix yet another possible compilation problem on other platform, carry r+.
Attachment #8389041 - Attachment is obsolete: true
Attachment #8389063 - Flags: review+
Attached patch V2, part2 (obsolete) — Splinter Review
Fix a Browser app crash after rebase, today.
Attachment #8389042 - Attachment is obsolete: true
Attachment #8389042 - Flags: review?(nical.bugzilla)
Attachment #8389064 - Flags: review?(nical.bugzilla)
I found some question on kk, I will fix all before send next review.
Comment on attachment 8389064 [details] [diff] [review]
V2, part2

Review of attachment 8389064 [details] [diff] [review]:
-----------------------------------------------------------------

I am quite happy with how this simplifies ImageBridgeChild, and even happier about the removal of GrallocBufferActor. I have a few nits below but the overall patch looks good (scary and good!).

::: gfx/layers/GrallocImages.cpp
@@ +87,4 @@
>    if (!mGraphicBufferLocked.get()) {
>  
>      SurfaceDescriptor desc;
> +    SharedBufferManagerChild *ibc = SharedBufferManagerChild::GetSingleton();

nit: ibc stood for "Image Bridge Child", you should rename the variable into something like "bufferManager" or something else that makes sense.

::: gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp
@@ +421,3 @@
>  {
> +  MaybeMagicGrallocBufferHandle handle;
> +  if (aDesc.type() == SurfaceDescriptor::TSurfaceDescriptorGralloc)

Please add {} braces to the if statements

::: gfx/layers/opengl/GrallocTextureClient.cpp
@@ +83,5 @@
>  }
>  
> +GrallocTextureClientOGL::GrallocTextureClientOGL(SurfaceDescriptor buffer,
> +                        gfx::IntSize aSize,
> +                        TextureFlags aFlags)

nit: somehow the indentation got messed up here

::: gfx/layers/opengl/GrallocTextureHost.cpp
@@ +339,5 @@
>    }
> +  if (mGrallocHandle.buffer().type() != SurfaceDescriptor::Tnull_t) {
> +    MaybeMagicGrallocBufferHandle handle = mGrallocHandle.buffer();
> +    base::ProcessId owner;
> +    if (handle.type() == MaybeMagicGrallocBufferHandle::TGrallocBufferRef)

Pleas add braces to the if blocks

::: gfx/thebes/gfxPlatform.cpp
@@ +502,4 @@
>      // deleted.
>      ImageBridgeChild::ShutDown();
>  
> +    SharedBufferManagerChild::ShutDown();

The StartUp method is #ifdefed MOZ_WIDGET_GONK while the ShutDown is unconditionally compiled. It looks like you want to add #ifdef here as well (or remove it from around StartUp)
Attachment #8389064 - Flags: review?(nical.bugzilla) → review+
Comment on attachment 8389064 [details] [diff] [review]
V2, part2

Review of attachment 8389064 [details] [diff] [review]:
-----------------------------------------------------------------

I fixed all the nits based on the review.

Thanks for the great review and your great patient to review such big change!

I will upload the updated version later, and I will test more case to make sure nothing break this time!

::: gfx/layers/GrallocImages.cpp
@@ +87,4 @@
>    if (!mGraphicBufferLocked.get()) {
>  
>      SurfaceDescriptor desc;
> +    SharedBufferManagerChild *ibc = SharedBufferManagerChild::GetSingleton();

Done.

::: gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp
@@ +421,3 @@
>  {
> +  MaybeMagicGrallocBufferHandle handle;
> +  if (aDesc.type() == SurfaceDescriptor::TSurfaceDescriptorGralloc)

Done.

::: gfx/layers/opengl/GrallocTextureClient.cpp
@@ +83,5 @@
>  }
>  
> +GrallocTextureClientOGL::GrallocTextureClientOGL(SurfaceDescriptor buffer,
> +                        gfx::IntSize aSize,
> +                        TextureFlags aFlags)

Done.

::: gfx/layers/opengl/GrallocTextureHost.cpp
@@ +339,5 @@
>    }
> +  if (mGrallocHandle.buffer().type() != SurfaceDescriptor::Tnull_t) {
> +    MaybeMagicGrallocBufferHandle handle = mGrallocHandle.buffer();
> +    base::ProcessId owner;
> +    if (handle.type() == MaybeMagicGrallocBufferHandle::TGrallocBufferRef)

Done.

::: gfx/thebes/gfxPlatform.cpp
@@ +502,4 @@
>      // deleted.
>      ImageBridgeChild::ShutDown();
>  
> +    SharedBufferManagerChild::ShutDown();

Added. In fact, I followed ImageBridgeChild this case, where ImageBridgeChild has conditional startup but not shutdown :p
Attached patch part2 (obsolete) — Splinter Review
Fix some nits based on review above, and fix Camera Preview for KitKat, carry r+.

This must be land after IPC fixes land(910010).

Anyone want to test this can change 
http://dxr.mozilla.org/mozilla-central/source/ipc/chromium/src/chrome/common/file_descriptor_set_posix.h#33

to any larger value.
Attachment #8389064 - Attachment is obsolete: true
Attachment #8389561 - Flags: review+
Depends on: 910010
No longer depends on: 979138
(In reply to Manish Goregaokar [:manishearth] from comment #101)
> The try build seems to have busted:
> 
> https://tbpl.mozilla.org/?tree=Try&rev=33b1386ba701

I am fixing them all, and note that the blocker 910010 is a must have before landing this, otherwise it crashes all Content process on init.
Build pass on TryServer:

https://tbpl.mozilla.org/?tree=Try&rev=b0a9b0015e90

I will upload a updated version later.
Attached patch part1 (obsolete) — Splinter Review
Build fix on all platform on try server, carry r+
Attachment #8389063 - Attachment is obsolete: true
Attachment #8392023 - Flags: review+
Attached patch part2 (obsolete) — Splinter Review
Build fix for all other platform, carry r+.
Attachment #8389561 - Attachment is obsolete: true
Attachment #8392024 - Flags: review+
Attached patch part2 (obsolete) — Splinter Review
Re-fix camera on unagi(ICS).
Attachment #8392024 - Attachment is obsolete: true
Attachment #8392080 - Flags: review+
Depends on: 988251
Comment on attachment 8392023 [details] [diff] [review]
part1

Review of attachment 8392023 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/ipc/SharedBufferManagerParent.cpp
@@ +99,5 @@
> +    XRE_GetIOMessageLoop()->PostTask(FROM_HERE,
> +                                     new DeleteTask<Transport>(mTransport));
> +  }
> +  sManagers.erase(mOwner);
> +  delete mThread;

if (mThread)
{
delete mThread;
mThread = nullptr or deadbeef
}
(In reply to C.J. Ku[:CJKu] from comment #107)

> if (mThread)
> {
> delete mThread;
> mThread = nullptr or deadbeef
> }

I think it is safe to delete a null pointer. The "if" check might not needed.
Blocks: 986694
Can we get this landed?
The blocker 988251 is a must needed for this one. In fact, 988251 is for all other new top level protocol since Nuwa's enabling.

If 988251 are not landed first, all content process will just crash on init. And we can only find some "Send/Recv Error" with some message complains about "Protocol set" deserialize problem.

The underlying problem is since google chromium IPC code hard limit the fd amount per ipc message to 4, and Nuwa must forward the top-level ipc channel fds to content process(current 4:PCompositor, PImageBridge,PContent,PBackground). The most simple solution is to uplift the limit to 5 or larger, however, we do not want to touch chromium's code in our code base if possible. (There is a issue in google project, too: https://codereview.chromium.org/7010015)

As a result, bug 910010 introduce PFileDescriptorSet to send fd 1-by-1, and fix some ipc code exclude Nuwa, that's why I report bug 988251 and let it block this one.
Chiajung, can you confirm that just changing MAX_DESCRIPTORS_PER_MESSAGE to 5 actually makes everything else work? From what I can tell, that limit has actually been bumped to 7 in chromium, so I'm not sure about the "we don't want to touch chromium code" part...
Flags: needinfo?(chung)
This really needs to be landed. Can we just bump it to 7 and call it a day?
If bump it to 7 is an option, then I definitely agree to land this as soon as possible, since it will be very time consuming to rebase.

So let me rebase it and push to try again, then land it!

I will include the change to bump the value, otherwise it will cause regression and backed out soon.
Flags: needinfo?(chung)
Attached patch part1 - final (obsolete) — Splinter Review
Rebased to newer codebase, carry r+
Attachment #8392023 - Attachment is obsolete: true
Attachment #8412455 - Flags: review+
Attached patch part2 - final (obsolete) — Splinter Review
Rebase new version, carry r+.
Attachment #8392080 - Attachment is obsolete: true
Attachment #8412457 - Flags: review+
Keywords: checkin-needed
Attachment #8381639 - Attachment is obsolete: true
And leaks on all platforms:
https://tbpl.mozilla.org/php/getParsedLog.php?id=38503202&tree=Mozilla-Inbound

I think these patches should get a Try |-b do -p all -u all| run before re-requesting checkin :)
try push with leak/debug build fix:
https://tbpl.mozilla.org/?tree=Try&rev=15d3348284a4

I'll upload the patch later
Use StaticAutoPtr to avoid memleak and fix debug build. carry r+
Attachment #8412455 - Attachment is obsolete: true
Attachment #8413539 - Flags: review+
carry r+
Attachment #8412457 - Attachment is obsolete: true
Attachment #8413540 - Flags: review+
remove redundant spaces.
Attachment #8413540 - Attachment is obsolete: true
Attachment #8413542 - Flags: review+
Keywords: checkin-needed
Part 2 doesn't apply cleanly. Please rebase.
Keywords: checkin-needed
rebase for landing, carry r+.
Attachment #8413539 - Attachment is obsolete: true
Attachment #8414192 - Flags: review+
rebase for landing, carry r+.
Attachment #8413542 - Attachment is obsolete: true
Attachment #8414193 - Flags: review+
Keywords: checkin-needed
Fix various silly inversed asserttion.
Attachment #8414193 - Attachment is obsolete: true
Keywords: checkin-needed
Try push:
https://tbpl.mozilla.org/?tree=Try&rev=d95300716042

All B2G debug pass now. Since this patch should only affect B2G for now, other platform fail should not related.
Please can you reattach the patches with valid commit messages? :-)
(Part 1 is malformed & part 2 is missing a commit message)
Keywords: checkin-needed
Attached patch part1,FinalSplinter Review
Regenerate the patch file by git, carry r+.
Attachment #8414192 - Attachment is obsolete: true
Attachment #8415759 - Flags: review+
Attached patch part2,FinalSplinter Review
Regenerate the patch file from git, carry r+. I didn't notice that hg qref overwrite the commit log before. Sorry for inconvient.
Attachment #8415164 - Attachment is obsolete: true
Attachment #8415760 - Flags: review+
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/27ad774a92a5
https://hg.mozilla.org/mozilla-central/rev/e8afdde798c3
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Blocks: b2g-nexuss
Kevin, try your homescreen with this patch landed. This might resolve the laggy-ness we saw today.
(In reply to Andreas Gal :gal from comment #138)
> Kevin, try your homescreen with this patch landed. This might resolve the
> laggy-ness we saw today.

Thanks! I filed bug 1004768, and bug 998215 might be related to that issue as well.
Flags: needinfo?(kgrandon)
I'm still seeing some laggy scrolling unfortunately, and will follow-up in the other bugs.
Flags: needinfo?(kgrandon)
blocking-b2g: 2.0? → ---
(In reply to Kevin Grandon :kgrandon from comment #141)
> I'm still seeing some laggy scrolling unfortunately, and will follow-up in
> the other bugs.
Only if an app try to allocate graphic buffers, in rendering process, can take benefit from this patch. Home screen scrolling may not be the case.
Blocks: 985772
No longer blocks: 985772
You need to log in before you can comment on or make changes to this bug.