Closed Bug 907745 Opened 6 years ago Closed 6 years ago

Land the new gralloc textures

Categories

(Core :: Graphics: Layers, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla26
blocking-b2g koi+
Tracking Status
firefox26 --- fixed

People

(Reporter: nical, Assigned: nical)

References

(Blocks 2 open bugs)

Details

Attachments

(6 files, 3 obsolete files)

Bug 858914 has too many items so I am dedicating this bug to landing the gralloc part of the new textures, so that we can track this work more efficiently (there are other items in 858914 that will land after the gralloc work but are not blocking all the B2G stuff).

I plan to land this preffed off first, to make sure that the deprecated texture path is not broken and make backing out the new gralloc texture path easier (just flip back the pref).

All of this should happen soon-ish (I need to do some more testing).
blocking-b2g: --- → koi?
I believe it blocks koi.
nical, about when the patch is going to land?
Blocks: 908196
(In reply to Sotaro Ikeda [:sotaro] from comment #2)
> nical, about when the patch is going to land?

Trunk seems to be broken in many places with unagi right now so I am testing different things on two unagi phones, one with and one without the new gralloc textures, and I was testing with the emulator but the latter stopped working (at all) for me even without my patches so right now I am stuck with unagi.

I am trying to test manually:
 * [OK] Going into the gallery, playing a video
 * [OK] Using the camera (emulator only)
 * [untested] Some canvas 2d app
 * [untested] Some webgl + video app (webgl not working on unagi)

I pushed to try and it doesn't look very good: https://tbpl.mozilla.org/?tree=Try&rev=3144b0ff314f

but I fixed locally one of the issues
blocking-b2g: koi? → ---
(In reply to Sotaro Ikeda [:sotaro] from comment #2)
> nical, about when the patch is going to land?

Trunk seems to be broken in many places with unagi right now so I am testing different things with two unagi phones, one with and one without the new gralloc textures, and I was testing with the emulator but the latter stopped working (at all) for me even without my patches. I have spent some time trying to fix it today without success so for now I am stuck with unagi.

I am trying to test manually:
 * [OK] Going into the gallery, playing a video
 * [OK] Using the camera (emulator only)
 * [broken] using everything.me
 * [untested] Some canvas 2d app
 * [untested] Some webgl + video app (webgl not working on unagi)

I pushed to try and it doesn't look very good: https://tbpl.mozilla.org/?tree=Try&rev=3144b0ff314f

I fixed locally one of the issues, I could push to try again and see.
I have a crash in everything.me that I am working on right now.

If some people really really need this patch asap, they can either apply the current version locally and work on their stuff (the camera is working for instance) or help me getting things green on try.
Attached patch Part 1 - New gralloc textures (obsolete) — Splinter Review
To enable new textures, pref off "layers.use-deprecated-textures". We use deprecated textures by default.
With the version of the patch I uploaded, everything.me works and canvas2D apps in general work as well (tested manually on unagi).

waiting for another round of try results: https://tbpl.mozilla.org/?tree=Try&rev=3df6a2a8d959
Comment on attachment 794072 [details] [diff] [review]
Part 1 - New gralloc textures

Review of attachment 794072 [details] [diff] [review]:
-----------------------------------------------------------------

I think I will still have some reftest failures to fix so I can't ask for a final review just yet but for the most part I expect the patch to not change so my lucky reviewers can start looking at the code and let me know if there are outstanding badness or things they would like to be working differently. Please ignore the printfs.

This patch implements a GrallcoTextureClientOGL and a GrallocTextureHostOGL to be used with the new textures architecture. New textures are only used by ImageClient and and CanvasClient2D (and on the compositor side ImageHost) when deprectaed textures are disabled. So only Image layers and Canvas layers should be affected (any other side effect should be considered as a bug in my patch).

On the Image API side (this will mostly interest Peter) I only sort of glued the gralloc texture client as an optional member of GrallocImage. So basically if we ask the gralloc image for a texture client it will create the Texture client and initialize it with the data. This is not satisfying for me in the long run because I want the TextureClient to fully own the gralloc buffer (which it does not because right now the Image does). Instead I would like the image to just have a reference to a GrallocTextureClientOGL which would fully own the buffer. To get there is probably a lot of stuff to be done so in the mean time it will be important to keep both the image and the TextureClient alive to manage the life-time of the gralloc buffer. I think we can still benefit from this setup which is a net improvement compared to deprecated textures.

Most of the interesting stuff is in the two new files that contain the new gralloc texture client and host, and in GrallocImages for how this relates to the video pipeline. The rest should mostly be glue code and small fixes to make things work. If the patch is too big I can move some of the glue code into another patch if asked.
Attachment #794072 - Flags: feedback?(pchang)
Attachment #794072 - Flags: feedback?(bjacob)
Comment on attachment 794072 [details] [diff] [review]
Part 1 - New gralloc textures

Review of attachment 794072 [details] [diff] [review]:
-----------------------------------------------------------------

Chatted with Nical about that, sounds sane, so feedback+. We'll see about the review ;-)
Attachment #794072 - Flags: feedback?(bjacob) → feedback+
The Gralloc textures support locking so we need need to modify the code that assumes that no texture backend supports locking yet.
Attachment #794679 - Flags: review?(matt.woodrow)
Attached patch Part 1 - New gralloc textures (obsolete) — Splinter Review
Attachment #794072 - Attachment is obsolete: true
Attachment #794072 - Flags: feedback?(pchang)
Attachment #794681 - Flags: feedback?(pchang)
Attachment #794682 - Flags: review?(matt.woodrow)
Somehow koi? was cleared. Set flag again.
blocking-b2g: --- → koi?
Comment on attachment 794681 [details] [diff] [review]
Part 1 - New gralloc textures

actually things are looking good, time to review.
Attachment #794681 - Flags: review?(pchang)
Attachment #794681 - Flags: review?(bjacob)
Attachment #794681 - Flags: feedback?(pchang)
The patches seems not include change to CompositorOGL::GetTemporaryTexture().
(In reply to Sotaro Ikeda [:sotaro] from comment #17)
> The patches seems not include change to CompositorOGL::GetTemporaryTexture().

Sorry, it is Bug 906671.
(In reply to Sotaro Ikeda [:sotaro] from comment #18)
> (In reply to Sotaro Ikeda [:sotaro] from comment #17)
> > The patches seems not include change to CompositorOGL::GetTemporaryTexture().
> 
> Sorry, it is Bug 906671.

Oh, right, my bad.
GrallocTextureSourceOGL in attachment 794681 [details] [diff] [review] is leaking EGLImage. During a video playback, following log was outputted and video playback was almost stopped. I temporarily added a code to remove the EGLImage just after bind to gralloc buffer like b2g19, it fixed the leaking problem. n.b. 

---------------------------

08-24 14:05:24.279   144   269 W Adreno200-GSL: <gsl_ldd_control:225>: ioctl code 0xc01c0915 (IOCTL_KGSL_MAP_USER_MEM) failed: errno 12 Out of memory
08-24 14:05:24.279   144   269 W Adreno200-EGLSUB: <CreateImage:1236>: fail gsl_memory_map.
08-24 14:05:24.279   144   269 W Adreno200-EGL: <qeglDrvAPI_eglCreateImageKHR:4504>: EGL_BAD_PARAMETER


n.b. I confirmed it by disabling HwComposer. If HwComposer is enable, video playback is not smooth. It seems related to Bug 908276.
Comment on attachment 794681 [details] [diff] [review]
Part 1 - New gralloc textures

Review of attachment 794681 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/GrallocImages.cpp
@@ +281,5 @@
> +{
> +  if (!mTextureClient) {
> +    if (!mGraphicBuffer) {
> +      return nullptr;
> +    }

GetSurfaceDescriptor() below already checked the mGraphicBuffer, you can skip here.

@@ +283,5 @@
> +    if (!mGraphicBuffer) {
> +      return nullptr;
> +    }
> +    const SurfaceDescriptorGralloc& desc =
> +      GetSurfaceDescriptor().get_SurfaceDescriptorGralloc();

Need to use get_NewSurfaceDescriptorGralloc here?

@@ +292,5 @@
> +    }
> +
> +    mTextureClient = new GrallocTextureClientOGL(flags);
> +    mTextureClient->InitWith(static_cast<GrallocBufferActor*>(desc.bufferChild()),
> +                             gfx::ToIntSize(mSize));

How about merge InitWith into GrallocTextureClientOGL constructor? And then we could make sure grallocTextureClient always has valid grallocbufferactor during initialization.

::: gfx/layers/ipc/LayersSurfaces.ipdlh
@@ +86,5 @@
> +   * that GraphicBuffer's size and actual video's size are different.
> +   * Extra size member is necessary. See Bug 850566.
> +   */
> +  IntSize size;
> +};

How to handle the RBSwap info with new Gralloc?
Because we are using BRGA/BRGX(non-GL rendering) in b2g but Android didn't have the BRGX format.
Therefore we add the RBSwap flag inside SurfaceDescriptorGralloc.

::: gfx/layers/opengl/GrallocTextureClient.cpp
@@ +128,5 @@
> +GrallocTextureClientOGL::AllocateForYCbCr(gfx::IntSize aYSize, gfx::IntSize aCbCrSize)
> +{
> +  return AllocateGralloc(aYSize,
> +                         HAL_PIXEL_FORMAT_YV12,
> +                         android::GraphicBuffer::USAGE_SW_READ_OFTEN);

I prefer not to add allocate API for YCbCr, we can just reuse AllocateForSurface above.
And aCbCrSize is not used here.

::: gfx/layers/opengl/GrallocTextureHost.cpp
@@ +250,5 @@
> +
> +LayerRenderState
> +GrallocTextureHostOGL::GetRenderState()
> +{
> +  if (IsValid()) {

I think we still need the RBSwap info here to pass the renderstate to other composer module.
if (mIsRBSwapped) {
      flags |= LAYER_RENDER_STATE_FORMAT_RB_SWAP;
    }
The patches seems to have one problem, they still could return GraphicBuffer to GonkNativeWindow too early. GrallocImage's destructor calls GraphicBufferLocked::Unlock(). And the call made the buffer return to GonkNativeWindow. ImageClientBuffered holds only TextureClients(GrallocTextureClientOGL). The TextureClient does not hold reference to Image nor TextureClient. So, the lifetime of GraphicBufferLocked is not controlled by GrallocTextureClientOGL.
I locally fixed the comment #20 and comment #22 as a temporary change. By the change gralloc buffers are correctly handled by gfx layers, but Bug 880780 is still there. Bug 880780 seem not related to gralloc buffers handling nor the genlock problem, seems more related to OpenGL.
Attachment #794679 - Flags: review?(matt.woodrow) → review+
Attachment #794682 - Flags: review?(matt.woodrow) → review+
(In reply to peter chang[:pchang] from comment #21)
> Comment on attachment 794681 [details] [diff] [review]
> Part 1 - New gralloc textures
> 
> Review of attachment 794681 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +283,5 @@
> > +    if (!mGraphicBuffer) {
> > +      return nullptr;
> > +    }
> > +    const SurfaceDescriptorGralloc& desc =
> > +      GetSurfaceDescriptor().get_SurfaceDescriptorGralloc();
> 
> Need to use get_NewSurfaceDescriptorGralloc here?

Not here: NewSurfaceDescriptorGralloc is only used to do the IPC glue between GrallocTextureClientOGL and GrallocTextureHostOGL. the code here uses the machanisms already in place to create gralloc buffers, which relies on the (not new) SurfaceDescriptorGralloc.

In the long run we'll cleanup the way we allocate gralloc buffers so that it does not rely on GrallocBufferActor and (old) SurfaceDescriptorGralloc.

> 
> @@ +292,5 @@
> > +    }
> > +
> > +    mTextureClient = new GrallocTextureClientOGL(flags);
> > +    mTextureClient->InitWith(static_cast<GrallocBufferActor*>(desc.bufferChild()),
> > +                             gfx::ToIntSize(mSize));
> 
> How about merge InitWith into GrallocTextureClientOGL constructor? 

Modified the patch accordingly (I'll upload a new version once I have fixed the rest of the comments and the EGLImage leak).

> And then
> we could make sure grallocTextureClient always has valid grallocbufferactor
> during initialization.

There are other users of gralloc texture client that will need to create the texture client uninitialized and allocate it later so I can't guaranty that the texture client always have allocated data after construction when using the other constructor.

> 
> ::: gfx/layers/ipc/LayersSurfaces.ipdlh
> @@ +86,5 @@
> > +   * that GraphicBuffer's size and actual video's size are different.
> > +   * Extra size member is necessary. See Bug 850566.
> > +   */
> > +  IntSize size;
> > +};
> 
> How to handle the RBSwap info with new Gralloc?
> Because we are using BRGA/BRGX(non-GL rendering) in b2g but Android didn't
> have the BRGX format.
> Therefore we add the RBSwap flag inside SurfaceDescriptorGralloc.

RBSwap is now in TextureFlag that is not carried by SurfaceDescriptor but by the OpAddTexture message instead. This way RB swap logic is not specific to gralloc.

> 
> ::: gfx/layers/opengl/GrallocTextureClient.cpp
> @@ +128,5 @@
> > +GrallocTextureClientOGL::AllocateForYCbCr(gfx::IntSize aYSize, gfx::IntSize aCbCrSize)
> > +{
> > +  return AllocateGralloc(aYSize,
> > +                         HAL_PIXEL_FORMAT_YV12,
> > +                         android::GraphicBuffer::USAGE_SW_READ_OFTEN);
> 
> I prefer not to add allocate API for YCbCr, we can just reuse
> AllocateForSurface above.
> And aCbCrSize is not used here.

It will be used in other code paths though. GrallocTextureClientOGL is an implementation of BufferTextureClient which abstracts out shmem vs memory vs gralloc. as such it has to implement allocate for surface (rgb) and allocate for YCbCr. I disabled the code paths that use allocate for YCbCr at the last minute because I first need Bug 908196 to be fixed. So right now this code is not used but It will soon be (See comment of patch Part 3).

> 
> ::: gfx/layers/opengl/GrallocTextureHost.cpp
> @@ +250,5 @@
> > +
> > +LayerRenderState
> > +GrallocTextureHostOGL::GetRenderState()
> > +{
> > +  if (IsValid()) {
> 
> I think we still need the RBSwap info here to pass the renderstate to other
> composer module.
> if (mIsRBSwapped) {
>       flags |= LAYER_RENDER_STATE_FORMAT_RB_SWAP;
>     }

Ah, good catch, thanks.
(In reply to Sotaro Ikeda [:sotaro] from comment #22)
> The patches seems to have one problem, they still could return GraphicBuffer
> to GonkNativeWindow too early.

Yes, Just like the deprecated gralloc code paths. I am not trying to solve that problem in this bug, this should be a followup.

> GrallocImage's destructor calls
> GraphicBufferLocked::Unlock(). And the call made the buffer return to
> GonkNativeWindow. ImageClientBuffered holds only
> TextureClients(GrallocTextureClientOGL). The TextureClient does not hold
> reference to Image nor TextureClient. So, the lifetime of
> GraphicBufferLocked is not controlled by GrallocTextureClientOGL.

To solve this problem we need to transfer completely the ownership of the gralloc buffer to the texture client (I mentioned that in comment 8) and let the Image only hold a reference to the texture client.

But this a lot of work and there is value in landing the new gralloc texture as it is and continue improving the situation as followups.
(In reply to Sotaro Ikeda [:sotaro] from comment #23)
> I locally fixed the comment #20 and comment #22 as a temporary change. By
> the change gralloc buffers are correctly handled by gfx layers, but Bug
> 880780 is still there. Bug 880780 seem not related to gralloc buffers
> handling nor the genlock problem, seems more related to OpenGL.

Do you have the fix for comment 20 somewhere online so that I can look at it and integrate it to the patch? Or would you prefer to fix it in a separate bug?
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Nicolas Silva [:nical] from comment #25)
> (In reply to Sotaro Ikeda [:sotaro] from comment #22)
> > The patches seems to have one problem, they still could return GraphicBuffer
> > to GonkNativeWindow too early.
> 
> Yes, Just like the deprecated gralloc code paths. I am not trying to solve
> that problem in this bug, this should be a followup.
> 
> > GrallocImage's destructor calls
> > GraphicBufferLocked::Unlock(). And the call made the buffer return to
> > GonkNativeWindow. ImageClientBuffered holds only
> > TextureClients(GrallocTextureClientOGL). The TextureClient does not hold
> > reference to Image nor TextureClient. So, the lifetime of
> > GraphicBufferLocked is not controlled by GrallocTextureClientOGL.
> 
> To solve this problem we need to transfer completely the ownership of the
> gralloc buffer to the texture client (I mentioned that in comment 8) and let
> the Image only hold a reference to the texture client.

It can easily fix as a temporary fix, just add a reference from GrallocTextureClientOGL to GraphicBufferLocked during GrallocTextureClientOGL creation. I am very concerned about the release gralloc buffer too early problem un-fixed on master very longtime.
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Nicolas Silva [:nical] from comment #26)
> Do you have the fix for comment 20 somewhere online so that I can look at it
> and integrate it to the patch? Or would you prefer to fix it in a separate
> bug?

If it take a long time to fix it correctly, just add a temporary fix in this bug seems reasonable. I fixed comment 20 just locally. I fixed comment 20 just locally. I just did same thing as GLContextEGL::BindExternalBuffer() in b2g18. So, just added a call of GrallocTextureSourceOGL::DeallocateDeviceData() in GrallocTextureSourceOGL::BindTexture().

http://mxr.mozilla.org/mozilla-b2g18/source/gfx/gl/GLContextProviderEGL.cpp#428
(In reply to Sotaro Ikeda [:sotaro] from comment #29)
> (In reply to Nicolas Silva [:nical] from comment #26)
> > Do you have the fix for comment 20 somewhere online so that I can look at it
> > and integrate it to the patch? Or would you prefer to fix it in a separate
> > bug?
> 
> If it take a long time to fix it correctly, just add a temporary fix in this
> bug seems reasonable. 

And add a correct fix in another bug, if it is the case.

Fix of b2g v1.2 will come soon, if we can fix temporarily, check-in the temporary fix at first, then fix the problem correctly in a separate bug seems a good way to go.
Attached patch Part 1 - New gralloc textures (obsolete) — Splinter Review
Here is a new version of the patch which contains fixes for comments from Peter and Sotaro.

I also filed bug 909356 for the EGLImage/unlocking gralloc issue.
In the mean time I took Sotaro's solution for this patch.
Attachment #794681 - Attachment is obsolete: true
Attachment #794681 - Flags: review?(pchang)
Attachment #794681 - Flags: review?(bjacob)
Attachment #795468 - Flags: review?(sotaro.ikeda.g)
Attachment #795468 - Flags: review?(pchang)
Attachment #795468 - Flags: review?(bjacob)
also the patch is rebased so that you can apply this patch queue on top of trunk without patches from other bugs.
Oops I forgot to make it so that it is the TextureClient that unlocks the GraphicBufferUnlocked during destruction instead of the GrallocImage.

New patch coming soon.
Comment on attachment 795468 [details] [diff] [review]
Part 1 - New gralloc textures

Sorry, my Comment 28 was bad, even when GrallocTextureClientOGL to GraphicBufferLocked, gralloc buffers are returned to GonkNativeWindow too early, because GrallocImage's destructor calls GraphicBufferLocked::Unlock(). I also locally chaned the implementation to return gralloc buffer during the GraphicBufferLocked derived classes destructor. It is OK to fix the problem in bug 909356 or another bug.
Attachment #795468 - Flags: review?(sotaro.ikeda.g) → review+
Sadly, after I made it so the GraphicBufferLocked is unlocked by the texture client's destructor rather than the GrallocImage's destructor I consistently get the following crash at the end of every hardware decoded video.

Sotaro, does this stack trace ring any bell? (sorry for needinfo'ing you several times a day like that, I really want to land this asap)


Crash on line: CHECK(onlyThoseWeOwn || buffers->isEmpty());
with onlyThoseWeOwn = false and buffers->isEmpty() returning false.

#0  __libc_android_abort () at bionic/libc/unistd/abort.c:82
#1  0x4009244c in __android_log_assert (cond=<value optimized out>, 
    tag=0x42e00325 "OMXCodec", fmt=0x42dfbc3b "%s") at system/core/liblog/logd_write.c:246
#2  0x42d5f2c6 in android::OMXCodec::freeBuffersOnPort (this=0x44b939d0, portIndex=1, 
    onlyThoseWeOwn=false) at frameworks/base/media/libstagefright/OMXCodec.cpp:3722
#3  0x42d5fcbc in android::OMXCodec::onStateChange (this=0x44b939d0, 
    newState=<value optimized out>)
    at frameworks/base/media/libstagefright/OMXCodec.cpp:3592
#4  0x42d60826 in android::OMXCodec::onCmdComplete (this=0x44b939d0, 
    cmd=<value optimized out>, data=1121958971)
    at frameworks/base/media/libstagefright/OMXCodec.cpp:3405
#5  0x42d61660 in android::OMXCodec::onEvent (this=0x44b939d0, 
    event=OMX_EventCmdComplete, data1=0, data2=2)
    at frameworks/base/media/libstagefright/OMXCodec.cpp:3278
#6  0x42d618e4 in android::OMXCodec::on_message (this=0x44b939d0, msg=...)
    at frameworks/base/media/libstagefright/OMXCodec.cpp:2966
#7  0x42d61fd4 in android::OMXCodecObserver::onMessage (this=<value optimized out>, 
    msg=...) at frameworks/base/media/libstagefright/OMXCodec.cpp:325
#8  0x42e9679e in android::BnOMXObserver::onTransact (this=0x44d9f460, 
    code=<value optimized out>, data=..., reply=<value optimized out>, flags=17)
    at frameworks/base/media/libmedia/IOMX.cpp:778
#9  0x402b8e8a in android::BBinder::transact (this=0x44d9f464, code=22, data=..., 
    reply=0x463d3dc0, flags=17) at frameworks/base/libs/binder/Binder.cpp:107
#10 0x402bc194 in android::IPCThreadState::executeCommand (this=0x48a0aa80, 
    cmd=<value optimized out>) at frameworks/base/libs/binder/IPCThreadState.cpp:1028
#11 0x402bc372 in android::IPCThreadState::joinThreadPool (this=0x48a0aa80, isMain=false)
    at frameworks/base/libs/binder/IPCThreadState.cpp:468
(In reply to Nicolas Silva [:nical] from comment #35)
> Sadly, after I made it so the GraphicBufferLocked is unlocked by the texture
> client's destructor rather than the GrallocImage's destructor I consistently
> get the following crash at the end of every hardware decoded video.
> 
> Sotaro, does this stack trace ring any bell? (sorry for needinfo'ing you
> several times a day like that, I really want to land this asap)
> 

It seems that some buffers are not returned to OMXCodec durng port close.
Before OMXCodec close, ImageContainer::SetCurrentImage(nullptr) should be called, it should trigger all gralloc buffer return to GonkNativeWindow. The crashin Comment 35 happned by related to this.
Current ImageClientBuffered's implementation seems not to handle the case in comment #37. And confirmed the same problem on master unagi.
Blocks: 906793
(In reply to Sotaro Ikeda [:sotaro] from comment #37)
> Before OMXCodec close, ImageContainer::SetCurrentImage(nullptr) should be
> called, it should trigger all gralloc buffer return to GonkNativeWindow. The
> crashin Comment 35 happned by related to this.

Interesting, so what happens here is that the TextureClient's destructor is called after ImageConainer::SetCurrentImage(nullptr) for the last frame.

I would prefer to fix the frame ordering problem as a followup if that's okay with you because it is most likely going to need some more involved logic.
(In reply to Nicolas Silva [:nical] from comment #39)
> (In reply to Sotaro Ikeda [:sotaro] from comment #37)
> > Before OMXCodec close, ImageContainer::SetCurrentImage(nullptr) should be
> > called, it should trigger all gralloc buffer return to GonkNativeWindow. The
> > crashin Comment 35 happned by related to this.
> 
> Interesting, so what happens here is that the TextureClient's destructor is
> called after ImageConainer::SetCurrentImage(nullptr) for the last frame.

TextureClient's destructor needs to be called synchronously. It is a very strong requirement from OMXCodec.

http://mxr.mozilla.org/mozilla-b2g18/source/gfx/layers/ImageContainer.cpp#171

> I would prefer to fix the frame ordering problem as a followup if that's
> okay with you because it is most likely going to need some more involved
> logic.

It's Okey.
Same patch without the attempt at fixing the frame ordering problem.
Attachment #795468 - Attachment is obsolete: true
Attachment #795468 - Flags: review?(pchang)
Attachment #795468 - Flags: review?(bjacob)
Attachment #795596 - Flags: review?(pchang)
Attachment #795596 - Flags: review?(bjacob)
In attachment 795596 [details] [diff] [review], DeallocateDeviceData() is called before calling fEGLImageTargetTexture2D(). It still leaks EGLImage and make b2g system unstable and unresponsive if you activate and activate a camera several times. I added DeallocateDeviceData() after the call like the following. It's code is scary, but works and used in b2g18.

-------------------------------------
  GLuint tex = mCompositor->GetTemporaryTexture(aTextureUnit);
  GLuint textureTarget = GetTextureTarget();

  gl()->fActiveTexture(aTextureUnit);
  gl()->fBindTexture(textureTarget, tex);
  if (!mEGLImage) {
    mEGLImage = gl()->CreateEGLImageForNativeBuffer(mGraphicBuffer->getNativeBuffer());
  }
  gl()->fEGLImageTargetTexture2D(textureTarget, mEGLImage);
  DeallocateDeviceData();
https://hg.mozilla.org/mozilla-central/rev/2a2d743f28ca
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla26
Only part 0 is landed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
When I built attachment 795596 [details] [diff] [review] on master nexus-4, I got some errors like following.

-----------------------------------
../../dist/include/mozilla/layers/GrallocTextureHost.h:87:16: error: 'virtual bool mozilla::layers::GrallocTextureHostOGL::IsValid() const' marked override, but does not overrid

../../dist/include/mozilla/layers/GrallocTextureClient.h:77:16: error: 'virtual bool mozilla::layers::GrallocTextureClientOGL::AllocateForYCbCr(mozilla::gfx::IntSize, mozilla::gfx::IntSize)' marked override, but does not override
(In reply to Sotaro Ikeda [:sotaro] from comment #45)
> When I built attachment 795596 [details] [diff] [review] on master nexus-4,
> I got some errors like following.
> 
> -----------------------------------
> ../../dist/include/mozilla/layers/GrallocTextureHost.h:87:16: error:
> 'virtual bool mozilla::layers::GrallocTextureHostOGL::IsValid() const'
> marked override, but does not overrid
> 
> ../../dist/include/mozilla/layers/GrallocTextureClient.h:77:16: error:
> 'virtual bool
> mozilla::layers::GrallocTextureClientOGL::AllocateForYCbCr(mozilla::gfx::
> IntSize, mozilla::gfx::IntSize)' marked override, but does not override

I don't have this build error on my side after rebasing to latest m-c and applying attachment 795596 [details] [diff] [review].
Whiteboard: [leave open]
Comment on attachment 795596 [details] [diff] [review]
Part 1 - New gralloc textures (carrying sotaro: r+)

Review of attachment 795596 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM
Attachment #795596 - Flags: review?(pchang) → review+
No longer blocks: 880780
Blocks: 908276
Duplicate of this bug: 901219
Blocks: 901220
Blocks: 909851
(In reply to Sotaro Ikeda [:sotaro] from comment #40)
> 
> TextureClient's destructor needs to be called synchronously. It is a very
> strong requirement from OMXCodec.
> 
> http://mxr.mozilla.org/mozilla-b2g18/source/gfx/layers/ImageContainer.cpp#171
> 
> > I would prefer to fix the frame ordering problem as a followup if that's
> > okay with you because it is most likely going to need some more involved
> > logic.
> 
> It's Okey.

I changed Bug 901224 to " ImageBridgeChild returns gralloc buffers too early to GonkNativeWindow".
Blocks: 901224
Blocks: 909746
No longer blocks: 909746
Blocks: 909564
Attachment #794684 - Flags: review?(bjacob) → review+
Comment on attachment 795596 [details] [diff] [review]
Part 1 - New gralloc textures (carrying sotaro: r+)

Review of attachment 795596 [details] [diff] [review]:
-----------------------------------------------------------------

I only reviewed the TextureClient and TextureHost files, assuming that Peter and Sotaro covered the rest.

I have many little nits, but I trust that you can address them and land without another round of reviews, since this is time-sensitive. I didn't find any definite bug.

I couldn't easily tell what is new code and what is from existing code, so I sort of re-reviewed these files entirely, you decide what applies to your patch vs existing code. Please file follow-ups as appropriate.

::: gfx/layers/opengl/GrallocTextureClient.cpp
@@ +69,5 @@
> +{
> +  // XXX- it would be cleaner to take the openMode into account or to check
> +  // that aMode is coherent with mGrallocFlags (which carries more information
> +  // than OpenMode).
> +  int32_t rv = mGraphicBuffer->lock(mGrallocFlags, &(void*&)mMappedBuffer);

Never use C-style casts on pointers.

Here you are using references in a tricky way to ensure that you're actually taking the address of mMappedBuffer, not of a temporary copied from it. A simpler way would be to take its address right away, and convert _that_ to void**.

int32_t rv = mGraphicBuffer->lock(mGrallocFlags, reinterpret_cast<void**>(&mMappedBuffer));

@@ +71,5 @@
> +  // that aMode is coherent with mGrallocFlags (which carries more information
> +  // than OpenMode).
> +  int32_t rv = mGraphicBuffer->lock(mGrallocFlags, &(void*&)mMappedBuffer);
> +  if (rv) {
> +    NS_WARNING("Couldn't lock graphic buffer");

This should never happen, right? So I'd crash debug builds on that. Easier debugging than having to figure why something is not correctly rendered on screen.

@@ +172,5 @@
> +
> +bool
> +GrallocTextureClientOGL::IsAllocated() const
> +{
> +  return !!mGraphicBuffer.get();

Both the !! and the .get() are useless here.

::: gfx/layers/opengl/GrallocTextureClient.h
@@ +17,5 @@
> +
> +/**
> + * A TextureClient implementation to share TextureMemory that is already
> + * on the GPU, for the OpenGL backend, using a platform-specific optimization
> + * called Gralloc

Ouch, this could be better phrased. "TextureMemory"?  Gralloc an "optimization"?

(And we should gradually switch away from calling Android GraphicBuffer "Gralloc").

@@ +33,5 @@
> +  GrallocTextureClientOGL(CompositableClient* aCompositable,
> +                          gfx::SurfaceFormat aFormat,
> +                          TextureFlags aFlags = TEXTURE_FLAGS_DEFAULT);
> +
> +  ~GrallocTextureClientOGL();

Same comment as in GrallocTextureHost.h:

 - why is the dtor not marked virtual MOZ_OVERRIDE like other methods?
 - please consider const-qualifying various methods in a follow-up.

::: gfx/layers/opengl/GrallocTextureHost.cpp
@@ +2,5 @@
> +//  * This Source Code Form is subject to the terms of the Mozilla Public
> + * License, v. 2.0. If a copy of the MPL was not distributed with this
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#ifdef MOZ_WIDGET_GONK

This ifdef really needs to go elsewhere, e.g. in a Makefile.

@@ +9,5 @@
> +#include "GLContext.h"
> +#include "gfxImageSurface.h"
> +#include "gfx2DGlue.h"
> +#include <ui/GraphicBuffer.h>
> +#include "mozilla/layers/CompositorOGL.h"

Please group the #includes by similarity e.g. the mozilla/layers/ ones together.

@@ +112,5 @@
> +  // This is a temporary fix to a bad lock/unlock race with the camera.
> +  // It is bad for performances so we need to find a better way asap.
> +  DeallocateDeviceData();
> +
> +  GLuint tex = mCompositor->GetTemporaryTexture(aTextureUnit);

Wait, I'm not sure I get it. As we discussed, there needs to be separate temporary textures for each texture target. Is this intended to be taken care of in a follow-up bug?

@@ +127,5 @@
> +
> +bool
> +GrallocTextureSourceOGL::IsValid() const
> +{
> +  return !!gl() && !!mGraphicBuffer.get();

The !! are useless here, as && takes boolean operands, right?

@@ +139,5 @@
> +
> +GLenum
> +GrallocTextureSourceOGL::GetTextureTarget() const
> +{
> +  MOZ_ASSERT(mGraphicBuffer.get());

Is this .get() really necessary? I thought that the implicit conversion-to-T* operator would be sufficient here.

@@ +145,5 @@
> +}
> +
> +gfx::SurfaceFormat
> +GrallocTextureSourceOGL::GetFormat() const MOZ_OVERRIDE {
> +  if (!mGraphicBuffer.get()) {

Same.

@@ +151,5 @@
> +  }
> +  if (GetTextureTarget() == LOCAL_GL_TEXTURE_EXTERNAL) {
> +    return gfx::FORMAT_R8G8B8A8;
> +  }
> +  return mFormat;

Thinking about this function... maybe we just shouldn't have a mFormat member that is a lie half of the time? We can get this information cheaply anyways from the mGraphicBuffer.

@@ +158,5 @@
> +gfx::IntSize
> +GrallocTextureSourceOGL::GetSize() const
> +{
> +  if (!IsValid()) {
> +    NS_WARNING("Trying to access the size of an invalid GrallocTextureSourceOGL");

I'm uncomfortable about this being a warning as opposed to an assertion. Are we really expecting to be using invalid texturesources that often? If this is from preexisting code, it's worth filing a follow-up bug.

@@ +165,5 @@
> +  return gfx::IntSize(mGraphicBuffer->getWidth(), mGraphicBuffer->getHeight());
> +}
> +
> +void
> +GrallocTextureSourceOGL::DeallocateDeviceData()

Oh, I had no idea what DeallocateDeviceData did based on its name.

If you think that it's a good name, please explain what 'device' means here in general. Otherwise, please find another terminology :-)

I see that this is already in mozilla-central.

@@ +171,5 @@
> +  if (mEGLImage) {
> +    MOZ_ASSERT(gl());
> +    gl()->MakeCurrent();
> +    gl()->DestroyEGLImage(mEGLImage);
> +    mEGLImage = 0;

LOCAL_EGL_NO_IMAGE_KHR

@@ +184,5 @@
> +  mGrallocActor =
> +    static_cast<GrallocBufferActor*>(aDescriptor.bufferParent());
> +
> +  android::sp<android::GraphicBuffer> graphicBuffer =
> +    mGrallocActor->GetGraphicBuffer();

For local variables like this, no need for a ref-counting pointer, a plain android::GraphicBuffer* will do (avoids needless refcounting work).

@@ +191,5 @@
> +  gfx::SurfaceFormat format =
> +    SurfaceFormatForAndroidPixelFormat(graphicBuffer->getPixelFormat(),
> +                                     aFlags & TEXTURE_RB_SWAPPED);
> +  mTextureSource = new GrallocTextureSourceOGL(nullptr,
> +                                               graphicBuffer.get(),

Needless .get() here.

@@ +284,5 @@
> +  gl()->fBindTexture(GetTextureTarget(), tex);
> +  if (!mEGLImage) {
> +    mEGLImage = gl()->CreateEGLImageForNativeBuffer(mGraphicBuffer->getNativeBuffer());
> +  }
> +  gl()->fEGLImageTargetTexture2D(GetTextureTarget(), mEGLImage);

Can't this all use the above BindTexture() method, rather than duplicating most of its code?

Also, if BindTexture() is so careful to restore the current active texture to TEXTURE0, why doesn't this method to do the same?

@@ +290,5 @@
> +  nsRefPtr<gfxImageSurface> surf = IsValid() ?
> +    gl()->GetTexImage(tex,
> +                      false,
> +                      GetFormat())
> +    : nullptr;

The indentation in this multi-line ?: is really too funky. In general, try to have ? and : vertically aligned, and try to have the operands fit on one line each.

::: gfx/layers/opengl/GrallocTextureHost.h
@@ +4,5 @@
> + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> +
> +#ifndef MOZILLA_GFX_GRALLOCTEXTUREHOST_H
> +#define MOZILLA_GFX_GRALLOCTEXTUREHOST_H
> +#ifdef MOZ_WIDGET_GONK

Can't you remove this MOZ_WIDGET_GONK? If this all needs to be inside an #ifdef, it's more customary to have the ifdef around the #include on the side of the file that does the #include.

@@ +64,5 @@
> +protected:
> +  CompositorOGL* mCompositor;
> +  android::sp<android::GraphicBuffer> mGraphicBuffer;
> +  EGLImage mEGLImage;
> +  gfx::SurfaceFormat mFormat;

See comment in GetFormat(): mFormat is redundant information (with mGraphicBuffer) and is a lie half of the time. Remove it?

@@ +75,5 @@
> +  GrallocTextureHostOGL(uint64_t aID,
> +                        TextureFlags aFlags,
> +                        const NewSurfaceDescriptorGralloc& aDescriptor);
> +
> +  ~GrallocTextureHostOGL();

Since you're putting virtual and MOZ_OVERRIDE on all the other overrides below, why not also on the destructor?

@@ +105,5 @@
> +
> +  virtual already_AddRefed<gfxImageSurface> GetAsSurface() MOZ_OVERRIDE;
> +
> +#ifdef MOZ_LAYERS_HAVE_LOG
> +  virtual const char* Name() { return "GrallocTextureHostOGL"; }

This is just a nit and I realize it's a preexisting problem, but many pf the methods here, like Name(), should really be constipated.
Attachment #795596 - Flags: review?(bjacob) → review+
(In reply to Benoit Jacob [:bjacob] from comment #50)
> Comment on attachment 795596 [details] [diff] [review]
> Part 1 - New gralloc textures (carrying sotaro: r+)
> 
> Review of attachment 795596 [details] [diff] [review]:
> -----------------------------------------------------------------
> @@ +71,5 @@
> > +  // that aMode is coherent with mGrallocFlags (which carries more information
> > +  // than OpenMode).
> > +  int32_t rv = mGraphicBuffer->lock(mGrallocFlags, &(void*&)mMappedBuffer);
> > +  if (rv) {
> > +    NS_WARNING("Couldn't lock graphic buffer");
> 
> This should never happen, right? So I'd crash debug builds on that. Easier
> debugging than having to figure why something is not correctly rendered on
> screen.

I'll add this as a followup. I don't want to go through another round of testing to make sure the patch will ot get backed out.

> 
> @@ +172,5 @@
> > +
> > +bool
> > +GrallocTextureClientOGL::IsAllocated() const
> > +{
> > +  return !!mGraphicBuffer.get();
> 
> Both the !! and the .get() are useless here.

error: cannot convert 'const android::sp<android::GraphicBuffer>' to 'bool' in return

Also I dislike converting pointers to bool in return assignment because it obfuscates the meaning of the code. in short functions like this it's okay because the function signature is not far away for your eyes to catch it, but then we get in the (bad) habit of doing that and in long function bodies you end up reading "return mFooBar" and thinking the function returns a FooBar where it is just returning a boolean.

!! is perfectly useless syntactically but it makes the code more explicit and understandable without adding complexity.

> @@ +33,5 @@
> > +  GrallocTextureClientOGL(CompositableClient* aCompositable,
> > +                          gfx::SurfaceFormat aFormat,
> > +                          TextureFlags aFlags = TEXTURE_FLAGS_DEFAULT);
> > +
> > +  ~GrallocTextureClientOGL();
> 
> Same comment as in GrallocTextureHost.h:
> 
>  - why is the dtor not marked virtual MOZ_OVERRIDE like other methods?

MOZ_OVERRIDE is usefull where you may risk to get the method prototype wrong. With the ~ in the name it's pretty safe. plus dtors are a bit special when it comes to virtual calls and "override" sorta conveys the idea that it replaces the inherited version.
> @@ +112,5 @@
> > +  // This is a temporary fix to a bad lock/unlock race with the camera.
> > +  // It is bad for performances so we need to find a better way asap.
> > +  DeallocateDeviceData();
> > +
> > +  GLuint tex = mCompositor->GetTemporaryTexture(aTextureUnit);
> 
> Wait, I'm not sure I get it. As we discussed, there needs to be separate
> temporary textures for each texture target. Is this intended to be taken
> care of in a follow-up bug?

The patch that does that has not landed yet and I was under the impression that Jeff is working a better way to unlock things (temporary texture per compositable rather than per Compositor), because even the current way has problems (even with the other patch).

> 
> @@ +127,5 @@
> > +
> > +bool
> > +GrallocTextureSourceOGL::IsValid() const
> > +{
> > +  return !!gl() && !!mGraphicBuffer.get();
> 
> The !! are useless here, as && takes boolean operands, right?

They are not for sp<GraphicBuffer> as I mentioned above.
And I like to explicitly do the boolean cast as I mentioned above too.

> 
> @@ +139,5 @@
> > +
> > +GLenum
> > +GrallocTextureSourceOGL::GetTextureTarget() const
> > +{
> > +  MOZ_ASSERT(mGraphicBuffer.get());
> 

same thing.

> Is this .get() really necessary? I thought that the implicit
> conversion-to-T* operator would be sufficient here.
> 
> @@ +145,5 @@
> > +}
> > +
> > +gfx::SurfaceFormat
> > +GrallocTextureSourceOGL::GetFormat() const MOZ_OVERRIDE {
> > +  if (!mGraphicBuffer.get()) {
> 
> Same.

counter-same

> 
> @@ +151,5 @@
> > +  }
> > +  if (GetTextureTarget() == LOCAL_GL_TEXTURE_EXTERNAL) {
> > +    return gfx::FORMAT_R8G8B8A8;
> > +  }
> > +  return mFormat;
> 
> Thinking about this function... maybe we just shouldn't have a mFormat
> member that is a lie half of the time? We can get this information cheaply
> anyways from the mGraphicBuffer.

This is not really in my comfort zone but my understanding is that we need to use RGBA shaders for external texture formats, and this relies on GetFormat to be RGBA.

> 
> @@ +158,5 @@
> > +gfx::IntSize
> > +GrallocTextureSourceOGL::GetSize() const
> > +{
> > +  if (!IsValid()) {
> > +    NS_WARNING("Trying to access the size of an invalid GrallocTextureSourceOGL");
> 
> I'm uncomfortable about this being a warning as opposed to an assertion. Are
> we really expecting to be using invalid texturesources that often? If this
> is from preexisting code, it's worth filing a follow-up bug.

I am okay about making our checks stricter as a followup but some cases like this one are tricky because it is very very hard to prove that this case will never happen because of all the asynchronousness in play. SO it is something I would play with locally but not land in the middle of a period of rush (rather land after the uplift instead).

> 
> @@ +165,5 @@
> > +  return gfx::IntSize(mGraphicBuffer->getWidth(), mGraphicBuffer->getHeight());
> > +}
> > +
> > +void
> > +GrallocTextureSourceOGL::DeallocateDeviceData()
> 
> Oh, I had no idea what DeallocateDeviceData did based on its name.
> 
> If you think that it's a good name, please explain what 'device' means here
> in general. Otherwise, please find another terminology :-)
> 
> I see that this is already in mozilla-central.

There is a distinction between "shared data" (stuff that is shared between processes like shmem) and "device data" (stuff that belongs to lower level APIs like OpenGL textures. The term device perhaps makes more sense in D3D's terminology but I can't think of a better name just now (agreed I am not good at naming things).

> 
> @@ +171,5 @@
> > +  if (mEGLImage) {
> > +    MOZ_ASSERT(gl());
> > +    gl()->MakeCurrent();
> > +    gl()->DestroyEGLImage(mEGLImage);
> > +    mEGLImage = 0;
> 
> LOCAL_EGL_NO_IMAGE_KHR

This means I need to include gfx/angle/include/EGL/eglext.h just for something that casts 0 into a void* :(

> > +  gl()->fBindTexture(GetTextureTarget(), tex);
> > +  if (!mEGLImage) {
> > +    mEGLImage = gl()->CreateEGLImageForNativeBuffer(mGraphicBuffer->getNativeBuffer());
> > +  }
> > +  gl()->fEGLImageTargetTexture2D(GetTextureTarget(), mEGLImage);
> 
> Can't this all use the above BindTexture() method, rather than duplicating
> most of its code?
> 

Right now BindTexture does the temporary hack calling DeallocateDeviceData to avoid locking issues. let's factor out that code when we get rid of this.

> ::: gfx/layers/opengl/GrallocTextureHost.h
> @@ +4,5 @@
> > + * file, You can obtain one at http://mozilla.org/MPL/2.0/. */
> > +
> > +#ifndef MOZILLA_GFX_GRALLOCTEXTUREHOST_H
> > +#define MOZILLA_GFX_GRALLOCTEXTUREHOST_H
> > +#ifdef MOZ_WIDGET_GONK
> 
> Can't you remove this MOZ_WIDGET_GONK? If this all needs to be inside an
> #ifdef, it's more customary to have the ifdef around the #include on the
> side of the file that does the #include.

I much prefer to do this #ifdef dance once in the header rather than in every place we #include the header because I tend to forget to do it and the error that is produced when I do is hard to understand. And typically people do the mistake of forgetting the #ifdef when they are working on B2G so they find out they broke the other platforms later which adds to the confusion (like when they push to try and find in the morning that they need to fix it and push to try again).

If anything I'd rather do a followup to change the current custom of #ifdefing around the #include.
To those who need the new gralloc textures to fix their bugs, it has landed but preffed off because of an m3 regression. Do not block your work on this, just flip the pref locally.

Set "layers.use-deprecated-textures" to false on B2G.
Attached file build error log
After the patches are committed to m-c, I always fail to build for b2g nexus-4. As I have written in Comment 45. This time I get the all source code from scratch and did build. But failed it.
Might have a clue into the media mochitest failures:

With layers.use-deprecated-textures=false, playing the WebM video on this page,

  http://www.quirksmode.org/html5/tests/video.html

we have continuously increasing memory usage in both the B2G main process and the browser process. Looking more closely we see that that memory usage is mostly under shmem-mapped, with a huge amount (I hit 177 M at some point) growing simultaneously in both processes. There is NO abnormal gralloc memory usage.

Conclusion: we're likely leaking ordinary shmem's (not gralloc) shared between the b2g main process and the browser process, during webm video playback.
Created Bug 910921 for the WebM playback problem.
Depends on: 910921
Blocks: 910928
(In reply to Sotaro Ikeda [:sotaro] from comment #57)
> Created attachment 797509 [details]
> build error log
> 
> After the patches are committed to m-c, I always fail to build for b2g
> nexus-4. As I have written in Comment 45. This time I get the all source
> code from scratch and did build. But failed it.

I do not see the build failure if B2G build is for ICS gonk.
Nical, 910928 is for build failure, can you fix is as soon as possible?
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(nical.bugzilla)
Depends on: 911941
Blocks: 913299
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Whiteboard: [leave open]
Target Milestone: --- → mozilla26
No longer blocks: GFXB2G1.2
blocking-b2g: koi? → koi+
You need to log in before you can comment on or make changes to this bug.