Closed Bug 885345 Opened 11 years ago Closed 11 years ago

HwcComposer2D doesn't render Camera or Video frames

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:leo+, firefox23 wontfix, firefox24 wontfix, firefox25 fixed, b2g18 fixed, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 wontfix, b2g-v1.1hd fixed)

RESOLVED FIXED
1.1 QE4 (15jul)
blocking-b2g leo+
Tracking Status
firefox23 --- wontfix
firefox24 --- wontfix
firefox25 --- fixed
b2g18 --- fixed
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- wontfix
b2g-v1.1hd --- fixed

People

(Reporter: diego, Assigned: pchang)

References

Details

(Whiteboard: [LeoVB+])

Attachments

(2 files, 6 obsolete files)

Camera and Video are the key hardware composer use cases as it improved power performance on v1 and v1.1 Currently on the trunk I can't run the camera at all and video playback falls back to GPU rendering. This could be due to the composite layer refactoring.
I found playing video from browser didn't go through HWComporitor path. And the following log said "couldn't find the gralloc buffer from canvaslayer". 06-21 10:55:47.349 149 291 D HWComposer: Render aborted. Nothing was drawn to the screen 06-21 10:55:47.379 149 291 D HWComposer: CanvasLayer Layer doesn't have a gralloc buffer 06-21 10:55:47.379 149 291 D HWComposer: Render aborted. Nothing was drawn to the screen 06-21 10:55:47.419 149 291 D HWComposer: CanvasLayer Layer doesn't have a gralloc buffer
I just dump all layers when playing youtube video by using broswer. And the problem was the last one canvas layer which is the network icon with 16x16 size. And B2G didn't allocate graphic buffer for too small buffer under 64. Did we really need to use canvas layer for this network icon? http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/ShadowLayerUtilsGralloc.cpp#364 aLayer 0x487b3c00 name ContainerLayer fmt 0 vis(0 0 320 480) sf(0 0) visReg xxx offset0 (0 0) opacity 1.00 aLayer 0x487b4400 name ThebesLayerComposite fmt 0 vis(0 0 0 0) sf(0 0) visReg xxx offset0 (0 0) opacity 1.00 aLayer 0x487b5000 name ColorLayer fmt 0 vis(0 0 0 0) sf(0 0) visReg xxx offset0 (0 0) opacity 1.00 aLayer 0x487b4800 name ContainerLayer fmt 0 vis(0 0 320 480) sf(0 0) visReg xxx offset0 (0 0) opacity 1.00 aLayer 0x487b4c00 name ThebesLayerComposite fmt 4 vis(0 0 320 480) sf(320 480) visReg xxx offset1 (0 0) opacity 1.00 aLayer 0x488c1000 name RefLayer fmt 0 vis(0 0 320 460) sf(0 0) visReg xxx offset0 (0 0) opacity 1.00 aLayer 0x48860800 name ContainerLayer fmt 0 vis(0 0 320 460) sf(0 0) visReg xxx offset0 (0 0) opacity 1.00 aLayer 0x488bec00 name ThebesLayerComposite fmt 1 vis(0 0 0 0) sf(320 460) visReg xxx offset1 (0 0) opacity 1.00 aLayer 0x488c3400 name ContainerLayer fmt 0 vis(0 0 320 180) sf(0 0) visReg xxx offset0 (0 0) opacity 1.00 aLayer 0x48bbc400 name ImageLayer fmt 266 vis(0 0 640 360) sf(640 360) visReg xxx offset0 (0 0) opacity 1.00 aLayer 0x44a27c00 name CanvasLayer fmt 0 vis(0 0 16 16) sf(0 0) visReg xxx offset0 (0 0) opacity 1.00
The "network activity" icon was used canvas element. Removing the following line from system app, youtube went though HWComposer now. https://github.com/mozilla-b2g/gaia/blob/v1-train/apps/system/index.html#L343
That was introduced in bug 822345 to resolve a performance issue(!)
Yes, canvas allowed us to hit a fast path that images weren't hitting. We never got around to fixing bug 836372 which would have made sure animated images hit the same path...
Update more detail because index.html is still changing. Just remove the following line could solve the problem. <canvas id="statusbar-network-activity" width="16" height="16" hidden></canvas> https://github.com/mozilla-b2g/gaia/blob/v1-train/apps/system/index.html#L352 (In reply to peter chang[:pchang] from comment #3) > The "network activity" icon was used canvas element. > Removing the following line from system app, youtube went though HWComposer > now. > https://github.com/mozilla-b2g/gaia/blob/v1-train/apps/system/index.html#L343
Is this bug against v1 or trunk? (Seems v1, but comment 0 also mentions trunk, so I'm not sure). Are there known steps-to-reproduce?
It's important to note there are two different things happening here. (1) On v1.1 I see the "network activity" icon issue pchang mentions. Video frames are rendered by HWC when the icon is hidden and by GPU when it's displayed. (2) On trunk all video frames fallback to the GPU regardless of the "network activity" icon. It seems (1) is the root cause of bug 884188.
(In reply to Benoit Jacob [:bjacob] from comment #7) > Is this bug against v1 or trunk? (Seems v1, but comment 0 also mentions > trunk, so I'm not sure). The priority is to fix this on b2g18.
(In reply to peter chang[:pchang] from comment #2) > And B2G didn't allocate graphic buffer for too small buffer under 64. > Did we really need to use canvas layer for this network icon? Note that we changed that as a workaround for a driver bug in the Otoro IIRC. We also felt that it shouldn't have had an impact on performance but if it does we could just try to gralloc small textures again and see if the problem doesn't affect the devices we're using now. It would only be a matter of reverting the patch for bug 792966.
It seems to me we could render small canvases with HWC even if we're not using gralloc for the canvas directly, by copying the canvas contents into a gralloc texture for compositing. But I think we should look into 836372 more and try to fix that.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #11) > It seems to me we could render small canvases with HWC even if we're not > using gralloc for the canvas directly, by copying the canvas contents into a > gralloc texture for compositing. Unfortunately I fear we'll have to go there unless we can obtain an updated driver. I just tried reverting the workaround in bug 792966 and the driver issue is still there, even on newer devices such as the Leo. > But I think we should look into 836372 more and try to fix that. Yes, but I would also like not to have workarounds in our codebase for driver bugs; as we've seen in this bug and bug 884188 you never know when they'll be coming back to bite you.
Depends on: 888184
I've prepared a PR in bug 888184 (https://github.com/mozilla-b2g/gaia/pull/10724) that removes the canvas elements from the status-bar and puts animated images in their place; could someone try to apply it and see if it fixes this issue? It's against gaia/master but I'm preparing a version for gaia/v1-train too.
I've landed fixes that remove the small canvases from both gaia/master and v1-train; Peter could you verify if hardware composition is now working properly on the aforementioned scenarios?
Flags: needinfo?(pchang)
With new gaia patch, the problem still existed. Because it still used a small 16x16 layer for the network activity icon and the workaround from bug 792966 didn't create gralloc buffer for it. Diego, About the 64 GL rendering limitation, I still found it with my Leo device. But HWComposer (using copybit) could handle it well. Any comment? I/Gecko ( 1685): aLayer 0x4cb83090 name ShadowContainerLayer fmt 0 vis(0 0 16 16) sf(0 0) visReg xxx buffer 0x0 offset0 (0 0) opacity 1.00 I/Gecko ( 1685): aLayer 0x4cb83890 name ShadowThebesLayer fmt 0 vis(0 0 0 0) sf(0 0) visReg xxx buffer 0x0 offset0 (0 0) opacity 1.00 I/Gecko ( 1685): aLayer 0x4cba9c90 name ShadowImageLayer fmt 0 vis(0 0 16 16) sf(0 0) visReg xxx buffer 0x0 offset0 (0 0) opacity 1.00 D/HWComposer( 1685): ShadowImageLayer Layer doesn't have a gralloc buffer (In reply to Gabriele Svelto [:gsvelto] from comment #14) > I've landed fixes that remove the small canvases from both gaia/master and > v1-train; Peter could you verify if hardware composition is now working > properly on the aforementioned scenarios?
Flags: needinfo?(pchang) → needinfo?(dwilson)
This is a workaround patch based on :pchang's suggestion (see attachment 768825 [details] [diff] on bug 884188) that applies to mozilla-b2g18. From my testing it works around the problem effectively and works on both the Leo and other devices (Otoro/Unagi). Unfortunately the same approach doesn't seem to work on mozilla-central. I've tried numerous workarounds that enlarged the gralloc'd buffer but they all seem to fail. Even removing the previous workaround entirely and forcing all unmodified surfaces to be gralloc'd causes issues with the following messages being printed on the logcat output everytime an animation frame is redrawn: D/HWComposer( 1442): ImageLayer Layer doesn't have a gralloc buffer Sometimes this appears coupled with the following message which seems to be coming from vendor libraries: E/msm7627a.gralloc( 1471): gralloc_lock: genlock_lock_buffer (lockType=0x2) failed On other occasions I've seen this messages instead: D/HWComposer( 843): ThebesLayerComposite Layer doesn't have a gralloc buffer D/HWComposer( 843): Render aborted. Nothing was drawn to the screen Or even: D/HWComposer( 843): ContainerLayer Layer has planar semitransparency which is unsupported D/HWComposer( 843): Render aborted. Nothing was drawn to the screen All in all it seems we need another approach to fix this on central. BTW all tests were done with gaia/master.
Attachment #772465 - Flags: review?(bjacob)
(In reply to Gabriele Svelto [:gsvelto] from comment #16) > Created attachment 771989 [details] [diff] [review] > [WIP PATCH mozilla-b2g18] Force small surfaces to be allocated via gralloc > and enlarge them to work around display issues > > This is a workaround patch based on :pchang's suggestion (see attachment > 768825 [details] [diff] [review] on bug 884188) that applies to > mozilla-b2g18. From my testing it works around the problem effectively and > works on both the Leo and other devices (Otoro/Unagi). > > Unfortunately the same approach doesn't seem to work on mozilla-central. > I've tried numerous workarounds that enlarged the gralloc'd buffer but they > all seem to fail. Even removing the previous workaround entirely and forcing > all unmodified surfaces to be gralloc'd causes issues with the following > messages being printed on the logcat output everytime an animation frame is > redrawn: > > D/HWComposer( 1442): ImageLayer Layer doesn't have a gralloc buffer > The patches in comment 16/17 should be fixed no gralloc buffer for ImageLayer. Correct the buffer validation inside GetRenderState(), it could pass gralloc buffer to composer module. https://bugzilla.mozilla.org/show_bug.cgi?id=885345#c17 > Sometimes this appears coupled with the following message which seems to be > coming from vendor libraries: > > E/msm7627a.gralloc( 1471): gralloc_lock: genlock_lock_buffer (lockType=0x2) > failed > > On other occasions I've seen this messages instead: > > D/HWComposer( 843): ThebesLayerComposite Layer doesn't have a gralloc buffer > D/HWComposer( 843): Render aborted. Nothing was drawn to the screen This error was still existed after applying my patch. Need to investigate more detail. > > Or even: > > D/HWComposer( 843): ContainerLayer Layer has planar semitransparency which > is unsupported > D/HWComposer( 843): Render aborted. Nothing was drawn to the screen > > All in all it seems we need another approach to fix this on central. BTW all > tests were done with gaia/master.
(In reply to peter chang[:pchang] from comment #18) > The patches in comment 16/17 should be fixed no gralloc buffer for > ImageLayer. > Correct the buffer validation inside GetRenderState(), it could pass gralloc > buffer to composer module. > https://bugzilla.mozilla.org/show_bug.cgi?id=885345#c17 I've folded your changes into a new patch that applies to mozilla-central. I've tested it quickly on my Leo device and it seems to work. I also don't get any error/warning messages in the adb logcat output. I will try and test it on another device too. My only gripe is that I have very limited knowledge of the layers code and I'd like somebody with more experience to verify that this is the right approach to work around this issue.
(In reply to Gabriele Svelto [:gsvelto] from comment #19) > Created attachment 772640 [details] [diff] [review] > [WIP PATCH mozilla-central] Resize small gralloc'd surfaces to work around > issues when drawing small canvases on certain devices > > (In reply to peter chang[:pchang] from comment #18) > > The patches in comment 16/17 should be fixed no gralloc buffer for > > ImageLayer. > > Correct the buffer validation inside GetRenderState(), it could pass gralloc > > buffer to composer module. > > https://bugzilla.mozilla.org/show_bug.cgi?id=885345#c17 > > I've folded your changes into a new patch that applies to mozilla-central. > I've tested it quickly on my Leo device and it seems to work. I also don't > get any error/warning messages in the adb logcat output. I will try and test > it on another device too. > I would prefer to separate patch to two parts, because attachment 772465 [details] [diff] [review] just modified to the expected behavior. But we may revert the patch "resize gralloc surface" in the future. > My only gripe is that I have very limited knowledge of the layers code and > I'd like somebody with more experience to verify that this is the right > approach to work around this issue.
Attachment #772465 - Flags: review?(ncameron)
Comment on attachment 772465 [details] [diff] [review] Correct the validation inside GetRenderState() Review of attachment 772465 [details] [diff] [review]: ----------------------------------------------------------------- LGTM ::: gfx/layers/opengl/TextureHostOGL.cpp @@ +853,5 @@ > if (mIsRBSwapped) { > flags |= LAYER_RENDER_STATE_FORMAT_RB_SWAP; > } > > + nsIntSize bufSize(mGraphicBuffer->getWidth(), mGraphicBuffer->getHeight()); nit: I prefer bufferSize, no point in saving a few chars
Attachment #772465 - Flags: review?(ncameron) → review+
Assignee: nobody → pchang
Uploaded patch for b2g18 which was a little different with m-c patch. It will pass valid SurfaceDesctitpor to composer module and composer will check buffer is valid or not. http://mxr.mozilla.org/mozilla-b2g18/source/widget/gonk/HwcComposer2D.cpp#344
Attachment #773022 - Flags: review?(ncameron)
update variable name to bufferSize.
Attachment #772465 - Attachment is obsolete: true
Attachment #772465 - Flags: review?(bjacob)
Attachment #773025 - Flags: review+
Comment on attachment 773022 [details] [diff] [review] [b2g18]Correct the buffer validation inside GetRenderState() Review of attachment 773022 [details] [diff] [review]: ----------------------------------------------------------------- I think this is OK, but why was it going wrong before? The descriptor had a type which was not none, but was not SurfaceDescriptor::TSharedTextureDescriptor? What type was it? And why don't other parts of CanvasLayerOGL go wrong?
Right now only HwcComposer uses GetRenderState() and it will check surface type is TSurfaceDescriptorGralloc or not to get the right gralloc buffer. B2G didn't use SurfaceDescriptor::TSharedTextureDescriptor when using OOPC. http://mxr.mozilla.org/mozilla-b2g18/source/widget/gonk/HwcComposer2D.cpp#344 343 if (state.mSurface && 344 state.mSurface->type() == SurfaceDescriptor::TSurfaceDescriptorGralloc) { 345 surfaceSize = state.mSurface->get_SurfaceDescriptorGralloc().size(); 346 } (In reply to Nick Cameron [:nrc] from comment #24) > Comment on attachment 773022 [details] [diff] [review] > [b2g18]Correct the buffer validation inside GetRenderState() > > Review of attachment 773022 [details] [diff] [review]: > ----------------------------------------------------------------- > > I think this is OK, but why was it going wrong before? The descriptor had a > type which was not none, but was not > SurfaceDescriptor::TSharedTextureDescriptor? What type was it? And why don't > other parts of CanvasLayerOGL go wrong?
(In reply to peter chang[:pchang] from comment #25) > Right now only HwcComposer uses GetRenderState() and it will check surface > type is TSurfaceDescriptorGralloc or not to get the right gralloc buffer. > > B2G didn't use SurfaceDescriptor::TSharedTextureDescriptor when using OOPC. > If this is correct, then you should change IsValidSharedTexDescriptor or find and change all its other uses. For example, if we can't go the Hwc path and we render normally, then we won't be able to render because of this check http://mxr.mozilla.org/mozilla-b2g18/source/gfx/layers/opengl/CanvasLayerOGL.cpp#492 (unless we have mTexImage, but I'm not sure that we will).
(In reply to Nick Cameron [:nrc] from comment #26) > (In reply to peter chang[:pchang] from comment #25) > > Right now only HwcComposer uses GetRenderState() and it will check surface > > type is TSurfaceDescriptorGralloc or not to get the right gralloc buffer. > > > > B2G didn't use SurfaceDescriptor::TSharedTextureDescriptor when using OOPC. > > > > If this is correct, then you should change IsValidSharedTexDescriptor or > find and change all its other uses. For example, if we can't go the Hwc path > and we render normally, then we won't be able to render because of this > check If I understand correctly, SharedTexDescriptor is used for inside process flow and SurfaceDescriptorGralloc is used for cross process flow. And we can let composer choose which kind of surfacedescriptor it wants to use. > http://mxr.mozilla.org/mozilla-b2g18/source/gfx/layers/opengl/CanvasLayerOGL. > cpp#492 (unless we have mTexImage, but I'm not sure that we will). I used gdb to attach my leo device and confirmed mTexImage was not null. Therefore, RenderLayer still executed with TSurfaceDescriptorGralloc type. We might need to change the logic check in line CanvasLayerOGL.cpp#492.
Comment on attachment 773022 [details] [diff] [review] [b2g18]Correct the buffer validation inside GetRenderState() Review of attachment 773022 [details] [diff] [review]: ----------------------------------------------------------------- r=me. It looks like we ought to always have mTexImage from the def of Swap() and your experiment confirms that we do. I think all other uses of IsValidSharedTexDescriptor already take into account that we might have a valid, different kind of SurfaceDescriptor.
Attachment #773022 - Flags: review?(ncameron) → review+
Attachment #771989 - Attachment is obsolete: true
Attachment #772640 - Attachment is obsolete: true
add reviewer
Attachment #773743 - Flags: review+
Attachment #773022 - Attachment is obsolete: true
I have no idea how you generated this patch, but it is not in an applicable format for landing. Please make sure it follows the instructions below. https://developer.mozilla.org/en-US/docs/Mercurial_FAQ#How_can_I_generate_a_patch_for_somebody_else_to_check-in_for_me.3F Also, this can't land on b2g18 until it has leo+ blocking status.
blocking-b2g: --- → leo?
Keywords: checkin-needed
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #32) > I have no idea how you generated this patch, but it is not in an applicable > format for landing. It looks like the patch contains ANSI escape codes for colors. > Also, this can't land on b2g18 until it has leo+ blocking status. This is blocking a leo+ bug so it should be leo+ as well, right?
Update correct patch format
Attachment #773743 - Attachment is obsolete: true
Attachment #774004 - Flags: review+
(In reply to Gabriele Svelto [:gsvelto] from comment #33) > (In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #32) > > I have no idea how you generated this patch, but it is not in an applicable > > format for landing. > > It looks like the patch contains ANSI escape codes for colors. > > > Also, this can't land on b2g18 until it has leo+ blocking status. > > This is blocking a leo+ bug so it should be leo+ as well, right? I just uploaded correct patches again. leo, we need to land this bug to solve bug 884188. It should be leo+ also.
Flags: needinfo?(leo.bugzilla.gecko)
blocking-b2g: leo? → leo+
Flags: needinfo?(leo.bugzilla.gecko)
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
I just came back from vacation, this is still needinfo? me from 10 days ago, but already RESOLVED FIXED. I assume by default that I'm no longer needed.
Flags: needinfo?(bjacob)
Whiteboard: [LeoVB+]
Flags: needinfo?(dwilson)
See Also: → 1139541
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: