Closed Bug 987563 Opened 7 years ago Closed 7 years ago

SIGSEGV in HwcComposer2D::Commit() due to lack of both colorFill and RB-Swap

Categories

(Firefox OS Graveyard :: GonkIntegration, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:1.4+, firefox29 wontfix, firefox30 wontfix, firefox31 fixed, b2g-v1.4 fixed, b2g-v2.0 fixed)

RESOLVED FIXED
1.4 S5 (11apr)
blocking-b2g 1.4+
Tracking Status
firefox29 --- wontfix
firefox30 --- wontfix
firefox31 --- fixed
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed

People

(Reporter: schiu, Assigned: sushilchauhan)

References

Details

Attachments

(1 file, 3 obsolete files)

When both colorFill and RB-Swap are not supported in HWC, the variable mHwcayerMap will always be empty. Indexing mHwcLayerMap without checking its size leads to segmentation fault.
Summary: SIGSEG in HwcComposer2D::Commit() due to lack of both colorFill and RB-Swap → SIGSEGV in HwcComposer2D::Commit() due to lack of both colorFill and RB-Swap
\
Assignee: nobody → schiu
Attached patch check_hwclayermap.patch (obsolete) — Splinter Review
Attachment #8396244 - Flags: review?(sotaro.ikeda.g)
Attachment #8396244 - Flags: review?(hshih)
Attachment #8396244 - Flags: review?(sushilchauhan)
in file vendor/sprd/open-source/libs/hwcomposer$ gedit SprdUtil.cpp

refer to function  SprdUtil::composerLayers

for rgba8888 format,we can set y_word_endn/a_swap_mode value to control whether RB was swapped.

            gsp_cfg_info.layer_des_info.img_format = GSP_DST_FMT_ARGB888;
            gsp_cfg_info.layer_des_info.endian_mode.y_word_endn = GSP_WORD_ENDN_1;
            gsp_cfg_info.layer_des_info.endian_mode.a_swap_mode = GSP_A_SWAP_RGBA;

for gba565 we user 
            gsp_cfg_info.layer_des_info.endian_mode.rgb_swap_mode = GSP_RGB_SWP_BGR;

The values are defined in file  kernel/include/video/gsp_types_shark.h
Hmm, it is a wired situation. "mHwcLayerMap is empty" says that there is nothing to render, but mList->numHwLayers has valid number. There seems inconsistency. I think this check should not be necessary. mHwcLayerMap size and mList->numHwLayers needs to be consistent.

They are incremented by the following.
http://mxr.mozilla.org/mozilla-central/source/widget/gonk/HwcComposer2D.cpp#496

schiu, do you know why such inconsistent state between mHwcLayerMap and mList->numHwLayers happens? Such thing should not happen by default gecko code.
Flags: needinfo?(schiu)
Comment on attachment 8396244 [details] [diff] [review]
check_hwclayermap.patch

Review of attachment 8396244 [details] [diff] [review]:
-----------------------------------------------------------------

Which hwc HAL are you using?

::: widget/gonk/HwcComposer2D.cpp
@@ +682,5 @@
>              int fd = mList->hwLayers[j].releaseFenceFd;
>              mList->hwLayers[j].releaseFenceFd = -1;
>              sp<Fence> fence = new Fence(fd);
>  
> +            if (!mHwcLayerMap.IsEmpty()) {

When mColorFill and mRBSwapSupport are false and composition falls back to GPU, mList->numHwLayers is 2.
HWC layer[0] = HWC_SKIP_LAYER layer (Bogus layer since HAL expects at-least 1 App layer to avoid stale FB updates).
HWC layer[1] = FrameBuffer layer

HWC HAL should not set any release fence fd on layer[0] since it is SKIP layer, so it should have its default/unchanged value, which is -1. So you should never enter the if block (line 681) in this case:
| if (mList->hwLayers[j].releaseFenceFd >= 0) { |
Attachment #8396244 - Flags: review?(sushilchauhan) → review-
Also dump the values of j, mList->numHwLayers and mList->hwLayers[j].releaseFenceFd before entering if block.
(In reply to Sotaro Ikeda [:sotaro] from comment #4)
> Hmm, it is a wired situation. "mHwcLayerMap is empty" says that there is
> nothing to render, but mList->numHwLayers has valid number. There seems
> inconsistency. I think this check should not be necessary. mHwcLayerMap size
> and mList->numHwLayers needs to be consistent.
> 
> They are incremented by the following.
> http://mxr.mozilla.org/mozilla-central/source/widget/gonk/HwcComposer2D.
> cpp#496
> 
> schiu, do you know why such inconsistent state between mHwcLayerMap and
> mList->numHwLayers happens? Such thing should not happen by default gecko
> code.

Sushil already answered above question in comment#5. And the releaseFenceFd with HWC_SKIP_LAYER set is also not set with -1 every time (with a chipset other than QCT). I think we need to discuss with the vendor further for handling HWC_SKIP_LAYER.
Flags: needinfo?(schiu)
schiu, can you provide the actual crash info? In which code, the crash happens? If the crash happens because "mHwcLayerMap[j]->GetLayer() is null pointer". The following code might be reasonable.

------------------------------

+           if (!mHwcLayerMap[j]->GetLayer()) {
+               continue;
+           }

            LayerRenderState state = mHwcLayerMap[j]->GetLayer()->GetRenderState();
            if (!state.mTexture) {
                continue;
            }
Flags: needinfo?(schiu)
Sotaro,

Change in Comment 8 would still crash because mHwcLayerMap is empty. Let's not add a check here. HWC HAL should not handle HWC_SKIP_LAYER and its release fence fd should be unchanged after hwc set call.
I misunderstood about the use case of the crash. HWC_SKIP_LAYER is set only when OpenGL composition is done. In this case, HwcComposer2D::Render() is called.
Consistency between mHwcLayerMap and mList->numHwLayers as in Comment 4 exists only when composition is doen by HWC.
HwcComposer2D::Commit() does not recognize that it is called for HWC's layer composition or OpenGL's layer composition. So it seems ok to fix the problem by recognizing the difference.

Prevent the crash by "HWC HAL should not handle HWC_SKIP_LAYER" could fix the crash. But semantically, it is not a best way to fix the problem. Because HwcComposer2D::Commit() still does not recognize mList->hwLayers[0] is not for gfx layer when called from HwcComposer2D::Render().
I believe it is better to fix the root cause of crash than avoiding the crash. It does not add any value, if hwc HAL is setting release fence fd on GPU composed layers (HWC_SKIP_LAYER / HWC_FRAMEBUFFER).

Can you try this change on other HAL and see if you still see the crash ? In Render():
-        mList->hwLayers[0].compositionType = HWC_BACKGROUND;
+        mList->hwLayers[0].compositionType = HWC_FRAMEBUFFER;
Call stack for your reference:

-----
Program received signal SIGSEGV, Segmentation fault.
0xb54073fa in mozilla::HwcComposer2D::Commit (this=this@entry=0xb1d33e70) at ../../../../dolphin/gecko/widget/gonk/HwcComposer2D.cpp:676
676                 LayerRenderState state = mHwcLayerMap[j]->GetLayer()->GetRenderState();
(gdb) bt
#0  0xb54073fa in mozilla::HwcComposer2D::Commit (this=this@entry=0xb1d33e70) at ../../../../dolphin/gecko/widget/gonk/HwcComposer2D.cpp:676
#1  0xb5407530 in mozilla::HwcComposer2D::Render (this=0xb1d33e70, dpy=0x1, sur=<optimized out>)
    at ../../../../dolphin/gecko/widget/gonk/HwcComposer2D.cpp:618
#2  0xb51e3ec0 in mozilla::layers::CompositorOGL::EndFrame (this=0xafa78980) at ../../../../dolphin/gecko/gfx/layers/opengl/CompositorOGL.cpp:1271
#3  0xb51d21e2 in mozilla::layers::LayerManagerComposite::Render (this=this@entry=0xafeb4350)
    at ../../../../dolphin/gecko/gfx/layers/composite/LayerManagerComposite.cpp:511
#4  0xb51d2290 in EndTransaction (aFlags=<optimized out>, this=0xafeb4350, aCallback=<optimized out>, aCallbackData=<optimized out>)
    at ../../../../dolphin/gecko/gfx/layers/composite/LayerManagerComposite.cpp:245
#5  mozilla::layers::LayerManagerComposite::EndTransaction (this=0xafeb4350, aCallback=<optimized out>, aCallbackData=<optimized out>, 
    aFlags=<optimized out>) at ../../../../dolphin/gecko/gfx/layers/composite/LayerManagerComposite.cpp:202
#6  0xb51d1044 in mozilla::layers::LayerManagerComposite::EndEmptyTransaction (this=<optimized out>, aFlags=<optimized out>)
    at ../../../../dolphin/gecko/gfx/layers/composite/LayerManagerComposite.cpp:197
#7  0xb51d9ee6 in mozilla::layers::CompositorParent::CompositeToTarget (this=0xaed1d400, aTarget=<optimized out>)
    at ../../../../dolphin/gecko/gfx/layers/ipc/CompositorParent.cpp:696
#8  0xb4fc4438 in DispatchToMethod<mozilla::ipc::MessageChannel, void (mozilla::ipc::MessageChannel::*)()> (
    method=(void (mozilla::ipc::MessageChannel::*)(mozilla::ipc::MessageChannel * const)) 0xb51d9fb7 <mozilla::layers::CompositorParent::Composite()>, 
    obj=<optimized out>, arg=...) at ../../../../dolphin/gecko/ipc/chromium/src/base/tuple.h:383
#9  RunnableMethod<mozilla::ipc::MessageChannel, void (mozilla::ipc::MessageChannel::*)(), Tuple0>::Run (this=<optimized out>)
    at ../../../../dolphin/gecko/ipc/chromium/src/base/task.h:307
#10 0xb4fbc068 in MessageLoop::RunTask (this=0xb1001cb0, task=0xaebd1280) at ../../../../dolphin/gecko/ipc/chromium/src/base/message_loop.cc:344
#11 0xb4fbc72a in MessageLoop::DeferOrRunPendingTask (this=<optimized out>, pending_task=...)
    at ../../../../dolphin/gecko/ipc/chromium/src/base/message_loop.cc:352
#12 0xb4fbd75c in DoWork (this=<optimized out>) at ../../../../dolphin/gecko/ipc/chromium/src/base/message_loop.cc:430
#13 MessageLoop::DoWork (this=0xb1001cb0) at ../../../../dolphin/gecko/ipc/chromium/src/base/message_loop.cc:409
#14 0xb4fbd8ec in base::MessagePumpDefault::Run (this=0xb1490880, delegate=0xb1001cb0)
    at ../../../../dolphin/gecko/ipc/chromium/src/base/message_pump_default.cc:34
#15 0xb4fbbff6 in MessageLoop::RunInternal (this=this@entry=0xb1001cb0) at ../../../../dolphin/gecko/ipc/chromium/src/base/message_loop.cc:226
#16 0xb4fbc0a8 in RunHandler (this=0xb1001cb0) at ../../../../dolphin/gecko/ipc/chromium/src/base/message_loop.cc:219
#17 MessageLoop::Run (this=this@entry=0xb1001cb0) at ../../../../dolphin/gecko/ipc/chromium/src/base/message_loop.cc:193
#18 0xb4fbe89a in base::Thread::ThreadMain (this=0xb147dd30) at ../../../../dolphin/gecko/ipc/chromium/src/base/thread.cc:162
#19 0xb4fb5550 in ThreadFunc (closure=<optimized out>) at ../../../../dolphin/gecko/ipc/chromium/src/base/platform_thread_posix.cc:39
#20 0xb6f0021c in __thread_entry (func=0xb4fb5549 <ThreadFunc(void*)>, arg=0xb147dd30, tls=0xb1001dd0) at bionic/libc/bionic/pthread_create.cpp:105
#21 0xb6f003b8 in pthread_create (thread_out=0xb147dd38, attr=<optimized out>, start_routine=0xb4fb5549 <ThreadFunc(void*)>, arg=0x78)
    at bionic/libc/bionic/pthread_create.cpp:224
#22 0x00000000 in ?? ()
Flags: needinfo?(schiu)
(In reply to Sushil from comment #13)
> I believe it is better to fix the root cause of crash than avoiding the
> crash. It does not add any value, if hwc HAL is setting release fence fd on
> GPU composed layers (HWC_SKIP_LAYER / HWC_FRAMEBUFFER).
> 
> Can you try this change on other HAL and see if you still see the crash ? In
> Render():
> -        mList->hwLayers[0].compositionType = HWC_BACKGROUND;
> +        mList->hwLayers[0].compositionType = HWC_FRAMEBUFFER;

This change can stop b2g from crashing, because the release fence of bogus layer can be set with -1 after hwc->set() called.
(In reply to Sushil from comment #13)
> I believe it is better to fix the root cause of crash than avoiding the
> crash. It does not add any value, if hwc HAL is setting release fence fd on
> GPU composed layers (HWC_SKIP_LAYER / HWC_FRAMEBUFFER).
> 
> Can you try this change on other HAL and see if you still see the crash ? In
> Render():
> -        mList->hwLayers[0].compositionType = HWC_BACKGROUND;
> +        mList->hwLayers[0].compositionType = HWC_FRAMEBUFFER;

Yeah, it is better to fix HWC's side implementation to prevent unnecessary divergence of HWC's behavior.
Sotaro / Solomon,

Same change is needed in GonkDisplayJB as well. Can you upload this patch? Or do you want me to upload ?
I will upload the patch.
Set correct default composition type for fake layer in GPU Composition. This will help different HALs to respond uniformly, in case if some HWC HAL tries to set release fence fd on any non HWC_FRAMEBUFFER layer.
Assignee: schiu → sushilchauhan
Status: NEW → ASSIGNED
Attachment #8398107 - Flags: review?(sotaro.ikeda.g)
Updated Comment 19:

Set correct default composition type for fake layer in GPU Composition. This will help different HALs to respond uniformly, in case if a HWC HAL tries to set release fence fd on any non HWC_FRAMEBUFFER layer, even though it has been marked as HWC_SKIP_LAYER.
Attachment #8398107 - Flags: review?(sotaro.ikeda.g) → review+
(In reply to Solomon Chiu [:schiu] from comment #15)
> (In reply to Sushil from comment #13)
> > I believe it is better to fix the root cause of crash than avoiding the
> > crash. It does not add any value, if hwc HAL is setting release fence fd on
> > GPU composed layers (HWC_SKIP_LAYER / HWC_FRAMEBUFFER).
> > 
> > Can you try this change on other HAL and see if you still see the crash ? In
> > Render():
> > -        mList->hwLayers[0].compositionType = HWC_BACKGROUND;
> > +        mList->hwLayers[0].compositionType = HWC_FRAMEBUFFER;
> 
> This change can stop b2g from crashing, because the release fence of bogus
> layer can be set with -1 after hwc->set() called.

schiu, did you already confirmed attachment 8398107 [details] [diff] [review] on the deivce?
Flags: needinfo?(schiu)
Uploading HG friendly version of reviewed patch.
Attachment #8396244 - Attachment is obsolete: true
Attachment #8398107 - Attachment is obsolete: true
Attachment #8396244 - Flags: review?(sotaro.ikeda.g)
Attachment #8396244 - Flags: review?(hshih)
Attachment #8398640 - Flags: review+
Uploading HG friendly version of reviewed patch.
Attachment #8398640 - Attachment is obsolete: true
Attachment #8398643 - Flags: review+
blocking-b2g: --- → 1.4?
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/d4fcaf72d783
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → 1.5 S1 (9may)
Just checked, the patch is ok on the device. Sorry for late reply.
Flags: needinfo?(schiu)
Inder,

Does this block any test cases for QC CS?
Flags: needinfo?(ikumar)
Preeti -- this is not a blocker for us. As per my understanding with Sushil we don't use this feature but this is a good to have fix regardless.
Flags: needinfo?(ikumar)
(In reply to Inder from comment #28)
> Preeti -- this is not a blocker for us. As per my understanding with Sushil
> we don't use this feature but this is a good to have fix regardless.

In that case, moving to backlog. Feel free to ask for aurora approval on this though if it's a good fix to include.
blocking-b2g: 1.4? → backlog
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #25)
> https://hg.mozilla.org/mozilla-central/rev/d4fcaf72d783

v1.4 need this patch to prevent crash in b2g.

But I'm not sure If I need to request for approval‑mozilla‑aurora or approval‑mozilla‑beta.

Ivan, can you help me to requset the correct one? Thanks!


[Approval Request Comment]
Bug caused by (feature/regressing bug #): 982040
User impact if declined: B2G will crash when "layers.composer2d.enabled" is set to true and hwc doesn't work.
Testing completed (on m-c, etc.): Manual test on MC, B2G will not crash anymore
Risk to taking this patch (and alternatives if risky): low because it set a correct default composition type to prevent error
String or IDL/UUID changes made by this patch: none
Flags: needinfo?(itsay)
Yes, it is a good fix to be included in v1.4 as well.
blocking-b2g: backlog → 1.4?
Flags: needinfo?(bbajaj)
Base on comment 31, Let's include this fix in v1.4.
blocking-b2g: 1.4? → 1.4+
Flags: needinfo?(itsay)
Target Milestone: 2.0 S1 (9may) → 1.4 S5 (11apr)
You need to log in before you can comment on or make changes to this bug.