Closed Bug 944420 Opened 12 years ago Closed 12 years ago

[crash] B2G process dies when loading an operator poweron mp4

Categories

(Core :: Graphics: Layers, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla28
blocking-b2g koi+
Tracking Status
firefox26 --- wontfix
firefox27 --- wontfix
firefox28 --- fixed
b2g-v1.2 --- fixed

People

(Reporter: amac, Assigned: sotaro)

References

Details

(Keywords: crash, reproducible, Whiteboard: [b2g-crash])

Attachments

(2 files, 3 obsolete files)

+++ This bug was initially created as a clone of Bug #916824 +++ I'm duplicating the bug because it's happening also on Hamachi, with the patches from [2] applied. STR: 1. Get carrier_power_on.mp4 from [1] and copy it to the apps/system/resources/power directory on your gaia directory. 2. Make a build from that directory and flash it to the phone. Expected: The phone boots correctly and the initial video is shown. Actual: B2G process dies. The complete log for the live of the process is attached. After the last trace, the log restarts again initializing the audio hardware. And it loops indefinitely. The crash happens after a E/VDL_RTOS( 693): ***YAMATO Enabled*** E/OMXNodeInstance( 693): !!! Observer died. Quickly, do something, ... anything... Changing the /system/b2g/webapps/system.gaiamobile.org/application.zip on the phone with one that does not have the offending file makes the phone work correctly. Note that this particular video file is used on the current (1.1) phones where it works correctly (and even should it be incorrect it should most definitely not kill the process). [1] https://github.com/telefonicaid/firefoxos-gaia-spain/blob/master/power/carrier_power_on.mp4 [2] https://github.com/mozilla-b2g/gonk-patches/tree/master/all-hamachi/frameworks/base The build used was a self build, HEAD for frameworks/base/media is e353565bea6e7c8382219190afefcced4b50b7d0 and for B2G26 is changeset: 156677:21e2ad082d85. I tested it with gecko from Mozilla Central and it fails the same way.
blocking-b2g: --- → koi?
Keywords: reproducible
Whiteboard: [systemsfe][b2g-crash] → [b2g-crash]
Summary: [crash] B2G process with a SIGSEGV when loading an operator poweron mp4 → [crash] B2G process dies when loading an operator poweron mp4
I confirmed that the crash happens at TextureHost::Create().
(In reply to Sotaro Ikeda [:sotaro] from comment #1) > I confirmed that the crash happens at TextureHost::Create(). The crash seems same as Bug 929005 Comment 16.
Component: General → Graphics: Layers
Product: Firefox OS → Core
I regenerate the crash by attaching GDB.
Assignee: nobody → sotaro.ikeda.g
(In reply to Sotaro Ikeda [:sotaro] from comment #1) > I confirmed that the crash happens at TextureHost::Create(). It failed because, LayerManagerComposite and CompositorOGL were still not created. They are created by a first function call to top drawing nsWindow's nsWindow::GetLayerManager() call. http://mxr.mozilla.org/mozilla-central/source/widget/gonk/nsWindow.cpp#548 First created nsWindow is not a drawing window. If GetLayerManager() is called on the nsWindow, as a result two LayerManagerComposites are created, second HwcComposer2D::TryHwComposition() call fell into stack state and drawing did not updated anymore.
ImageBridge tried to create a TextureHost, even when there is no ClientLayerManager and nsVideoFrame.
I am not sure how to add nsWindow::GetLayerManager(). I temporarily add it to HTMLMediaElement. Confirmed that the power on video was drawn correctly on master hamachi.
(In reply to Sotaro Ikeda [:sotaro] from comment #4) > > First created nsWindow is not a drawing window. It seems "hidden window".
By attachment 8340732 [details] [diff] [review], white screen was drawn for shot period and then start video rendering.
This patch did cause white screen before a power on video rendering.
Comment on attachment 8340858 [details] [diff] [review] temporary patch - Create TextureHostOGL when Compositor is not present on gonk Review of attachment 8340858 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/composite/TextureHost.cpp @@ +103,5 @@ > aFlags); > +#ifdef MOZ_WIDGET_GONK > + case LAYERS_NONE: > + return CreateTextureHostOGL(aID, aDesc, aDeallocator, aFlags); > +#endif That's very hacky, a comment explaining what's going on would be nice in the final patch. Alternatively, how about not using ImageBridge for the poweron video? If ImageContainer::mImageClient is null, the video is rendered through the non-async transaction code paths which means you know for sure that frames will not reach the compositor thread before layers and compositors are created (To be honest I am a bit worried about how we manage to send video frames to the compositor early enough that no compositor has ever been created). Anyway, just throwing this idea here in case it might help.
Attachment #8340732 - Attachment is obsolete: true
(In reply to Nicolas Silva [:nical] from comment #11) > If ImageContainer::mImageClient is null, the video is rendered through the > non-async transaction code paths which means you know for sure that frames > will not reach the compositor thread before layers and compositors are > created (To be honest I am a bit worried about how we manage to send video > frames to the compositor early enough that no compositor has ever been > created). ImageContainer::mImageClient is not null. It is created and uses ImageBridge. So, it can not be used to fix the problem. But ImageClientBridge is not created yet.
When the problem happens. ImageBridge are already created and works normally. But LayerTransaction is not created yet.
This is certification blocker. Please, triage reviewers, set the suitable flag here in order to avoid future issues.
(In reply to Sotaro Ikeda [:sotaro] from comment #12) > ImageContainer::mImageClient is not null. It is created and uses > ImageBridge. So, it can not be used to fix the problem. But > ImageClientBridge is not created yet. Yes, I mean we could try to create the ImageContainer in a way that it doesn't use ImageBridge (so mImageClient would then be null) in the case of poweron video. This way video would go through main-thread layers transaction. Alternatively, I don't know the how starting up the phone works, but making it so we display a black background (just full screen color layer) before we start the video could force that a compositor was created before the video (that would be a somewhat higher level fix).
Milan, can you set the bug to koi+ based on Comment 14?
Flags: needinfo?(milan)
(In reply to Nicolas Silva [:nical] from comment #15) > Yes, I mean we could try to create the ImageContainer in a way that it > doesn't use ImageBridge (so mImageClient would then be null) in the case of > poweron video. This way video would go through main-thread layers > transaction. I feel it is more dirty than attachment 8340858 [details] [diff] [review]. By the patch, only difference is TextureHost::Create() and works good.
Add comments.
Attachment #8340858 - Attachment is obsolete: true
Attachment #8342145 - Flags: review?(nical.bugzilla)
(In reply to Sotaro Ikeda [:sotaro] from comment #16) > Milan, can you set the bug to koi+ based on Comment 14? I think it may have to be Fabrice at this point.
Flags: needinfo?(milan) → needinfo?(fabrice)
Comment on attachment 8342145 [details] [diff] [review] patch - Create TextureHostOGL when Compositor is not present on gonk Review of attachment 8342145 [details] [diff] [review]: ----------------------------------------------------------------- If this works I am not particularly against it. I am slightly worried that other parts of the compositor code might fail under the assumption that a Compositor instance was created before, but I might just be too pessimistic about the robustness against this edge case. And in that case it'll worth bullet-proofing our code against that anyways :)
Attachment #8342145 - Flags: review?(nical.bugzilla) → review+
I think there is no problem other than this bug. Problem could happen only when ImageBridgeParent is going to use LayerManagerComposite or Compositor. ImageBridgeParent seems independent from then until ImageBridge's ImageClient is connected to ImageClientBridge, except TextureHost allocation.
Committable patch. Carry "nial review+".
Attachment #8342145 - Attachment is obsolete: true
Attachment #8342405 - Flags: review+
Keywords: checkin-needed
blocking-b2g: koi? → koi+
Note: I'm not approving anything for 1.2, only 1.3
Flags: needinfo?(fabrice)
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: