Closed Bug 791711 Opened 7 years ago Closed 7 years ago

[camera] camera sometimes shows completely black

Categories

(Core :: DOM: Device Interfaces, defect, P2)

ARM
Gonk (Firefox OS)
defect

Tracking

()

VERIFIED FIXED
mozilla19
blocking-basecamp +
Tracking Status
firefox18 --- fixed
firefox19 --- fixed

People

(Reporter: tchung, Assigned: kanru)

References

Details

Attachments

(1 file, 2 obsolete files)

Crossposting from https://github.com/mozilla-b2g/gaia/issues/3667

Camera preview screen shows up black.

retested on 09-17-2012 daily build:

    gaia commit: 392ad17
    gecko commit: a68e207

reproducible STR:
1) launch gallery App, switch to camera activity (button on bottom)
2) app switches to camera and enables the camera preview screen correctly
3) close the app. relaunch Camera App
4) verify the preview is black.
5) If step 4 works correctly, close the camera app and relaunch it again. eventually you'll hit the black preview.

If launching camera from gallery web activity doesnt do it, then close the Gallery and open the Camera directly. Sometimes the black screen will just sit there, until you tap the screen.

See screencast: http://youtu.be/ZRU7ZTUslj0
Logcat: https://gist.github.com/b5f1e55f59d8ff8045c4
Assignee: nobody → mhabicher
blocking-basecamp: ? → +
Someone else is going to have to take a look at this.  I don't have the cycles right now, and from a camera perspective, everything looks like it's running okay.
Assignee: mhabicher → nobody
Kan-Ru, can you take a look?
Assignee: nobody → kchen
(In reply to Tony Chung [:tchung] from comment #0)
> Sometimes the black screen will just sit there, until you tap the screen.

I had this too.
Compositing does not start until any redraw occurs.
This might be a gaia issue with visibility/focus.
Camera does receive visibility event and call camera.start(). MediaStreamGraph is running but the screen doesn't update, compositing counter shows zero.
Before I tap the screen, CompositorParent::GetCompositor() is nullptr :-/
That's always null in content processes.  If it ever goes nonnull, we have a bad bug.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #8)
> That's always null in content processes.  If it ever goes nonnull, we have a
> bad bug.

It is null in parent process. ImageContainerParent failed to lookup via compositiorID, which I don't understand how it is assigned.
There should only ever be one toplevel compositor (widget) for master processes in b2g.  If that's not the case, we're messing something up.
Showing this issue in the latest build:
Just on launching camera.  If you take a picture it will crash due to null picture.

https://gist.github.com/3783526

Otoro phone, build 2012-09-25 eu
Taken from default.xml in b2g-distro: 
* "platform_build" revision= 7d202d6409d7aa95011870e40ecfe7f177c27539
* "gaia" revision= 71e89d4338d59a86e8599abeda24dbfaf1c80e1a 
* "releases-mozilla-central" revision= 1c333645ec70bfce2176876e4bfaf24bfd4bbde5
* "gonk-misc" revision= b9f30c9c745885e03b7d6dab574fd1e588ac1b94
This is a new bug, it's definitely what used to happen.

I see several

E/GeckoConsole(18691): [JavaScript Warning: "Invalid URI. Load of media resource  failed." {file: "app://camera.gaiamobile.org/index.html" line: 0}]

errors in logcat.

Maybe related to removing old camera code?
Um, also

I/QualcommCameraHardware(18691): setRecordSize: preview dimensions: 2097184x2097184
I/QualcommCameraHardware(18691): setRecordSize: video dimensions: 2097184x2097184

WTF?

The preview and picture sizes are sane though.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #13)
> Um, also
> 
> I/QualcommCameraHardware(18691): setRecordSize: preview dimensions: 2097184x2097184
> I/QualcommCameraHardware(18691): setRecordSize: video dimensions: 2097184x2097184
> 
> WTF?

This is what the library logs when video recorder hasn't been configured yet.  The libcamera API uses a single massive class to track all camera properties, and parses the whole thing every time you change even a single setting--including unconfigured settings, which is what you're seeing here.  The values change, but it's always nonsense.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #12)
> This is a new bug, it's definitely what used to happen.
> 
> I see several
> 
> E/GeckoConsole(18691): [JavaScript Warning: "Invalid URI. Load of media
> resource  failed." {file: "app://camera.gaiamobile.org/index.html" line: 0}]
> 
> errors in logcat.

I'm just seeing this too--any idea what this means?

> Maybe related to removing old camera code?

Which camera code was removed?
No idea.

ISTR the old hacky code in b2g/ was removed recently.  I wonder if the camera app was still depending on it, somehow.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #16)
> No idea.
> 
> ISTR the old hacky code in b2g/ was removed recently.  I wonder if the
> camera app was still depending on it, somehow.

Ugh.  Maybe.  The rest of the logcat output looks like a normal camera start-up and configuration, but the preview stream is never started and everything just sits there, idle.
This error appears only twice in gecko:
content/html/content/src/nsHTMLMediaElement.cpp:775:      ReportLoadError("MediaLoadInvalidURI", params, ArrayLength(params));
content/html/content/src/nsHTMLMediaElement.cpp:882:      ReportLoadError("MediaLoadInvalidURI", params, ArrayLength(params));

'MediaLoadInvalidURI' is localized here:
dom/locales/en-US/chrome/dom/dom.properties:117:MediaLoadInvalidURI=Invalid URI. Load of media resource %S failed.

|export NSPR_LOG_MODULES=nsMediaElement:5| reports:
I/PRLog   (  407): 1074550120[42304160]: 42f3ad80 ChangeDelayLoadStatus(1) doc=0x43f54c00
I/PRLog   (  407): 1074550120[42304160]: 42f3ad80 Trying load from src=[object MediaStream @ 0x44445bc0 (native @ 0x44794460)]
I/PRLog   (  407): 1074550120[42304160]: NotifyLoadError(), no supported media error
I/PRLog   (  407): 1074550120[42304160]: 42f3ad80 ChangeDelayLoadStatus(0) doc=0x43f54c00

...where 0x44794460 matches the address of the DOMCameraPreview object:
I/PRLog   (  407): 1074550120[42304160]: mozilla::DOMCameraPreview::DOMCameraPreview(mozilla::ICameraControl*, uint32_t, uint32_t, uint32_t):147 : this=44794460
Yep, that change is what breaks the camera in gaia.  Change |viewfinder.src = stream| to |viewfinder.mozSrcObject = stream| in gaia/apps/camera/js/camera.js to fix.

Back to :tchung's original bug report: unable to reproduce.  Anyone else?
Can't reproduce now. Maybe it was a gaia issue?
Attached patch sNextID should start from zero (obsolete) — Splinter Review
Attachment #671403 - Flags: review?(nical.bugzilla)
Kan-Ru, I attentionally had sNextID start from a non-zero value to keep zero as the invalid ID. This is because mCompositorID is set to zero in CompositorParent's constructor and a task is dispatched from there, that sets the actual value. So it will be easier to debug eventually if we can see that we read mCompositorID somwhere and it is zero (invalid). I incidentally made sNextID it start from 2 instead of 1 but it does not really matter i suppose. Why do we want sNextID to start from zero?
Priority: -- → P2
(In reply to Nicolas Silva [:nical] from comment #23)
> Kan-Ru, I attentionally had sNextID start from a non-zero value to keep zero
> as the invalid ID. This is because mCompositorID is set to zero in
> CompositorParent's constructor and a task is dispatched from there, that
> sets the actual value. So it will be easier to debug eventually if we can
> see that we read mCompositorID somwhere and it is zero (invalid). I
> incidentally made sNextID it start from 2 instead of 1 but it does not
> really matter i suppose. Why do we want sNextID to start from zero?

OK, I thought it was a typo ;-) Actually I thought you had it intentionally start from a non-zero value until I found there is code sending zero. So I need to check how first image and its compositorID is paired.
Attachment #671403 - Flags: review?(nical.bugzilla)
(In reply to Kan-Ru Chen [:kanru] from comment #24)
> OK, I thought it was a typo ;-) Actually I thought you had it intentionally
> start from a non-zero value until I found there is code sending zero. So I
> need to check how first image and its compositorID is paired.

Here are some pointers to what you are looking for:

When the ImageLayer is composited, the compositorID of the image is set to the one of the layer:
http://dxr.mozilla.org/mozilla-central/gfx/layers/opengl/ImageLayerOGL.cpp.html?string=ImageLayerOGL.cpp#l917

The image itself has been queried through a ImageContainerID unique to each video element.
This ImageContainerID is passed to the layer through a layer transaction (the SharedImageID member in SharedImage that you can see here: http://dxr.mozilla.org/mozilla-central/gfx/layers/ipc/LayersSurfaces.ipdlh.html?string=LayersSurfaces.ipdlh#l89 ). The relevent code for the layer transaction is here: http://dxr.mozilla.org/mozilla-central/gfx/layers/basic/BasicImageLayer.cpp.html?string=BasicImageLayer.cpp#l259

Normally what happens is that the page loads, and the first time the ImageContainer's image is set, it calls Invalidate(), which will trigger the layer transaction that sends the ImageContainerID to the ShadowImageLayer. Note that Invalidate() is called only once per ImageContainer that use OMTC (vs each time the image is set for non OMTC).
Then, when composition happens, the ImageBridgeParent tries to find the CompositorID in the global SharedImage map (that is in ImageContainerParent.cpp) by looking up the ImageContainerID. if it does not find it (id = 0) then it can't retrigger composition for this frame (maybe it is what's happenning here?).
On the browser it works because the layer transaction that passes the ImageContainerID always happen before the video starts sending frames.
What happens here is that when nsVideoFrame is building the layer, the image container size is zero (not sure why yet) so the ImageLayer isn't built, the containerID is always zero. Rotate the phone or tap on the screen somehow triggered a reflow or something that rebuild the layer and the ImageLayer is created this time. Maybe a race condition?
Only after received the first frame from child (RecvPublishImage) the SetCompsotiorIDForImage will work. If SetCompsotiorIDForImage (in RenderLayer) is called before we receive the first frame, we won't be able to get the CompositorID for following RecvPublishImage and composition won't be scheduled again, then we are stuck!
(In reply to Kan-Ru Chen [:kanru] from comment #26)
> What happens here is that when nsVideoFrame is building the layer, the image
> container size is zero (not sure why yet) so the ImageLayer isn't built, the
> containerID is always zero. 

This zero size thing that does not create the layer does not look good to me (or if it is an expected behavior then async-video does not take it into account).

> Rotate the phone or tap on the screen somehow
> triggered a reflow or something that rebuild the layer and the ImageLayer is
> created this time. Maybe a race condition?

As long as the ShadowImageLayerOGL is not created and we don't invalidate it at least once (to give it the ImageContainerID), video frames won't reach the screen. If it's a race, it may be a race between the creation of the ImageLayer and the call to VideoFrameContainer::Invalidate(), because the later is called only once and expects that the layer exists. You can go there: http://dxr.mozilla.org/mozilla-central/content/media/VideoFrameContainer.cpp.html?string=VideoFrameContainer.cpp#l97 and remove the return so that it invalidates every frame just to see if there is a race there (but this is not a solution to the problem, because we don't want to trigger layer transactions for each frame).

btw, sorry I  mentioned ImageContainer::Invalidate but is VideoFrameContainer instead ImageContainer.

(In reply to Kan-Ru Chen [:kanru] from comment #27)
> Only after received the first frame from child (RecvPublishImage) the
> SetCompsotiorIDForImage will work. If SetCompsotiorIDForImage (in
> RenderLayer) is called before we receive the first frame, we won't be able
> to get the CompositorID for following RecvPublishImage and composition won't
> be scheduled again, then we are stuck!

SetCompositorIDForImage is called each time we composite, so it's not a big deal if it is called too early. I am more worried about the layer transaction that passes the ImageContainerID not hapenning until you manually trigger a reflow by rotating the screen or something similar.
Not tested yet, my phone is in office, but I think this should work.
Attachment #671403 - Attachment is obsolete: true
Yep, invalidate every frame does bring the preview back.
So the problem is that if we remove the SharedImage from the ImageContainer, ImageContainerParent looses the track of the SharedImageID <-> Compositor pair. VideoFrameContainer doesn't know that and won't invalidate again. We end up depend on something else to invalidate the layer _after_ we registered the ImageContainer to the parent and it does not always happen in the correct order.

This patch re-invalidate the image layer if the SharedImage had been removed. Only set the flag after the ImageContainer has received one image.
Attachment #671828 - Attachment is obsolete: true
Attachment #673142 - Flags: review?(nical.bugzilla)
Attachment #673142 - Flags: review?(nical.bugzilla) → review+
Comment on attachment 673142 [details] [diff] [review]
Invalidate again if the SharedImageID was removed.

Review of attachment 673142 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/VideoFrameContainer.cpp
@@ +78,5 @@
>  
>    mImageContainer->SetCurrentImage(nullptr);
> +
> +  // We removed the current image so we will have to invalidate once
> +  // agaian to setup the ImageContainer <-> Compositor pair.

"agaian" --> "again"
https://hg.mozilla.org/mozilla-central/rev/6fb79aaa4092
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → mozilla19
Duplicate of this bug: 798164
verified fixed on 11-14 unagi daily build.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.