Closed Bug 784829 Opened 12 years ago Closed 12 years ago

Camera not working anymore

Categories

(Core :: Audio/Video, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla18
blocking-basecamp +
Tracking Status
firefox17 --- fixed

People

(Reporter: gerard-majax, Assigned: roc)

References

Details

(Whiteboard: [qa!], [getUserMedia], [blocking-gum+])

Attachments

(3 files, 1 obsolete file)

No more preview working in Camera app, no more camera working. Since roughly 21st, August.

Logcat (when broken) :
$ adb logcat | grep -i "camera viewfinder"
I/Gecko   (  241): camera viewfinder is unknown
I/Gecko   (  241): camera viewfinder is consuming

Logcat (when working) :
$ adb logcat | grep -i "camera viewfinder"
I/Gecko   (  345): camera viewfinder is consuming
I/Gecko   (  345): camera viewfinder is not consuming

bisecting with git gives the following commit:

---
8d75feebe8cafe014d553d6f7653a13c8db737fa is the first bad commit
commit 8d75feebe8cafe014d553d6f7653a13c8db737fa
Author: Robert O'Callahan <robert@ocallahan.org>
Date:   Wed Aug 1 00:17:21 2012 +1200

    Bug 779715. Part 1: Add basic support for ProcessedMediaStreams. r=jesup
    The main thing this patch does is to add ProcessedMediaStream objects and
    MediaInputPorts connecting streams. ProcessedMediaStreams are an abstract
    class that doesn't constrain what processing is actually performed, except
    that for now we assume a stream's processing depends only on its inputs
    window of data between mCurrentTime and mStateComputedTime.
    This patch reorganizes the way the blocking status of each stream is computed.
    The streams are partitioned into groups so that every stream which can affect
    the blocking status of another stream is in the same group as that other stream.
    We also add a pass to order the streams by dependency so we can process the streams
    in order of dependency; this pass also identifies the streams that form part of a
    cycle.
    
    --HG--
    extra : rebase_source : c45c931a264e73f295642a934500bbeaa6448774

:040000 040000 9ec2cc6f184b9b2a14ca93317c14fc2775ca00b4 0cd45ea944fa09c23f9741c54bec158e9c69cfa9 M	content
(In reply to Alexandre LISSY from comment #0)
> $ adb logcat | grep -i "camera viewfinder"
> I/Gecko   (  241): camera viewfinder is unknown
> I/Gecko   (  241): camera viewfinder is consuming

That's weird. "unknown" seems wrong, since the only caller of NotifyConsumptionChanged is MediaStreamGraphImpl::UpdateConsumptionState, which passes
  MediaStreamListener::Consumption state =
      aStream->mIsConsumed ? MediaStreamListener::CONSUMED
      : MediaStreamListener::NOT_CONSUMED;
So I would love to see a stack of how we hit "unknown" there.
Attached patch patch (obsolete) — Splinter Review
Alexandre, can you try this patch with NSPR_LOG_MODULES=MediaStreamGraph:5 in your environment? It might possibly fix your bug, but I don't expect it to, but the log data would be helpful. Also, if you can get a stack for us hitting http://dxr.mozilla.org/mozilla-central/dom/camera/CameraPreview.cpp.html#l44 that would be helpful. I have absolutely no idea how we can hit that.
Assignee: nobody → roc
I'll give it a try, but FYI, I'm not the only one hitting this.
First, backtrace from GDB (your patch not applied):
Reading symbols from /home/alex/codaz/B2G/objdir-gecko/dist/bin/b2g...done.
Remote debugging from host 127.0.0.1
syscall () at bionic/libc/arch-arm/bionic/syscall.S:50
50	    ldmfd   sp!, {r4, r5, r6, r7}
(gdb) break CameraPreview.cpp
CameraPreview.cpp  cpp                cpp_t              
(gdb) break CameraPreview.cpp:45
Breakpoint 1 at 0x4094049e: file /home/alex/codaz/B2G/gecko/dom/camera/CameraPreview.cpp, line 45.
(gdb) continue 
Continuing.
[New Thread 77.241]
[Switching to Thread 77.241]

Breakpoint 1, CameraPreviewListener::NotifyConsumptionChanged (this=0x29ec008, aGraph=<value optimized out>, aConsuming=113)
    at /home/alex/codaz/B2G/gecko/dom/camera/CameraPreview.cpp:45
45	        state = "unknown";
(gdb) bt
#0  CameraPreviewListener::NotifyConsumptionChanged (this=0x29ec008, aGraph=<value optimized out>, aConsuming=113) at /home/alex/codaz/B2G/gecko/dom/camera/CameraPreview.cpp:45
#1  0x4094c1e6 in mozilla::MediaStreamGraphImpl::UpdateConsumptionState (this=0x20667c8, aStream=0x29ebee0) at /home/alex/codaz/B2G/gecko/content/media/MediaStreamGraph.cpp:572
#2  0x4094f8a4 in mozilla::MediaStreamGraphImpl::RunThread (this=0x20667c8) at /home/alex/codaz/B2G/gecko/content/media/MediaStreamGraph.cpp:1380
#3  0x4094faac in Run (this=<value optimized out>) at /home/alex/codaz/B2G/gecko/content/media/MediaStreamGraph.cpp:1512
#4  0x40c837fa in nsThread::ProcessNextEvent (this=0x2066898, mayWait=<value optimized out>, result=0x4ca65eb7) at /home/alex/codaz/B2G/gecko/xpcom/threads/nsThread.cpp:624
#5  0x40c63f06 in NS_ProcessNextEvent_P (thread=0xa0, mayWait=true) at /home/alex/codaz/B2G/objdir-gecko/xpcom/build/nsThreadUtils.cpp:220
#6  0x40c83c28 in nsThread::ThreadFunc (arg=<value optimized out>) at /home/alex/codaz/B2G/gecko/xpcom/threads/nsThread.cpp:257
#7  0x401909b4 in _pt_root (arg=<value optimized out>) at /home/alex/codaz/B2G/gecko/nsprpub/pr/src/pthreads/ptthread.c:156
#8  0x4004fc28 in __thread_entry (func=0x40190955 <_pt_root>, arg=0x20bc920, tls=<value optimized out>) at bionic/libc/bionic/pthread.c:217
#9  0x4004f77c in pthread_create (thread_out=<value optimized out>, attr=0xbe86c6b4, start_routine=0x40190955 <_pt_root>, arg=0x20bc920) at bionic/libc/bionic/pthread.c:357
#10 0x00000000 in ?? ()
With updated gecko (master at ef6746907b0a6a5d3160e1b26272ba38c5fdef73), and the proposed changes, the preview works again. Not sure if the patch is related, I'll retry without. In the meantime, please find the log with the changes applied and environment set accordingly.
Please also not that during this latest trial, I had no "unknown" status popping.
Build just finished, without your proposed patch and with master at ef6746907b0a6a5d3160e1b26272ba38c5fdef73, preview is broken, and I get "unknown" status in logcat.
I've played a bit with gdb, and adding a breakpoint in MediaStreamGraphImpl::UpdateConsumptionState() inside the "if (state != aStream->mLastConsumptionState)" is never being hit. Meaning state == aStream->mLastConsumptionState.
And I see nowhere in the sourcecode where aStream->mLastConsumptionState is changed.
Also, playing with breakpoints, I saw that both "aStream->mIsConsumed = true;" in MediaStreamGraphImpl::DetermineWhetherStreamIsConsumed gets called.
Issue is fixed by only applying this part of your patch:

@@ -940,16 +939,17 @@ void
 MediaStreamGraphImpl::UpdateStreamOrder()
 {
   nsTArray<nsRefPtr<MediaStream> > oldStreams;
   oldStreams.SwapElements(mStreams);
   for (PRUint32 i = 0; i < oldStreams.Length(); ++i) {
     MediaStream* stream = oldStreams[i];
     stream->mHasBeenOrdered = false;
     stream->mKnowIsConsumed = false;
+    stream->mIsConsumed = false;
     stream->mIsOnOrderingStack = false;
     stream->mInBlockingSet = false;
     ProcessedMediaStream* ps = stream->AsProcessedStream();
     if (ps) {
       ps->mInCycle = false;
     }   
   }
(In reply to Alexandre LISSY from comment #11)
> Issue is fixed by only applying this part of your patch:
> 
> @@ -940,16 +939,17 @@ void
>  MediaStreamGraphImpl::UpdateStreamOrder()
>  {
>    nsTArray<nsRefPtr<MediaStream> > oldStreams;
>    oldStreams.SwapElements(mStreams);
>    for (PRUint32 i = 0; i < oldStreams.Length(); ++i) {
>      MediaStream* stream = oldStreams[i];
>      stream->mHasBeenOrdered = false;
>      stream->mKnowIsConsumed = false;
> +    stream->mIsConsumed = false;
>      stream->mIsOnOrderingStack = false;
>      stream->mInBlockingSet = false;
>      ProcessedMediaStream* ps = stream->AsProcessedStream();
>      if (ps) {
>        ps->mInCycle = false;
>      }   
>    }

I can confirm that this patch fixes a lot of problems (previews shutdown properly and objects are properly collected when the camera is closed); but I'm still seeing the preview stall eventually--all of the GraphicsLockedBuffers seem to be stuck in the DEQUEUED or RENDERING state.
(In reply to Alexandre LISSY from comment #4)
> (gdb) bt
> #0  CameraPreviewListener::NotifyConsumptionChanged (this=0x29ec008,
> aGraph=<value optimized out>, aConsuming=113) at
> /home/alex/codaz/B2G/gecko/dom/camera/CameraPreview.cpp:45
> #1  0x4094c1e6 in mozilla::MediaStreamGraphImpl::UpdateConsumptionState
> (this=0x20667c8, aStream=0x29ebee0) at
> /home/alex/codaz/B2G/gecko/content/media/MediaStreamGraph.cpp:572

I guess the compiler is optimizing mIsConsumed ? CONSUMED : NOT_CONSUMED away to nothing since the only valid values for mIsConsumed are 0 and 1, so uninitialized and invalid mIsConsumed is passed through.

Thanks a ton for your help.
Attached patch fixSplinter Review
Attachment #654506 - Attachment is obsolete: true
Attachment #654852 - Flags: review?(rjesup)
Assignee: roc → nobody
Component: General → Video/Audio
Product: Boot2Gecko → Core
Assignee: nobody → roc
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #13)
> (In reply to Alexandre LISSY from comment #4)
> > (gdb) bt
> > #0  CameraPreviewListener::NotifyConsumptionChanged (this=0x29ec008,
> > aGraph=<value optimized out>, aConsuming=113) at
> > /home/alex/codaz/B2G/gecko/dom/camera/CameraPreview.cpp:45
> > #1  0x4094c1e6 in mozilla::MediaStreamGraphImpl::UpdateConsumptionState
> > (this=0x20667c8, aStream=0x29ebee0) at
> > /home/alex/codaz/B2G/gecko/content/media/MediaStreamGraph.cpp:572
> 
> I guess the compiler is optimizing mIsConsumed ? CONSUMED : NOT_CONSUMED
> away to nothing since the only valid values for mIsConsumed are 0 and 1, so
> uninitialized and invalid mIsConsumed is passed through.
> 
> Thanks a ton for your help.

Unfortunately even with this patch, the preview still freezes.

I've been instrumenting a lot of the code, and from what I can see, it looks like something gets confused in an upper layer.  The camera is blocked because there are no free buffers--all of the buffers are either DEQUEUED or RENDERING, which means they're either already in the pipe to be filled, or displayed, respectively.  A call to the GonkIOSurfaceImage dtor will mark the buffers FREE, but that never happens.  Meanwhile the media thread keeps on chugging (it sure does generate a lot of output. :)

I have a _very_ detailed log that shows what's going on, but it's 15MiB.

It prompts two questions:
1. Why is the camera holding onto so many buffers?
2. Why is media stream/renderer not consuming and releasing the frames?

This pastie shows a backtrace of what should be happening: http://pastebin.com/DieGstPZ
blocking-basecamp: --- → +
Attached file Detailed log dump
Shows:
1. open camera app, which opens preview successfully
2. takes picture
3. restarts preview successfully
4. takes pictures
5. preview fails to restart
I've confirmed that if ReceiveFrame() is replaced with a stub that just immediately frees the preview data buffer, the preview never freezes.
The patch on this bug is waiting for 6 days now for review. Is it not the right approach, or still valid fix? Does something else need to happen per Mike's previous comments?
That patch needs to land. I just nagged Jesup.
(In reply to Mike Habicher [:mikeh] from comment #16)
> Shows:
> 1. open camera app, which opens preview successfully
> 2. takes picture
> 3. restarts preview successfully
> 4. takes pictures
> 5. preview fails to restart

In that log, only three frames get written to the final stream 4538c3d0. Are we really expecting to have only three frames available? Have we failed to free frames from the previous stream(s)?
We remove media stream 4538d920 from the graph only after 50 seconds, which is after the hang and long after we stopped using it. So maybe it (and 4655c240) have kept some Images alive. They probably aren't removed from the graph until cycle collection (or at least GC) runs, and maybe they aren't running very often here. Not sure yet if that's the problem, although since each stream holds onto at least the last Image, it's not good.

I see that at about 1.8 seconds (right after "Updating current time to ... real 1.752319") we suddenly allocate all available buffers (according to your "Buffer [0]: state DEQUEUED" logs, before time 1.806122. Why is that? That's less than 50ms so you should only get a couple of frames in that time?
With patch from bug 781892, I've been able to go up to 10 photos before getting killed (OOM probably)
Comment on attachment 654852 [details] [diff] [review]
fix

Review of attachment 654852 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry for the delay; I got swamped in bugmail by some QA add-myself-to-bugs and triage-generated mails, and haven't dug out, and missed this one
Attachment #654852 - Flags: review?(rjesup) → review+
I think we should consider this bug fixed by this patch. The problem with playback stalling is a different problem that deserves its own bug.
Comment on attachment 654852 [details] [diff] [review]
fix

Review of attachment 654852 [details] [diff] [review]:
-----------------------------------------------------------------

Fixes a serious regression introduced before the Aurora uplift.
Attachment #654852 - Flags: approval-mozilla-aurora?
Yep, the freeze is being tracked down in bug 781892.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #25)
> Comment on attachment 654852 [details] [diff] [review]
> fix
> 
> Review of attachment 654852 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Fixes a serious regression introduced before the Aurora uplift.

I don't see a comment for this landing on central yet, will wait for that to approve aurora uplift.
https://hg.mozilla.org/mozilla-central/rev/3d7e89acf4c2
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
QA Contact: jsmith
Whiteboard: [qa+]
Whiteboard: [qa+] → [qa+], [getUserMedia], [blocking-gum+]
Verified on 8/30 Nightly build.
Status: RESOLVED → VERIFIED
Whiteboard: [qa+], [getUserMedia], [blocking-gum+] → [qa!], [getUserMedia], [blocking-gum+]
Attachment #654852 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: