Closed Bug 994903 Opened 6 years ago Closed 6 years ago

Camera app crashes during stability test in mozilla::layers::GrallocImage::~GrallocImage

Categories

(Firefox OS Graveyard :: Gaia::Camera, defect, P1, critical)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-b2g:1.4+, firefox29 wontfix, firefox30 fixed, firefox31 fixed, b2g-v1.4 fixed, b2g-v2.0 fixed)

RESOLVED FIXED
1.4 S6 (25apr)
blocking-b2g 1.4+
Tracking Status
firefox29 --- wontfix
firefox30 --- fixed
firefox31 --- fixed
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed

People

(Reporter: tkundu, Assigned: sotaro)

References

Details

(Keywords: crash, Whiteboard: [caf-crash 171][caf priority: p1][CR 646525][b2g-crash])

Crash Data

Attachments

(2 files, 3 obsolete files)

Attached file logsAndStackTrace.zip
STR:

1. Run stability test with Call, SMS, Browser, Camera, Camcorder, Music and video . 
2. Receive MT calls, while running automation test cases.
3. After night run, mini dumps are generated in the phone.
blocking-b2g: --- → 1.4?
Sotaro, looks like another graphics buffer lifetime issue.

Crash reason:  SIGSEGV
Crash address: 0x5a5a5a58

Thread 7 (crashed)
 0  0x5a5a5a58
     r4 = 0xb140a714    r5 = 0xb140a710    r6 = 0xc2c444fa    r7 = 0x000014fc
     r8 = 0xc2c1c4ec    r9 = 0x000014fc   r10 = 0xb329f670    fp = 0xb1341d40
     sp = 0xb3a01930    lr = 0xb5973ecb    pc = 0x5a5a5a58
    Found by: given as instruction pointer in context
 1  libxul.so!mozilla::layers::GrallocImage::~GrallocImage() [RefPtr.h : 274 + 0x5]
     sp = 0xb3a01940    pc = 0xb59a14d1
    Found by: stack scanning
 2  libxul.so!mozilla::layers::GrallocImage::~GrallocImage() [GrallocImages.cpp : 74 + 0x3]
     r4 = 0xb1341ea0    r5 = 0x00000000    r6 = 0xc2c444fa    sp = 0xb3a01988
     pc = 0xb59a1501
    Found by: call frame info
 3  libxul.so!CSF::CC_Call::Release() + 0x17
     r4 = 0xb1341ea0    r5 = 0x00000000    r6 = 0xc2c444fa    sp = 0xb3a01990
     pc = 0xb59002e9
    Found by: call frame info
 4  libxul.so!mozilla::VideoFrameContainer::SetCurrentFrame(nsIntSize const&, mozilla::layers::Image*, mozilla::TimeStamp) [nsAutoPtr.h : 900 + 0x5]
     r4 = 0xb1341ea0    r5 = 0x00000000    r6 = 0xc2c444fa    sp = 0xb3a019a0
     pc = 0xb5e8d7e3
    Found by: call frame info
 5  libxul.so!mozilla::CameraPreviewMediaStream::SetCurrentFrame(nsIntSize const&, mozilla::layers::Image*) [CameraPreviewMediaStream.cpp : 113 + 0xb]
     r4 = 0xc4bff438    r5 = 0x000014fc    r6 = 0xb471e3d0    r7 = 0x00000000
     r8 = 0xb32bfbc0    r9 = 0xb1341d40   r10 = 0xb3a01a30    fp = 0xb5d510ad
     sp = 0xb3a019e0    pc = 0xb5d51287
    Found by: call frame info
 6  libxul.so!mozilla::DOMCameraControlListener::OnNewPreviewFrame(mozilla::layers::Image*, unsigned int, unsigned int) [DOMCameraControlListener.cpp : 246 + 0x9]
     r4 = 0xb1341d40    r5 = 0x00000000    r6 = 0xb471e3d0    r7 = 0xb1341d40
     r8 = 0x00000160    r9 = 0x00000120   r10 = 0x00000230    fp = 0xb3298258
     sp = 0xb3a01a30    pc = 0xb5d53e4d
    Found by: call frame info
 7  libxul.so!mozilla::CameraControlImpl::OnNewPreviewFrame(mozilla::layers::Image*, unsigned int, unsigned int) [CameraControlImpl.cpp : 224 + 0x9]
     r4 = 0x00000001    r5 = 0x00000000    r6 = 0xb471e3d0    r7 = 0xb1341d40
     r8 = 0x00000160    r9 = 0x00000120   r10 = 0x00000230    fp = 0xb3298258
     sp = 0xb3a01a40    pc = 0xb5d50a89
    Found by: call frame info
 8  libxul.so!mozilla::nsGonkCameraControl::OnNewPreviewFrame(mozilla::layers::GraphicBufferLocked*) [GonkCameraControl.cpp : 1381 + 0xb]
     r4 = 0xb1341d40    r5 = 0xb471e3d0    r6 = 0xb14fe5e0    r7 = 0xb3a01b48
     r8 = 0x00000001    r9 = 0xb3298a70   r10 = 0x00000230    fp = 0xb3298258
     sp = 0xb3a01a68    pc = 0xb5d55c23
    Found by: call frame info
 9  libxul.so!android::GonkCameraHardware::OnNewFrame() [GonkCameraHwMgr.cpp : 56 + 0x5]
     r4 = 0xb3a01a88    r5 = 0xb2ae5d40    r6 = 0xb3298a50    r7 = 0xb3a01b48
     r8 = 0x00000001    r9 = 0xb3298a70   r10 = 0x00000230    fp = 0xb3298258
     sp = 0xb3a01a88    pc = 0xb5d56d17
    Found by: call frame info
10  libxul.so!android::GonkNativeWindow::onFrameAvailable() [GonkNativeWindowKK.cpp : 173 + 0x5]
     r4 = 0xb2add000    r5 = 0x00000007    r6 = 0xb3298a50    r7 = 0xb3a01b48
     r8 = 0x00000001    r9 = 0xb3298a70   r10 = 0x00000230    fp = 0xb3298258
     sp = 0xb3a01aa0    pc = 0xb5c047ed
    Found by: call frame info
11  libxul.so!android::GonkBufferQueue::ProxyConsumerListener::onFrameAvailable() [GonkBufferQueueKK.cpp : 1276 + 0x5]
     r4 = 0xb2c72e60    r5 = 0x00000007    r6 = 0xb3298a50    r7 = 0xb3a01b48
     r8 = 0x00000001    r9 = 0xb3298a70   r10 = 0x00000230    fp = 0xb3298258
     sp = 0xb3a01aa8    pc = 0xb5c0263d
    Found by: call frame info
12  libxul.so!android::GonkBufferQueue::queueBuffer(int, android::IGraphicBufferProducer::QueueBufferInput const&, android::IGraphicBufferProducer::QueueBufferOutput*) [GonkBufferQueueKK.cpp : 638 + 0x5]
     r4 = 0xb2c72e60    r5 = 0x00000007    r6 = 0xb3298a50    r7 = 0xb3a01b48
     r8 = 0x00000001    r9 = 0xb3298a70   r10 = 0x00000230    fp = 0xb3298258
     sp = 0xb3a01ab8    pc = 0xb5c0291f
    Found by: call frame info
13  libgui.so!android::BnGraphicBufferProducer::onTransact(unsigned int, android::Parcel const&, android::Parcel*, unsigned int) [IGraphicBufferProducer.cpp : 246 + 0x3]
     r4 = 0xb5c02649    r5 = 0xb3a01cd4    r6 = 0xb3298000    r7 = 0xb3a01c10
     r8 = 0x00000000    r9 = 0x00000007   r10 = 0xbee22c74    fp = 0xb6ec82ec
     sp = 0xb3a01bf0    pc = 0xb503a73b
    Found by: call frame info
Flags: needinfo?(sotaro.ikeda.g)
Hmm, GrallocImage is also thread safe referenced...
Flags: needinfo?(sotaro.ikeda.g)
Assignee: nobody → sotaro.ikeda.g
Tapas, like Bug 985681, can you test again by applying Bug 988713 fix?
Flags: needinfo?(tkundu)
blocking-b2g: 1.4? → 1.4+
pc is causing the crash. It seems that during GrallocImage::~GrallocImage(), jump to incorrect pc address. 

--------------------------------
 0  0x5a5a5a58
     r4 = 0xb140a714    r5 = 0xb140a710    r6 = 0xc2c444fa    r7 = 0x000014fc
     r8 = 0xc2c1c4ec    r9 = 0x000014fc   r10 = 0xb329f670    fp = 0xb1341d40
     sp = 0xb3a01930    lr = 0xb5973ecb    pc = 0x5a5a5a58
    Found by: given as instruction pointer in context
 1  libxul.so!mozilla::layers::GrallocImage::~GrallocImage() [RefPtr.h : 274 + 0x5]
     sp = 0xb3a01940    pc = 0xb59a14d1
    Found by: stack scanning
There seems a possibility that this bug is dup of Bug 985681. Bug 985681's crash calls stack also have GrallocImage::~GrallocImage().
Flags: needinfo?(tkundu)
(In reply to Sotaro Ikeda [:sotaro] from comment #3)
> Tapas, like Bug 985681, can you test again by applying Bug 988713 fix?

Sorry, I cancel the request. We found one problem on gecko source. :jrmuizel found it.
AtomicRefCountedWithFinalize::Release() has race condition problem.

http://mxr.mozilla.org/mozilla-central/source/gfx/layers/AtomicRefCountedWithFinalize.h#31
So, this bug seems a different problem than Bug 985681.
If two threads are almost concurrently call Release(), the following part could be called after the object deletion. It could happen depends on thread scheduling.

>  } else if (1 == currCount && mRecycleCallback) {
Only fix seems that changing AtomicRefCountedWithFinalize as recycling happen on count become 0. On current implementation recycling happens when count becomes 1. It is implemented like it just because of simplicity.
0x5a5a5a58 says that the code is trying to access already deleted(initialized) data.
This change request the followings to recycler.
- Recycler needs to hold RefPtr<> during waiting recycling callback.
- Recycler needs to call ClearRecycleCallback() before release object.
Fix nits.
Attachment #8405021 - Attachment is obsolete: true
Keywords: crash
Whiteboard: [CR 646525] → [CR 646525][b2g-crash]
This seems simpler than previous one.
Attachment #8405022 - Attachment is obsolete: true
Comment on attachment 8405025 [details] [diff] [review]
patch v3  - Copy mRecycleCallback on top of Release()

Jeff, can you review the patch?
Attachment #8405025 - Flags: review?(jmuizelaar)
Comment on attachment 8405025 [details] [diff] [review]
patch v3  - Copy mRecycleCallback on top of Release()

Review of attachment 8405025 [details] [diff] [review]:
-----------------------------------------------------------------

This seems like a reasonable short term solution. I'd suggest adding a comment that 
we read mRecycleCallback early so that it does not get set to deleted memory, if the
object is goes away.
Attachment #8405025 - Flags: review?(jmuizelaar) → review+
Update a comment. Carry "r=jrmuizel".
Attachment #8405025 - Attachment is obsolete: true
Attachment #8405405 - Flags: review+
Status: NEW → ASSIGNED
Attachment #8405405 - Flags: review?(bgirard)
Attachment #8405405 - Flags: review?(bgirard) → review+
https://hg.mozilla.org/mozilla-central/rev/9778da56668b
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → 1.5 S1 (9may)
Tapas, can you test again by applying attachment 8405405 [details] [diff] [review] in Bug 994903 and Bug 988713 fix?
Flags: needinfo?(tkundu)
(In reply to Sotaro Ikeda [:sotaro] from comment #21)
> Tapas, can you test again by applying attachment 8405405 [details] [diff] [review]
> [review] in Bug 994903 and Bug 988713 fix?

Issue is still present even after applying this fix .
Status: RESOLVED → REOPENED
Flags: needinfo?(tkundu) → needinfo?(sotaro.ikeda.g)
Resolution: FIXED → ---
(In reply to Tapas Kumar Kundu from comment #22)
> (In reply to Sotaro Ikeda [:sotaro] from comment #21)
> > Tapas, can you test again by applying attachment 8405405 [details] [diff] [review]
> > [review] in Bug 994903 and Bug 988713 fix?
> 
> Issue is still present even after applying this fix .

Tapas, can you provide the crash log? It is totally same as attachment 8404927 [details] in comment 0?
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(tkundu)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #20)
> https://hg.mozilla.org/mozilla-central/rev/9778da56668b

https://hg.mozilla.org/releases/mozilla-aurora/rev/2a597beb4be7
Target Milestone: 1.5 S1 (9may) → 1.4 S6 (25apr)
(In reply to Sotaro Ikeda [:sotaro] from comment #23)
> (In reply to Tapas Kumar Kundu from comment #22)
> > (In reply to Sotaro Ikeda [:sotaro] from comment #21)
> > > Tapas, can you test again by applying attachment 8405405 [details] [diff] [review]
> > > [review] in Bug 994903 and Bug 988713 fix?
> > 
> > Issue is still present even after applying this fix .
> 
> Tapas, can you provide the crash log? It is totally same as attachment
> 8404927 [details] in comment 0?

Sorry, #comment 22 is my misunderstanding. Our internal test team is still testing build with your fix. I will update here soon .
(In reply to Tapas Kumar Kundu from comment #25)
> (In reply to Sotaro Ikeda [:sotaro] from comment #23)
> > (In reply to Tapas Kumar Kundu from comment #22)
> > > (In reply to Sotaro Ikeda [:sotaro] from comment #21)
> > > > Tapas, can you test again by applying attachment 8405405 [details] [diff] [review]
> > > > [review] in Bug 994903 and Bug 988713 fix?
> > > 
> > > Issue is still present even after applying this fix .
> > 
> > Tapas, can you provide the crash log? It is totally same as attachment
> > 8404927 [details] in comment 0?
> 
> Sorry, #comment 22 is my misunderstanding. Our internal test team is still
> testing build with your fix. I will update here soon .

It is not coming anymore in v1.4 tip of gaia/gecko. Sorry for re-opening it. It can be closed now. Thanks a lot for your work
Flags: needinfo?(tkundu)
Closed based on Comment 26.
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Whiteboard: [CR 646525][b2g-crash] → [caf priority: p1][CR 646525][b2g-crash]
Observed on: 

Device: 
Gonk Version: AU_LINUX_GECKO_B2G_KK_3.5.01.04.00.113.068
Moz BuildID: 20140331000202
B2G Version: 1.4
Gecko Version: 30.0a2
Gaia:  http://git.mozilla.org/?p=releases/gaia.git;a=commit;h=4c3b2f57f4229c5f36f0d8fd399e65f4db88f104
Gecko: http://git.mozilla.org/?p=releases/gecko.git;a=commit;h=d02b9250ef7fedafc6c709dfcc899844b8624ab6
Whiteboard: [caf priority: p1][CR 646525][b2g-crash] → [caf-crash 171][caf priority: p1][CR 646525][b2g-crash]
Flags: in-moztrap?(bzumwalt)
STR involves automation processes that are unusable to create test case.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
Flags: in-moztrap?(bzumwalt)
Flags: in-moztrap-
You need to log in before you can comment on or make changes to this bug.