crash in mozilla::nsDOMCameraControl::OnUserError(mozilla::CameraControlListener::UserContext, nsresult)

VERIFIED FIXED in Firefox 40

Status

()

defect
--
critical
VERIFIED FIXED
4 years ago
4 years ago

People

(Reporter: dharris, Assigned: aosmond)

Tracking

({crash, regression})

Trunk
mozilla40
ARM
Gonk (Firefox OS)
Points:
---

Firefox Tracking Flags

(blocking-b2g:2.2+, firefox38 wontfix, firefox39 wontfix, firefox40 fixed, b2g-v2.1 unaffected, b2g-v2.2 verified, b2g-master verified)

Details

(Whiteboard: [3.0-Daily-Testing], [319mb-flame-support] [b2g-crash], crash signature, URL)

Attachments

(5 attachments)

This bug was filed from the Socorro interface and is 
report bp-c7542b40-a05d-4e07-aec1-d8d612150424.
=============================================================

Description:
Going to card view or home screen while taking a picture in the camera app can make the camera app try to recover while the flash is turning on and then become stuck recovering. This will cause the camera app to crash, and once the app crashes this way, the user will be unable to successfully open the camera app until the device is restarted. SHB was enabled while this bug occured, but I dont think its necessary

Prerequisite: Camera has to have flash enabled, or on auto

Repro Steps:
1) Update a Flame to 20150424010200
2) Open Camera app
3) Take a picture in a dark area so flash initiates> Long press home button quickly
4) Return to Camera app
5) Take a picture in a dark area so flash initiates> Tap home button quickly
6) Repeat steps 3-5 Until app crashes

Actual:
Once the camera app has crashed this way, the user will not be able to open the app without it crashing until they restart their device


Expected:
Camera app can transition between card view and home screen without crashing


Environmental Variables:
Device: Flame 3.0 (319mb)(Kitkat)(Full Flash)
Build ID: 20150424010200
Gaia: 0c5e2ee1173f3c53379ef3cd10de714836258fe8
Gecko: 22a157f7feb7
Gonk: b83fc73de7b64594cd74b33e498bf08332b5d87b
Version: 40.0a1 (Master)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0


Repro frequency: 2/20
See attached: Logcat, Video - https://youtu.be/4wo0q0EZdpM
I was unable to get this issue to occur on Flame 2.2

The repro rate is very low and could be present at a lower rate on this branch. Not marking tracking flags or regression keyword for this reason

Environmental Variables:
Device: Flame 2.2 (319mb)(Kitkat)(Full Flash)
Build ID: 20150424002507
Gaia: b838d0e7c163e66660dcb6e387d8339944a7a30e
Gecko: 5fe76b26e55f
Gonk: ebad7da532429a6f5efadc00bf6ad8a41288a429
Version: 37.0 (2.2)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Adding steps-wanted to see if we can find better STR's with a higher reproducibility rate.
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Keywords: steps-wanted
I can reliably reproduce a crash using the provided STR, however some of my crashes are missing symbols. They look like this
https://crash-stats.mozilla.com/report/index/f94bcbb1-79e4-424e-b283-627c62150501
https://crash-stats.mozilla.com/report/index/382f068b-98d9-451a-b3a2-1fa862150501

Three of the crashes do match this bug's signature:
https://crash-stats.mozilla.com/report/index/79cc75ce-fc3b-45ff-90ec-2c22b2150501
https://crash-stats.mozilla.com/report/index/aa5337f3-1013-454d-83bd-d58f82150501
https://crash-stats.mozilla.com/report/index/17d45ef6-3bcb-46f6-bee2-b47292150501

Same as described at comment 0, after I see those crashes, Camera app can no longer be used.

At this point I'm assuming the missing symbols crashes are the same as this bug because I did the exact same things and they have the same outcome (crashing camera continuously)

I think it is a combination of a timing, memory stressing issue (I can't repro it with unrestricted memory), and window management issue. I got the first crash within 5 minutes of trying, and all the rest of the crashes within 2 minutes (less than 15 attempts).

SHB is not needed to repro the crash. After FTU I went straight to Camera app attempting STR. I denied Camera app geolocation permission.

I also can't repro in v2.2. Marking v2.2 as unaffected.

Device: Flame 3.0 Master (full flashed 319MB KK)
BuildID: 20150501010203
Gaia: 759a1f935a6a81c32ad66e39a6353b334dfa4f91
Gecko: 7723b15ea695
Gonk: b83fc73de7b64594cd74b33e498bf08332b5d87b
Version: 40.0a1 (3.0 Master)
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
Keywords: steps-wantedregression
Whiteboard: [3.0-Daily-Testing] → [3.0-Daily-Testing], [319mb-flame-support]
(need info to help with investigation ; also possibilities of symbols missing and why )
Flags: needinfo?(nhirata.bugzilla)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
f94bcbb1-79e4-424e-b283-627c62150501 is the same crash.
checking the other crash.
382f068b-98d9-451a-b3a2-1fa862150501 is also the same crash.
Investigating why the symbols weren't attached.
Frame 	Module 	Signature 	Source
0 	libxul.so 	mozilla::nsDOMCameraControl::OnUserError(mozilla::CameraControlListener::UserContext, nsresult) 	dom/camera/DOMCameraControl.cpp
1 	libxul.so 	mozilla::DOMCameraControlListener::DOMCallback::Run() 	dom/camera/DOMCameraControlListener.cpp
2 	libxul.so 	nsThread::ProcessNextEvent(bool, bool*) 	xpcom/threads/nsThread.cpp
3 	libxul.so 	NS_ProcessNextEvent(nsIThread*, bool) 	xpcom/glue/nsThreadUtils.cpp
4 	libxul.so 	mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) 	ipc/glue/MessagePump.cpp
5 	libxul.so 	MessageLoop::RunInternal() 	ipc/chromium/src/base/message_loop.cc
6 	libxul.so 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc
7 	libxul.so 	nsBaseAppShell::Run() 	widget/nsBaseAppShell.cpp
8 	libxul.so 	XRE_RunAppShell 	toolkit/xre/nsEmbedFunctions.cpp
9 	libxul.so 	MessageLoop::RunInternal() 	ipc/chromium/src/base/message_loop.cc
10 	libxul.so 	MessageLoop::Run() 	ipc/chromium/src/base/message_loop.cc
11 	libxul.so 	XRE_InitChildProcess 	toolkit/xre/nsEmbedFunctions.cpp
12 	libxul.so 	content_process_main(int, char**) 	ipc/contentproc/plugin-container.cpp
13 	libxul.so 	mozilla::ipc::ProcLoaderLoadRunner::DoWork() 	ipc/glue/ProcessUtils_linux.cpp
14 	libxul.so 	XRE_ProcLoaderServiceRun 	ipc/glue/ProcessUtils_linux.cpp
15 	b2g 	main 	b2g/app/B2GLoader.cpp
16 	libc.so 	__libc_init 	/builds/slave/b2g_m-cen_flm-kk_ntly-00000000/build/bionic/libc/bionic/libc_init_dynamic.cpp:112
17 	b2g 	b2g@0xc06a 	
18 	linker 	set_soinfo_pool_protection 	/builds/slave/b2g_m-cen_flm-kk_ntly-00000000/build/bionic/linker/linker.cpp:291
19 		@0xbe9e3d95
Component: Gaia::Camera → DOM: Device Interfaces
Flags: needinfo?(nhirata.bugzilla)
OS: Android → Gonk (Firefox OS)
Product: Firefox OS → Core
Hardware: Unspecified → ARM
Whiteboard: [3.0-Daily-Testing], [319mb-flame-support] → [3.0-Daily-Testing], [319mb-flame-support] [b2g-crash]
Version: unspecified → Trunk
In regards to the symbols not being mapped to the 5/1 nightly builds ( prod / eng ) :

Asking rhelmer in irc : 
bug 1085557 landed before 5/1 so we should see the symbols mapped and it's not mapped for 2 builds in bug 1158378, should we create a bug for this?

bug 1144865 was closed off as working on 5/4.

Decided to NI him here, so he can reply in the bug.  :)  We may need to file a separate bug?
Flags: needinfo?(rhelmer)
(Assignee)

Comment 10

4 years ago
From the stack trace, we know that in the nsDOMCameraControl object, an ICameraControl request failed with OnUserError. The line where it crashed indicates mCameraControl is null, thus mSetInitialConfig is true and the aContext is kInSetConfiguration (only branch which could dereference a null pointer in any event). That user error can only be triggered if SetConfiguration is called on the mCameraControl object -- since mSetInitialConfig is true, that means it must be the call in the nsDOMCameraControl constructor, hence a pre-existing camera object where Start() has already been called.

From the logs:

04-24 14:26:54.294: W/CameraService(207): CameraService::connect X (pid 32140) rejected (existing client).
04-24 14:26:54.294: W/CameraBase(32140): An error occurred while connecting to camera: 0

We see that the mCameraControl->Start *failed* which would have happened before the SetConfiguration call was issued. We probably missed the initial OnHardwareStateChange w/ OpenFailed event but received it after we registered as a listener in the nsDOMCameraControl constructor by which point we already called SetConfiguration. The hardware state event caused us to clear mCameraControl, and reject the get camera promise to clean up the app as expected. However then the SetConfiguration call failed because there was no underlying camera hardware (as expected), and we tried to stop the camera even though it was already gone.
Assignee: nobody → aosmond
Status: NEW → ASSIGNED
(Assignee)

Comment 11

4 years ago
The potential for this crash was introduced in 2.2 by bug 1104913. Fix should be very simple.
Blocks: 1104913
blocking-b2g: --- → 2.2?
(Assignee)

Comment 12

4 years ago
try on b2g-inbound: https://treeherder.mozilla.org/#/jobs?repo=try&revision=b9c55919e926
try on b2g37_v2_2: https://treeherder.mozilla.org/#/jobs?repo=try&revision=c76c0bd91f2d

No test case, the race condition is a bit tricky to capture.
Attachment #8603076 - Flags: review?(dhylands)
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #9)
> In regards to the symbols not being mapped to the 5/1 nightly builds ( prod
> / eng ) :
> 
> Asking rhelmer in irc : 
> bug 1085557 landed before 5/1 so we should see the symbols mapped and it's
> not mapped for 2 builds in bug 1158378, should we create a bug for this?
> 
> bug 1144865 was closed off as working on 5/4.
> 
> Decided to NI him here, so he can reply in the bug.  :)  We may need to file
> a separate bug?

Sorry for the delay, was on PTO. Yes please do file a bug, we're not aware of any missing symbol problems on the server side.
Flags: needinfo?(rhelmer)
Attachment #8603076 - Flags: review?(dhylands) → review+
https://hg.mozilla.org/mozilla-central/rev/2f8b7d236713
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla40

Updated

4 years ago
See Also: → 1155490
Hi Shawn,
THis is the one Naoki mentioned that should probably uplift in 2.2 as the cause of the bug 1155490 was introduced there.
Can you help to check? Thanks!
Hi Alison, Gerry,

Could you also follow up this if we have to land this?
Flags: needinfo?(gchang)
Flags: needinfo?(ashiue)

Comment 18

4 years ago
Comment on attachment 8603076 [details] [diff] [review]
bug1158378.patch, v1

Review of attachment 8603076 [details] [diff] [review]:
-----------------------------------------------------------------

Should 2.2 take this patch?
Attachment #8603076 - Flags: approval-mozilla-b2g37?

Updated

4 years ago
blocking-b2g: 2.2? → 2.2+
Flags: needinfo?(jocheng)
Comment on attachment 8603076 [details] [diff] [review]
bug1158378.patch, v1

Approving this patch per comment 11 as a regression caused by bug 1104913.
Attachment #8603076 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Hi Becker,
Naoki mentioned this might related to bug 1155490.
Per comment 11 this is a regression caused by bug 1104913.
just FYI. Thanks
Flags: needinfo?(behsieh)
Hi josh:
from my observation ,root cause of bug 1155490 is because camera driver send a error event and causes libcameraservice crash

this should not related to this bug
thanks
Flags: needinfo?(behsieh)

Comment 23

4 years ago
remove NI according to comment 22.
Flags: needinfo?(gchang)
Flags: needinfo?(ashiue)
Thanks Becker for the explanation.  Much appreciated.
This issue is verified fixed on the latest Nightly 3.0 and 2.2 builds.

Actual Results: The camera app did not crash after following the steps 25 times.

Environmental Variables:
Device: Flame 3.0 KK (Full Flash) (319 MB)
BuildID: 20150520010202
Gaia: 600fd8249960b8256af9de67d9171025bb9a3ff3
Gecko: ac277e615f8f
Gonk: 040bb1e9ac8a5b6dd756fdd696aa37a8868b5c67
Version: 41.0a1 (3.0) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:41.0) Gecko/41.0 Firefox/41.0

Environmental Variables:
Device: Flame 2.2 KK (Full Flash) (319 MB)
BuildID: 20150520002502
Gaia: 63e9eeec3032318f8a240f80b6a184fa4b50b6e1
Gecko: a89755309dea
Gonk: bd9cb3af2a0354577a6903917bc826489050b40d
Version: 37.0 (2.2) 
Firmware Version: v18D-1
User Agent: Mozilla/5.0 (Mobile; rv:37.0) Gecko/37.0 Firefox/37.0
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
You need to log in before you can comment on or make changes to this bug.