Closed Bug 825687 Opened 7 years ago Closed 7 years ago

[B2G][Crash][Camera] User can crash the phone when on the lockscreen and tapping camera, then tapping on the unlock button.

Categories

(Firefox OS Graveyard :: General, defect, P1, critical)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-basecamp:+, firefox19 fixed, firefox20 fixed, firefox21 fixed, b2g18 fixed)

VERIFIED FIXED
B2G C4 (2jan on)
blocking-basecamp +
Tracking Status
firefox19 --- fixed
firefox20 --- fixed
firefox21 --- fixed
b2g18 --- fixed

People

(Reporter: croesch, Assigned: kanru)

References

Details

(Keywords: crash, reproducible, Whiteboard: [b2g-crash])

Crash Data

Attachments

(3 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20100101 Firefox/17.0
Build ID: 20121128204232

Steps to reproduce:

1. Launch Unabi build 20121231070201 version 18.0
2. Go into settings and add a passcode lock to your phone.
3. Tap the Power button to lock the phone.
4. Tap the Power button again to bring up the phone lock screen. (Camera and Unlock buttons available here)
5. Tap the camera button then quickly tap the unlock button.
6. If done properly, the camera will try to launch but then the passcode screen appears. At this point the phone will appear to go dark and crash.

From this point I am unable to get the phone to recover/restart/power off etc....
The only way I can seem to fix this is by pulling the battery.

Repro 99% 9/10 attempts


Actual results:

when you tap the camera and then quickly tap the unlock button to bring up the security code screen, both screens try to load at the same time. For some reason this results in a extremely broken state for the phone.

There is a similar bug 818933 however I feel that this bug is a bit different and more severe. Plus the other bug is marked as fixed.


Expected results:

The two programs (Camera and Security screen?) should behave in a way that cannot result in a crash.
Attached image Phone Lock Camera Crash
Again, I'm not sure if this is related to bug 818933 or not.
Nominating since this is a 100% reproducible crash.

Report: https://crash-stats.mozilla.com/report/index/ead498e7-8583-4346-9354-c41872121231
Status: UNCONFIRMED → NEW
blocking-basecamp: --- → ?
Crash Signature: [@ jemalloc_crash | arena_dalloc | free | moz_free | mozilla::CameraControlImpl::~CameraControlImpl ]
Ever confirmed: true
Keywords: reproducible
Triage: BB+, P2, C4 - severe crash
blocking-basecamp: ? → +
Priority: -- → P2
Target Milestone: --- → B2G C4 (2jan on)
Assignee: nobody → alive
Duplicate of this bug: 823955
logcat output:

I/        (  108): Destroying camera 0
E/QualcommCamera(  108): Qint android::close_camera_device(hw_device_t*): device =0x48eb31a0 E
I/QualcommCamera(  108): void android::close_Hal_obj(camera_device*): E
I/QualcommCamera(  108): void android::close_Hal_obj(camera_device*): clear hw
I/QualcommCameraHardware(  108): ~QualcommCameraHardware E
V/QualcommCameraHardware(  108): ~MMCameraDL: E
V/QualcommCameraHardware(  108): closed MM Camera DL 
V/QualcommCameraHardware(  108): ~MMCameraDL: X
I/QualcommCameraHardware(  108): ~QualcommCameraHardware X
I/QualcommCamera(  108): void android::close_Hal_obj(camera_device*): X
I/PRLog   (  108): 1075070200[40404160]: virtual mozilla::nsGonkCameraControl::~nsGonkCameraControl():290
I/PRLog   (  108): 1075070200[40404160]: virtual mozilla::CameraControlImpl::~CameraControlImpl():34 : this=48232920
I/PRLog   (  108): 1075070200[40404160]: Camera hardware was closed
F/libc    (  108): Fatal signal 11 (SIGSEGV) at 0x5a5a5a5a (code=1)
From the above output, it looks like the CameraControl object is destroyed before the CameraControlImpl::OnClosedInternal() method is called.  OnClosedInternal() is called from an nsRunnable (see line 287--NS_NewRunnableMethod()) that holds a reference to CameraControl, so this looks like a double-free.
Assignee: alive → mhabicher
This patch will prevent the camera app/lockscreen from crashing.

With this patch, the following is observed:
1. press the power button to wake up the DUT (assume passcode is set)
2. quickly press the camera and the unlock buttons
  a. passcode-entry screen appears
  b. camera app may briefly appear
3. wait
  a. screen goes black, backlights turn off
4. press the power button
  a. softkeys light up
  b. display backlight seems to turn on
  c. screen remains black
5. wait
  a. screen and softkeys turn off

Once in this state, the lock screen cannot be brought up, but pressing and holding the power button will bring up the Phone menu, and the DUT can be restarted, etc.
Attachment #697248 - Flags: review?(kchen)
to P1 as it blocks 808607
Priority: P2 → P1
Component: Gaia → General
Comment on attachment 697248 [details] [diff] [review]
Prevent CameraControl destructors from creating self-referring runnables

Review of attachment 697248 [details] [diff] [review]:
-----------------------------------------------------------------

How did OnShutter and OnClosed been called? Are there dangling references? I'll need a closer look.
With or without this patch the b2g process does not crash. The bug is still reproducible.
(In reply to Kan-Ru Chen [:kanru] from comment #11)
> With or without this patch the b2g process does not crash. The bug is still
> reproducible.

Oh, it crashed. I tried the patch again and get the same result as in comment 8.
(In reply to Kan-Ru Chen [:kanru] from comment #10)
> Comment on attachment 697248 [details] [diff] [review]
> Prevent CameraControl destructors from creating self-referring runnables
> 
> Review of attachment 697248 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> How did OnShutter and OnClosed been called? Are there dangling references?
> I'll need a closer look.

Here's the code flow (MT = Main Thread, CT = Camera Thread):

0. [MT] DOMCameraControl and CameraControlImpl objects are created
1. [MT] InitGonkCameraControl nsRunnable is created and dispatched to CT
2. [MT] Navigator::OnNavigation() fires and removes the current window from the active window list in DOMCameraManager.cpp
3. [CT] InitGonkCameraControl runs, initializes the camera hardware, and calls DOMCameraControl::Result(), which creates a GetCameraResult nsRunnable and dispatches it to MT
4. [MT] GetCameraResult::Run() executes but the window was removed in step #2 so the new DOMCameraControl object is not returned to JavaScript
5. [MT] When the GetCameraResult runnable dies, it removes the last reference to the DOMCameraControl object and triggers its destructor
6. [MT] The DOMCameraControl object has an nsRefPtr<CameraControlImpl> (which really contains a GonkCameraControl reference), so those destructors are called as well
7. [MT] The GonkCameraControl destructor calls ReleaseHardwareImpl(), which calls GonkCameraHardware::ReleaseHandle()
8. [MT] ReleaseHandle() (which is a static member) calls |delete hw;| and the GonkCameraHardware destructor calls CameraControlImpl::OnClosed()
9. [MT] OnClosed() creates an nsRunnable using NS_NewRunnableMethod() that references the CameraControlImpl object and dispatches it to the MT
10. [MT] All of the destructors unwind
11. [MT] The runnable created in (9) executes

Things blow up because NS_NewRunnableMethod in step (9) ADDREFs the CameraControlImpl object inside the destructor chain and thinks it has a valid ref, even though the objects are already in the process of being destroyed.  This, when the new runnable in step (11) finishes, it triggers a double-free.

The solution in the patch I posted is to prevent step (9).  Looking at it again today, the guard in OnShutter() isn't required; the only case we need to handle here is OnClosed(), since that's the only code that can be triggered by the abrupt destruction of the CameraControl objects due to the missing window in step (4).

Regarding dangling references, the destructors clean everything up, and so long as the DOMCameraControl reference isn't returned to JS in step (4), there won't be any (other) asynchronous events waiting to get processed on the CT or inside the driver.
(In reply to Mike Habicher [:mikeh] from comment #13)
>
> This, when the new runnable in step (11) finishes, it triggers a double-free.

*This --> Thus
Status: NEW → ASSIGNED
I don't like that in step 8 we call to parent without knowing if it is dying. How about expose a IsAlive() method from CameraControl and check that before using it in the destructor? Otherwise we should check isDying in every camera driver callbacks.
Agreed--I want a cleaner way to do this.
Severity: normal → critical
Keywords: crash
Whiteboard: [b2g-crash]
Assignee: mhabicher → kchen
Attachment #697248 - Attachment is obsolete: true
Attachment #697248 - Flags: review?(kchen)
Attachment #698610 - Flags: review?(mhabicher)
Comment on attachment 698610 [details] [diff] [review]
Unregister camera control from the camera hardware

Review of attachment 698610 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!
Attachment #698610 - Flags: review?(mhabicher) → review+
Dammit, that should have been my jacket!  ;)
Now you know why I asked you ;)
Will Regress this after Test Pass 2.
Using the latest unagi nightly, 

Gecko: 8f2ef4998b60
Gaia: 45a3196a5517

I cannot get back in a good state - what is described in Comment 8 doesn't happen for me, and I end up having to reboot the phone to get back in a good working state.
In Unagi build 20130114073222 Version 18.0 I'm also getting into a unrecoverable state where it's just a black screen. This was using my original steps I wrote at the top. Phone must be restarted to be used.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Please file a new bug for the issue you are hitting and link it as a dependency. It makes tracking a royal pain when we reopen unless a patch is backed out.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Resolution: --- → FIXED
I'm going to close this and open a new bug (bug 830480) to track the lockscreen blackness.
LOL....damnit I just opened a new bug on it as well....sigh.

https://bugzilla.mozilla.org/show_bug.cgi?id=830483
Sorry, croesch. :)  I marked yours as a dup of mine.
OS: All → Gonk (Firefox OS)
Hardware: All → ARM
This issue does not reproduce on Unagi build 20130211070202 with Dec 5th Kernel

Phone does not crash when tapping "camera" and then tapping on the "unlock" button on the lockscreen.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.