Closed Bug 1440255 Opened 2 years ago Closed 2 years ago

Crash @ java.lang.RuntimeException: Camera thread already started! at org.webrtc.videoengine.VideoCaptureAndroid.startCapture(VideoCaptureAndroid.java) - Tokbox crashes on Nightly

Categories

(Core :: WebRTC: Audio/Video, defect, P1, critical)

60 Branch
All
Android
defect

Tracking

()

RESOLVED FIXED
mozilla61
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- unaffected
firefox59 --- unaffected
firefox60 + verified
firefox61 --- verified

People

(Reporter: oana.horvath, Assigned: pehrsons)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(6 files)

build: Nightly 60.0a1 (2018-02-21).
Devices:
Huawei Nexus 6P (Android 8.0)
Google Pixel (Android 8.0)
LG G4 (Android 5.1)
Oneplus Two (Android 6.0.1)

STR:
1. Go to opentokrtc.com
2. Enter a room and allow to share camera & mic.
3. A second permission request will be displayed, share again.


Expected result:
Permission request to access camera & mic should be displayed only once. No crash.

Actual results:
The permission to share camera & mic is requested twice before entering the chat room. A crash occurs after the second permission request is allowed.
If you are able to enter the room, the permission will be displayed again.
Rank: 15
Priority: -- → P2
Oana,

it looks like we haven't touched that code Android code in a long time.
Can I ask you to check if the same problem exists for 58 and 59?
Flags: needinfo?(oana.horvath)
Hi Nils,
It doesn't happen on 58 & 59.

Ran a bisection and found the following:
-Last good build: 2018-01-31
-First bad build: 2018-02-01

2018-02-28T11:22:24: DEBUG : Using url: https://hg.mozilla.org/integration/mozilla-inbound/json-pushes?changeset=93a8b769cb7a76aa8eb1c0b182c84d5bdf2fcb71&full=1
2018-02-28T11:22:25: DEBUG : Found commit message:
Bug 1426718 - Assert that we append at most once per stream per iteration. r=padenot

MozReview-Commit-ID: 3bNTZRhv839

Hope this helps.
Flags: needinfo?(oana.horvath)
Crash Signature: @ java.lang.RuntimeException: Camera thread already started! at org.webrtc.videoengine.VideoCaptureAndroid.startCapture(VideoCaptureAndroid.java) → [@ java.lang.RuntimeException: Camera thread already started! at org.webrtc.videoengine.VideoCaptureAndroid.startCapture(VideoCaptureAndroid.java)]
Paul, do you have any idea how your change in bug 1426718 could cause such a crash on Android.
Depends on: 1426718
Flags: needinfo?(padenot)
[Tracking Requested - why for this release]: crash regression

That's not the only bug in that push. Bug 1299515 is more likely as it did a lot of refactoring.
Blocks: 1299515
No longer depends on: 1426718
Flags: needinfo?(padenot)
Duplicate of this bug: 1442841
Assignee: nobody → apehrson
Rank: 15 → 11
Component: WebRTC → WebRTC: Audio/Video
Andreas, have you been able to get to this?
Flags: needinfo?(apehrson)
Not yet, but it's among the highest on my priority list now.
Status: NEW → ASSIGNED
Flags: needinfo?(apehrson)
(In reply to Andreas Pehrson [:pehrsons] from comment #7)
> Not yet, but it's among the highest on my priority list now.

checking back in on this.. any updates?
Flags: needinfo?(apehrson)
I've taken a quick look but nothing concrete yet. I'm increasing priority to keep this floating above some of the other regressions I'm battling.
Rank: 11 → 9
Flags: needinfo?(apehrson)
Priority: P2 → P1
Comment on attachment 8965404 [details]
Bug 1440255 - Backout bug 1420585 deadlock wallpaper fix

https://reviewboard.mozilla.org/r/234156/#review239782
Attachment #8965404 - Flags: review?(dminor) → review+
Comment on attachment 8965405 [details]
Bug 1440255 - Move thread exchanging out to runnables.

https://reviewboard.mozilla.org/r/234158/#review239784
Attachment #8965405 - Flags: review?(dminor) → review+
Comment on attachment 8965406 [details]
Bug 1440255 - Make VideoCaptureAndroid reconfigurable through subsequent startCapture.

https://reviewboard.mozilla.org/r/234160/#review239788
Attachment #8965406 - Flags: review?(dminor) → review+
Comment on attachment 8965407 [details]
Bug 1440255 - Consolidate failure paths.

https://reviewboard.mozilla.org/r/234162/#review239796

::: media/webrtc/trunk/webrtc/modules/video_capture/android/java/src/org/webrtc/videoengine/VideoCaptureAndroid.java:135
(Diff revision 1)
> +      // Starting failed on the camera thread. The looper has now quit and the
> +      // camera thread is dead.
> +      try {
> +        cameraThread.join();
> +      } catch (InterruptedException e) {
> +        throw new RuntimeException(e);

I think nothing catches this exception, so this will end up causing a crash. Is being interrupted on the thread join a severe enough error to justify crashing?
Attachment #8965407 - Flags: review?(dminor) → review+
Comment on attachment 8965408 [details]
Bug 1440255 - Remove tabs from VideoCaptureAndroid.

https://reviewboard.mozilla.org/r/234164/#review239800
Attachment #8965408 - Flags: review?(dminor) → review+
Comment on attachment 8965407 [details]
Bug 1440255 - Consolidate failure paths.

https://reviewboard.mozilla.org/r/234162/#review239796

> I think nothing catches this exception, so this will end up causing a crash. Is being interrupted on the thread join a severe enough error to justify crashing?

Can something magically interrupt the thread? We never call interrupt() on it.
I could let it pass silently I suppose.
Comment on attachment 8965407 [details]
Bug 1440255 - Consolidate failure paths.

https://reviewboard.mozilla.org/r/234162/#review239796

> Can something magically interrupt the thread? We never call interrupt() on it.
> I could let it pass silently I suppose.

Actually, stopCapture() has a similar uncaught exception on failed join(). With that in mind, this seems fine, and I'll drop the issue.
Pushed by pehrsons@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/010d3951685d
Backout bug 1420585 deadlock wallpaper fix r=dminor
https://hg.mozilla.org/integration/autoland/rev/f4d6d91cd350
Move thread exchanging out to runnables. r=dminor
https://hg.mozilla.org/integration/autoland/rev/662f4b6498ed
Make VideoCaptureAndroid reconfigurable through subsequent startCapture. r=dminor
https://hg.mozilla.org/integration/autoland/rev/54d05863ef6a
Consolidate failure paths. r=dminor
https://hg.mozilla.org/integration/autoland/rev/6764192d7dc5
Remove tabs from VideoCaptureAndroid. r=dminor
Basic STR in comment 0.

Also, a more complex case would be using two tabs at the same time:
1. In one tab, load https://jsfiddle.net/pehrsons/r1jz5q9h/
2. Click Start. Approve the permission prompt. It should say "Started" below the video element.
3. In another tab, load https://jsfiddle.net/pehrsons/yrbxrzhq/
4. Click Start. Approve all the permission prompts. On desktop, it's ok to persist the permissions.

Note that in both fiddles you must select the same camera.

We expect the first tab to not change resolution and to not crash.

We expect the second tab to pass all tests. It will say "DONE All 27 cases tested" at the bottom when done.
Flags: qe-verify+
(In reply to Andreas Pehrson [:pehrsons] from comment #28)
> Also, a more complex case would be using two tabs at the same time:
> 1. In one tab, load https://jsfiddle.net/pehrsons/r1jz5q9h/
> 2. Click Start. Approve the permission prompt. It should say "Started" below
> the video element.
> 3. In another tab, load https://jsfiddle.net/pehrsons/yrbxrzhq/
> 4. Click Start. Approve all the permission prompts. On desktop, it's ok to
> persist the permissions.
> 
> Note that in both fiddles you must select the same camera.
> 
> We expect the first tab to not change resolution and to not crash.
> 
> We expect the second tab to pass all tests. It will say "DONE All 27 cases
> tested" at the bottom when done.

Now that I just wrote this test case and tested it on Android, we're hanging when trying to go beyond 480x640 (the default, and what the first tab uses). I'll file a new bug to get this fixed.
See Also: → 1452048
So for the verification, please test the basic case in comment 0, and we'll take the complex one to bug 1452048.
Comment on attachment 8965404 [details]
Bug 1440255 - Backout bug 1420585 deadlock wallpaper fix

This uplift request applies to all patches in this bug.

Approval Request Comment
[Feature/Bug causing the regression]: bug 1299515
[User impact if declined]: Crashes on Android on some websites that use camera capture.
[Is this code covered by automated tests?]: No, we have no automated test coverage of real device backends on Android.
[Has the fix been verified in Nightly?]: No. I have verified it locally however.
[Needs manual test from QE? If yes, steps to reproduce]: Yes. See comment 0 for STR.
[List of other uplifts needed for the feature/fix]: None, but note that this uplift will block other uplifts.
[Is the change risky?]: No
[Why is the change risky/not risky?]: There's one path here that carries some risk, and it's the one where we reconfigure the camera. The other changes are very simple. This risk comes into play when we before this patch would always crash, and after this patch still have issues *sometimes* (bug 1452048, a deadlock that I will request uplift for later). This deadlock will freeze the camera thread and will not affect main thread. This will appear as the camera image freezing, which is IMO better than a crash, but you can also decide to wait for bug 1452048 and uplift them together.
[String changes made/needed]: None
Attachment #8965404 - Flags: approval-mozilla-beta?
Comment on attachment 8965404 [details]
Bug 1440255 - Backout bug 1420585 deadlock wallpaper fix

Bug 1452048 sounds like something we'll want, but it also seems to me that this would benefit from getting an extra week of testing in the mean time by getting into today's b11 build.
Attachment #8965404 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Verified as fixed on Beta 60.0b11. 
Verified on:
Prestigio Grace X5 (Android 4.4.2)
Motorola Nexus 6 (Android 7.0)
HTC Desire 820 (Android 6.0.1)
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.