Android 6.0 devices (eg Nexus 5X/6P) reboot/restart or hang when playing audio/video

VERIFIED FIXED in Firefox 42, Firefox OS v2.5

Status

()

VERIFIED FIXED
3 years ago
2 years ago

People

(Reporter: snorp, Assigned: snorp)

Tracking

unspecified
Firefox 45
Points:
---

Firefox Tracking Flags

(firefox42+ verified, firefox43 verified, firefox44 fixed, firefox45 verified, b2g-v2.5 fixed, fennec42+)

Details

(URL)

Attachments

(7 attachments, 1 obsolete attachment)

At least it does on mine. Nightly. Seems pretty easy to repro.
So far I have a 100% reproduction rate by playing the video on this page: https://gma.yahoo.com/video-shows-taco-bell-execs-alleged-drunk-attack-061200233--abc-news-topstories.html#

Logcat didn't show anything interesting before the reboot.
With a local build it doesn't crash, but instead freezes the entire device. Logcat shows:

11-03 13:29:32.144 14239 14239 D GeckoBrowserApp: BrowserApp.onTabChanged: 0: AUDIO_PLAYING_CHANGE
11-03 13:29:32.152  3266  5493 D audio_hw_primary: select_devices: out_snd_device(2: speaker) in_snd_device(0: none)
11-03 13:29:32.152  3266  5493 D msm8974_platform: platform_send_audio_calibration: sending audio calibration for snd_device(2) acdb_id(15)
11-03 13:29:32.152  3266  5493 I soundtrigger: audio_extn_sound_trigger_update_device_status: device 0x2 of type 0 for Event 1
11-03 13:29:33.779  4404  9881 D NetlinkSocketObserver: NeighborEvent{elapsedMs=1060170, 192.168.1.1, [AC220B880DCA], RTM_NEWNEIGH, NUD_REACHABLE}
11-03 13:29:36.078 13651 13651 I ServiceManager: Waiting for service AtCmdFwd...
11-03 13:29:37.079 13651 13651 I ServiceManager: Waiting for service AtCmdFwd...
11-03 13:29:38.079 13651 13651 I ServiceManager: Waiting for service AtCmdFwd...
11-03 13:29:39.080 13651 13651 I ServiceManager: Waiting for service AtCmdFwd...
11-03 13:29:40.081 13651 13651 I ServiceManager: Waiting for service AtCmdFwd...
11-03 13:29:41.081 13651 13651 W Atfwd_Sendcmd: AtCmdFwd service not published, waiting... retryCnt : 3
11-03 13:29:56.082 13651 13651 I ServiceManager: Waiting for service AtCmdFwd...
11-03 13:29:57.082 13651 13651 I ServiceManager: Waiting for service AtCmdFwd...
11-03 13:29:58.083 13651 13651 I ServiceManager: Waiting for service AtCmdFwd...
11-03 13:29:59.083 13651 13651 I ServiceManager: Waiting for service AtCmdFwd...
11-03 13:29:59.649  4404  5546 D ConnectivityService: notifyType CAP_CHANGED for NetworkAgentInfo [WIFI () - 102]
11-03 13:30:00.084 13651 13651 I ServiceManager: Waiting for service AtCmdFwd...
11-03 13:30:01.023  4404  4430 I ActivityManager: Start proc 14464:com.google.android.deskclock/u0a66 for broadcast com.google.android.deskclock/com.android.alarmclock.DigitalAppWidgetProvider
11-03 13:30:01.084 13651 13651 W Atfwd_Sendcmd: AtCmdFwd service not published, waiting... retryCnt : 4
11-03 13:30:02.659  4404  5546 D ConnectivityService: notifyType CAP_CHANGED for NetworkAgentInfo [WIFI () - 102]
11-03 13:30:04.881  4404  5544 D WifiStateMachine: starting scan for "Willcox"WPA_PSK with 5680,2437
11-03 13:30:05.106  4404  5544 D WifiStateMachine: shouldSwitchNetwork  txSuccessRate=8.08 rxSuccessRate=5.75 delta 999 -> 993
11-03 13:30:05.107  4404  5544 D WifiStateMachine: CMD_AUTO_ROAM sup state CompletedState my state ConnectedState nid=0 config "Willcox"WPA_PSK roam=1 to any targetRoamBSSID any
11-03 13:30:05.107  4404  5544 D WifiStateMachine: AUTO_ROAM nothing to do
11-03 13:30:16.173  8014  8014 I clatd   : Detecting NAT64 prefix from DNS...
11-03 13:30:16.176  8014  8014 E clatd   : plat_prefix/dns(ipv4only.arpa) status = 7/No address associated with hostname
11-03 13:30:16.176  8014  8014 W clatd   : dns64_detection -- error, sleeping for 120 seconds
11-03 13:30:21.490  4404  9881 D NetlinkSocketObserver: NeighborEvent{elapsedMs=1107880, 192.168.1.1, [AC220B880DCA], RTM_NEWNEIGH, NUD_STALE}
11-03 13:30:26.790  4404  9881 D NetlinkSocketObserver: NeighborEvent{elapsedMs=1113180, 192.168.1.1, [AC220B880DCA], RTM_NEWNEIGH, NUD_REACHABLE}
11-03 13:30:41.351  4404  6314 I Process : Sending signal. PID: 4404 SIG: 3
11-03 13:30:41.352  4404  4408 I art     : Thread[2,tid=4408,WaitingInMainSignalCatcherLoop,Thread*=0x7f85d43000,peer=0x12c010a0,"Signal Catcher"]: reacting to signal 3
11-03 13:30:46.419  4404  4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out
11-03 13:30:46.420  4404  4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data.
11-03 13:30:51.480  4404  4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out
11-03 13:30:51.480  4404  4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data.
11-03 13:30:56.510  4404  4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out
11-03 13:30:56.511  4404  4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data.
11-03 13:30:56.603 14239 14256 I GeckoWebappsUpdateTimer: periodic check for webapp updates
11-03 13:30:56.603 14239 14256 D GeckoWebappManager: checkForUpdates
11-03 13:30:56.614 14239 14256 D GeckoWebapps: Saving /data/user/0/org.mozilla.fennec_snorp/files/mozilla/webapps/webapps.json
11-03 13:30:56.623 14239 14256 D GeckoWebapps: Success saving /data/user/0/org.mozilla.fennec_snorp/files/mozilla/webapps/webapps.json
11-03 13:30:56.626 14239 14256 D GeckoWebapps: Saving /data/user/0/org.mozilla.fennec_snorp/files/mozilla/webapps/webapps.json
11-03 13:30:56.633 14239 14256 D GeckoWebapps: Success saving /data/user/0/org.mozilla.fennec_snorp/files/mozilla/webapps/webapps.json
11-03 13:31:01.558  4404  4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out
11-03 13:31:01.559  4404  4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data.
11-03 13:31:06.996  4404  4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out
11-03 13:31:06.996  4404  4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data.
11-03 13:31:12.313  4404  4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out
11-03 13:31:12.313  4404  4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data.
11-03 13:31:17.341  4404  4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out
11-03 13:31:17.341  4404  4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data.
11-03 13:31:22.509  4404  4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out
11-03 13:31:22.509  4404  4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data.


It looks like the AtCmdFwd service is blocked, or something, and that's hosing system_server.
AtCmdFwd is some kind of qualcomm-specific service. I'm wondering if my device is somehow just hosed.
tracking-fennec: --- → ?
Well this isn't very nice...

#0  0xf6f9e088 in __reboot () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libc.so
#1  0xf71c4134 in systemTime () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libutils.so
#2  0xf6648bd8 in android::AudioTrack::processAudioBuffer() () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libmedia.so
#3  0xf6649a6c in android::AudioTrack::AudioTrackThread::threadLoop() () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libmedia.so
#4  0xf71c4072 in android::Thread::_threadLoop(void*) () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libutils.so
#5  0xf6f9c814 in __pthread_start(void*) () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libc.so
#6  0xf6f76f76 in __start_thread () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libc.so
#7  0x00000000 in ?? ()

Comment 5

3 years ago
Have a similar problem running  beta although its not 100% reproducible with above video

Comment 6

3 years ago
Forgot to mention I am using nexus 5x

Comment 7

3 years ago
I think I might be hitting the same crash on my Nexus 5 (Can't test with the video from comment 1 because of a geo restriction), device just reboots sometimes when playing video.
Apparently AtCmdFwd is some kind of Wifi display forwarding service. Not sure why I seem to get so many messages about it while playing video. If I disable Wifi I don't get any messages, but the Yahoo video still reboots and/or hangs. Red herring.
Nevermind, I do still see the AtCmdFwd messages even with wifi off...might still be related?
Sotaro, do we still have contacts at QC that could help on that?
'adb bugreport' shows some interesting stuff in the last kmsg

[46941.303851] msm_vidc: info: Closed video instance: ffffffc021d34000
[46941.601653] msm_vidc: info: Opening video instance: ffffffc0023be000, 1
[46941.947502] msm8994_quat_mi2s_snd_startup: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b893a410  substream = subdevice #0  stream = 1 bit width =6 sample rate =48000
[46941.951550] afe_get_cal_topology_id: [AFE_TOPOLOGY_CAL] not initialized for this port 4103
[46941.961241] msm8994_quat_mi2s_snd_startup: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b893a410  substream = subdevice #0  stream = 0 bit width =6 sample rate =48000
[46941.962594] afe_get_cal_topology_id: [AFE_TOPOLOGY_CAL] not initialized for this port 4102
[46958.523245] msm_vidc: info: Closed video instance: ffffffc0023be000
[46958.704096] msm8994_quat_mi2s_snd_shutdown: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b893a410  substream = subdevice #0  stream = 0
[46958.718849] msm8994_quat_mi2s_snd_shutdown: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b893a410  substream = subdevice #0  stream = 1
[46958.719758] msm8994_quat_mi2s_snd_shutdown Quaternary MI2S Clock is Disabled
[46958.721100] max98925_left_en_put: spk&rcver switch gpio had pulled down[46958.804060] msm_vidc: info: Opening video instance: ffffffc094fa6000, 1
[46959.023967] msm8994_quat_mi2s_snd_startup: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b893a410  substream = subdevice #0  stream = 1 bit width =6 sample rate =48000
[46965.831599] Watchdog bark! Now = 46965.831593
[46965.831615] Watchdog last pet at 46954.773015
[46965.831623] cpu alive mask from last pet 0-7
[46965.831658] Causing a watchdog bite!
[46965.832668] Wdog - STS: 0x1, CTL: 0x1, BARK TIME: 0x57fdf, BITE TIME: 0x1[46965.832678] Kernel panic - not syncing: Failed to cause a watchdog bite! - Falling back to kernel panic!
[46965.832689] CPU: 0 PID: 17228 Comm: AudioTrack Tainted: G        W    3.10.73-gcf36678 #1
[46965.832694] Call trace:
[46965.832710] [<ffffffc000206b88>] dump_backtrace+0x0/0x250
[46965.832718] [<ffffffc000206de8>] show_stack+0x10/0x1c
[46965.832730] [<ffffffc000c907e4>] dump_stack+0x1c/0x28
[46965.832737] [<ffffffc000c8efec>] panic+0x14c/0x300
[46965.832747] [<ffffffc000505150>] wdog_bark_handler+0xd8/0xdc
[46965.832757] [<ffffffc00026ace8>] handle_irq_event_percpu+0xb4/0x244
[46965.832764] [<ffffffc00026aec0>] handle_irq_event+0x48/0x78
[46965.832773] [<ffffffc00026db8c>] handle_fasteoi_irq+0xc0/0x128
[46965.832780] [<ffffffc00026a530>] generic_handle_irq+0x28/0x3c
[46965.832788] [<ffffffc0002040d8>] handle_IRQ+0x84/0xa8
[46965.832795] [<ffffffc0002006d8>] gic_handle_irq+0x54/0x84
[46965.832801] Exception stack(0xffffffc012d3feb0 to 0xffffffc012d3ffd0)
[46965.832808] fea0:                                     00400000 00000000 00000000 00000000
[46965.832815] fec0: ffffffff ffffffff f757c6aa 00000000 bfd9c6b0 00000000 c0aa7888 00000000
[46965.832823] fee0: 00000002 00000000 bfd9c6b4 00000000 bfd9c6b0 00000000 f6a43910 00000000
[46965.832830] ff00: ffffff92 00000000 00000000 00000000 bf7cacb0 00000000 00000000 00000000
[46965.832838] ff20: c0aa7860 00000000 00000000 00000000 f7587c28 00000000 c0aa7828 00000000
[46965.832845] ff40: f6a02ee1 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[46965.832853] ff60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[46965.832861] ff80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[46965.832868] ffa0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[46965.832874] ffc0: 00000000 00000000 00000000 00000000
[46965.832887] CPU4: stopping
[46965.832905]
[46965.832924] CPU: 4 PID: 16939 Comm: Gecko Tainted: G        W    3.10.73-gcf36678 #1
[46965.832941] task: ffffffc048ad60c0 ti: ffffffc087a74000 task.ti: ffffffc087a74000
[46965.832954] PC is at 0xd77eb2c6
[46965.832967] LR is at 0xf6b
Alright, so I think the watchdog bite means it has detected that the device is hung, so it panics to cause a reboot. I just got it to hang without triggering the watchdog, and there is some interesting stuff. It looks like the kernel gets hung up waiting on the decoder to do some stuff. I'm not sure why we would be triggering this and other apps wouldn't....
Created attachment 8683742 [details]
dmesg output during a hang
comment #10
Flags: needinfo?(sotaro.ikeda.g)
Now in another instance I get some kind of GPU deadlock?

[ 4368.612169] msm_vidc: info: Opening video instance: ffffffc0ba374000, 1
[ 4368.804081] msm8994_quat_mi2s_snd_startup: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b8a33410  substream = subdevice #0  stream = 1 bit width =6 sample rate =48000  
[ 4369.353773] afe_get_cal_topology_id: [AFE_TOPOLOGY_CAL] not initialized for this port 4103
[ 4369.365721] msm8994_quat_mi2s_snd_startup: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b8a33410  substream = subdevice #0  stream = 0 bit width =6 sample rate =48000  
[ 4369.367352] afe_get_cal_topology_id: [AFE_TOPOLOGY_CAL] not initialized for this port 4102
[ 4391.046292] kgsl kgsl-3d0: kgsl: possible gpu syncpoint deadlock for context 10 timestamp 0
[ 4391.046315] kgsl kgsl-3d0:   context[10]: queue=4700, submit=4699, start=4699, retire=4699
[ 4391.046323] kgsl kgsl-3d0:   possible deadlock. Context 10 might be blocked for itself
[ 4391.046335] kgsl kgsl-3d0:   context[10]: submit times: 5.354 5.392 5.548 5.393 5.345 5.308 5.375 
[ 4391.046343] [ffffffc091f9f700] SurfaceView:0: active
[ 4391.046347] waiters:
[ 4391.046363]  kgsl_sync_callback+0x0/0x38
[ 4391.046368] syncpoints:
[ 4391.046374]   kgsl-3d0_surfaceflinger(380)-surfaceflinger(380)-2_pt signaled@4386.030689: 41 / 41 queued:41 retired:41
[ 4391.046396]   kgsl-3d0_la.fennec_snorp(19650)-Thread-173(19777)-10_pt signaled@4386.030689: 4697 / 4699 queued:4700 retired:4699
[ 4391.046413]   mdss_fb_0_pt active: 13688 / 13687
[ 4391.046428] kgsl kgsl-3d0: --gpu syncpoint deadlock print end--
According to the angler-kernel repo[0] there have been quite a few updates since the one installed on my device (3.10.73-gcf36678). I think I'll try to install a newer kernel.


[0] https://android.googlesource.com/device/huawei/angler-kernel/
I updated my kernel to the latest one in the angler-kernel repo, 3.10.73-g549e459, and I can still reproduce the issue. Ugh.
tracking-fennec: ? → 42+

Comment 18

3 years ago
noise
I just ordered nexus6P yesterday. The phone seems to be in Hongkong now in delivery tracking system.
(In reply to [:fabrice] Fabrice Desré from comment #10)
> Sotaro, do we still have contacts at QC that could help on that?

:diego, might still respond to a question. But they did not spend time if the bug does not related to fxos mobile products and basically we needed to analyze it.
(In reply to Alex from comment #7)
> I think I might be hitting the same crash on my Nexus 5 (Can't test with the
> video from comment 1 because of a geo restriction), device just reboots
> sometimes when playing video.

I also cannot test the video because of geo restriction.
Flags: needinfo?(sotaro.ikeda.g)
(In reply to Ka-Hing Cheung from comment #6)
> Forgot to mention I am using nexus 5x

It might be a problem that happen both nexus6p and nexus5x. From it, it might be a problem that commonly happen on new qualcomm 64bit chipset.
  https://en.wikipedia.org/wiki/Nexus_6P
  https://en.wikipedia.org/wiki/Nexus_5X
Yes can confirm that this is an issue for the 5x as well. Snorp ashed me to check Yahoo video on the 5x, the first video I found reboots the device every time. https://screen.yahoo.com/popular/star-wars-lego-destruction-star-140000599.html

Comment 23

3 years ago
My Nexus 5 (not 5x) crashed on that video fairly quickly.

Before it crashed the video was jumpy, started playing a second or so into the video, then froze and jumped back to the start, then crashed.
(In reply to Alex from comment #23)
> My Nexus 5 (not 5x) crashed on that video fairly quickly.
> 
> Before it crashed the video was jumpy, started playing a second or so into
> the video, then froze and jumped back to the start, then crashed.

Alex, can you take logcat and dmesg log?

Comment 25

3 years ago
Created attachment 8684033 [details]
nexus5_logs.zip

Sure, I'm not sure how useful dmesg will be, since I can't find a way to tail its output, but I did get the output of logcat and last_kmsg, and they give a bit more info (Included everything though)
Thanks Alex! From last_kmsg.log in attachment 8684033 [details], problem seems to happen around SurfaceView fence handling.

[  420.640043] fence timeout on [c3604280] after 10000ms
[  420.640430] objs:
[  420.640433] --------------
>>> snip
[  420.658320] [c3604280] SurfaceView:0: active
[  420.658323]   kgsl-timeline_pt signaled@410.514145
[  420.659780]   kgsl-timeline_pt signaled@410.511896
[  420.659784]   mdss_fb_0_pt active: 2388 / 2387
>>> snip
[  421.190340] Watchdog bark! Now = 421.190336
[  421.190414] Watchdog last pet at 410.190056
[  421.190453] cpu alive mask from last pet 0-3
[  421.190491] Causing a watchdog bite!
Flags: needinfo?(the.decryptor)
I saw the SurfaceView watchdog bite once too. I am not sure if that's the root cause or if the kernel is fouled up from something else. Media playback clearly plays some kind of part...
I thought maybe the per-frame attach/detach that we do on the SurfaceTexture could be responsible for this badness, but disabling that doesn't seem to help here.
In the file I attached, nearly all of the blocked tasks are waiting on the main binder mutex. I don't see anything there which would be holding that mutex, so I guess the owner is not in the blocked state (and therefore does not appear in this list)?
Another thing interesting from that log: All CPUs are in lpm_cpuidle_enter, except CPU 4 which is servicing the SysRq and CPU 0 which is the AudioTrack task. AudioTrack is not listed in set of blocked tasks, but presumably it's holding the binder mutex and then gets stuck somewhere else?
Modifying Fennec to drop the audio track on the floor results in no problems, so audio does indeed some to be part of the problem.
Doing the opposite -- keeping the audio track, but dropping the video track, results in reboot.
We're using OpenSL for audio output, which might be the reason we are a unique snowflake here.
Apparently OpenSL uses AudioTrack on this device, because the AudioTrack thread that is hung on CPU 0 is from fennec. We are definitely not using AudioTrack directly, because that cubeb backend doesn't even work on 6.0 (it's missing a bunch of symbols). In the latest hang, it's stuck at ktime_get_ts+0x74/0x100 in the kernel. This matches the userspace stack I once got above in comment #4.

top shows fennec using 12% CPU, which on this 8 core device equates to one whole CPU. I think we must be spinning in ktime_get_ts for some reason.

Updated

3 years ago
Flags: needinfo?(the.decryptor)
I can reproduce this as well
I am able to build my own kernels now, so I can debug a little closer. So far, I have determined:

1) The wait_for_completion in msm_mpm_work_fn is probably harmless. I made it timeout, and the behavior was pretty much the same.
2) All of the tasks waiting on the binder mutex do not appear to be deadlocked. After the device is hung I still see successful locks taking place. I guess there is just a lot of contention?
3) I am in the deep end of the pool here and have no idea what I am doing.
Inder, do you know someone that could help us there? thanks!
Flags: needinfo?(ikumar)
I have been assuming the watchdog bite was because it determined there was some hardware badness, but it appears the watchdog is most likely kept satisfied by userspace pings. Consequently, I could be barking up the wrong tree here.
The watchdog daemon source here[0] looks pretty tiny, though, so if that is hosed I would guess there is kernel badness.


[0] http://androidxref.com/6.0.0_r1/xref/system/core/init/watchdogd.cpp
Created attachment 8684980 [details]
userspace hang traces

The userspace watchdog service is what generates the kernel sysrq traces, and it also does a userspace dump on some system services as well. I grabbed this one just now.
Created attachment 8685047 [details]
More userspace traces, this time it includes Fennec (org.mozilla.fennec_snorp)

This is another set of userspace traces, and I was able to get fennec this time as well. The main thread appears to be stuck in  android.view.DisplayEventReceiver.nativeScheduleVsync, which ends up stuck in binder.
Summary: Nexus 6P crashes (reboots) when playing video in Fennec → Nexus 6P reboots or hangs when playing video in Fennec
Might be getting somewhere. Binder does indeed some to be...bound up.

[  259.317110] SNORP: binder locked! Binder_4
[  259.317135] SNORP: attempting binder lock.. Thread-152
[  259.317141] SNORP: binder locked! Thread-152
[  259.317567] SNORP: attempting binder lock.. Thread-153
[  259.317575] SNORP: binder locked! Thread-153
[  259.317605] SNORP: attempting binder lock.. Thread-153
[  259.317611] SNORP: binder locked! Thread-153
[  259.317622] SNORP: attempting binder lock.. Thread-153
[  259.317627] SNORP: binder locked! Thread-153
[  259.317652] SNORP: attempting binder lock.. Binder_2
[  259.317658] SNORP: binder locked! Binder_2
[  259.317768] SNORP: attempting binder lock.. Binder_2
[  259.317774] SNORP: binder locked! Binder_2
[  259.317792] SNORP: attempting binder lock.. Binder_2
[  259.317798] SNORP: binder locked! Binder_2
[  259.317822] SNORP: attempting binder lock.. Binder_2
[  259.317827] SNORP: binder locked! Binder_2
[  259.317853] SNORP: attempting binder lock.. Thread-153
[  259.317859] SNORP: binder locked! Thread-153
[  259.318313] SNORP: attempting binder lock.. Binder_3
[  259.318350] SNORP: attempting binder lock.. MediaPl~ack #11
[  259.977658] SNORP: attempting binder lock.. system_server
[  260.233574] SNORP: attempting binder lock.. m.android.phone
[  264.617285] SNORP: attempting binder lock.. ActivityManager
[  264.617461] SNORP: attempting binder lock.. ConnectivitySer
[  267.351033] healthd: battery l=77 v=4034 t=37.0 h=2 st=2 c=-318 chg=u 2015-11-09 22:45:37.804400605 UTC
[  267.351096] SNORP: attempting binder lock.. healthd
[  267.634189] SNORP: attempting binder lock.. Gecko
[  273.038252] SNORP: attempting binder lock.. ATFWD-daemon
[  284.436335] SNORP: attempting binder lock.. GCMReader
[  322.738802] SNORP: attempting binder lock.. Binder_1
[  327.761553] SNORP: attempting binder lock.. Binder_2
[  343.445929] SNORP: attempting binder lock.. Binder_3
[  348.469079] SNORP: attempting binder lock.. Binder_4
[  350.513281] SMBCHG: determine_target_input_current: target_current = 500
[  353.711504] SNORP: attempting binder lock.. Binder_5
[  358.732076] SNORP: attempting binder lock.. Binder_6
[  363.857570] SNORP: attempting binder lock.. Binder_7
[  368.920602] SNORP: attempting binder lock.. Binder_8
[  373.942060] SNORP: attempting binder lock.. Binder_9
[  378.981073] SNORP: attempting binder lock.. Binder_A
[  384.002085] SNORP: attempting binder lock.. Binder_B
[  389.147634] SNORP: attempting binder lock.. Binder_C
[  394.171567] SNORP: attempting binder lock.. Binder_D
[  399.207093] SNORP: attempting binder lock.. Binder_E
[  399.221849] SNORP: attempting binder lock.. Binder_1

Thread-153 (whatever that is) is hosing us. Or maybe the target thread is hosing us, not sure.

Comment 43

3 years ago
I am also experiencing this issue. I am able to reproduce it using the following demo: http://www.wothke.ch/websid/ . If you tap on "What's this?" and then repeatedly tap on the next button ">>|" the phone will eventually either hang or reboot. I am on Nexus 6P.
(In reply to rqou from comment #43)
> I am also experiencing this issue. I am able to reproduce it using the
> following demo: http://www.wothke.ch/websid/ . If you tap on "What's this?"
> and then repeatedly tap on the next button ">>|" the phone will eventually
> either hang or reboot. I am on Nexus 6P.

Thanks for the report. I can confirm that I get the same hang if I follow those instructions. There is no decoder activity with this demo, so I think we can narrow this down to something related to sound playback.
(In reply to rqou from comment #43)
> I am also experiencing this issue. I am able to reproduce it using the
> following demo: http://www.wothke.ch/websid/ . If you tap on "What's this?"
> and then repeatedly tap on the next button ">>|" the phone will eventually
> either hang or reboot. I am on Nexus 6P.

I also confirmed system reboot on nexus-5.
I am going to think, the problem might be caused by very high cpu usage of AudioTrack thread. The thread handle callback from AudioFlinger and run very high thread priority.

Before the system reboot on nexus 5, "adb shell top -m 5 -d 3 -t" showed the following result.

----------------------------------------------------
User 68%, System 19%, IOW 0%, IRQ 0%
User 212 + Nice 0 + Sys 60 + Idle 37 + IOW 0 + IRQ 0 + SIRQ 0 = 309

  PID   TID PR CPU% S     VSS     RSS PCY UID      Thread          Proc
 3028  3182  0 126% R 1695784K 404936K  fg u0_a53   AudioTrack      org.mozilla.fennec_sotaro
 3086  3086  0   6% R   3220K   1376K  fg shell    top             top
  201   675  0   2% R 138232K  10480K  fg media    FastMixer       /system/bin/mediaserver
 3028  3141  0   2% S 1695784K 404936K  fg u0_a53   Thread-169      org.mozilla.fennec_sotaro
 3028  3097  0   1% S 1695784K 404936K  fg u0_a53   Gecko           org.mozilla.fennec_sotaro
On fxos nexus-5-l, top command showed very different result. The following showed that Browser main thread was most busy. On android, AudioTrack thread was most busy and its cpu usage was too high.

----------------------------------------------

User 46%, System 9%, IOW 0%, IRQ 0%
User 283 + Nice 0 + Sys 59 + Idle 259 + IOW 2 + IRQ 0 + SIRQ 1 = 604

  PID   TID PR CPU% S     VSS     RSS PCY UID      Thread          Proc
 1808  1808  1  44% R 278148K  68784K unk u0_a1808 Browser         /system/b2g/b2g
  196   790  1   2% S 568660K 138248K  fg root     Compositor      /system/b2g/b2g
  196  1830  1   2% S 573224K 138248K  fg root     BufMgrParent#18 /system/b2g/b2g
 1808  2566  1   1% S 278148K  68784K  fg u0_a1808 AudioTrack      /system/b2g/b2g
 2685  2685  0   1% R  10672K   1264K  fg root     top             top
Both on android and fxos nexus-5, AudioTrack thread seemed to have same thread priority like the following.

> APPLICATION PRIO  NICE  RTPRI SCHED  NAME
> AudioTrack  -3    -16   2     1      AudioTrack

AudioTrack thread is very high priority, if the thread consume almost all cpu resource, another thread could not use cpu. It might cause this problem.
(In reply to Sotaro Ikeda [:sotaro] from comment #47)
> On fxos nexus-5-l, top command showed very different result. The following
> showed that Browser main thread was most busy. On android, AudioTrack thread
> was most busy and its cpu usage was too high.
> 
> ----------------------------------------------
> 
> User 46%, System 9%, IOW 0%, IRQ 0%
> User 283 + Nice 0 + Sys 59 + Idle 259 + IOW 2 + IRQ 0 + SIRQ 1 = 604
> 
>   PID   TID PR CPU% S     VSS     RSS PCY UID      Thread          Proc
>  1808  1808  1  44% R 278148K  68784K unk u0_a1808 Browser         /system/b2g/b2g


On fxos nexus-5 case, Browser main thread cpu usage is mainly for rendering. When display is off, cpu usage was the following.

-------------------------------------------------

  PID   TID PR CPU% S     VSS     RSS PCY UID      Thread          Proc
 1688  1688  0  17% S 259240K  66444K unk u0_a1688 Browser         /system/b2g/b2g
 1688  2398  0  12% S 259240K  66444K  fg u0_a1688 AudioTrack      /system/b2g/b2g
  192   732  0   8% S  47136K   9172K  fg media    FastMixer       /system/bin/mediaserver
 2877  2877  0   4% R  10672K   1252K  fg root     top             top
  910   927  0   1% S  20256K    880K  fg root     mpdecision      /system/bin/mpdecision
On android 5.1.1 factory image of nexus-5, http://www.wothke.ch/websid/ showed the following cpu usage by "adb shell top -m 5 -d 3 -t". It seems same to fxos master nexus-5-l.

----------------------------------------------------------

User 51%, System 8%, IOW 0%, IRQ 0%
User 312 + Nice 0 + Sys 54 + Idle 239 + IOW 0 + IRQ 0 + SIRQ 0 = 605

  PID   TID PR CPU% S     VSS     RSS PCY UID      Thread          Proc
 5914  5963  0  46% R 2258460K 391556K  fg u0_a83   Gecko           org.mozilla.fennec_sotaro
 5914  6465  1   3% S 2258460K 391556K  fg u0_a83   AudioTrack      org.mozilla.fennec_sotaro
 6440  6440  1   3% R  10672K   1560K  fg shell    top             top
  171   217  1   2% S  18628K   2740K  fg logd     logd.writer     /system/bin/logd
 5914  6041  0   2% S 2258460K 391556K  fg u0_a83   Thread-530      org.mozilla.fennec_sotaro
Reproducing using comment 43 on Nexus 7 and 5 with Android M. Using the hardware volume buttons seems to be broken in this situation and triggers the hang/reboot for me.
(In reply to Sotaro Ikeda [:sotaro] from comment #48)
> Both on android and fxos nexus-5, AudioTrack thread seemed to have same
> thread priority like the following.
> 
> > APPLICATION PRIO  NICE  RTPRI SCHED  NAME
> > AudioTrack  -3    -16   2     1      AudioTrack
> 
> AudioTrack thread is very high priority, if the thread consume almost all
> cpu resource, another thread could not use cpu. It might cause this problem.

This is interesting, but the 6P has 8 cores. It could easily schedule stuff on one of those, right? Maybe something prevents that?
Created attachment 8686167 [details]
kernel stack traces from fennec + all blocked tasks during hang

I modified the kernel to show all of the fennec tasks when the userspace watchdog does the sysrq during the hang. The AudioTrack thread is interesting:

[  995.648185] AudioTrack      R  running task        0  8840    530 0x00400000
[  995.648196] Call trace:
[  995.648202] [<ffffffc000204c88>] __switch_to+0x7c/0x88
[  995.648209] [<ffffffc0002040dc>] handle_IRQ+0x88/0xa8
[  995.648217] [<ffffffc0002006d8>] gic_handle_irq+0x54/0x84
[  995.648222] Exception stack(0xffffffc037cdbd10 to 0xffffffc037cdbe30)
[  995.648230] bd00:                                     37cdbe80 ffffffc0 37cdbec0 ffffffc0
[  995.648238] bd20: 37cdbe50 ffffffc0 0042d308 ffffffc0 00000000 00000000 37cdbe90 ffffffc0
[  995.648246] bd40: ffffffff ffffffff 2625a023 00000000 37cdbed0 ffffffc0 000003e2 00000000
[  995.648254] bd60: 34155555 00000000 00000018 00000000 a9bc78e0 ffffffff 00000000 00000000
[  995.648262] bd80: 00000000 00000000 00000000 00000000 00000000 00000000 c67fb820 00000000
[  995.648271] bda0: f767c135 00000000 00000000 00000000 0028231c ffffffc0 00000000 00000000
[  995.648278] bdc0: 00000000 00000000 37cdbe80 ffffffc0 37cdbec0 ffffffc0 00000001 00000000
[  995.648286] bde0: 37cdbdf0 ffffffc0 00911e28 ffffffc0 37cdbe00 ffffffc0 00272078 ffffffc0
[  995.648295] be00: 37cdbe40 ffffffc0 0023e7ec ffffffc0 37cdbe80 ffffffc0 37cdbec0 ffffffc0
[  995.648301] be20: 00000001 00000000 00000000 00000080
[  995.648308] [<ffffffc0002035a4>] el1_irq+0x64/0xd4
[  995.648316] [<ffffffc00028234c>] compat_sys_clock_gettime+0x30/0x64

I guess clock_gettime() was interrupted by an IRQ, and the handler never returns? I need to add a way to trigger this sysrq whenever I want to see if it's really stuck there, but selinux is being a pain.
Does the presence of the __switch_to in the top frame mean the task was preempted? Is that supposed to be allowed in an interrupt handler? I'm about at my wits end here (probably well past that point, actually).
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #52)
> (In reply to Sotaro Ikeda [:sotaro] from comment #48)
> > Both on android and fxos nexus-5, AudioTrack thread seemed to have same
> > thread priority like the following.
> > 
> > > APPLICATION PRIO  NICE  RTPRI SCHED  NAME
> > > AudioTrack  -3    -16   2     1      AudioTrack
> > 
> > AudioTrack thread is very high priority, if the thread consume almost all
> > cpu resource, another thread could not use cpu. It might cause this problem.
> 
> This is interesting, but the 6P has 8 cores. It could easily schedule stuff
> on one of those, right? Maybe something prevents that?

Yea, on multi core device, on thread's priority should not prevent hole devices. There should be something.
AudioTrack:s sampling rate seems to be related to the problem. At http://www.wothke.ch/websid/, sampleRate was 48000. At youtube.com, sampleRate was 44100. Android normally use 44100 as sample rate.

The followings are AudioTrack params dump.
----------------

- audio track prams at http://www.wothke.ch/websid/
> 11-11 23:54:19.349 14855 14981 V AudioTrack: set(): streamType 3, sampleRate 48000, format 0x1, channelMask 0x3, frameCount 0, flags #4, notificationFrames 0, sessionId 56, transferType 0, uid -1, pid -1

- audio params at youtube.com
> 11-11 23:52:26.975  6681 14713 V AudioTrack: set(): streamType 3, sampleRate 44100, format 0x1, channelMask 0x3, frameCount 0, flags #4, notificationFrames 0, sessionId 50, transferType 0, uid -1, pid -1
Just the following change(hard code 44100 sample rate) seemed to fix the problem, though audio pitch and play speed became slow.

diff --git a/media/libcubeb/src/cubeb_opensl.c b/media/libcubeb/src/cubeb_opensl.c
--- a/media/libcubeb/src/cubeb_opensl.c
+++ b/media/libcubeb/src/cubeb_opensl.c
@@ -475,16 +475,18 @@ opensl_stream_init(cubeb * ctx, cubeb_st
 
   *stream = NULL;
 
   if (stream_params.channels < 1 || stream_params.channels > 32 ||
       latency < 1 || latency > 2000) {
     return CUBEB_ERROR_INVALID_FORMAT;
   }
 
+  stream_params.rate = 44100;
+
   SLDataFormat_PCM format;
When audio sample rate was 48000, audio out seemed to use low-latency-playback.

----------------------------------------------------------------

11-12 01:34:04.856  1686  1938 V AudioTrack: set(): streamType 3, sampleRate 48000, format 0x1, channelMask 0x3, frameCount 0, flags #4, notificationFrames 0, sessionId 22, transferType 0, uid -1, pid -1
11-12 01:34:04.856  1686  1938 V AudioTrack: set() streamType 3 frameCount 0 flags 0004
11-12 01:34:04.856  1686  1938 V AudioTrack: createTrack_l() output 2 afLatency 50
11-12 01:34:04.856   205   856 V AudioFlinger: createTrack() lSessionId: 22
11-12 01:34:04.857  1686  1938 V AudioTrack: AUDIO_OUTPUT_FLAG_FAST successful; frameCount 240
11-12 01:34:04.857   205   781 V AudioFlinger: acquiring 22 from 1686, for 1686
11-12 01:34:04.857   205   781 V AudioFlinger:  added new entry for 22
11-12 01:34:04.868   205   774 D audio_hw_primary: out_set_parameters: enter: usecase(1: low-latency-playback) kvpairs: routing=2
11-12 01:34:04.868  1686  1943 V AudioTrack: timeout 0.008
Created attachment 8686386 [details] [diff] [review]
patch - Wrokaround of phone reboot

This patch is a temporary workaround of phone reboot on android M. It seemed to work for http://www.wothke.ch/websid/ phone reboot on master android firefox on nexus-5 of android M.
(In reply to Sotaro Ikeda [:sotaro] from comment #46)
> I am going to think, the problem might be caused by very high cpu usage of
> AudioTrack thread. The thread handle callback from AudioFlinger and run very
> high thread priority.
> 
> Before the system reboot on nexus 5, "adb shell top -m 5 -d 3 -t" showed the
> following result.
> 
> ----------------------------------------------------
> User 68%, System 19%, IOW 0%, IRQ 0%
> User 212 + Nice 0 + Sys 60 + Idle 37 + IOW 0 + IRQ 0 + SIRQ 0 = 309
> 
>   PID   TID PR CPU% S     VSS     RSS PCY UID      Thread          Proc
>  3028  3182  0 126% R 1695784K 404936K  fg u0_a53   AudioTrack     
> org.mozilla.fennec_sotaro
>  3086  3086  0   6% R   3220K   1376K  fg shell    top             top
>   201   675  0   2% R 138232K  10480K  fg media    FastMixer      
> /system/bin/mediaserver
>  3028  3141  0   2% S 1695784K 404936K  fg u0_a53   Thread-169     
> org.mozilla.fennec_sotaro
>  3028  3097  0   1% S 1695784K 404936K  fg u0_a53   Gecko          
> org.mozilla.fennec_sotaro

I seems to understand why AudioTrack thread uses a lot of cpu. When the problem happened AudioTrack::setMarkerPosition() was called with very big marker value. It caused the problem. When very bit marker value is set, AudioTrack::processAudioBuffer() does not work correctly.

The following time unit becomes 0. If it happens, AudioTrack::obtainBuffer() is called without timeout wait and AudioTrack thread becomes not to sleep.

------------------------------------

    // Convert frame units to time units
    nsecs_t ns = NS_WHENEVER;
    if (minFrames != (uint32_t) ~0) {
        ns = framesToNanoseconds(minFrames, sampleRate, speed) + kWaitPeriodNs;
        ns -= (timeAfterCallbacks - timeBeforeCallbacks);  // account for callback time
        // TODO: Should we warn if the callback time is too long?
        if (ns < 0) {
           ns = 0;
        }
    }
 
http://androidxref.com/6.0.0_r1/xref/frameworks/av/media/﷒0
Comment on attachment 8686386 [details] [diff] [review]
patch - Wrokaround of phone reboot

Set to obsolete.
Attachment #8686386 - Attachment is obsolete: true
(In reply to Sotaro Ikeda [:sotaro] from comment #60)
> (In reply to Sotaro Ikeda [:sotaro] from comment #46)
> > I am going to think, the problem might be caused by very high cpu usage of
> > AudioTrack thread. The thread handle callback from AudioFlinger and run very
> > high thread priority.
> > 
> > Before the system reboot on nexus 5, "adb shell top -m 5 -d 3 -t" showed the
> > following result.
> > 
> > ----------------------------------------------------
> > User 68%, System 19%, IOW 0%, IRQ 0%
> > User 212 + Nice 0 + Sys 60 + Idle 37 + IOW 0 + IRQ 0 + SIRQ 0 = 309
> > 
> >   PID   TID PR CPU% S     VSS     RSS PCY UID      Thread          Proc
> >  3028  3182  0 126% R 1695784K 404936K  fg u0_a53   AudioTrack     
> > org.mozilla.fennec_sotaro
> >  3086  3086  0   6% R   3220K   1376K  fg shell    top             top
> >   201   675  0   2% R 138232K  10480K  fg media    FastMixer      
> > /system/bin/mediaserver
> >  3028  3141  0   2% S 1695784K 404936K  fg u0_a53   Thread-169     
> > org.mozilla.fennec_sotaro
> >  3028  3097  0   1% S 1695784K 404936K  fg u0_a53   Gecko          
> > org.mozilla.fennec_sotaro
> 
> I seems to understand why AudioTrack thread uses a lot of cpu. When the
> problem happened AudioTrack::setMarkerPosition() was called with very big
> marker value. It caused the problem. When very bit marker value is set,
> AudioTrack::processAudioBuffer() does not work correctly.
> 
> The following time unit becomes 0. If it happens, AudioTrack::obtainBuffer()
> is called without timeout wait and AudioTrack thread becomes not to sleep.
> 

I tried to prevent the above situation by local modification. The problem of http://www.wothke.ch/websid/ was addressed.
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #52)
> > 
> > > APPLICATION PRIO  NICE  RTPRI SCHED  NAME
> > > AudioTrack  -3    -16   2     1      AudioTrack
> > 
> > AudioTrack thread is very high priority, if the thread consume almost all
> > cpu resource, another thread could not use cpu. It might cause this problem.
> 
> This is interesting, but the 6P has 8 cores. It could easily schedule stuff
> on one of those, right? Maybe something prevents that?

AudioTrack is very high priority(RT priority) thread and it continuously called system calls when the problem happened. If system call holds global lock within kernel, it could almost freeze the whole system, I assume.
(In reply to Sotaro Ikeda [:sotaro] from comment #62)
> (In reply to Sotaro Ikeda [:sotaro] from comment #60)
> > (In reply to Sotaro Ikeda [:sotaro] from comment #46)
> > > I am going to think, the problem might be caused by very high cpu usage of
> > > AudioTrack thread. The thread handle callback from AudioFlinger and run very
> > > high thread priority.
> > > 
> > > Before the system reboot on nexus 5, "adb shell top -m 5 -d 3 -t" showed the
> > > following result.
> > > 
> > > ----------------------------------------------------
> > > User 68%, System 19%, IOW 0%, IRQ 0%
> > > User 212 + Nice 0 + Sys 60 + Idle 37 + IOW 0 + IRQ 0 + SIRQ 0 = 309
> > > 
> > >   PID   TID PR CPU% S     VSS     RSS PCY UID      Thread          Proc
> > >  3028  3182  0 126% R 1695784K 404936K  fg u0_a53   AudioTrack     
> > > org.mozilla.fennec_sotaro
> > >  3086  3086  0   6% R   3220K   1376K  fg shell    top             top
> > >   201   675  0   2% R 138232K  10480K  fg media    FastMixer      
> > > /system/bin/mediaserver
> > >  3028  3141  0   2% S 1695784K 404936K  fg u0_a53   Thread-169     
> > > org.mozilla.fennec_sotaro
> > >  3028  3097  0   1% S 1695784K 404936K  fg u0_a53   Gecko          
> > > org.mozilla.fennec_sotaro
> > 
> > I seems to understand why AudioTrack thread uses a lot of cpu. When the
> > problem happened AudioTrack::setMarkerPosition() was called with very big
> > marker value. It caused the problem. When very bit marker value is set,
> > AudioTrack::processAudioBuffer() does not work correctly.
> > 
> > The following time unit becomes 0. If it happens, AudioTrack::obtainBuffer()
> > is called without timeout wait and AudioTrack thread becomes not to sleep.
> > 
> 
> I tried to prevent the above situation by local modification. The problem of
> http://www.wothke.ch/websid/ was addressed.

Interesting! I changed opensl_get_preferred_sample() to always return 44100, and things do seem improved. However, I am still able to get a hang on the Yahoo test. I think perhaps the higher iteration frequency made it easier to get into whatever bad state we're hitting.
AFAICT, the only place wilhelm calls AudioTrack::setMarkerPosition() is here[0], which is in response to registering for the SL_PLAYEVENT_HEADATMARKER event. I think the reason the marker value would be huge is because the marker position is initialized[1] to SL_TIME_UNKNOWN, which is 0xFFFFFFFF[2]. Ahem.

Setting the marker position to 0 before setting the event mask seems to avoid the badness. Patch incoming.

[0] http://androidxref.com/6.0.0_r1/xref/frameworks/wilhelm/src/﷒0﷓
[1] http://androidxref.com/6.0.0_r1/xref/frameworks/wilhelm/src/﷒1﷓
[2] http://androidxref.com/6.0.0_r1/xref/frameworks/wilhelm/include/﷒2
Note that the spec[0] does not say the marker must be set before registering for this callback, so I think wilhelm is just busted here. It should check for SL_TIME_UNKNOWN.

[0] https://www.khronos.org/registry/sles/specs/OpenSL_ES_Specification_1.0.1.pdf
Created attachment 8686774 [details] [diff] [review]
Work around busted OpenSL causing hangs/reboots on Android
Attachment #8686774 - Flags: review?(padenot)
:esawin asked me why this would only happen on Android M, and I'm not totally sure. It looks like the busted marker position stuff has been in wilhelm for a while. My guess is that they only recently made the AudioTrack thread RT?
On andorid v5.5.1, AudioTrack thread is also RT. I think the following code affect to the problem as in Comment 60. The following was changed from android v5.5.1 to android 6.0.0.

------------------------------------

    // Convert frame units to time units
    nsecs_t ns = NS_WHENEVER;
    if (minFrames != (uint32_t) ~0) {
        ns = framesToNanoseconds(minFrames, sampleRate, speed) + kWaitPeriodNs;
        ns -= (timeAfterCallbacks - timeBeforeCallbacks);  // account for callback time
        // TODO: Should we warn if the callback time is too long?
        if (ns < 0) {
           ns = 0;
        }
    }
Created attachment 8686942 [details] [diff] [review]
temporary patch - Partially revert AudioTrack to android v5.5.1
attachment 8686942 [details] [diff] [review] revert a part of AudioTrack code that is related to comment 69. By applying the patch to android v6.0.0, I confirmed the problem of http://www.wothke.ch/websid/ was addressed on nexus-5.
From comment 71, the change of attachment 8686942 [details] [diff] [review] that happened from v5.1.1 to v6.0.0 seems to trigger the problem.
diego, do you know if this is a known problem of android 6.0?
Flags: needinfo?(dwilson)
Comment on attachment 8686774 [details] [diff] [review]
Work around busted OpenSL causing hangs/reboots on Android

Review of attachment 8686774 [details] [diff] [review]:
-----------------------------------------------------------------

?!?
Attachment #8686774 - Flags: review?(padenot) → review+
Duplicate of this bug: 1220485
Summary: Nexus 6P reboots or hangs when playing video in Fennec → 6.0 devices reboot or hang when playing audio/video in Fennec
Comment on attachment 8686774 [details] [diff] [review]
Work around busted OpenSL causing hangs/reboots on Android

Approval Request Comment
[Feature/regressing bug #]: Android 6.0
[User impact if declined]: Hangs and reboots when playing media
[Describe test coverage new/current, TreeHerder]: local, and soon on nightly
[Risks and why]: Pretty low, but we should obviously test to make sure the patch doesn't somehow cause problems.
[String/UUID change made/needed]: None
Attachment #8686774 - Flags: approval-mozilla-release?
Attachment #8686774 - Flags: approval-mozilla-beta?
Attachment #8686774 - Flags: approval-mozilla-aurora?
Duplicate of this bug: 1224739

Comment 79

3 years ago
Can confirm with Nexus 5X with FF only. Restarts everytime with videogular example.

http://www.videogular.com/examples/simplest-videogular-player/
I'll check back to make sure this lands and is ok on m-c before we uplift it to beta.  
Let's aim this for the beta 4 build on Monday.

Comment 81

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/84c21e328c74
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox45: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 45
rqou, can you please verify that the hang/reboot issue you are experiencing is fixed on the latest Nightly build (date 11/15/2015)? Thanks.
Flags: needinfo?(rqou)

Comment 83

3 years ago
Bug seems fixed in Nightly. I tested on both the videogular example I posted, and, on my own site. Thankyou!

Comment 84

3 years ago
Tested the 11/15/2015 Nightly, bug seems fixed on all the sites/WebAudio demos I tried it on.
Flags: needinfo?(rqou)

Comment 85

3 years ago
Fixed for me, nothing I tried caused a crash (Still played with issues, I was hoping that was down to the CPU being hogged, but it seems to be something else)
status-firefox42: --- → affected
status-firefox43: --- → affected
status-firefox44: --- → affected
status-firefox45: fixed → verified
Comment on attachment 8686774 [details] [diff] [review]
Work around busted OpenSL causing hangs/reboots on Android

Taking it on all branches as it is an important issue. However, we haven't decided if we do a 42.0.1 or not.
Attachment #8686774 - Flags: approval-mozilla-release?
Attachment #8686774 - Flags: approval-mozilla-release+
Attachment #8686774 - Flags: approval-mozilla-beta?
Attachment #8686774 - Flags: approval-mozilla-beta+
Attachment #8686774 - Flags: approval-mozilla-aurora?
Attachment #8686774 - Flags: approval-mozilla-aurora+
Assignee: nobody → snorp
https://hg.mozilla.org/releases/mozilla-aurora/rev/a8608fc16c09
status-firefox44: affected → fixed
Based on comment 83, 84 and 85, this is verified on trunk (Nightly45).
Status: RESOLVED → VERIFIED
(In reply to rqou from comment #84)
> Tested the 11/15/2015 Nightly, bug seems fixed on all the sites/WebAudio
> demos I tried it on.

Thanks for the verification! This is very helpful.

Updated

3 years ago
Duplicate of this bug: 1225415
(Making summary a bit more keyword rich, since I wasn't able to find this bug prior to filing bug 1225415.)
Summary: 6.0 devices reboot or hang when playing audio/video in Fennec → Android 6.0 devices (eg Nexus 5X/6P) reboot/restart or hang when playing audio/video

Comment 94

3 years ago
Verified as fixed on Nexus 5 (Android 6.0) on Firefox 43 Beta 4.
I was able to reproduce this issue on Firefox 43 Beta 2, on http://www.videogular.com/examples/simplest-videogular-player/
status-firefox43: fixed → verified
[Tracking Requested - why for this release]: We should probably track this as it may be included in a 42.0.1 dot release. 

Snorp can you request uplift to m-r? Sylvestre can decide on this tomorrow, I just want to line up everything for him to do so. Thanks.
tracking-firefox42: --- → ?
Flags: needinfo?(snorp)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #95)
> [Tracking Requested - why for this release]: We should probably track this
> as it may be included in a 42.0.1 dot release. 
> 
> Snorp can you request uplift to m-r? Sylvestre can decide on this tomorrow,
> I just want to line up everything for him to do so. Thanks.

This is done, see comment #86
Flags: needinfo?(snorp)

Comment 98

3 years ago
Verified as fixed on Nexus 5 (Android 6.0) on Firefox 42.0.1.
status-firefox42: fixed → verified

Updated

3 years ago
Flags: needinfo?(dwilson)
tracking-firefox42: ? → +
Flags: needinfo?(ikumar)
You need to log in before you can comment on or make changes to this bug.