Closed
Bug 1221228
Opened 8 years ago
Closed 8 years ago
Android 6.0 devices (eg Nexus 5X/6P) reboot/restart or hang when playing audio/video
Categories
(Firefox for Android Graveyard :: Audio/Video, defect)
Firefox for Android Graveyard
Audio/Video
Tracking
(firefox42+ verified, firefox43 verified, firefox44 fixed, firefox45 verified, b2g-v2.5 fixed, fennec42+)
VERIFIED
FIXED
Firefox 45
People
(Reporter: snorp, Assigned: snorp)
References
()
Details
Attachments
(7 files, 1 obsolete file)
63.08 KB,
text/plain
|
Details | |
127.20 KB,
application/zip
|
Details | |
154.05 KB,
text/plain
|
Details | |
306.64 KB,
text/plain
|
Details | |
288.29 KB,
text/plain
|
Details | |
1.03 KB,
patch
|
padenot
:
review+
Sylvestre
:
approval-mozilla-aurora+
Sylvestre
:
approval-mozilla-beta+
Sylvestre
:
approval-mozilla-release+
|
Details | Diff | Splinter Review |
2.16 KB,
patch
|
Details | Diff | Splinter Review |
At least it does on mine. Nightly. Seems pretty easy to repro.
Assignee | ||
Comment 1•8 years ago
|
||
So far I have a 100% reproduction rate by playing the video on this page: https://gma.yahoo.com/video-shows-taco-bell-execs-alleged-drunk-attack-061200233--abc-news-topstories.html# Logcat didn't show anything interesting before the reboot.
Assignee | ||
Comment 2•8 years ago
|
||
With a local build it doesn't crash, but instead freezes the entire device. Logcat shows: 11-03 13:29:32.144 14239 14239 D GeckoBrowserApp: BrowserApp.onTabChanged: 0: AUDIO_PLAYING_CHANGE 11-03 13:29:32.152 3266 5493 D audio_hw_primary: select_devices: out_snd_device(2: speaker) in_snd_device(0: none) 11-03 13:29:32.152 3266 5493 D msm8974_platform: platform_send_audio_calibration: sending audio calibration for snd_device(2) acdb_id(15) 11-03 13:29:32.152 3266 5493 I soundtrigger: audio_extn_sound_trigger_update_device_status: device 0x2 of type 0 for Event 1 11-03 13:29:33.779 4404 9881 D NetlinkSocketObserver: NeighborEvent{elapsedMs=1060170, 192.168.1.1, [AC220B880DCA], RTM_NEWNEIGH, NUD_REACHABLE} 11-03 13:29:36.078 13651 13651 I ServiceManager: Waiting for service AtCmdFwd... 11-03 13:29:37.079 13651 13651 I ServiceManager: Waiting for service AtCmdFwd... 11-03 13:29:38.079 13651 13651 I ServiceManager: Waiting for service AtCmdFwd... 11-03 13:29:39.080 13651 13651 I ServiceManager: Waiting for service AtCmdFwd... 11-03 13:29:40.081 13651 13651 I ServiceManager: Waiting for service AtCmdFwd... 11-03 13:29:41.081 13651 13651 W Atfwd_Sendcmd: AtCmdFwd service not published, waiting... retryCnt : 3 11-03 13:29:56.082 13651 13651 I ServiceManager: Waiting for service AtCmdFwd... 11-03 13:29:57.082 13651 13651 I ServiceManager: Waiting for service AtCmdFwd... 11-03 13:29:58.083 13651 13651 I ServiceManager: Waiting for service AtCmdFwd... 11-03 13:29:59.083 13651 13651 I ServiceManager: Waiting for service AtCmdFwd... 11-03 13:29:59.649 4404 5546 D ConnectivityService: notifyType CAP_CHANGED for NetworkAgentInfo [WIFI () - 102] 11-03 13:30:00.084 13651 13651 I ServiceManager: Waiting for service AtCmdFwd... 11-03 13:30:01.023 4404 4430 I ActivityManager: Start proc 14464:com.google.android.deskclock/u0a66 for broadcast com.google.android.deskclock/com.android.alarmclock.DigitalAppWidgetProvider 11-03 13:30:01.084 13651 13651 W Atfwd_Sendcmd: AtCmdFwd service not published, waiting... retryCnt : 4 11-03 13:30:02.659 4404 5546 D ConnectivityService: notifyType CAP_CHANGED for NetworkAgentInfo [WIFI () - 102] 11-03 13:30:04.881 4404 5544 D WifiStateMachine: starting scan for "Willcox"WPA_PSK with 5680,2437 11-03 13:30:05.106 4404 5544 D WifiStateMachine: shouldSwitchNetwork txSuccessRate=8.08 rxSuccessRate=5.75 delta 999 -> 993 11-03 13:30:05.107 4404 5544 D WifiStateMachine: CMD_AUTO_ROAM sup state CompletedState my state ConnectedState nid=0 config "Willcox"WPA_PSK roam=1 to any targetRoamBSSID any 11-03 13:30:05.107 4404 5544 D WifiStateMachine: AUTO_ROAM nothing to do 11-03 13:30:16.173 8014 8014 I clatd : Detecting NAT64 prefix from DNS... 11-03 13:30:16.176 8014 8014 E clatd : plat_prefix/dns(ipv4only.arpa) status = 7/No address associated with hostname 11-03 13:30:16.176 8014 8014 W clatd : dns64_detection -- error, sleeping for 120 seconds 11-03 13:30:21.490 4404 9881 D NetlinkSocketObserver: NeighborEvent{elapsedMs=1107880, 192.168.1.1, [AC220B880DCA], RTM_NEWNEIGH, NUD_STALE} 11-03 13:30:26.790 4404 9881 D NetlinkSocketObserver: NeighborEvent{elapsedMs=1113180, 192.168.1.1, [AC220B880DCA], RTM_NEWNEIGH, NUD_REACHABLE} 11-03 13:30:41.351 4404 6314 I Process : Sending signal. PID: 4404 SIG: 3 11-03 13:30:41.352 4404 4408 I art : Thread[2,tid=4408,WaitingInMainSignalCatcherLoop,Thread*=0x7f85d43000,peer=0x12c010a0,"Signal Catcher"]: reacting to signal 3 11-03 13:30:46.419 4404 4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out 11-03 13:30:46.420 4404 4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data. 11-03 13:30:51.480 4404 4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out 11-03 13:30:51.480 4404 4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data. 11-03 13:30:56.510 4404 4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out 11-03 13:30:56.511 4404 4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data. 11-03 13:30:56.603 14239 14256 I GeckoWebappsUpdateTimer: periodic check for webapp updates 11-03 13:30:56.603 14239 14256 D GeckoWebappManager: checkForUpdates 11-03 13:30:56.614 14239 14256 D GeckoWebapps: Saving /data/user/0/org.mozilla.fennec_snorp/files/mozilla/webapps/webapps.json 11-03 13:30:56.623 14239 14256 D GeckoWebapps: Success saving /data/user/0/org.mozilla.fennec_snorp/files/mozilla/webapps/webapps.json 11-03 13:30:56.626 14239 14256 D GeckoWebapps: Saving /data/user/0/org.mozilla.fennec_snorp/files/mozilla/webapps/webapps.json 11-03 13:30:56.633 14239 14256 D GeckoWebapps: Success saving /data/user/0/org.mozilla.fennec_snorp/files/mozilla/webapps/webapps.json 11-03 13:31:01.558 4404 4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out 11-03 13:31:01.559 4404 4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data. 11-03 13:31:06.996 4404 4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out 11-03 13:31:06.996 4404 4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data. 11-03 13:31:12.313 4404 4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out 11-03 13:31:12.313 4404 4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data. 11-03 13:31:17.341 4404 4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out 11-03 13:31:17.341 4404 4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data. 11-03 13:31:22.509 4404 4408 W libbacktrace: bool ThreadEntry::Wait(int): pthread_cond_timedwait for value 1 failed: Connection timed out 11-03 13:31:22.509 4404 4408 E libbacktrace: bool BacktraceCurrent::UnwindThread(size_t): Timed out waiting for signal handler to get ucontext data. It looks like the AtCmdFwd service is blocked, or something, and that's hosing system_server.
Assignee | ||
Comment 3•8 years ago
|
||
AtCmdFwd is some kind of qualcomm-specific service. I'm wondering if my device is somehow just hosed.
Assignee | ||
Updated•8 years ago
|
tracking-fennec: --- → ?
Assignee | ||
Comment 4•8 years ago
|
||
Well this isn't very nice... #0 0xf6f9e088 in __reboot () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libc.so #1 0xf71c4134 in systemTime () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libutils.so #2 0xf6648bd8 in android::AudioTrack::processAudioBuffer() () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libmedia.so #3 0xf6649a6c in android::AudioTrack::AudioTrackThread::threadLoop() () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libmedia.so #4 0xf71c4072 in android::Thread::_threadLoop(void*) () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libutils.so #5 0xf6f9c814 in __pthread_start(void*) () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libc.so #6 0xf6f76f76 in __start_thread () from /Users/snorp/source/jimdb-arm/lib/84B5T15A15005343/system/lib/libc.so #7 0x00000000 in ?? ()
Comment 5•8 years ago
|
||
Have a similar problem running beta although its not 100% reproducible with above video
Comment 6•8 years ago
|
||
Forgot to mention I am using nexus 5x
I think I might be hitting the same crash on my Nexus 5 (Can't test with the video from comment 1 because of a geo restriction), device just reboots sometimes when playing video.
Assignee | ||
Comment 8•8 years ago
|
||
Apparently AtCmdFwd is some kind of Wifi display forwarding service. Not sure why I seem to get so many messages about it while playing video. If I disable Wifi I don't get any messages, but the Yahoo video still reboots and/or hangs. Red herring.
Assignee | ||
Comment 9•8 years ago
|
||
Nevermind, I do still see the AtCmdFwd messages even with wifi off...might still be related?
Comment 10•8 years ago
|
||
Sotaro, do we still have contacts at QC that could help on that?
Assignee | ||
Comment 11•8 years ago
|
||
'adb bugreport' shows some interesting stuff in the last kmsg [46941.303851] msm_vidc: info: Closed video instance: ffffffc021d34000 [46941.601653] msm_vidc: info: Opening video instance: ffffffc0023be000, 1 [46941.947502] msm8994_quat_mi2s_snd_startup: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b893a410 substream = subdevice #0 stream = 1 bit width =6 sample rate =48000 [46941.951550] afe_get_cal_topology_id: [AFE_TOPOLOGY_CAL] not initialized for this port 4103 [46941.961241] msm8994_quat_mi2s_snd_startup: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b893a410 substream = subdevice #0 stream = 0 bit width =6 sample rate =48000 [46941.962594] afe_get_cal_topology_id: [AFE_TOPOLOGY_CAL] not initialized for this port 4102 [46958.523245] msm_vidc: info: Closed video instance: ffffffc0023be000 [46958.704096] msm8994_quat_mi2s_snd_shutdown: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b893a410 substream = subdevice #0 stream = 0 [46958.718849] msm8994_quat_mi2s_snd_shutdown: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b893a410 substream = subdevice #0 stream = 1 [46958.719758] msm8994_quat_mi2s_snd_shutdown Quaternary MI2S Clock is Disabled [46958.721100] max98925_left_en_put: spk&rcver switch gpio had pulled down[46958.804060] msm_vidc: info: Opening video instance: ffffffc094fa6000, 1 [46959.023967] msm8994_quat_mi2s_snd_startup: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b893a410 substream = subdevice #0 stream = 1 bit width =6 sample rate =48000 [46965.831599] Watchdog bark! Now = 46965.831593 [46965.831615] Watchdog last pet at 46954.773015 [46965.831623] cpu alive mask from last pet 0-7 [46965.831658] Causing a watchdog bite! [46965.832668] Wdog - STS: 0x1, CTL: 0x1, BARK TIME: 0x57fdf, BITE TIME: 0x1[46965.832678] Kernel panic - not syncing: Failed to cause a watchdog bite! - Falling back to kernel panic! [46965.832689] CPU: 0 PID: 17228 Comm: AudioTrack Tainted: G W 3.10.73-gcf36678 #1 [46965.832694] Call trace: [46965.832710] [<ffffffc000206b88>] dump_backtrace+0x0/0x250 [46965.832718] [<ffffffc000206de8>] show_stack+0x10/0x1c [46965.832730] [<ffffffc000c907e4>] dump_stack+0x1c/0x28 [46965.832737] [<ffffffc000c8efec>] panic+0x14c/0x300 [46965.832747] [<ffffffc000505150>] wdog_bark_handler+0xd8/0xdc [46965.832757] [<ffffffc00026ace8>] handle_irq_event_percpu+0xb4/0x244 [46965.832764] [<ffffffc00026aec0>] handle_irq_event+0x48/0x78 [46965.832773] [<ffffffc00026db8c>] handle_fasteoi_irq+0xc0/0x128 [46965.832780] [<ffffffc00026a530>] generic_handle_irq+0x28/0x3c [46965.832788] [<ffffffc0002040d8>] handle_IRQ+0x84/0xa8 [46965.832795] [<ffffffc0002006d8>] gic_handle_irq+0x54/0x84 [46965.832801] Exception stack(0xffffffc012d3feb0 to 0xffffffc012d3ffd0) [46965.832808] fea0: 00400000 00000000 00000000 00000000 [46965.832815] fec0: ffffffff ffffffff f757c6aa 00000000 bfd9c6b0 00000000 c0aa7888 00000000 [46965.832823] fee0: 00000002 00000000 bfd9c6b4 00000000 bfd9c6b0 00000000 f6a43910 00000000 [46965.832830] ff00: ffffff92 00000000 00000000 00000000 bf7cacb0 00000000 00000000 00000000 [46965.832838] ff20: c0aa7860 00000000 00000000 00000000 f7587c28 00000000 c0aa7828 00000000 [46965.832845] ff40: f6a02ee1 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [46965.832853] ff60: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [46965.832861] ff80: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [46965.832868] ffa0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [46965.832874] ffc0: 00000000 00000000 00000000 00000000 [46965.832887] CPU4: stopping [46965.832905] [46965.832924] CPU: 4 PID: 16939 Comm: Gecko Tainted: G W 3.10.73-gcf36678 #1 [46965.832941] task: ffffffc048ad60c0 ti: ffffffc087a74000 task.ti: ffffffc087a74000 [46965.832954] PC is at 0xd77eb2c6 [46965.832967] LR is at 0xf6b
Assignee | ||
Comment 12•8 years ago
|
||
Alright, so I think the watchdog bite means it has detected that the device is hung, so it panics to cause a reboot. I just got it to hang without triggering the watchdog, and there is some interesting stuff. It looks like the kernel gets hung up waiting on the decoder to do some stuff. I'm not sure why we would be triggering this and other apps wouldn't....
Assignee | ||
Comment 13•8 years ago
|
||
Assignee | ||
Comment 15•8 years ago
|
||
Now in another instance I get some kind of GPU deadlock? [ 4368.612169] msm_vidc: info: Opening video instance: ffffffc0ba374000, 1 [ 4368.804081] msm8994_quat_mi2s_snd_startup: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b8a33410 substream = subdevice #0 stream = 1 bit width =6 sample rate =48000 [ 4369.353773] afe_get_cal_topology_id: [AFE_TOPOLOGY_CAL] not initialized for this port 4103 [ 4369.365721] msm8994_quat_mi2s_snd_startup: dai name qcom,msm-dai-q6-mi2s-quat.210 ffffffc0b8a33410 substream = subdevice #0 stream = 0 bit width =6 sample rate =48000 [ 4369.367352] afe_get_cal_topology_id: [AFE_TOPOLOGY_CAL] not initialized for this port 4102 [ 4391.046292] kgsl kgsl-3d0: kgsl: possible gpu syncpoint deadlock for context 10 timestamp 0 [ 4391.046315] kgsl kgsl-3d0: context[10]: queue=4700, submit=4699, start=4699, retire=4699 [ 4391.046323] kgsl kgsl-3d0: possible deadlock. Context 10 might be blocked for itself [ 4391.046335] kgsl kgsl-3d0: context[10]: submit times: 5.354 5.392 5.548 5.393 5.345 5.308 5.375 [ 4391.046343] [ffffffc091f9f700] SurfaceView:0: active [ 4391.046347] waiters: [ 4391.046363] kgsl_sync_callback+0x0/0x38 [ 4391.046368] syncpoints: [ 4391.046374] kgsl-3d0_surfaceflinger(380)-surfaceflinger(380)-2_pt signaled@4386.030689: 41 / 41 queued:41 retired:41 [ 4391.046396] kgsl-3d0_la.fennec_snorp(19650)-Thread-173(19777)-10_pt signaled@4386.030689: 4697 / 4699 queued:4700 retired:4699 [ 4391.046413] mdss_fb_0_pt active: 13688 / 13687 [ 4391.046428] kgsl kgsl-3d0: --gpu syncpoint deadlock print end--
Assignee | ||
Comment 16•8 years ago
|
||
According to the angler-kernel repo[0] there have been quite a few updates since the one installed on my device (3.10.73-gcf36678). I think I'll try to install a newer kernel. [0] https://android.googlesource.com/device/huawei/angler-kernel/
Assignee | ||
Comment 17•8 years ago
|
||
I updated my kernel to the latest one in the angler-kernel repo, 3.10.73-g549e459, and I can still reproduce the issue. Ugh.
Updated•8 years ago
|
tracking-fennec: ? → 42+
Comment 18•8 years ago
|
||
noise |
I just ordered nexus6P yesterday. The phone seems to be in Hongkong now in delivery tracking system.
Comment 19•8 years ago
|
||
(In reply to [:fabrice] Fabrice Desré from comment #10) > Sotaro, do we still have contacts at QC that could help on that? :diego, might still respond to a question. But they did not spend time if the bug does not related to fxos mobile products and basically we needed to analyze it.
Comment 20•8 years ago
|
||
(In reply to Alex from comment #7) > I think I might be hitting the same crash on my Nexus 5 (Can't test with the > video from comment 1 because of a geo restriction), device just reboots > sometimes when playing video. I also cannot test the video because of geo restriction.
Flags: needinfo?(sotaro.ikeda.g)
Comment 21•8 years ago
|
||
(In reply to Ka-Hing Cheung from comment #6) > Forgot to mention I am using nexus 5x It might be a problem that happen both nexus6p and nexus5x. From it, it might be a problem that commonly happen on new qualcomm 64bit chipset. https://en.wikipedia.org/wiki/Nexus_6P https://en.wikipedia.org/wiki/Nexus_5X
Comment 22•8 years ago
|
||
Yes can confirm that this is an issue for the 5x as well. Snorp ashed me to check Yahoo video on the 5x, the first video I found reboots the device every time. https://screen.yahoo.com/popular/star-wars-lego-destruction-star-140000599.html
Comment 23•8 years ago
|
||
My Nexus 5 (not 5x) crashed on that video fairly quickly. Before it crashed the video was jumpy, started playing a second or so into the video, then froze and jumped back to the start, then crashed.
Comment 24•8 years ago
|
||
(In reply to Alex from comment #23) > My Nexus 5 (not 5x) crashed on that video fairly quickly. > > Before it crashed the video was jumpy, started playing a second or so into > the video, then froze and jumped back to the start, then crashed. Alex, can you take logcat and dmesg log?
Comment 25•8 years ago
|
||
Sure, I'm not sure how useful dmesg will be, since I can't find a way to tail its output, but I did get the output of logcat and last_kmsg, and they give a bit more info (Included everything though)
Comment 26•8 years ago
|
||
Thanks Alex! From last_kmsg.log in attachment 8684033 [details], problem seems to happen around SurfaceView fence handling. [ 420.640043] fence timeout on [c3604280] after 10000ms [ 420.640430] objs: [ 420.640433] -------------- >>> snip [ 420.658320] [c3604280] SurfaceView:0: active [ 420.658323] kgsl-timeline_pt signaled@410.514145 [ 420.659780] kgsl-timeline_pt signaled@410.511896 [ 420.659784] mdss_fb_0_pt active: 2388 / 2387 >>> snip [ 421.190340] Watchdog bark! Now = 421.190336 [ 421.190414] Watchdog last pet at 410.190056 [ 421.190453] cpu alive mask from last pet 0-3 [ 421.190491] Causing a watchdog bite!
Flags: needinfo?(the.decryptor)
Assignee | ||
Comment 27•8 years ago
|
||
I saw the SurfaceView watchdog bite once too. I am not sure if that's the root cause or if the kernel is fouled up from something else. Media playback clearly plays some kind of part...
Assignee | ||
Comment 28•8 years ago
|
||
I thought maybe the per-frame attach/detach that we do on the SurfaceTexture could be responsible for this badness, but disabling that doesn't seem to help here.
Assignee | ||
Comment 29•8 years ago
|
||
In the file I attached, nearly all of the blocked tasks are waiting on the main binder mutex. I don't see anything there which would be holding that mutex, so I guess the owner is not in the blocked state (and therefore does not appear in this list)?
Assignee | ||
Comment 30•8 years ago
|
||
Another thing interesting from that log: All CPUs are in lpm_cpuidle_enter, except CPU 4 which is servicing the SysRq and CPU 0 which is the AudioTrack task. AudioTrack is not listed in set of blocked tasks, but presumably it's holding the binder mutex and then gets stuck somewhere else?
Assignee | ||
Comment 31•8 years ago
|
||
Modifying Fennec to drop the audio track on the floor results in no problems, so audio does indeed some to be part of the problem.
Assignee | ||
Comment 32•8 years ago
|
||
Doing the opposite -- keeping the audio track, but dropping the video track, results in reboot.
Assignee | ||
Comment 33•8 years ago
|
||
We're using OpenSL for audio output, which might be the reason we are a unique snowflake here.
Assignee | ||
Comment 34•8 years ago
|
||
Apparently OpenSL uses AudioTrack on this device, because the AudioTrack thread that is hung on CPU 0 is from fennec. We are definitely not using AudioTrack directly, because that cubeb backend doesn't even work on 6.0 (it's missing a bunch of symbols). In the latest hang, it's stuck at ktime_get_ts+0x74/0x100 in the kernel. This matches the userspace stack I once got above in comment #4. top shows fennec using 12% CPU, which on this 8 core device equates to one whole CPU. I think we must be spinning in ktime_get_ts for some reason.
Updated•8 years ago
|
Flags: needinfo?(the.decryptor)
Comment 35•8 years ago
|
||
I can reproduce this as well
Assignee | ||
Comment 36•8 years ago
|
||
I am able to build my own kernels now, so I can debug a little closer. So far, I have determined: 1) The wait_for_completion in msm_mpm_work_fn is probably harmless. I made it timeout, and the behavior was pretty much the same. 2) All of the tasks waiting on the binder mutex do not appear to be deadlocked. After the device is hung I still see successful locks taking place. I guess there is just a lot of contention? 3) I am in the deep end of the pool here and have no idea what I am doing.
Comment 37•8 years ago
|
||
Inder, do you know someone that could help us there? thanks!
Flags: needinfo?(ikumar)
Assignee | ||
Comment 38•8 years ago
|
||
I have been assuming the watchdog bite was because it determined there was some hardware badness, but it appears the watchdog is most likely kept satisfied by userspace pings. Consequently, I could be barking up the wrong tree here.
Assignee | ||
Comment 39•8 years ago
|
||
The watchdog daemon source here[0] looks pretty tiny, though, so if that is hosed I would guess there is kernel badness. [0] http://androidxref.com/6.0.0_r1/xref/system/core/init/watchdogd.cpp
Assignee | ||
Comment 40•8 years ago
|
||
The userspace watchdog service is what generates the kernel sysrq traces, and it also does a userspace dump on some system services as well. I grabbed this one just now.
Assignee | ||
Comment 41•8 years ago
|
||
This is another set of userspace traces, and I was able to get fennec this time as well. The main thread appears to be stuck in android.view.DisplayEventReceiver.nativeScheduleVsync, which ends up stuck in binder.
Assignee | ||
Updated•8 years ago
|
Summary: Nexus 6P crashes (reboots) when playing video in Fennec → Nexus 6P reboots or hangs when playing video in Fennec
Assignee | ||
Comment 42•8 years ago
|
||
Might be getting somewhere. Binder does indeed some to be...bound up. [ 259.317110] SNORP: binder locked! Binder_4 [ 259.317135] SNORP: attempting binder lock.. Thread-152 [ 259.317141] SNORP: binder locked! Thread-152 [ 259.317567] SNORP: attempting binder lock.. Thread-153 [ 259.317575] SNORP: binder locked! Thread-153 [ 259.317605] SNORP: attempting binder lock.. Thread-153 [ 259.317611] SNORP: binder locked! Thread-153 [ 259.317622] SNORP: attempting binder lock.. Thread-153 [ 259.317627] SNORP: binder locked! Thread-153 [ 259.317652] SNORP: attempting binder lock.. Binder_2 [ 259.317658] SNORP: binder locked! Binder_2 [ 259.317768] SNORP: attempting binder lock.. Binder_2 [ 259.317774] SNORP: binder locked! Binder_2 [ 259.317792] SNORP: attempting binder lock.. Binder_2 [ 259.317798] SNORP: binder locked! Binder_2 [ 259.317822] SNORP: attempting binder lock.. Binder_2 [ 259.317827] SNORP: binder locked! Binder_2 [ 259.317853] SNORP: attempting binder lock.. Thread-153 [ 259.317859] SNORP: binder locked! Thread-153 [ 259.318313] SNORP: attempting binder lock.. Binder_3 [ 259.318350] SNORP: attempting binder lock.. MediaPl~ack #11 [ 259.977658] SNORP: attempting binder lock.. system_server [ 260.233574] SNORP: attempting binder lock.. m.android.phone [ 264.617285] SNORP: attempting binder lock.. ActivityManager [ 264.617461] SNORP: attempting binder lock.. ConnectivitySer [ 267.351033] healthd: battery l=77 v=4034 t=37.0 h=2 st=2 c=-318 chg=u 2015-11-09 22:45:37.804400605 UTC [ 267.351096] SNORP: attempting binder lock.. healthd [ 267.634189] SNORP: attempting binder lock.. Gecko [ 273.038252] SNORP: attempting binder lock.. ATFWD-daemon [ 284.436335] SNORP: attempting binder lock.. GCMReader [ 322.738802] SNORP: attempting binder lock.. Binder_1 [ 327.761553] SNORP: attempting binder lock.. Binder_2 [ 343.445929] SNORP: attempting binder lock.. Binder_3 [ 348.469079] SNORP: attempting binder lock.. Binder_4 [ 350.513281] SMBCHG: determine_target_input_current: target_current = 500 [ 353.711504] SNORP: attempting binder lock.. Binder_5 [ 358.732076] SNORP: attempting binder lock.. Binder_6 [ 363.857570] SNORP: attempting binder lock.. Binder_7 [ 368.920602] SNORP: attempting binder lock.. Binder_8 [ 373.942060] SNORP: attempting binder lock.. Binder_9 [ 378.981073] SNORP: attempting binder lock.. Binder_A [ 384.002085] SNORP: attempting binder lock.. Binder_B [ 389.147634] SNORP: attempting binder lock.. Binder_C [ 394.171567] SNORP: attempting binder lock.. Binder_D [ 399.207093] SNORP: attempting binder lock.. Binder_E [ 399.221849] SNORP: attempting binder lock.. Binder_1 Thread-153 (whatever that is) is hosing us. Or maybe the target thread is hosing us, not sure.
Comment 43•8 years ago
|
||
I am also experiencing this issue. I am able to reproduce it using the following demo: http://www.wothke.ch/websid/ . If you tap on "What's this?" and then repeatedly tap on the next button ">>|" the phone will eventually either hang or reboot. I am on Nexus 6P.
Assignee | ||
Comment 44•8 years ago
|
||
(In reply to rqou from comment #43) > I am also experiencing this issue. I am able to reproduce it using the > following demo: http://www.wothke.ch/websid/ . If you tap on "What's this?" > and then repeatedly tap on the next button ">>|" the phone will eventually > either hang or reboot. I am on Nexus 6P. Thanks for the report. I can confirm that I get the same hang if I follow those instructions. There is no decoder activity with this demo, so I think we can narrow this down to something related to sound playback.
Comment 45•8 years ago
|
||
(In reply to rqou from comment #43) > I am also experiencing this issue. I am able to reproduce it using the > following demo: http://www.wothke.ch/websid/ . If you tap on "What's this?" > and then repeatedly tap on the next button ">>|" the phone will eventually > either hang or reboot. I am on Nexus 6P. I also confirmed system reboot on nexus-5.
Comment 46•8 years ago
|
||
I am going to think, the problem might be caused by very high cpu usage of AudioTrack thread. The thread handle callback from AudioFlinger and run very high thread priority. Before the system reboot on nexus 5, "adb shell top -m 5 -d 3 -t" showed the following result. ---------------------------------------------------- User 68%, System 19%, IOW 0%, IRQ 0% User 212 + Nice 0 + Sys 60 + Idle 37 + IOW 0 + IRQ 0 + SIRQ 0 = 309 PID TID PR CPU% S VSS RSS PCY UID Thread Proc 3028 3182 0 126% R 1695784K 404936K fg u0_a53 AudioTrack org.mozilla.fennec_sotaro 3086 3086 0 6% R 3220K 1376K fg shell top top 201 675 0 2% R 138232K 10480K fg media FastMixer /system/bin/mediaserver 3028 3141 0 2% S 1695784K 404936K fg u0_a53 Thread-169 org.mozilla.fennec_sotaro 3028 3097 0 1% S 1695784K 404936K fg u0_a53 Gecko org.mozilla.fennec_sotaro
Comment 47•8 years ago
|
||
On fxos nexus-5-l, top command showed very different result. The following showed that Browser main thread was most busy. On android, AudioTrack thread was most busy and its cpu usage was too high. ---------------------------------------------- User 46%, System 9%, IOW 0%, IRQ 0% User 283 + Nice 0 + Sys 59 + Idle 259 + IOW 2 + IRQ 0 + SIRQ 1 = 604 PID TID PR CPU% S VSS RSS PCY UID Thread Proc 1808 1808 1 44% R 278148K 68784K unk u0_a1808 Browser /system/b2g/b2g 196 790 1 2% S 568660K 138248K fg root Compositor /system/b2g/b2g 196 1830 1 2% S 573224K 138248K fg root BufMgrParent#18 /system/b2g/b2g 1808 2566 1 1% S 278148K 68784K fg u0_a1808 AudioTrack /system/b2g/b2g 2685 2685 0 1% R 10672K 1264K fg root top top
Comment 48•8 years ago
|
||
Both on android and fxos nexus-5, AudioTrack thread seemed to have same thread priority like the following.
> APPLICATION PRIO NICE RTPRI SCHED NAME
> AudioTrack -3 -16 2 1 AudioTrack
AudioTrack thread is very high priority, if the thread consume almost all cpu resource, another thread could not use cpu. It might cause this problem.
Comment 49•8 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #47) > On fxos nexus-5-l, top command showed very different result. The following > showed that Browser main thread was most busy. On android, AudioTrack thread > was most busy and its cpu usage was too high. > > ---------------------------------------------- > > User 46%, System 9%, IOW 0%, IRQ 0% > User 283 + Nice 0 + Sys 59 + Idle 259 + IOW 2 + IRQ 0 + SIRQ 1 = 604 > > PID TID PR CPU% S VSS RSS PCY UID Thread Proc > 1808 1808 1 44% R 278148K 68784K unk u0_a1808 Browser /system/b2g/b2g On fxos nexus-5 case, Browser main thread cpu usage is mainly for rendering. When display is off, cpu usage was the following. ------------------------------------------------- PID TID PR CPU% S VSS RSS PCY UID Thread Proc 1688 1688 0 17% S 259240K 66444K unk u0_a1688 Browser /system/b2g/b2g 1688 2398 0 12% S 259240K 66444K fg u0_a1688 AudioTrack /system/b2g/b2g 192 732 0 8% S 47136K 9172K fg media FastMixer /system/bin/mediaserver 2877 2877 0 4% R 10672K 1252K fg root top top 910 927 0 1% S 20256K 880K fg root mpdecision /system/bin/mpdecision
Comment 50•8 years ago
|
||
On android 5.1.1 factory image of nexus-5, http://www.wothke.ch/websid/ showed the following cpu usage by "adb shell top -m 5 -d 3 -t". It seems same to fxos master nexus-5-l. ---------------------------------------------------------- User 51%, System 8%, IOW 0%, IRQ 0% User 312 + Nice 0 + Sys 54 + Idle 239 + IOW 0 + IRQ 0 + SIRQ 0 = 605 PID TID PR CPU% S VSS RSS PCY UID Thread Proc 5914 5963 0 46% R 2258460K 391556K fg u0_a83 Gecko org.mozilla.fennec_sotaro 5914 6465 1 3% S 2258460K 391556K fg u0_a83 AudioTrack org.mozilla.fennec_sotaro 6440 6440 1 3% R 10672K 1560K fg shell top top 171 217 1 2% S 18628K 2740K fg logd logd.writer /system/bin/logd 5914 6041 0 2% S 2258460K 391556K fg u0_a83 Thread-530 org.mozilla.fennec_sotaro
Comment 51•8 years ago
|
||
Reproducing using comment 43 on Nexus 7 and 5 with Android M. Using the hardware volume buttons seems to be broken in this situation and triggers the hang/reboot for me.
Assignee | ||
Comment 52•8 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #48) > Both on android and fxos nexus-5, AudioTrack thread seemed to have same > thread priority like the following. > > > APPLICATION PRIO NICE RTPRI SCHED NAME > > AudioTrack -3 -16 2 1 AudioTrack > > AudioTrack thread is very high priority, if the thread consume almost all > cpu resource, another thread could not use cpu. It might cause this problem. This is interesting, but the 6P has 8 cores. It could easily schedule stuff on one of those, right? Maybe something prevents that?
Assignee | ||
Comment 53•8 years ago
|
||
I modified the kernel to show all of the fennec tasks when the userspace watchdog does the sysrq during the hang. The AudioTrack thread is interesting: [ 995.648185] AudioTrack R running task 0 8840 530 0x00400000 [ 995.648196] Call trace: [ 995.648202] [<ffffffc000204c88>] __switch_to+0x7c/0x88 [ 995.648209] [<ffffffc0002040dc>] handle_IRQ+0x88/0xa8 [ 995.648217] [<ffffffc0002006d8>] gic_handle_irq+0x54/0x84 [ 995.648222] Exception stack(0xffffffc037cdbd10 to 0xffffffc037cdbe30) [ 995.648230] bd00: 37cdbe80 ffffffc0 37cdbec0 ffffffc0 [ 995.648238] bd20: 37cdbe50 ffffffc0 0042d308 ffffffc0 00000000 00000000 37cdbe90 ffffffc0 [ 995.648246] bd40: ffffffff ffffffff 2625a023 00000000 37cdbed0 ffffffc0 000003e2 00000000 [ 995.648254] bd60: 34155555 00000000 00000018 00000000 a9bc78e0 ffffffff 00000000 00000000 [ 995.648262] bd80: 00000000 00000000 00000000 00000000 00000000 00000000 c67fb820 00000000 [ 995.648271] bda0: f767c135 00000000 00000000 00000000 0028231c ffffffc0 00000000 00000000 [ 995.648278] bdc0: 00000000 00000000 37cdbe80 ffffffc0 37cdbec0 ffffffc0 00000001 00000000 [ 995.648286] bde0: 37cdbdf0 ffffffc0 00911e28 ffffffc0 37cdbe00 ffffffc0 00272078 ffffffc0 [ 995.648295] be00: 37cdbe40 ffffffc0 0023e7ec ffffffc0 37cdbe80 ffffffc0 37cdbec0 ffffffc0 [ 995.648301] be20: 00000001 00000000 00000000 00000080 [ 995.648308] [<ffffffc0002035a4>] el1_irq+0x64/0xd4 [ 995.648316] [<ffffffc00028234c>] compat_sys_clock_gettime+0x30/0x64 I guess clock_gettime() was interrupted by an IRQ, and the handler never returns? I need to add a way to trigger this sysrq whenever I want to see if it's really stuck there, but selinux is being a pain.
Assignee | ||
Comment 54•8 years ago
|
||
Does the presence of the __switch_to in the top frame mean the task was preempted? Is that supposed to be allowed in an interrupt handler? I'm about at my wits end here (probably well past that point, actually).
Comment 55•8 years ago
|
||
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #52) > (In reply to Sotaro Ikeda [:sotaro] from comment #48) > > Both on android and fxos nexus-5, AudioTrack thread seemed to have same > > thread priority like the following. > > > > > APPLICATION PRIO NICE RTPRI SCHED NAME > > > AudioTrack -3 -16 2 1 AudioTrack > > > > AudioTrack thread is very high priority, if the thread consume almost all > > cpu resource, another thread could not use cpu. It might cause this problem. > > This is interesting, but the 6P has 8 cores. It could easily schedule stuff > on one of those, right? Maybe something prevents that? Yea, on multi core device, on thread's priority should not prevent hole devices. There should be something.
Comment 56•8 years ago
|
||
AudioTrack:s sampling rate seems to be related to the problem. At http://www.wothke.ch/websid/, sampleRate was 48000. At youtube.com, sampleRate was 44100. Android normally use 44100 as sample rate. The followings are AudioTrack params dump. ---------------- - audio track prams at http://www.wothke.ch/websid/ > 11-11 23:54:19.349 14855 14981 V AudioTrack: set(): streamType 3, sampleRate 48000, format 0x1, channelMask 0x3, frameCount 0, flags #4, notificationFrames 0, sessionId 56, transferType 0, uid -1, pid -1 - audio params at youtube.com > 11-11 23:52:26.975 6681 14713 V AudioTrack: set(): streamType 3, sampleRate 44100, format 0x1, channelMask 0x3, frameCount 0, flags #4, notificationFrames 0, sessionId 50, transferType 0, uid -1, pid -1
Comment 57•8 years ago
|
||
Just the following change(hard code 44100 sample rate) seemed to fix the problem, though audio pitch and play speed became slow. diff --git a/media/libcubeb/src/cubeb_opensl.c b/media/libcubeb/src/cubeb_opensl.c --- a/media/libcubeb/src/cubeb_opensl.c +++ b/media/libcubeb/src/cubeb_opensl.c @@ -475,16 +475,18 @@ opensl_stream_init(cubeb * ctx, cubeb_st *stream = NULL; if (stream_params.channels < 1 || stream_params.channels > 32 || latency < 1 || latency > 2000) { return CUBEB_ERROR_INVALID_FORMAT; } + stream_params.rate = 44100; + SLDataFormat_PCM format;
Comment 58•8 years ago
|
||
When audio sample rate was 48000, audio out seemed to use low-latency-playback. ---------------------------------------------------------------- 11-12 01:34:04.856 1686 1938 V AudioTrack: set(): streamType 3, sampleRate 48000, format 0x1, channelMask 0x3, frameCount 0, flags #4, notificationFrames 0, sessionId 22, transferType 0, uid -1, pid -1 11-12 01:34:04.856 1686 1938 V AudioTrack: set() streamType 3 frameCount 0 flags 0004 11-12 01:34:04.856 1686 1938 V AudioTrack: createTrack_l() output 2 afLatency 50 11-12 01:34:04.856 205 856 V AudioFlinger: createTrack() lSessionId: 22 11-12 01:34:04.857 1686 1938 V AudioTrack: AUDIO_OUTPUT_FLAG_FAST successful; frameCount 240 11-12 01:34:04.857 205 781 V AudioFlinger: acquiring 22 from 1686, for 1686 11-12 01:34:04.857 205 781 V AudioFlinger: added new entry for 22 11-12 01:34:04.868 205 774 D audio_hw_primary: out_set_parameters: enter: usecase(1: low-latency-playback) kvpairs: routing=2 11-12 01:34:04.868 1686 1943 V AudioTrack: timeout 0.008
Comment 59•8 years ago
|
||
This patch is a temporary workaround of phone reboot on android M. It seemed to work for http://www.wothke.ch/websid/ phone reboot on master android firefox on nexus-5 of android M.
Comment 60•8 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #46) > I am going to think, the problem might be caused by very high cpu usage of > AudioTrack thread. The thread handle callback from AudioFlinger and run very > high thread priority. > > Before the system reboot on nexus 5, "adb shell top -m 5 -d 3 -t" showed the > following result. > > ---------------------------------------------------- > User 68%, System 19%, IOW 0%, IRQ 0% > User 212 + Nice 0 + Sys 60 + Idle 37 + IOW 0 + IRQ 0 + SIRQ 0 = 309 > > PID TID PR CPU% S VSS RSS PCY UID Thread Proc > 3028 3182 0 126% R 1695784K 404936K fg u0_a53 AudioTrack > org.mozilla.fennec_sotaro > 3086 3086 0 6% R 3220K 1376K fg shell top top > 201 675 0 2% R 138232K 10480K fg media FastMixer > /system/bin/mediaserver > 3028 3141 0 2% S 1695784K 404936K fg u0_a53 Thread-169 > org.mozilla.fennec_sotaro > 3028 3097 0 1% S 1695784K 404936K fg u0_a53 Gecko > org.mozilla.fennec_sotaro I seems to understand why AudioTrack thread uses a lot of cpu. When the problem happened AudioTrack::setMarkerPosition() was called with very big marker value. It caused the problem. When very bit marker value is set, AudioTrack::processAudioBuffer() does not work correctly. The following time unit becomes 0. If it happens, AudioTrack::obtainBuffer() is called without timeout wait and AudioTrack thread becomes not to sleep. ------------------------------------ // Convert frame units to time units nsecs_t ns = NS_WHENEVER; if (minFrames != (uint32_t) ~0) { ns = framesToNanoseconds(minFrames, sampleRate, speed) + kWaitPeriodNs; ns -= (timeAfterCallbacks - timeBeforeCallbacks); // account for callback time // TODO: Should we warn if the callback time is too long? if (ns < 0) { ns = 0; } } http://androidxref.com/6.0.0_r1/xref/frameworks/av/media/libmedia/AudioTrack.cpp#1925
Comment 61•8 years ago
|
||
Comment on attachment 8686386 [details] [diff] [review] patch - Wrokaround of phone reboot Set to obsolete.
Attachment #8686386 -
Attachment is obsolete: true
Comment 62•8 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #60) > (In reply to Sotaro Ikeda [:sotaro] from comment #46) > > I am going to think, the problem might be caused by very high cpu usage of > > AudioTrack thread. The thread handle callback from AudioFlinger and run very > > high thread priority. > > > > Before the system reboot on nexus 5, "adb shell top -m 5 -d 3 -t" showed the > > following result. > > > > ---------------------------------------------------- > > User 68%, System 19%, IOW 0%, IRQ 0% > > User 212 + Nice 0 + Sys 60 + Idle 37 + IOW 0 + IRQ 0 + SIRQ 0 = 309 > > > > PID TID PR CPU% S VSS RSS PCY UID Thread Proc > > 3028 3182 0 126% R 1695784K 404936K fg u0_a53 AudioTrack > > org.mozilla.fennec_sotaro > > 3086 3086 0 6% R 3220K 1376K fg shell top top > > 201 675 0 2% R 138232K 10480K fg media FastMixer > > /system/bin/mediaserver > > 3028 3141 0 2% S 1695784K 404936K fg u0_a53 Thread-169 > > org.mozilla.fennec_sotaro > > 3028 3097 0 1% S 1695784K 404936K fg u0_a53 Gecko > > org.mozilla.fennec_sotaro > > I seems to understand why AudioTrack thread uses a lot of cpu. When the > problem happened AudioTrack::setMarkerPosition() was called with very big > marker value. It caused the problem. When very bit marker value is set, > AudioTrack::processAudioBuffer() does not work correctly. > > The following time unit becomes 0. If it happens, AudioTrack::obtainBuffer() > is called without timeout wait and AudioTrack thread becomes not to sleep. > I tried to prevent the above situation by local modification. The problem of http://www.wothke.ch/websid/ was addressed.
Comment 63•8 years ago
|
||
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #52) > > > > > APPLICATION PRIO NICE RTPRI SCHED NAME > > > AudioTrack -3 -16 2 1 AudioTrack > > > > AudioTrack thread is very high priority, if the thread consume almost all > > cpu resource, another thread could not use cpu. It might cause this problem. > > This is interesting, but the 6P has 8 cores. It could easily schedule stuff > on one of those, right? Maybe something prevents that? AudioTrack is very high priority(RT priority) thread and it continuously called system calls when the problem happened. If system call holds global lock within kernel, it could almost freeze the whole system, I assume.
Assignee | ||
Comment 64•8 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #62) > (In reply to Sotaro Ikeda [:sotaro] from comment #60) > > (In reply to Sotaro Ikeda [:sotaro] from comment #46) > > > I am going to think, the problem might be caused by very high cpu usage of > > > AudioTrack thread. The thread handle callback from AudioFlinger and run very > > > high thread priority. > > > > > > Before the system reboot on nexus 5, "adb shell top -m 5 -d 3 -t" showed the > > > following result. > > > > > > ---------------------------------------------------- > > > User 68%, System 19%, IOW 0%, IRQ 0% > > > User 212 + Nice 0 + Sys 60 + Idle 37 + IOW 0 + IRQ 0 + SIRQ 0 = 309 > > > > > > PID TID PR CPU% S VSS RSS PCY UID Thread Proc > > > 3028 3182 0 126% R 1695784K 404936K fg u0_a53 AudioTrack > > > org.mozilla.fennec_sotaro > > > 3086 3086 0 6% R 3220K 1376K fg shell top top > > > 201 675 0 2% R 138232K 10480K fg media FastMixer > > > /system/bin/mediaserver > > > 3028 3141 0 2% S 1695784K 404936K fg u0_a53 Thread-169 > > > org.mozilla.fennec_sotaro > > > 3028 3097 0 1% S 1695784K 404936K fg u0_a53 Gecko > > > org.mozilla.fennec_sotaro > > > > I seems to understand why AudioTrack thread uses a lot of cpu. When the > > problem happened AudioTrack::setMarkerPosition() was called with very big > > marker value. It caused the problem. When very bit marker value is set, > > AudioTrack::processAudioBuffer() does not work correctly. > > > > The following time unit becomes 0. If it happens, AudioTrack::obtainBuffer() > > is called without timeout wait and AudioTrack thread becomes not to sleep. > > > > I tried to prevent the above situation by local modification. The problem of > http://www.wothke.ch/websid/ was addressed. Interesting! I changed opensl_get_preferred_sample() to always return 44100, and things do seem improved. However, I am still able to get a hang on the Yahoo test. I think perhaps the higher iteration frequency made it easier to get into whatever bad state we're hitting.
Assignee | ||
Comment 65•8 years ago
|
||
AFAICT, the only place wilhelm calls AudioTrack::setMarkerPosition() is here[0], which is in response to registering for the SL_PLAYEVENT_HEADATMARKER event. I think the reason the marker value would be huge is because the marker position is initialized[1] to SL_TIME_UNKNOWN, which is 0xFFFFFFFF[2]. Ahem. Setting the marker position to 0 before setting the event mask seems to avoid the badness. Patch incoming. [0] http://androidxref.com/6.0.0_r1/xref/frameworks/wilhelm/src/android/AudioPlayer_to_android.cpp#2029 [1] http://androidxref.com/6.0.0_r1/xref/frameworks/wilhelm/src/itf/IPlay.c#466 [2] http://androidxref.com/6.0.0_r1/xref/frameworks/wilhelm/include/SLES/OpenSLES.h#921
Assignee | ||
Comment 66•8 years ago
|
||
Note that the spec[0] does not say the marker must be set before registering for this callback, so I think wilhelm is just busted here. It should check for SL_TIME_UNKNOWN. [0] https://www.khronos.org/registry/sles/specs/OpenSL_ES_Specification_1.0.1.pdf
Assignee | ||
Comment 67•8 years ago
|
||
Attachment #8686774 -
Flags: review?(padenot)
Assignee | ||
Comment 68•8 years ago
|
||
:esawin asked me why this would only happen on Android M, and I'm not totally sure. It looks like the busted marker position stuff has been in wilhelm for a while. My guess is that they only recently made the AudioTrack thread RT?
Comment 69•8 years ago
|
||
On andorid v5.5.1, AudioTrack thread is also RT. I think the following code affect to the problem as in Comment 60. The following was changed from android v5.5.1 to android 6.0.0. ------------------------------------ // Convert frame units to time units nsecs_t ns = NS_WHENEVER; if (minFrames != (uint32_t) ~0) { ns = framesToNanoseconds(minFrames, sampleRate, speed) + kWaitPeriodNs; ns -= (timeAfterCallbacks - timeBeforeCallbacks); // account for callback time // TODO: Should we warn if the callback time is too long? if (ns < 0) { ns = 0; } }
Comment 70•8 years ago
|
||
Comment 71•8 years ago
|
||
attachment 8686942 [details] [diff] [review] revert a part of AudioTrack code that is related to comment 69. By applying the patch to android v6.0.0, I confirmed the problem of http://www.wothke.ch/websid/ was addressed on nexus-5.
Comment 72•8 years ago
|
||
From comment 71, the change of attachment 8686942 [details] [diff] [review] that happened from v5.1.1 to v6.0.0 seems to trigger the problem.
Comment 73•8 years ago
|
||
diego, do you know if this is a known problem of android 6.0?
Flags: needinfo?(dwilson)
Comment 74•8 years ago
|
||
Comment on attachment 8686774 [details] [diff] [review] Work around busted OpenSL causing hangs/reboots on Android Review of attachment 8686774 [details] [diff] [review]: ----------------------------------------------------------------- ?!?
Attachment #8686774 -
Flags: review?(padenot) → review+
Assignee | ||
Updated•8 years ago
|
Summary: Nexus 6P reboots or hangs when playing video in Fennec → 6.0 devices reboot or hang when playing audio/video in Fennec
Assignee | ||
Comment 77•8 years ago
|
||
Comment on attachment 8686774 [details] [diff] [review] Work around busted OpenSL causing hangs/reboots on Android Approval Request Comment [Feature/regressing bug #]: Android 6.0 [User impact if declined]: Hangs and reboots when playing media [Describe test coverage new/current, TreeHerder]: local, and soon on nightly [Risks and why]: Pretty low, but we should obviously test to make sure the patch doesn't somehow cause problems. [String/UUID change made/needed]: None
Attachment #8686774 -
Flags: approval-mozilla-release?
Attachment #8686774 -
Flags: approval-mozilla-beta?
Attachment #8686774 -
Flags: approval-mozilla-aurora?
Comment 79•8 years ago
|
||
Can confirm with Nexus 5X with FF only. Restarts everytime with videogular example. http://www.videogular.com/examples/simplest-videogular-player/
I'll check back to make sure this lands and is ok on m-c before we uplift it to beta. Let's aim this for the beta 4 build on Monday.
Comment 81•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/84c21e328c74
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox45:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 45
rqou, can you please verify that the hang/reboot issue you are experiencing is fixed on the latest Nightly build (date 11/15/2015)? Thanks.
Flags: needinfo?(rqou)
Comment 83•8 years ago
|
||
Bug seems fixed in Nightly. I tested on both the videogular example I posted, and, on my own site. Thankyou!
Comment 84•8 years ago
|
||
Tested the 11/15/2015 Nightly, bug seems fixed on all the sites/WebAudio demos I tried it on.
Flags: needinfo?(rqou)
Comment 85•8 years ago
|
||
Fixed for me, nothing I tried caused a crash (Still played with issues, I was hoping that was down to the CPU being hogged, but it seems to be something else)
Updated•8 years ago
|
Comment 86•8 years ago
|
||
Comment on attachment 8686774 [details] [diff] [review] Work around busted OpenSL causing hangs/reboots on Android Taking it on all branches as it is an important issue. However, we haven't decided if we do a 42.0.1 or not.
Attachment #8686774 -
Flags: approval-mozilla-release?
Attachment #8686774 -
Flags: approval-mozilla-release+
Attachment #8686774 -
Flags: approval-mozilla-beta?
Attachment #8686774 -
Flags: approval-mozilla-beta+
Attachment #8686774 -
Flags: approval-mozilla-aurora?
Attachment #8686774 -
Flags: approval-mozilla-aurora+
Updated•8 years ago
|
Assignee: nobody → snorp
Based on comment 83, 84 and 85, this is verified on trunk (Nightly45).
Status: RESOLVED → VERIFIED
(In reply to rqou from comment #84) > Tested the 11/15/2015 Nightly, bug seems fixed on all the sites/WebAudio > demos I tried it on. Thanks for the verification! This is very helpful.
Comment 92•8 years ago
|
||
(Making summary a bit more keyword rich, since I wasn't able to find this bug prior to filing bug 1225415.)
Summary: 6.0 devices reboot or hang when playing audio/video in Fennec → Android 6.0 devices (eg Nexus 5X/6P) reboot/restart or hang when playing audio/video
Comment 93•8 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-b2g44_v2_5/rev/a8608fc16c09
status-b2g-v2.5:
--- → fixed
Comment 94•8 years ago
|
||
Verified as fixed on Nexus 5 (Android 6.0) on Firefox 43 Beta 4. I was able to reproduce this issue on Firefox 43 Beta 2, on http://www.videogular.com/examples/simplest-videogular-player/
[Tracking Requested - why for this release]: We should probably track this as it may be included in a 42.0.1 dot release. Snorp can you request uplift to m-r? Sylvestre can decide on this tomorrow, I just want to line up everything for him to do so. Thanks.
tracking-firefox42:
--- → ?
Flags: needinfo?(snorp)
Assignee | ||
Comment 96•8 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #95) > [Tracking Requested - why for this release]: We should probably track this > as it may be included in a 42.0.1 dot release. > > Snorp can you request uplift to m-r? Sylvestre can decide on this tomorrow, > I just want to line up everything for him to do so. Thanks. This is done, see comment #86
Flags: needinfo?(snorp)
Comment 98•8 years ago
|
||
Verified as fixed on Nexus 5 (Android 6.0) on Firefox 42.0.1.
Updated•8 years ago
|
Flags: needinfo?(dwilson)
Updated•8 years ago
|
Updated•7 years ago
|
Flags: needinfo?(ikumar)
Updated•3 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•