Closed Bug 1755700 Opened 2 years ago Closed 2 years ago

Crash in [@ mozilla::widget::AudioSession::Start]

Categories

(Core :: Widget: Win32, defect)

Firefox 99
Unspecified
Windows 11
defect

Tracking

()

RESOLVED FIXED
105 Branch
Tracking Status
thunderbird_esr102 --- fixed
firefox-esr91 --- unaffected
firefox-esr102 --- fixed
firefox97 --- unaffected
firefox98 --- unaffected
firefox99 + fixed
firefox102 --- wontfix
firefox103 --- wontfix
firefox104 --- fixed
firefox105 --- fixed

People

(Reporter: calixte, Assigned: handyman)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: crash, regression, Whiteboard: [tbird crash])

Crash Data

Attachments

(4 files)

Crash report: https://crash-stats.mozilla.org/report/index/d058ccd3-496b-4ca9-910c-ca1b60220216

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0 xul.dll mozilla::widget::AudioSession::Start widget/windows/AudioSession.cpp:214
1 xul.dll mozilla::detail::RunnableFunction<`lambda at /builds/worker/checkouts/gecko/widget/windows/AudioSession.cpp:98:7'>::Run xpcom/threads/nsThreadUtils.h:531
2 xul.dll nsThreadPool::Run xpcom/threads/nsThreadPool.cpp:306
3 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1165
4 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:300
5 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:324
6 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:306
7 xul.dll static nsThread::ThreadFunc xpcom/threads/nsThread.cpp:387
8 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:399
9 nss3.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:139

There is 1 crash in nightly 99 with buildid 20220216094238. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1617283.

[1] https://hg.mozilla.org/mozilla-central/rev?node=e1b946a41694

Flags: needinfo?(ssummar.bee16seecs)

Backout of bug 1617283 incoming.

Status: NEW → RESOLVED
Crash Signature: [@ mozilla::widget::AudioSession::Start] → [@ mozilla::widget::AudioSession::Start] [@ shutdownhang | nsThreadPool::ShutdownWithTimeout] [@ shutdownhang | kernelbase.dll | _tailMerge_d3dcompiler_47.dll | nsThreadPool::ShutdownWithTimeout]
Has Regression Range: --- → yes
Closed: 2 years ago
Flags: needinfo?(ssummar.bee16seecs)
Resolution: --- → FIXED
Target Milestone: --- → 99 Branch
Crash Signature: [@ mozilla::widget::AudioSession::Start] [@ shutdownhang | nsThreadPool::ShutdownWithTimeout] [@ shutdownhang | kernelbase.dll | _tailMerge_d3dcompiler_47.dll | nsThreadPool::ShutdownWithTimeout] → [@ audioses.dll | RtlpProtectHeap | RtlProtectHeap | <unknown in ntdll.dll> | RtlProtectHeap | RtlpCallVectoredHandlers | mozilla::widget::AudioSession::Start] [@ ATL::CComPtrBase<T>::CComPtrBase<T> ] [@ mozilla::widget::AudioSession::Start] [@ RtlpPro…
No longer blocks: 1617283

Crash signature got reports again starting with Firefox 101 after it had been fixed by backout for v99 and v100. ~500 crash reports stored for a release, most are startup crashes. Windows only. CC David as reviewer of bug 1617283 whose relanding revived the signature.

Crash report: https://crash-stats.mozilla.org/report/index/c8ce22b9-b1d2-4d46-b3b6-f591f0220711

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0 xul.dll mozilla::widget::AudioSession::Start widget/windows/AudioSession.cpp:219
1 xul.dll mozilla::detail::RunnableFunction<`lambda at /builds/worker/checkouts/gecko/widget/windows/AudioSession.cpp:99:7'>::Run xpcom/threads/nsThreadUtils.h:531
2 xul.dll nsThreadPool::Run xpcom/threads/nsThreadPool.cpp:310
3 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1174
4 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run ipc/glue/MessagePump.cpp:300
5 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:373
6 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:355
7 xul.dll static nsThread::ThreadFunc xpcom/threads/nsThread.cpp:378
8 nss3.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:399
9 nss3.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:139
Status: RESOLVED → REOPENED
Flags: needinfo?(davidp99)
Resolution: FIXED → ---
Target Milestone: 99 Branch → ---
Assignee: nobody → davidp99
Flags: needinfo?(davidp99)

These crashes (and some hangs) are "startup crashes at shutdown". AudioSession::Start is dispatched as part of browser startup (before the main loop) or as part of handing AudioSession::OnSessionDisconnected. Nearly all of the crashes with these signatures are the former (the other issues are part of bug 1777518). Specifically, they tend to have the main thread in ShutdownXPCOM. This means that StopAudioSession has already run and potentially destroyed the AudioSession. AudioSession::Start will then crash accessing this.


Some examples:

Stack too corrupt to tell if in Shutdown:
https://crash-stats.mozilla.org/report/index/d6069dd6-f330-4c08-9f0b-ac1a30220604
https://crash-stats.mozilla.org/report/index/4cd49f41-b40e-4e75-a5f3-07b800220712

[run < 10 seconds] Stack is in shutdown, past StopAudioSession:
https://crash-stats.mozilla.org/report/index/d058ccd3-496b-4ca9-910c-ca1b60220216
https://crash-stats.mozilla.org/report/index/fe1ba3d1-8ae8-4913-b39c-0390c0220621

[run >10 seconds] Stack is in shutdown, past StopAudioSession:
https://crash-stats.mozilla.org/report/index/d2b902c4-e8fa-439e-9724-5d24f0220714 (1260 s)
https://crash-stats.mozilla.org/report/index/352df25c-5ecb-47b0-bdcb-3a2de0220715 (20 s)
https://crash-stats.mozilla.org/report/index/685405ea-57ea-4a86-b93f-7a2ca0220715 (218 s)
https://crash-stats.mozilla.org/report/index/df64e2c6-fd36-45ca-9d8c-348110220625 (35 s)


AudioSession::Start usually completes in the first few seconds of running and these stacks are not related to OnSessionDisconnected so it is odd that some of these are complete browser runs that seem not to have finished it. It is possible that some COM method is hanging the Start call until XPCOM starts terminating threads but that's strange. I don't have any other theories.

The simplest solution is to have the AudioSession outlive XPCOM threads (e.g. clear it during ShutdownPhase::XPCOMShutdownFinal).

Moves StopAudioSession earlier -- to nsAppShell::Run -- where we call StartAudioSession.

Concurrent operations in the MTA make the required lifetime of the AudioSession complex. In particular, it cannot be known when/if any methods are currently queued or being executed (in particular, Start). In order to make this safe, we keep the AudioSession at least as long as XPCOM is running background threads.

This patch also removes a long-defunct state machine and does some basic code cleanup.

Depends on D152299

A deadlock issue with our main thread audio playback handling arises when we try to destroy the AudioSessionControl from the MTA, despite it being an MTA object. We therefore dispatch its destruction to the main thread (STA) so the deadlock is impossible. In order to use it from the STA, we should wrap it in an AgileReference.

Depends on D152300

Destroying the IAudioSessionControl has concurrency issues with audio playback that require it to be done on the main thread. We have been trying to create a new AudioSessionControl on a background thread while this happens but some AudioSessionControl internals/dependencies are not thread safe. In order to avoid concurrency crashes, we destroy the old IAudioSessionControl on the main thread, then dispatch the restart to a background (MTA) thread asynchronously.

Depends on D152301

Attachment #9286292 - Attachment description: Bug 1755700: Serialize IAudioSessionControl destruction/re-creation r=cmartin! → Bug 1755700: Perform IAudioSessionControl destruction/re-creation in sequence r=cmartin!
Attachment #9286292 - Attachment description: Bug 1755700: Perform IAudioSessionControl destruction/re-creation in sequence r=cmartin! → Bug 1755700: Serialize IAudioSessionControl destruction/re-creation r=cmartin!
Pushed by daparks@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b8e85f2c1831
Use StartAudioSession and StopAudioSession symmetrically r=cmartin
https://hg.mozilla.org/integration/autoland/rev/1a94e7579066
Simplify ownership of AudioSession r=cmartin
https://hg.mozilla.org/integration/autoland/rev/9e7a0d712d9e
Use AgileReference for cross-apartment AudioSessionControl use r=Jamie
https://hg.mozilla.org/integration/autoland/rev/62e28f4dc1af
Serialize IAudioSessionControl destruction/re-creation r=cmartin

The patch landed in nightly and beta is affected.
:handyman, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox104 to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(davidp99)

Leaving NI open while I monitor this for fallout (so far so good). If it still looks stable in a few days then we'll try for uplift.

Comment on attachment 9286292 [details]
Bug 1755700: Serialize IAudioSessionControl destruction/re-creation r=cmartin!

Beta/Release Uplift Approval Request

  • User impact if declined: Although the crash signatures attached to this bug are too wide and include a lot of unrelated crashes, this crash happens frequently, especially when connecting to remote desktop sessions running Fx. Windows only.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): Contains some substantial changes to object ownership and some advanced COM and threading activities, but nothing more elaborate than what was there. The riskiest times are 1) changing audio devices and 2) shutdown. Those are the same intermittent crashes this is intended to fix.
  • String changes made/needed: N/A
  • Is Android affected?: No
Flags: needinfo?(davidp99)
Attachment #9286292 - Flags: approval-mozilla-beta?
Attachment #9286289 - Flags: approval-mozilla-beta?
Attachment #9286290 - Flags: approval-mozilla-beta?
Attachment #9286291 - Flags: approval-mozilla-beta?

Comment on attachment 9286289 [details]
Bug 1755700: Use StartAudioSession and StopAudioSession symmetrically r=cmartin!

Approved for 104.0b8

Attachment #9286289 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9286290 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9286291 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Attachment #9286292 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Whiteboard: [tbird crash]
See Also: → 1785808

What do you thinking about taking this on ESR102 as well? RDP connections seem enterprisey in nature and it appears TB is also affected per the recent See Also. The patches graft cleanly.

Flags: needinfo?(davidp99)

Comment on attachment 9286292 [details]
Bug 1755700: Serialize IAudioSessionControl destruction/re-creation r=cmartin!

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: This is a pretty ugly crash that happens a lot before complete startup, during shutdown, or during certain special audio device events like connecting through RDP.
  • User impact if declined: General instability, especially when reconnecting during RDP.
  • Fix Landed on Version: 104 (uplifted)
  • Risk to taking this patch: Medium
  • Why is the change risky/not risky? (and alternatives if risky): It would be high as these are serious changes but the patches have been very stable and cleared up a lot of crashes. There is apparently a very small remaining Windows 7 issue that may or may not have been there before. But stability seems much improved for all Windows versions.
Flags: needinfo?(davidp99)
Attachment #9286292 - Flags: approval-mozilla-esr102?
Attachment #9286289 - Flags: approval-mozilla-esr102?
Attachment #9286290 - Flags: approval-mozilla-esr102?
Attachment #9286291 - Flags: approval-mozilla-esr102?

(In reply to Ryan VanderMeulen [:RyanVM] from comment #16)

What do you thinking about taking this on ESR102 as well? RDP connections seem enterprisey in nature and it appears TB is also affected per the recent See Also. The patches graft cleanly.

Let's try. This is looking very stable.

Comment on attachment 9286289 [details]
Bug 1755700: Use StartAudioSession and StopAudioSession symmetrically r=cmartin!

This is performing well in the field, approved for 102.3esr so TB and enterprise users can also benefit.

Attachment #9286289 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+
Attachment #9286290 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+
Attachment #9286291 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+
Attachment #9286292 - Flags: approval-mozilla-esr102? → approval-mozilla-esr102+

Copying crash signatures from duplicate bugs.

Crash Signature: _tailMerge_d3dcompiler_47.dll | nsThreadPool::ShutdownWithTimeout] [@ shutdownhang | ntdll.dll | _tailMerge_d3dcompiler_47.dll | profiler_thread_sleep] → _tailMerge_d3dcompiler_47.dll | nsThreadPool::ShutdownWithTimeout] [@ shutdownhang | ntdll.dll | _tailMerge_d3dcompiler_47.dll | profiler_thread_sleep] [@ mozilla::widget::AudioSession::Start()]
See Also: 1785808
See Also: → 1794799
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: