Closed Bug 1551856 Opened 6 years ago Closed 8 months ago

Crash in [@ shutdownhang | NtAlpcSendWaitReceivePort]

Categories

(Core :: Widget: Win32, defect, P2)

x86
Windows 7
defect

Tracking

()

RESOLVED FIXED
141 Branch
Tracking Status
firefox-esr60 --- wontfix
firefox-esr115 --- wontfix
firefox-esr128 --- wontfix
firefox-esr140 --- fixed
firefox66 --- wontfix
firefox67 --- wontfix
firefox68 --- wontfix
firefox69 --- wontfix
firefox70 --- wontfix
firefox139 --- wontfix
firefox140 --- wontfix
firefox141 --- fixed

People

(Reporter: pascalc, Assigned: handyman)

References

(Blocks 1 open bug)

Details

(Keywords: crash, perf-alert)

Crash Data

Attachments

(1 file)

This bug is for crash report bp-ed41785f-17be-4d4d-8314-713470190515.

Top 10 frames of crashing thread:

0 ntdll.dll KiFastSystemCallRet 
1 ntdll.dll NtAlpcSendWaitReceivePort 
2 rpcrt4.dll LRPC_CASSOCIATION::AlpcSendWaitReceivePort 
3 rpcrt4.dll LRPC_BASE_CCALL::DoSendReceive 
4 rpcrt4.dll LRPC_BASE_CCALL::SendReceive 
5 rpcrt4.dll LRPC_CCALL::SendReceive 
6 rpcrt4.dll I_RpcSendReceive 
7 rpcrt4.dll NdrSendReceive 
8 rpcrt4.dll NdrpSendReceive 
9 rpcrt4.dll _imp_load__FreeAddrInfoW 

These crashes either have Moz Crash reason MOZ_CRASH(Shutdown too long, probably frozen, causing a crash.) or MOZ_CRASH(Shutdown hanging before starting.)

Volume isn't terribly high in 67 - under 200 crashes so far. Affects 68/69 as well.

Some of the stacks here have widget/windows/AudioSession.cpp:282 stuff in them. Maybe jim would know better where to bucket this one? Volume isn't super high on release.

Flags: needinfo?(jmathies)
Component: Untriaged → Widget: Win32
Flags: needinfo?(jmathies)
Product: Firefox → Core
Priority: -- → P2
QA Whiteboard: qa-not-actionable

This is spiking again along with bug 1755700.

Severity: critical → S2
Assignee: nobody → davidp99

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 20 desktop browser crashes on release (startup)

For more information, please visit BugBot documentation.

Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.

For more information, please visit BugBot documentation.

:handyman is well aware, but for others' reference: these crashes are largely or entirely arising from within "FreeAudioSession" when it releases the IAgileReference for the CAudioSessionControl. This is probably not distinct from bug 1794799.

I suspect this won't be fixable without a rearchitecture of AudioSession to allow for proper asynchronous disposal.

See Also: → 1794799

(In reply to Ray Kraesig [:rkraesig] from comment #6)

:handyman is well aware, but for others' reference: these crashes are largely or entirely arising from within "FreeAudioSession" when it releases the IAgileReference for the CAudioSessionControl. This is probably not distinct from bug 1794799.

I suspect this won't be fixable without a rearchitecture of AudioSession to allow for proper asynchronous disposal.

As a hack, I wonder if we could just forget and leak the mAudioSessionControl here if

  • we are the parent
  • we are not in a leak checking build
  • we are AppShutdown::IsInOrBeyond(ShutdownPhase::AppShutdownConfirmed)

assuming that the process exit will make the OS cleanup everything.

(In reply to Jens Stutte [:jstutte] from comment #7)

As a hack, I wonder if we could just forget and leak the mAudioSessionControl here if

That's a good idea, although it's not uncommon for Windows COM services to crash atexit when there are leaks, so it may just move the crash.

I'm currently playing with two patches to experiment with this:

  1. Makes the AudioSession a proper XPCOM service instead of the static object it is. The hope is that the service can cleanly uninit earlier than we do with AudioSession, hopefully concurrent with other shutdown actions, so we won't time out.
  2. Just moves the StartAudioSession/StopAudioSession calls to nsLayoutStatics so that it can (I think) assume that cubeb isn't playing anything -- meaning that shutdown doesn't have to synchronize with the main thread, apart from its role as STA. This could also end up just moving the crash.

This allows us to shut down the IAudioSessionManager after Cubeb has stopped,
so there shouldn't be STA apartment deadlock issues with it.
IAudioSessionManager's shutdown always requires the STA. This patch eliminates
MTA threads involvement at shutdown, which was previously needed for thread
safety. We keep other operations on MTA (background) threads, for performance.

This also removes a lot of unused functionality. AudioSession hasn't been
needed outside of the parent process since audio remoting landed.

Severity: S2 → S3
Pushed by sstanca@mozilla.com: https://github.com/mozilla-firefox/firefox/commit/90d9be235e2c https://hg.mozilla.org/integration/autoland/rev/08cf64b4e65b Revert "Bug 1551856: Move AudioSession to LayoutStatics r=mccr8,win-reviewers,gstoll" for causing gtests assertion failures in AudioSession.cpp.

Reverted this because it was causing gtests assertion failures in AudioSession.cpp.

Flags: needinfo?(davidp99)
Flags: needinfo?(davidp99)
Status: NEW → RESOLVED
Closed: 8 months ago
Resolution: --- → FIXED
Target Milestone: --- → 141 Branch

The patch landed in nightly and beta is affected.
:handyman, is this bug important enough to require an uplift?

For more information, please visit BugBot documentation.

Flags: needinfo?(davidp99)

No uplift needed.

Flags: needinfo?(davidp99)
Regressions: 1973493
QA Whiteboard: qa-not-actionable → [qa-triage-done-c142/b141] qa-not-actionable

(In reply to Pulsebot from comment #11)

Pushed by sstanca@mozilla.com:
https://github.com/mozilla-firefox/firefox/commit/90d9be235e2c
https://hg.mozilla.org/integration/autoland/rev/08cf64b4e65b
Revert "Bug 1551856: Move AudioSession to LayoutStatics
r=mccr8,win-reviewers,gstoll" for causing gtests assertion failures in
AudioSession.cpp.

Perfherder has detected a browsertime performance change from push 08cf64b4e65bacbdc488fc6ee0e5acea61ff8d19. As author of one of the patches included in that push, we need your help to address this regression.

If you have any questions or need any help with the investigation, please reach out to a performance sheriff. Alternatively, you can find help on Slack by joining #perf-help, and on Matrix you can find help by joining #perftest.

Regressions:

Ratio Test Platform Options Absolute values (old vs new) Performance Profiles
2% speedometer jQuery-TodoMVC/Adding100Items/Async linux1804-64-nightlyasrelease-qr fission webrender 3.24 -> 3.30 Before/After

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a performance sheriff to do that for you.

You can run all of these tests on try with ./mach try perf --alert 45585

The following documentation link provides more information about this command.

Keywords: perf-alert

Please nominate this for ESR140 uplift.

Flags: needinfo?(davidp99)

Comment on attachment 9491603 [details]
Bug 1551856: Move AudioSession to LayoutStatics r=#win-reviewers!,mccr8!

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Large reduction in crashes
  • User impact if declined: More crashes, mostly at shutdown
  • Fix Landed on Version: 141
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The change is already in the latest release. We do not believe we have seen new crashes.
Flags: needinfo?(davidp99)
Attachment #9491603 - Flags: approval-mozilla-esr140?

Comment on attachment 9491603 [details]
Bug 1551856: Move AudioSession to LayoutStatics r=#win-reviewers!,mccr8!

Approved for 140.2esr.

Attachment #9491603 - Flags: approval-mozilla-esr140? → approval-mozilla-esr140+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: