Startup Crash in [@ sandbox::InterceptionManager::PatchNtdll] via mozilla::SandboxBroker::LaunchApp
Categories
(Core :: Security: Process Sandboxing, defect, P3)
Tracking
()
People
(Reporter: wsmwk, Unassigned)
References
Details
(Keywords: crash, regression, Whiteboard: [tbird crash])
Crash Data
Attachments
(2 obsolete files)
This occurs in Firefox at a low crash rate, but Thunderbird 91 is much higher, as can be seen in https://crash-stats.mozilla.org/signature/?signature=logging%3A%3ALogMessage%3A%3A~LogMessage&date=%3E%3D2021-04-24T00%3A00%3A00.000Z&date=%3C2021-10-24T23%3A59%3A00.000Z#graphs
Earlist Crash report is : https://crash-stats.mozilla.org/report/index/8ad2318c-a60e-4136-acfd-452270210903 tb91.0.3
Reason: EXCEPTION_BREAKPOINT
Top 10 frames of crashing thread:
0 thunderbird.exe logging::LogMessage::~LogMessage security/sandbox/chromium-shim/base/logging.cpp:111
1 thunderbird.exe sandbox::InterceptionManager::PatchNtdll security/sandbox/chromium/sandbox/win/src/interception.cc:392
2 thunderbird.exe sandbox::InterceptionManager::InitializeInterceptions security/sandbox/chromium/sandbox/win/src/interception.cc:150
3 thunderbird.exe sandbox::PolicyBase::SetupAllInterceptions security/sandbox/chromium/sandbox/win/src/sandbox_policy_base.cc:705
4 thunderbird.exe sandbox::PolicyBase::AddTarget security/sandbox/chromium/sandbox/win/src/sandbox_policy_base.cc:524
5 thunderbird.exe sandbox::BrokerServicesBase::SpawnTarget security/sandbox/chromium/sandbox/win/src/broker_services.cc:638
6 xul.dll mozilla::SandboxBroker::LaunchApp security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp:282
7 xul.dll mozilla::ipc::WindowsProcessLauncher::DoLaunch ipc/glue/GeckoChildProcessHost.cpp:1523
8 xul.dll mozilla::ipc::BaseProcessLauncher::PerformAsyncLaunch ipc/glue/GeckoChildProcessHost.cpp:1012
9 xul.dll mozilla::detail::ProxyRunnable<mozilla::MozPromise<mozilla::ipc::LaunchResults, mozilla::ipc::LaunchError, 0>, RefPtr<mozilla::MozPromise<mozilla::ipc::LaunchResults, mozilla::ipc::LaunchError, 0> > xpcom/threads/MozPromise.h:1536
More recent Thunderbird crash bp-0b29108e-c947-4a73-89b4-7d8f50211023
Comment 2•4 years ago
|
||
Not sure why the crash reason isn't shown, it usually does for MOZ_CRASH: "Hit fatal chromium sandbox condition."
https://searchfox.org/mozilla-central/rev/ff5309e67b20a9976ea762f9ec5f657da0801490/security/sandbox/chromium-shim/base/logging.cpp#111
Not too much to go on in the crash report.
| Reporter | ||
Comment 3•4 years ago
|
||
#154, so No longer a topcrash
| Reporter | ||
Comment 4•3 years ago
|
||
Recent Firefox example bp-d78d7c30-8c0d-41ec-bb00-7ca7d0221231
Recent Thunderbird example bp-91308484-a7e1-4cb1-9d18-e157e0221226
Updated•3 years ago
|
Comment 5•2 years ago
•
|
||
These crashes are failure to reserve and/or commit memory in the child process from code running in the main process in sandbox::InterceptionManager::PatchNtdll. Most of the crashes show a memory exhaustion condition on the system, with only few megabytes left of page file and/or physical memory.
// Reserve a full 64k memory range in the child process.
HANDLE child = child_->Process();
BYTE* thunk_base = reinterpret_cast<BYTE*>(::VirtualAllocEx(
child, nullptr, kAllocGranularity, MEM_RESERVE, PAGE_NOACCESS));
// Find an aligned, random location within the reserved range.
size_t thunk_bytes =
interceptions_.size() * sizeof(ThunkData) + sizeof(DllInterceptionData);
size_t thunk_offset = internal::GetGranularAlignedRandomOffset(thunk_bytes);
// Split the base and offset along page boundaries.
thunk_base += thunk_offset & ~(kPageSize - 1);
thunk_offset &= kPageSize - 1;
// Make an aligned, padded allocation, and move the pointer to our chunk.
size_t thunk_bytes_padded = (thunk_bytes + kPageSize - 1) & ~(kPageSize - 1);
thunk_base = reinterpret_cast<BYTE*>(
::VirtualAllocEx(child, thunk_base, thunk_bytes_padded, MEM_COMMIT,
PAGE_EXECUTE_READWRITE));
CHECK(thunk_base); // If this fails we'd crash anyway on an invalid access.
We could potentially recycle the weird trick here and sleep and retry (bug 1716727) to try to save the main process.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 6•2 years ago
|
||
The bug is linked to a topcrash signature, which matches the following criterion:
- Top 20 desktop browser crashes on release (startup)
For more information, please visit BugBot documentation.
Comment 7•1 year ago
|
||
This is pretty high on the parent process topcrash list. Request re-review of the severity. I think we may want to prioritize comment 5 as well if we think it can help mitigate this.
Comment 8•1 year ago
|
||
It's an OOM so it's only partially actionable. I don't think it should be a priority.
Comment 9•1 year ago
|
||
Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.
For more information, please visit BugBot documentation.
| Reporter | ||
Comment 10•3 months ago
|
||
Toward the end of July, the Firefox crash rate doubles.
https://crash-stats.mozilla.org/signature/?product=Firefox&signature=logging%3A%3ALogMessage%3A%3A~LogMessage&date=%3E%3D2025-04-04T02%3A46%3A00.000Z&date=%3C2025-10-04T02%3A46%3A00.000Z#graphs
Comment 11•3 months ago
|
||
I assume the assessment from comment 5 is still valid?
| Reporter | ||
Comment 12•3 months ago
|
||
Curiously, the initial bump is ~July 24:
Comment 13•3 months ago
|
||
(In reply to Jens Stutte [:jstutte] from comment #11)
I assume the assessment from comment 5 is still valid?
Thanks for the ping! While the OOM volume is still present as background volume, there is new volume with startup crashes from people that don't show symptoms of memory exhaustion (examples 1, 2, 3). So a new, serious bug seems to have appeared within this old signature. I'll take a look.
Updated•3 months ago
|
Comment 14•3 months ago
|
||
Regarding the code shared in comment 5 -- despite the comment next to it, I don't think that a CHECK is required here. It should be safe to just return SBOX_ERROR_CANNOT_WRITE_INTERCEPTION_THUNK instead of crashing. This would cause a failure to start the child process, rather than a parent process crash.
My only concern with that would be: do we monitor process launch failures enough, e.g. would we have been able to catch bug 1992678 if this CHECK was just a return SBOX_ERROR_CANNOT_WRITE_INTERCEPTION_THUNK;?
| Comment hidden (obsolete) |
Updated•2 months ago
|
| Comment hidden (obsolete) |
Updated•2 months ago
|
Updated•2 months ago
|
Comment 17•2 months ago
|
||
I got my bug IDs mixed up, sorry for the noise.
Updated•2 months ago
|
Comment 18•2 months ago
|
||
Reflecting the signature change after bug 1998112. We will no longer be mixing all chromium sandbox CHECK failures in the same signature.
Updated•2 months ago
|
Description
•