Crash in [@ mozilla::nt::PEHeaders::FindResourceEntry]
Categories
(Toolkit :: Telemetry, defect, P1)
Tracking
()
People
(Reporter: yannis, Assigned: yannis)
References
(Regression)
Details
(Keywords: regression, Whiteboard: [win:stability])
Crash Data
Attachments
(1 file)
48 bytes,
text/x-phabricator-request
|
diannaS
:
approval-mozilla-beta+
dmeehan
:
approval-mozilla-release+
dmeehan
:
approval-mozilla-esr115+
|
Details | Review |
When we send an untrusted modules ping, we sometimes crash with the following call stack:
00 xul!mozilla::nt::PEHeaders::FindResourceEntry+0x4 [/builds/worker/workspace/obj-build/dist/include/mozilla/NativeNt.h @ 972]
01 xul!mozilla::nt::PEHeaders::FindResourceLeaf+0x35 [/builds/worker/workspace/obj-build/dist/include/mozilla/NativeNt.h @ 824]
02 xul!mozilla::nt::PEHeaders::GetVersionInfo+0x35 [/builds/worker/workspace/obj-build/dist/include/mozilla/NativeNt.h @ 705]
03 xul!AddSharedLibraryFromModuleInfo+0x415 [/builds/worker/checkouts/gecko/tools/profiler/core/shared-libraries-win32.cc @ 129]
04 xul!SharedLibraryInfo::GetInfoFromPath+0x33 [/builds/worker/checkouts/gecko/tools/profiler/core/shared-libraries-win32.cc @ 162]
05 xul!mozilla::Telemetry::SerializeModule+0x535 [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModulesDataSerializer.cpp @ 177]
06 xul!mozilla::Telemetry::UntrustedModulesDataSerializer::AddSingleData::<lambda_1>::operator()+0x55c [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModulesDataSerializer.cpp @ 452]
07 xul!nsBaseHashtable<nsStringHashKey,unsigned int,unsigned int,nsDefaultConverter<unsigned int,unsigned int> >::WithEntryHandle<`lambda at /builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModulesDataSerializer.cpp:447:52'>::<lambda_1>::operator()+0x577 [/builds/worker/workspace/obj-build/dist/include/nsBaseHashtable.h @ 836]
08 xul!nsTHashtable<nsBaseHashtableET<nsStringHashKey,unsigned int> >::WithEntryHandle<`lambda at /builds/worker/workspace/obj-build/dist/include/nsBaseHashtable.h:835:15'>::<lambda_1>::operator()+0x58a [/builds/worker/workspace/obj-build/dist/include/nsTHashtable.h @ 437]
09 xul!PLDHashTable::WithEntryHandle+0x598 [/builds/worker/workspace/obj-build/dist/include/PLDHashTable.h @ 605]
0a xul!nsTHashtable<nsBaseHashtableET<nsStringHashKey,unsigned int> >::WithEntryHandle+0x598 [/builds/worker/workspace/obj-build/dist/include/nsTHashtable.h @ 434]
0b xul!nsBaseHashtable<nsStringHashKey,unsigned int,unsigned int,nsDefaultConverter<unsigned int,unsigned int> >::WithEntryHandle+0x598 [/builds/worker/workspace/obj-build/dist/include/nsBaseHashtable.h @ 834]
0c xul!mozilla::Telemetry::UntrustedModulesDataSerializer::AddSingleData+0x675 [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModulesDataSerializer.cpp @ 447]
0d xul!mozilla::Telemetry::UntrustedModulesDataSerializer::Add+0xa8 [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModulesDataSerializer.cpp @ 577]
0e xul!mozilla::Telemetry::MultiGetUntrustedModulesData::Serialize+0xa3 [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModules.cpp @ 228]
0f xul!mozilla::Telemetry::GetUntrustedModuleLoadEvents::<lambda_1>::operator()+0x9 [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModules.cpp @ 285]
10 xul!mozilla::MozPromise<bool,nsresult,1>::InvokeMethod+0x9 [/builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h @ 654]
11 xul!mozilla::MozPromise<bool,nsresult,1>::InvokeCallbackMethod+0x9 [/builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h @ 685]
12 xul!mozilla::MozPromise<bool,nsresult,1>::ThenValue<`lambda at /builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModules.cpp:285:7',`lambda at /builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModules.cpp:286:7'>::DoResolveOrRejectInternal+0x33 [/builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h @ 870]
13 xul!mozilla::MozPromise<bool,nsresult,1>::ThenValueBase::DoResolveOrReject+0x22 [/builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h @ 623]
14 xul!mozilla::MozPromise<bool,nsresult,1>::ThenValueBase::ResolveOrRejectRunnable::Run+0x54 [/builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h @ 491]
15 xul!mozilla::RunnableTask::Run+0x11 [/builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp @ 555]
16 xul!mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal+0x7a5 [/builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp @ 879]
17 xul!mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal+0xb [/builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp @ 702]
18 xul!mozilla::TaskController::ProcessPendingMTTask+0x17 [/builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp @ 491]
19 xul!mozilla::TaskController::TaskController::<lambda_4>::operator()+0x23 [/builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp @ 218]
1a xul!mozilla::detail::RunnableFunction<`lambda at /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:218:7'>::Run+0x4a [/builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.h @ 548]
1b xul!nsThread::ProcessNextEvent+0xb53 [/builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp @ 1240]
1c xul!NS_ProcessNextEvent+0xbab [/builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp @ 479]
1d xul!mozilla::ipc::MessagePump::Run+0x25f [/builds/worker/checkouts/gecko/ipc/glue/MessagePump.cpp @ 85]
1e xul!MessageLoop::RunInternal+0x16 [/builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc @ 368]
1f xul!MessageLoop::RunHandler+0x2f [/builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc @ 362]
20 xul!MessageLoop::Run+0x4e [/builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc @ 344]
21 xul!nsBaseAppShell::Run+0x28 [/builds/worker/checkouts/gecko/widget/nsBaseAppShell.cpp @ 150]
22 xul!nsAppShell::Run+0x3a [/builds/worker/checkouts/gecko/widget/windows/nsAppShell.cpp @ 615]
23 xul!nsAppStartup::Run+0x41 [/builds/worker/checkouts/gecko/toolkit/components/startup/nsAppStartup.cpp @ 296]
24 xul!XREMain::XRE_mainRun+0xc12 [/builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp @ 5659]
25 xul!XREMain::XRE_main+0x323 [/builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp @ 5859]
26 xul!XRE_main+0x6b [/builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp @ 5915]
27 firefox!do_main+0xc6 [/builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp @ 227]
28 firefox!NS_internal_main+0x497 [/builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp @ 445]
29 firefox!wmain+0x729 [/builds/worker/checkouts/gecko/toolkit/xre/nsWindowsWMain.cpp @ 167]
2a firefox!invoke_main+0x22 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 90]
2b firefox!__scrt_common_main_seh+0x10c [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288]
2c kernel32!BaseThreadInitThunk+0xd
2d ntdll!RtlUserThreadStart+0x1d
If I am correct, this is especially likely to occur when sending pings for a module that is blocked in the main process but not in other processes. So, this crash could become dangerously high volume after bug 1837242, where this is exactly what we did.
The technical reason for the crash would be that when we map the module to send the ping, we use LOAD_LIBRARY_AS_DATAFILE | LOAD_LIBRARY_AS_IMAGE_RESOURCE
instead of just LOAD_LIBRARY_AS_IMAGE_RESOURCE
. We thus let Windows decide between the two options: loading as a data file (RVA computations will not work), or as an image file (RVA computations will work). In fact our code is designed in a way that will only work if the module is loaded as an image file because we do RVA computations, e.g. in GetImageDirectoryEntry.
When a module is already loaded in the main process, it is expected that Windows will choose to recycle the existing mapping and thus map it as an image file, and we will not experience a crash. However, if a module is blocked in the main process but not in child processes, Windows is likely to choose to map it as a data file since it is not already loaded as an image file. This would make our RVA computations wrong and can likely result in this crash.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 1•2 years ago
|
||
:gstoll, since you are the author of the regressor, bug 1536901, could you take a look?
For more information, please visit BugBot documentation.
Assignee | ||
Comment 2•2 years ago
|
||
When we prepare an untrusted modules ping, we currently let Windows
choose between loading the module as a data file or as an image file.
However our code relies on the module being loaded as an image file,
because we do RVA computations. We must use the proper flag
LOAD_LIBRARY_AS_IMAGE_RESOURCE alone, which guarantees what we want:
loading as an image file, but not for execution.
Updated•2 years ago
|
Assignee | ||
Comment 4•2 years ago
•
|
||
Comment on attachment 9342896 [details]
Bug 1842368 - Force loading modules as image files for untrusted modules ping. r=gstoll,handyman,mhowell
Beta/Release Uplift Approval Request
- User impact if declined: Crash spike of unpredictable volume due to untrusted module pings for modules that are loaded only in child processes. The volume could presumably be high due to the way we mitigated bug 1837242 - where we blocked a DLL only in the main process.
- Is this code covered by automated tests?: Unknown
- Has the fix been verified in Nightly?: No
- Needs manual test from QE?: Yes
- If yes, steps to reproduce: - With custom executable TestHooks.exe (please ask)
- On Windows 7 64-bit
- Launch TestHooks.exe
- Launch Firefox
- Open
about:third-party
- Firefox should not crash
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This change restricts the options Windows has when mapping a module. Windows used to have two options and will now only have one (the correct one). This should enforce the correct behavior.
- String changes made/needed:
- Is Android affected?: No
ESR Uplift Approval Request
- If this is not a sec:{high,crit} bug, please state case for ESR consideration: This is especially important for Windows 7 users who will be redirected to ESR 115 after end of support.
- User impact if declined: Crash spike of unpredictable volume due to untrusted module pings for modules that are loaded only in child processes. The volume could presumably be high due to the way we mitigated bug 1837242 - where we blocked a DLL only in the main process.
- Fix Landed on Version: 115
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): This change restricts the options Windows has when mapping a module. Windows used to have two options and will now only have one (the correct one). This should enforce the correct behavior.
Assignee | ||
Updated•2 years ago
|
Comment 5•2 years ago
|
||
bugherder |
Comment 7•2 years ago
|
||
bugherder |
Comment 8•2 years ago
|
||
Comment on attachment 9342896 [details]
Bug 1842368 - Force loading modules as image files for untrusted modules ping. r=gstoll,handyman,mhowell
Approved for 116.0b3
Comment 10•2 years ago
|
||
bugherder uplift |
Updated•2 years ago
|
Updated•2 years ago
|
Comment 11•2 years ago
|
||
Comment on attachment 9342896 [details]
Bug 1842368 - Force loading modules as image files for untrusted modules ping. r=gstoll,handyman,mhowell
Approved for 115.0.2
Approved for 115.0.2esr
Comment 12•2 years ago
|
||
uplift |
Updated•2 years ago
|
Comment 13•2 years ago
|
||
uplift |
Updated•2 years ago
|
Comment 14•2 years ago
|
||
uplift |
Comment 15•2 years ago
|
||
Quick question @Yannis @Dianna how can we verify the fix here if the users are not allowed to install or update Fireox later than 115 on Windows 7 ? is there a command we can run to force the install for 116 or 117?
We were able to reproduce the crash in 115 but we cant verify the fix in 116 or 117.
We were able to Verify the fix in 115.0.2 as well as 115.0.2esr. I will update the flags.
Comment 16•2 years ago
|
||
We were able to Reproduce the crash with Zipped builds and we could Verify the fix for 116 and 117 using zipped builds on Windows 7.
We also tested 116 and 117 on a Windows 10 machine to make sure the crash was not introduced there with this fix, and the issue does not happen on Windows 10.
Assignee | ||
Comment 17•2 years ago
•
|
||
To clarify the surface of this crash: Windows 7 users could experience crashes while Firefox was gathering data about blocked DLL load attempts. If the DLL was forbidden to load in the main process by the blocklist, Firefox would load it as a data file rather than as an image, thereby breaking our assumptions for parsing its contents, which in turn could result in a crash.
Description
•