Closed Bug 1842368 Opened 2 years ago Closed 2 years ago

Crash in [@ mozilla::nt::PEHeaders::FindResourceEntry]

Categories

(Toolkit :: Telemetry, defect, P1)

defect

Tracking

()

VERIFIED FIXED
117 Branch
Tracking Status
firefox-esr102 --- unaffected
firefox-esr115 115+ verified
firefox115 + verified
firefox116 + verified
firefox117 + verified

People

(Reporter: yannis, Assigned: yannis)

References

(Regression)

Details

(Keywords: regression, Whiteboard: [win:stability])

Crash Data

Attachments

(1 file)

When we send an untrusted modules ping, we sometimes crash with the following call stack:

00 xul!mozilla::nt::PEHeaders::FindResourceEntry+0x4 [/builds/worker/workspace/obj-build/dist/include/mozilla/NativeNt.h @ 972]
01 xul!mozilla::nt::PEHeaders::FindResourceLeaf+0x35 [/builds/worker/workspace/obj-build/dist/include/mozilla/NativeNt.h @ 824]
02 xul!mozilla::nt::PEHeaders::GetVersionInfo+0x35 [/builds/worker/workspace/obj-build/dist/include/mozilla/NativeNt.h @ 705]
03 xul!AddSharedLibraryFromModuleInfo+0x415 [/builds/worker/checkouts/gecko/tools/profiler/core/shared-libraries-win32.cc @ 129] 
04 xul!SharedLibraryInfo::GetInfoFromPath+0x33 [/builds/worker/checkouts/gecko/tools/profiler/core/shared-libraries-win32.cc @ 162]
05 xul!mozilla::Telemetry::SerializeModule+0x535 [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModulesDataSerializer.cpp @ 177] 
06 xul!mozilla::Telemetry::UntrustedModulesDataSerializer::AddSingleData::<lambda_1>::operator()+0x55c [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModulesDataSerializer.cpp @ 452] 
07 xul!nsBaseHashtable<nsStringHashKey,unsigned int,unsigned int,nsDefaultConverter<unsigned int,unsigned int> >::WithEntryHandle<`lambda at /builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModulesDataSerializer.cpp:447:52'>::<lambda_1>::operator()+0x577 [/builds/worker/workspace/obj-build/dist/include/nsBaseHashtable.h @ 836]
08 xul!nsTHashtable<nsBaseHashtableET<nsStringHashKey,unsigned int> >::WithEntryHandle<`lambda at /builds/worker/workspace/obj-build/dist/include/nsBaseHashtable.h:835:15'>::<lambda_1>::operator()+0x58a [/builds/worker/workspace/obj-build/dist/include/nsTHashtable.h @ 437]
09 xul!PLDHashTable::WithEntryHandle+0x598 [/builds/worker/workspace/obj-build/dist/include/PLDHashTable.h @ 605] 
0a xul!nsTHashtable<nsBaseHashtableET<nsStringHashKey,unsigned int> >::WithEntryHandle+0x598 [/builds/worker/workspace/obj-build/dist/include/nsTHashtable.h @ 434] 
0b xul!nsBaseHashtable<nsStringHashKey,unsigned int,unsigned int,nsDefaultConverter<unsigned int,unsigned int> >::WithEntryHandle+0x598 [/builds/worker/workspace/obj-build/dist/include/nsBaseHashtable.h @ 834] 
0c xul!mozilla::Telemetry::UntrustedModulesDataSerializer::AddSingleData+0x675 [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModulesDataSerializer.cpp @ 447] 
0d xul!mozilla::Telemetry::UntrustedModulesDataSerializer::Add+0xa8 [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModulesDataSerializer.cpp @ 577] 
0e xul!mozilla::Telemetry::MultiGetUntrustedModulesData::Serialize+0xa3 [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModules.cpp @ 228] 
0f xul!mozilla::Telemetry::GetUntrustedModuleLoadEvents::<lambda_1>::operator()+0x9 [/builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModules.cpp @ 285] 
10 xul!mozilla::MozPromise<bool,nsresult,1>::InvokeMethod+0x9 [/builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h @ 654] 
11 xul!mozilla::MozPromise<bool,nsresult,1>::InvokeCallbackMethod+0x9 [/builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h @ 685] 
12 xul!mozilla::MozPromise<bool,nsresult,1>::ThenValue<`lambda at /builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModules.cpp:285:7',`lambda at /builds/worker/checkouts/gecko/toolkit/components/telemetry/other/UntrustedModules.cpp:286:7'>::DoResolveOrRejectInternal+0x33 [/builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h @ 870] 
13 xul!mozilla::MozPromise<bool,nsresult,1>::ThenValueBase::DoResolveOrReject+0x22 [/builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h @ 623] 
14 xul!mozilla::MozPromise<bool,nsresult,1>::ThenValueBase::ResolveOrRejectRunnable::Run+0x54 [/builds/worker/workspace/obj-build/dist/include/mozilla/MozPromise.h @ 491] 
15 xul!mozilla::RunnableTask::Run+0x11 [/builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp @ 555] 
16 xul!mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal+0x7a5 [/builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp @ 879] 
17 xul!mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal+0xb [/builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp @ 702]
18 xul!mozilla::TaskController::ProcessPendingMTTask+0x17 [/builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp @ 491] 
19 xul!mozilla::TaskController::TaskController::<lambda_4>::operator()+0x23 [/builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp @ 218] 
1a xul!mozilla::detail::RunnableFunction<`lambda at /builds/worker/checkouts/gecko/xpcom/threads/TaskController.cpp:218:7'>::Run+0x4a [/builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.h @ 548] 
1b xul!nsThread::ProcessNextEvent+0xb53 [/builds/worker/checkouts/gecko/xpcom/threads/nsThread.cpp @ 1240] 
1c xul!NS_ProcessNextEvent+0xbab [/builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp @ 479] 
1d xul!mozilla::ipc::MessagePump::Run+0x25f [/builds/worker/checkouts/gecko/ipc/glue/MessagePump.cpp @ 85] 
1e xul!MessageLoop::RunInternal+0x16 [/builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc @ 368] 
1f xul!MessageLoop::RunHandler+0x2f [/builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc @ 362] 
20 xul!MessageLoop::Run+0x4e [/builds/worker/checkouts/gecko/ipc/chromium/src/base/message_loop.cc @ 344] 
21 xul!nsBaseAppShell::Run+0x28 [/builds/worker/checkouts/gecko/widget/nsBaseAppShell.cpp @ 150] 
22 xul!nsAppShell::Run+0x3a [/builds/worker/checkouts/gecko/widget/windows/nsAppShell.cpp @ 615] 
23 xul!nsAppStartup::Run+0x41 [/builds/worker/checkouts/gecko/toolkit/components/startup/nsAppStartup.cpp @ 296] 
24 xul!XREMain::XRE_mainRun+0xc12 [/builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp @ 5659]
25 xul!XREMain::XRE_main+0x323 [/builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp @ 5859] 
26 xul!XRE_main+0x6b [/builds/worker/checkouts/gecko/toolkit/xre/nsAppRunner.cpp @ 5915] 
27 firefox!do_main+0xc6 [/builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp @ 227] 
28 firefox!NS_internal_main+0x497 [/builds/worker/checkouts/gecko/browser/app/nsBrowserApp.cpp @ 445] 
29 firefox!wmain+0x729 [/builds/worker/checkouts/gecko/toolkit/xre/nsWindowsWMain.cpp @ 167] 
2a firefox!invoke_main+0x22 [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 90] 
2b firefox!__scrt_common_main_seh+0x10c [D:\a\_work\1\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl @ 288]
2c kernel32!BaseThreadInitThunk+0xd
2d ntdll!RtlUserThreadStart+0x1d

If I am correct, this is especially likely to occur when sending pings for a module that is blocked in the main process but not in other processes. So, this crash could become dangerously high volume after bug 1837242, where this is exactly what we did.

The technical reason for the crash would be that when we map the module to send the ping, we use LOAD_LIBRARY_AS_DATAFILE | LOAD_LIBRARY_AS_IMAGE_RESOURCE instead of just LOAD_LIBRARY_AS_IMAGE_RESOURCE. We thus let Windows decide between the two options: loading as a data file (RVA computations will not work), or as an image file (RVA computations will work). In fact our code is designed in a way that will only work if the module is loaded as an image file because we do RVA computations, e.g. in GetImageDirectoryEntry.

When a module is already loaded in the main process, it is expected that Windows will choose to recycle the existing mapping and thus map it as an image file, and we will not experience a crash. However, if a module is blocked in the main process but not in child processes, Windows is likely to choose to map it as a data file since it is not already loaded as an image file. This would make our RVA computations wrong and can likely result in this crash.

See Also: → 1837242
Severity: -- → S2
Priority: -- → P1

:gstoll, since you are the author of the regressor, bug 1536901, could you take a look?

For more information, please visit BugBot documentation.

Flags: needinfo?(gstoll)

When we prepare an untrusted modules ping, we currently let Windows
choose between loading the module as a data file or as an image file.
However our code relies on the module being loaded as an image file,
because we do RVA computations. We must use the proper flag
LOAD_LIBRARY_AS_IMAGE_RESOURCE alone, which guarantees what we want:
loading as an image file, but not for execution.

Assignee: nobody → yjuglaret
Status: NEW → ASSIGNED
Pushed by yjuglaret@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c94d7d0dccef Force loading modules as image files for untrusted modules ping. r=gstoll

Comment on attachment 9342896 [details]
Bug 1842368 - Force loading modules as image files for untrusted modules ping. r=gstoll,handyman,mhowell

Beta/Release Uplift Approval Request

  • User impact if declined: Crash spike of unpredictable volume due to untrusted module pings for modules that are loaded only in child processes. The volume could presumably be high due to the way we mitigated bug 1837242 - where we blocked a DLL only in the main process.
  • Is this code covered by automated tests?: Unknown
  • Has the fix been verified in Nightly?: No
  • Needs manual test from QE?: Yes
  • If yes, steps to reproduce: - With custom executable TestHooks.exe (please ask)
  • On Windows 7 64-bit
  • Launch TestHooks.exe
  • Launch Firefox
  • Open about:third-party
  • Firefox should not crash
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This change restricts the options Windows has when mapping a module. Windows used to have two options and will now only have one (the correct one). This should enforce the correct behavior.
  • String changes made/needed:
  • Is Android affected?: No

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: This is especially important for Windows 7 users who will be redirected to ESR 115 after end of support.
  • User impact if declined: Crash spike of unpredictable volume due to untrusted module pings for modules that are loaded only in child processes. The volume could presumably be high due to the way we mitigated bug 1837242 - where we blocked a DLL only in the main process.
  • Fix Landed on Version: 115
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): This change restricts the options Windows has when mapping a module. Windows used to have two options and will now only have one (the correct one). This should enforce the correct behavior.
Attachment #9342896 - Flags: approval-mozilla-release?
Attachment #9342896 - Flags: approval-mozilla-esr115?
Attachment #9342896 - Flags: approval-mozilla-beta?
Flags: qe-verify+
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 117 Branch
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/autoland/rev/087a1c95dc7c Enable test_ThirdPartyModulesPing.js on all branches. r=me

Comment on attachment 9342896 [details]
Bug 1842368 - Force loading modules as image files for untrusted modules ping. r=gstoll,handyman,mhowell

Approved for 116.0b3

Attachment #9342896 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Whiteboard: [qa-triaged]
Flags: needinfo?(gstoll)

Comment on attachment 9342896 [details]
Bug 1842368 - Force loading modules as image files for untrusted modules ping. r=gstoll,handyman,mhowell

Approved for 115.0.2
Approved for 115.0.2esr

Attachment #9342896 - Flags: approval-mozilla-release?
Attachment #9342896 - Flags: approval-mozilla-release+
Attachment #9342896 - Flags: approval-mozilla-esr115?
Attachment #9342896 - Flags: approval-mozilla-esr115+

Quick question @Yannis @Dianna how can we verify the fix here if the users are not allowed to install or update Fireox later than 115 on Windows 7 ? is there a command we can run to force the install for 116 or 117?

We were able to reproduce the crash in 115 but we cant verify the fix in 116 or 117.

We were able to Verify the fix in 115.0.2 as well as 115.0.2esr. I will update the flags.

QA Whiteboard: [qa-triaged]
Flags: qe-verify+
Flags: needinfo?(yjuglaret)
Flags: needinfo?(dsmith)

We were able to Reproduce the crash with Zipped builds and we could Verify the fix for 116 and 117 using zipped builds on Windows 7.
We also tested 116 and 117 on a Windows 10 machine to make sure the crash was not introduced there with this fix, and the issue does not happen on Windows 10.

Status: RESOLVED → VERIFIED
Flags: needinfo?(yjuglaret)
Flags: needinfo?(dsmith)

To clarify the surface of this crash: Windows 7 users could experience crashes while Firefox was gathering data about blocked DLL load attempts. If the DLL was forbidden to load in the main process by the blocklist, Firefox would load it as a data file rather than as an image, thereby breaking our assumptions for parsing its contents, which in turn could result in a crash.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: