Open Bug 1936749 Opened 1 year ago Updated 1 year ago

Crash in [@ CxxThrowException | MpDetoursHookAll]

Categories

(External Software Affecting Firefox :: Other, defect)

Unspecified
Windows
defect

Tracking

(firefox-esr128 fixed, firefox135 wontfix, firefox136 fixed, firefox137 fixed)

REOPENED
137 Branch
Tracking Status
firefox-esr128 --- fixed
firefox135 --- wontfix
firefox136 --- fixed
firefox137 --- fixed

People

(Reporter: gsvelto, Assigned: yannis)

Details

(Keywords: crash)

Crash Data

Attachments

(3 files)

Crash report: https://crash-stats.mozilla.org/report/index/37858fa9-0525-4fae-b5ac-8ad400241210

Reason:

Unhandled C++ Exception

Top 8 frames:

0  KERNELBASE.dll  RaiseException
1  MpDetours.dll  CxxThrowException
2  MpDetours.dll  MpDetoursHookAll(bool)
3  MpDetours.dll  MpDetoursHookThreadProc(void*)
4  kernel32.dll  BaseThreadInitThunk
5  mozglue.dll  mozilla::interceptor::FuncHook<mozilla::interceptor::WindowsDllInterceptor<mo...  toolkit/xre/dllservices/mozglue/nsWindowsDllInterceptor.h:150
5  mozglue.dll  patched_BaseThreadInitThunk(int, void*, void*)  toolkit/xre/dllservices/mozglue/WindowsDllBlocklist.cpp:562
6  ntdll.dll  RtlUserThreadStart

This innocuous-looking crash on Socorro is the highest volume non-main process crash in the crash ping data, with 1951 crashes affecting 1287 users over a week on versions 133-135 of Firefox.

The severity field is not set for this bug.
:gstoll, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(gstoll)
Severity: -- → S3
Flags: needinfo?(gstoll)

Last error was ERROR_DYNAMIC_CODE_BLOCKED

We only have a handful of socorro crashes here, but there are currently ~3700 crashes listed with ~2100 clients in the last 7 days -- the #2 crasher after OOM small! This also implies people hit it more than once (and may well switch browsers afterwards). What can we do to look into this?

Flags: needinfo?(gsvelto)
Flags: needinfo?(gpascutto)

MpDetours appears to be a Microsoft Windows Defender dll

Flags: needinfo?(gpascutto)
Flags: needinfo?(gsvelto)

The high crash volume was already identified in the first comment. I'm not sure why it was marked S3, Greg?

Flags: needinfo?(gstoll)

Pouring through the crash ping data, this is socket process only, >50% of the crash volume for that process.
1 56.7% CxxThrowException | MpDetoursHookAll

the #2 crasher after OOM small! This also implies people hit it more than once (and may well switch browsers afterwards)

See above - not sure how disruptive this actually is! Yay for separating things out into processes.

Sorry, I must have missed that comment. I'll take a look.

Flags: needinfo?(gstoll)

Analyzing MpDetoursHook.dll's MpDetoursHookAll shows two conditions under which we can end up in this CxxThrowException code path:
(1) reaching MpDetoursHookAll while another thread is already running it;
(2) if a call to VirtualProtect(..., PAGE_EXECUTE_READWRITE, ...) fails.

(1) does not match with what we see in the crash reports. For example we don't have multiple threads running MpDetoursHook.dll in the crash dumps. However (2) makes a lot of sense.

In security/sandbox/win/src/sandboxbroker/sandboxBroker.cpp, we can see that MITIGATION_DYNAMIC_CODE_DISABLE is used in socket, RDD and utility processes. This matches precisely with the process types we find on crash-stats. The mitigation will make the call in (2) fail, by design. Also matches with comment 2.

I'll write to our contacts our Microsoft to ask if they could check if the mitigation is active and fail silently rather than raise an exception in this case.

I'm not sure this is related to Windows Defender - all the crash reports I see now on crash-stats have QIPCAP64.dll loaded, whose publisher is Forcepoint LLC.

Looking at the 32 crashes that reached crash-stats on this signature in the last six months gives these numbers of occurences for third-party DLLs compared to MpDetours.dll:

amapphook.dll: 1 (Ivanti, Inc.)
amldrappinit.dll: 1 (Ivanti, Inc.)
igd10iumd64.dll: 1 (IntelVPGSigning2021)
igdgmm64.dll: 1 (IntelVPGSigning2021)
pghook.dll: 2 (BeyondTrust Corporation)
qipcap64.dll: 4 (Forcepoint LLC)
cyinjct.dll: 5 (Palo Alto Networks (Netherlands) B.V.)
cyvera.dll: 5 (Palo Alto Networks (Netherlands) B.V.)
cyvrtrap.dll: 5 (Palo Alto Networks (Netherlands) B.V.)
ntnativeapi.dll: 5 (Palo Alto Networks (Netherlands) B.V.)
mpdetours.dll: 32
Assignee: nobody → yjuglaret
Status: NEW → ASSIGNED
Pushed by yjuglaret@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/391773dafbd6 Disable ACG if we detect MpDetours.dll injection. r=bobowen
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → 137 Branch

Increasing severity based on comment 0 and comment 2.

Severity: S3 → S2
Attachment #9465645 - Flags: approval-mozilla-beta?

beta Uplift Approval Request

  • User impact if declined: Repeated crashes of socket/rdd/utility processes for some Windows Defender users
  • Code covered by automated testing: no
  • Fix verified in Nightly: no
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: Unclear
  • Risk associated with taking this patch: Low
  • Explanation of risk level: The patch disables a security mitigation if we detect injection of the faulty DLL. This should only impact the affected users and can only improve the situation.
  • String changes made/needed: None
  • Is Android affected?: no

I'm thinking this probably doesn't need to be rushed into next week's Fx135 planned dot release, but LMK if you feel strongly otherwise. Also, I'm guessing we'll probably want to take this on ESR128 also?

Attachment #9465645 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

(In reply to Ryan VanderMeulen [:RyanVM] from comment #16)

Flags: needinfo?(yjuglaret)

Beta uplift should be enough. I'll push a request for ESR.

Attachment #9466150 - Flags: approval-mozilla-esr128?

esr128 Uplift Approval Request

  • User impact if declined: Repeated crashes of socket/rdd/utility processes for some Windows Defender users
  • Code covered by automated testing: no
  • Fix verified in Nightly: no
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing: Unclear
  • Risk associated with taking this patch: Low
  • Explanation of risk level: The patch disables a security mitigation if we detect injection of the faulty DLL. This should only impact the affected users and can only improve the situation.
  • String changes made/needed: None
  • Is Android affected?: no
Attachment #9466150 - Flags: approval-mozilla-esr128? → approval-mozilla-esr128+
Flags: needinfo?(yjuglaret)

Early crash ping telemetry shows this crash is now only <1% of the socket process crash volume, suggesting that our patch has almost completely mitigated the issue. Reopening with lower severity still for tracking purposes, since it is present despite low volume and we are hoping that Microsoft will acknowledge and fix this issue.

Severity: S2 → S3
Status: RESOLVED → REOPENED
Resolution: FIXED → ---

Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.

For more information, please visit BugBot documentation.

Keywords: topcrash
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: