Closed Bug 1992678 Opened 22 days ago Closed 15 days ago

Startup Crash with Avast (or other antivirus software) in [@ logging::LogMessage::~LogMessage] via sandbox::InterceptionManager with Firefox and Thunderbird

Categories

(External Software Affecting Firefox :: Other, defect)

Firefox 141
Desktop
Windows
defect

Tracking

(firefox-esr115 wontfix, firefox-esr140 wontfix, firefox144 fixed, firefox145 fixed, firefox146 fixed)

RESOLVED FIXED
146 Branch
Tracking Status
firefox-esr115 --- wontfix
firefox-esr140 --- wontfix
firefox144 --- fixed
firefox145 --- fixed
firefox146 --- fixed

People

(Reporter: yannis, Assigned: yannis)

References

Details

(Keywords: crash, regression, Whiteboard: [tbird crash])

Crash Data

Attachments

(4 files, 1 obsolete file)

As initially noticed by :wsmwk in bug 1737467 comment 10, new crash volume has started arriving on this signature, which was known as an OOM crash signature. As shown by bug 1737467 comment 12 (especially graphing crashes per day by major version), this new crash volume seems to have started with version 141.

The correlations suggest that the new volume could be a startup crash with Avast, accounting for two thirds of the crashes that end up on this signature, the remaining third likely being the previously known OOM crash volume:

(82.63% in signature vs 08.03% overall) Module "aswJsFlt.dll" = true [100.0% vs 10.87% if startup_crash = 1]
(67.66% in signature vs 02.06% overall) startup_crash = 1
No longer depends on: 1737467
See Also: → 1737467

On second look, we have volume with versions prior to 141, even for the startup crash variation, the volume just seems to have exploded around the release of 141. And it's not only Avast, although Avast is the biggest correlation. The offending code is the CHECK in security/sandbox/chromium/sandbox/win/src/interception.cc:

  // Reserve a full 64k memory range in the child process.
  HANDLE child = child_->Process();
  BYTE* thunk_base = reinterpret_cast<BYTE*>(::VirtualAllocEx(
      child, nullptr, kAllocGranularity, MEM_RESERVE, PAGE_NOACCESS));

  // Find an aligned, random location within the reserved range.
  size_t thunk_bytes =
      interceptions_.size() * sizeof(ThunkData) + sizeof(DllInterceptionData);
  size_t thunk_offset = internal::GetGranularAlignedRandomOffset(thunk_bytes);

  // Split the base and offset along page boundaries.
  thunk_base += thunk_offset & ~(kPageSize - 1);
  thunk_offset &= kPageSize - 1;

  // Make an aligned, padded allocation, and move the pointer to our chunk.
  size_t thunk_bytes_padded = base::bits::AlignUp(thunk_bytes, kPageSize);
  thunk_base = reinterpret_cast<BYTE*>(
      ::VirtualAllocEx(child, thunk_base, thunk_bytes_padded, MEM_COMMIT,
                       PAGE_EXECUTE_READWRITE));
  CHECK(thunk_base);  // If this fails we'd crash anyway on an invalid access.

It could be that Avast (and other AV products) has heuristics that block remote RWX allocations. We could try using a file mapping mapped RW in the parent and RX in the child and see if that fixes the issue.

So far I haven't been able to reproduce with the free version of Avast but I haven't officially started the trial (i.e. I have not entered credit card info yet), so maybe I don't have all features enabled although injection does happen.

Summary: Startup Crash with Avast in [@ logging::LogMessage::~LogMessage] with Firefox and Thunderbird 141+ → Startup Crash with Avast in [@ logging::LogMessage::~LogMessage] with Firefox and Thunderbird
Severity: -- → S3

Hi Greg! As a startup crash, why do we not mark this S2?

One of the problems here is that this is upstream chromium sandbox code, and quite a critical path, but nonetheless I think that the only solution from our side might be the file mapping thing, if that works. So I'll continue digging in that direction.

Blocking the DLLs may or may not be a solution too. I'm not convinced that us being prevented from allocating remote RWX memory is something that is achieved through DLL injection. More likely a kernel driver, with ObRegisterCallbacks, and in that case we can't do anything (except not do remote RWX allocations).

Yeah, good point, thanks!

Severity: S3 → S2

This removes the need for a remote RWX allocation in the child, or for a
remote change of protection from RW to RX. Instead, sandboxed child
processes only ever get a RX view on the mapping, which the parent
initializes with a RW view.

The calls to WriteProcessMemory do not need to occur anymore, since the
parent updates through the RW view automatically propagate to the
child's RX view.

We also delete ServiceResolverThunk::CopyThunk which is dead code.

Assignee: nobody → yjuglaret
Status: NEW → ASSIGNED

This bug has been marked as a regression. Setting status flag for Nightly to affected.

Summary: Startup Crash with Avast in [@ logging::LogMessage::~LogMessage] with Firefox and Thunderbird → Startup Crash with Avast (or other antivirus software) in [@ logging::LogMessage::~LogMessage] via sandbox::InterceptionManager with Firefox and Thunderbird

I have analyzed the crash volume more rigorously after noticing that there were different proto signatures involved. This is because all CHECK() failures in the sandbox code end up in this signature currently (in particular, one CHECK() in AddHandleToShare mostly affecting old ESR 128). So here is my new update.

In the last six months, if I ignore all potentially-OOM crashes and all proto signatures that don't contain PatchNtdll, thus only looking at the failures that should be caused by failure to commit PAGE_EXECUTE_READWRITE pages in a remote process, I get:

Accordingly, I estimate that Avast represents at least ~97% of the non-OOM volume on the PatchNtdll proto signature.

Hi :atrif, I believe that you are responsible for testing with Avast, is that correct? If you have the time, could you try to see if there are some Avast options that may trigger this crash in Firefox Release 143.0.4? Likely candidates would be things around blocking exploits or monitoring the system. Here (2, 3, 4) are suggestions from ChatGPT, though I don't know if these options really exist. The crash should occur at startup if you have the correct configuration. Thank you!

Flags: needinfo?(atrif)

(In reply to Yannis Juglaret [:yannis] from comment #8)

Hi :atrif, I believe that you are responsible for testing with Avast, is that correct? If you have the time, could you try to see if there are some Avast options that may trigger this crash in Firefox Release 143.0.4? Likely candidates would be things around blocking exploits or monitoring the system. Here (2, 3, 4) are suggestions from ChatGPT, though I don't know if these options really exist. The crash should occur at startup if you have the correct configuration. Thank you!

Hello! I have tried to reproduce the crash with Firefox 143.0.4, but with no luck.
I have enabled I think everything in Avast > Core Shields, and then started Firefox and navigated to google.com. I am using Avast Premium (25.9.10453g- build 25.9.10453.954). I enabled:

  • Cybercapture
  • Hardened Mode
  • Scan for tools
  • Anti-root shield
  • Anti-exploit shield
  • I changed to High from Medium Sensitivity for core shields
  • Sandbox to run Firefox always Virtualized

If more testing is needed, please let me know. Thank you!

QA Whiteboard: [p1] [qa-investig-done-c145/b144]
Flags: needinfo?(atrif)

If commiting RWX memory fails, we fall back to RW instead, which is
enough to initialize the thunks. In that case, we make sure that the
later transition to RX permission succeeds.

Status: ASSIGNED → RESOLVED
Closed: 15 days ago
Resolution: --- → FIXED
Target Milestone: --- → 146 Branch

If commiting RWX memory fails, we fall back to RW instead, which is
enough to initialize the thunks. In that case, we make sure that the
later transition to RX permission succeeds.

Original Revision: https://phabricator.services.mozilla.com/D268123

Attachment #9519889 - Flags: approval-mozilla-beta?
Attachment #9519891 - Flags: approval-mozilla-beta?

(The second beta uplift request in comment 14 was to get a trace to file bug 1994123; I will now remove it)

Attachment #9519891 - Attachment is obsolete: true
Attachment #9519891 - Flags: approval-mozilla-beta?

firefox-beta Uplift Approval Request

  • User impact if declined: Startup crash for some Avast users (and maybe occasionally for some users of other antivirus software).
  • Code covered by automated testing: yes
  • Fix verified in Nightly: no
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing:
  • Risk associated with taking this patch: low
  • Explanation of risk level: This patch preserves the usual code path for users not affected by the Avast crash. Only users that would have run into the crash will go through the alternate code path, which is an attempt at mitigating the underlying issue.
  • String changes made/needed: No
  • Is Android affected?: no

If commiting RWX memory fails, we fall back to RW instead, which is
enough to initialize the thunks. In that case, we make sure that the
later transition to RX permission succeeds.

Original Revision: https://phabricator.services.mozilla.com/D268123

Attachment #9519902 - Flags: approval-mozilla-release?

firefox-release Uplift Approval Request

  • User impact if declined: Startup crash for some Avast users (and maybe occasionally for some users of other antivirus software).
  • Code covered by automated testing: yes
  • Fix verified in Nightly: no
  • Needs manual QE test: no
  • Steps to reproduce for manual QE testing:
  • Risk associated with taking this patch: low
  • Explanation of risk level: This patch preserves the usual code path for users not affected by the Avast crash. Only users that would have run into the crash will go through the alternate code path, which is an attempt at mitigating the underlying issue.
  • String changes made/needed: No
  • Is Android affected?: no

The patch that has landed in Nightly and for which I'm asking uplifts is a naive and uplift-safe attempt at addressing the issue, which may or may not be enough (worst case, we'll still be receiving the crashes). If this attempt doesn't work, we can consider the more involved patch in D267839.

Attachment #9519889 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
QA Contact: atrif
Attachment #9519902 - Flags: approval-mozilla-release? → approval-mozilla-release+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: