Startup Crash with Avast (or other antivirus software) in [@ logging::LogMessage::~LogMessage] via sandbox::InterceptionManager with Firefox and Thunderbird
Categories
(External Software Affecting Firefox :: Other, defect)
Tracking
(firefox-esr115 wontfix, firefox-esr140 wontfix, firefox144 fixed, firefox145 fixed, firefox146 fixed)
People
(Reporter: yannis, Assigned: yannis)
References
Details
(Keywords: crash, regression, Whiteboard: [tbird crash])
Crash Data
Attachments
(4 files, 1 obsolete file)
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
|
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-release+
|
Details | Review |
As initially noticed by :wsmwk in bug 1737467 comment 10, new crash volume has started arriving on this signature, which was known as an OOM crash signature. As shown by bug 1737467 comment 12 (especially graphing crashes per day by major version), this new crash volume seems to have started with version 141.
The correlations suggest that the new volume could be a startup crash with Avast, accounting for two thirds of the crashes that end up on this signature, the remaining third likely being the previously known OOM crash volume:
(82.63% in signature vs 08.03% overall) Module "aswJsFlt.dll" = true [100.0% vs 10.87% if startup_crash = 1]
(67.66% in signature vs 02.06% overall) startup_crash = 1
| Assignee | ||
Updated•22 days ago
|
| Assignee | ||
Comment 1•22 days ago
•
|
||
On second look, we have volume with versions prior to 141, even for the startup crash variation, the volume just seems to have exploded around the release of 141. And it's not only Avast, although Avast is the biggest correlation. The offending code is the CHECK in security/sandbox/chromium/sandbox/win/src/interception.cc:
// Reserve a full 64k memory range in the child process.
HANDLE child = child_->Process();
BYTE* thunk_base = reinterpret_cast<BYTE*>(::VirtualAllocEx(
child, nullptr, kAllocGranularity, MEM_RESERVE, PAGE_NOACCESS));
// Find an aligned, random location within the reserved range.
size_t thunk_bytes =
interceptions_.size() * sizeof(ThunkData) + sizeof(DllInterceptionData);
size_t thunk_offset = internal::GetGranularAlignedRandomOffset(thunk_bytes);
// Split the base and offset along page boundaries.
thunk_base += thunk_offset & ~(kPageSize - 1);
thunk_offset &= kPageSize - 1;
// Make an aligned, padded allocation, and move the pointer to our chunk.
size_t thunk_bytes_padded = base::bits::AlignUp(thunk_bytes, kPageSize);
thunk_base = reinterpret_cast<BYTE*>(
::VirtualAllocEx(child, thunk_base, thunk_bytes_padded, MEM_COMMIT,
PAGE_EXECUTE_READWRITE));
CHECK(thunk_base); // If this fails we'd crash anyway on an invalid access.
It could be that Avast (and other AV products) has heuristics that block remote RWX allocations. We could try using a file mapping mapped RW in the parent and RX in the child and see if that fixes the issue.
So far I haven't been able to reproduce with the free version of Avast but I haven't officially started the trial (i.e. I have not entered credit card info yet), so maybe I don't have all features enabled although injection does happen.
Updated•22 days ago
|
| Assignee | ||
Comment 2•22 days ago
|
||
Hi Greg! As a startup crash, why do we not mark this S2?
One of the problems here is that this is upstream chromium sandbox code, and quite a critical path, but nonetheless I think that the only solution from our side might be the file mapping thing, if that works. So I'll continue digging in that direction.
Blocking the DLLs may or may not be a solution too. I'm not convinced that us being prevented from allocating remote RWX memory is something that is achieved through DLL injection. More likely a kernel driver, with ObRegisterCallbacks, and in that case we can't do anything (except not do remote RWX allocations).
| Assignee | ||
Comment 4•21 days ago
|
||
This removes the need for a remote RWX allocation in the child, or for a
remote change of protection from RW to RX. Instead, sandboxed child
processes only ever get a RX view on the mapping, which the parent
initializes with a RW view.
The calls to WriteProcessMemory do not need to occur anymore, since the
parent updates through the RW view automatically propagate to the
child's RX view.
We also delete ServiceResolverThunk::CopyThunk which is dead code.
Updated•21 days ago
|
| Comment hidden (obsolete) |
Comment 6•20 days ago
|
||
This bug has been marked as a regression. Setting status flag for Nightly to affected.
Updated•20 days ago
|
| Assignee | ||
Comment 7•20 days ago
|
||
I have analyzed the crash volume more rigorously after noticing that there were different proto signatures involved. This is because all CHECK() failures in the sandbox code end up in this signature currently (in particular, one CHECK() in AddHandleToShare mostly affecting old ESR 128). So here is my new update.
In the last six months, if I ignore all potentially-OOM crashes and all proto signatures that don't contain PatchNtdll, thus only looking at the failures that should be caused by failure to commit PAGE_EXECUTE_READWRITE pages in a remote process, I get:
- 1600 startup crashes with at least one Avast DLL present;
- 10 startup crashes without (0,1,2,3,4,5,6,7,8,9);
- 268 non-startup crashes with at least one Avast DLL present;
- 56 non-startup crashes without (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55).
Accordingly, I estimate that Avast represents at least ~97% of the non-OOM volume on the PatchNtdll proto signature.
| Assignee | ||
Comment 8•20 days ago
|
||
Hi :atrif, I believe that you are responsible for testing with Avast, is that correct? If you have the time, could you try to see if there are some Avast options that may trigger this crash in Firefox Release 143.0.4? Likely candidates would be things around blocking exploits or monitoring the system. Here (2, 3, 4) are suggestions from ChatGPT, though I don't know if these options really exist. The crash should occur at startup if you have the correct configuration. Thank you!
Comment 9•19 days ago
|
||
(In reply to Yannis Juglaret [:yannis] from comment #8)
Hi :atrif, I believe that you are responsible for testing with Avast, is that correct? If you have the time, could you try to see if there are some Avast options that may trigger this crash in Firefox Release 143.0.4? Likely candidates would be things around blocking exploits or monitoring the system. Here (2, 3, 4) are suggestions from ChatGPT, though I don't know if these options really exist. The crash should occur at startup if you have the correct configuration. Thank you!
Hello! I have tried to reproduce the crash with Firefox 143.0.4, but with no luck.
I have enabled I think everything in Avast > Core Shields, and then started Firefox and navigated to google.com. I am using Avast Premium (25.9.10453g- build 25.9.10453.954). I enabled:
- Cybercapture
- Hardened Mode
- Scan for tools
- Anti-root shield
- Anti-exploit shield
- I changed to High from Medium Sensitivity for core shields
- Sandbox to run Firefox always Virtualized
If more testing is needed, please let me know. Thank you!
| Assignee | ||
Comment 10•19 days ago
|
||
If commiting RWX memory fails, we fall back to RW instead, which is
enough to initialize the thunks. In that case, we make sure that the
later transition to RX permission succeeds.
Comment 11•15 days ago
|
||
Comment 12•15 days ago
|
||
| bugherder | ||
Updated•15 days ago
|
| Assignee | ||
Comment 13•14 days ago
|
||
If commiting RWX memory fails, we fall back to RW instead, which is
enough to initialize the thunks. In that case, we make sure that the
later transition to RX permission succeeds.
Original Revision: https://phabricator.services.mozilla.com/D268123
Updated•14 days ago
|
| Comment hidden (obsolete) |
Updated•14 days ago
|
| Assignee | ||
Comment 15•14 days ago
|
||
(The second beta uplift request in comment 14 was to get a trace to file bug 1994123; I will now remove it)
| Assignee | ||
Updated•14 days ago
|
Comment 16•14 days ago
|
||
firefox-beta Uplift Approval Request
- User impact if declined: Startup crash for some Avast users (and maybe occasionally for some users of other antivirus software).
- Code covered by automated testing: yes
- Fix verified in Nightly: no
- Needs manual QE test: no
- Steps to reproduce for manual QE testing:
- Risk associated with taking this patch: low
- Explanation of risk level: This patch preserves the usual code path for users not affected by the Avast crash. Only users that would have run into the crash will go through the alternate code path, which is an attempt at mitigating the underlying issue.
- String changes made/needed: No
- Is Android affected?: no
| Assignee | ||
Comment 17•14 days ago
|
||
If commiting RWX memory fails, we fall back to RW instead, which is
enough to initialize the thunks. In that case, we make sure that the
later transition to RX permission succeeds.
Original Revision: https://phabricator.services.mozilla.com/D268123
Updated•14 days ago
|
Comment 18•14 days ago
|
||
firefox-release Uplift Approval Request
- User impact if declined: Startup crash for some Avast users (and maybe occasionally for some users of other antivirus software).
- Code covered by automated testing: yes
- Fix verified in Nightly: no
- Needs manual QE test: no
- Steps to reproduce for manual QE testing:
- Risk associated with taking this patch: low
- Explanation of risk level: This patch preserves the usual code path for users not affected by the Avast crash. Only users that would have run into the crash will go through the alternate code path, which is an attempt at mitigating the underlying issue.
- String changes made/needed: No
- Is Android affected?: no
| Assignee | ||
Comment 19•14 days ago
|
||
The patch that has landed in Nightly and for which I'm asking uplifts is a naive and uplift-safe attempt at addressing the issue, which may or may not be enough (worst case, we'll still be receiving the crashes). If this attempt doesn't work, we can consider the more involved patch in D267839.
Updated•13 days ago
|
Updated•13 days ago
|
Comment 20•13 days ago
|
||
| uplift | ||
Updated•8 days ago
|
Updated•5 days ago
|
Updated•5 days ago
|
Comment 21•5 days ago
|
||
| uplift | ||
Description
•