Crash in [@ RedBlackTree<T>::TreeNode::SetColor] with Trend Micro
Categories
(External Software Affecting Firefox :: Other, defect)
Tracking
(firefox-esr115 wontfix, firefox-esr140 wontfix, firefox143+ fixed, firefox144 fixed, firefox145 fixed)
People
(Reporter: merlino37, Assigned: gstoll)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, regression, topcrash)
Crash Data
Attachments
(3 files)
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-beta+
|
Details | Review |
|
48 bytes,
text/x-phabricator-request
|
phab-bot
:
approval-mozilla-release+
|
Details | Review |
Crash report: https://crash-stats.mozilla.org/report/index/a7e6eda1-46ca-42c2-a566-1206c0231228
MOZ_CRASH Reason: MOZ_RELEASE_ASSERT(mNode)
Top 10 frames of crashing thread:
0 firefox-bin RedBlackTree<arena_chunk_map_t, ArenaRunTreeTrait>::TreeNode::SetColor memory/build/rb.h:182
0 firefox-bin RedBlackTree<arena_chunk_map_t, ArenaRunTreeTrait>::MoveRedRight memory/build/rb.h:636
0 firefox-bin RedBlackTree<arena_chunk_map_t, ArenaRunTreeTrait>::Remove memory/build/rb.h:533
0 firefox-bin RedBlackTree<arena_chunk_map_t, ArenaRunTreeTrait>::Remove memory/build/rb.h:137
0 firefox-bin arena_t::DallocSmall memory/build/mozjemalloc.cpp:3702
0 firefox-bin arena_dalloc memory/build/mozjemalloc.cpp:3781
0 firefox-bin BaseAllocator::free memory/build/mozjemalloc.cpp:4591
0 firefox-bin Allocator<MozJemallocBase>::free memory/build/malloc_decls.h:54
0 firefox-bin free memory/build/malloc_decls.h:54
1 libxul.so mozilla::layers::CompositableClient::Release gfx/layers/client/CompositableClient.h:75
Comment 2•1 year ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Graphics' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Comment 3•1 year ago
|
||
I looked at about 10 crashes with this signature, none of them had the same stack as the linked crash. There were many distinct stacks, some not in graphics. There are likely multiple different problems under this signature.
Comment 4•1 year ago
|
||
The bug has a crash signature, thus the bug will be considered confirmed.
Comment 5•1 year ago
|
||
This feels like the class of bug that might be background-level memory corruption.
I think this is probably not graphics, but rather many stacks will contain graphics, because graphics is pervasive.
Updated•1 year ago
|
Updated•1 year ago
|
Comment 6•3 months ago
|
||
The bug is linked to a topcrash signature, which matches the following criteria:
- Top 20 desktop browser crashes on beta (startup)
- Top 10 content process crashes on beta
:glandium, could you consider increasing the severity of this top-crash bug?
For more information, please visit BugBot documentation.
Comment 7•3 months ago
|
||
Based on the topcrash criteria, the crash signature linked to this bug is not a topcrash signature anymore.
For more information, please visit BugBot documentation.
Comment 8•3 months ago
|
||
Is it known why the crash volume spiked for Firefox 143 builds?
Updated•3 months ago
|
Updated•3 months ago
|
Comment 9•3 months ago
|
||
The bug is marked as tracked for firefox143 (release). However, the bug still isn't assigned and has low severity.
:jstutte, could you please find an assignee and increase the severity for this tracked bug? If you disagree with the tracking decision, please talk with the release managers.
For more information, please visit BugBot documentation.
Comment 10•3 months ago
|
||
This could be some external DLL thing causing issues? I looked at the correlations tab for release and it has this:
(100.0% in signature vs 01.88% overall) moz_crash_reason = MOZ_RELEASE_ASSERT(mNode)
(100.0% in signature vs 02.57% overall) Module "TmUmEvt64.dll" = true
(100.0% in signature vs 02.57% overall) Module "tmmon64.dll" = true
(100.0% in signature vs 25.69% overall) Module "bcryptprimitives.dll" = true
Comment 11•3 months ago
|
||
Google says that tmmon64.dll is associated with Trend Micro UMH Monitor Engine, and that TmUmEvt64.dll "belongs to AMSP UMH module".
Updated•3 months ago
|
| Assignee | ||
Comment 12•3 months ago
|
||
Updated•3 months ago
|
Comment 13•3 months ago
|
||
firefox-beta Uplift Approval Request
- User impact if declined: Trend Micro users will experience tab crashes
- Code covered by automated testing: no
- Fix verified in Nightly: no
- Needs manual QE test: yes
- Steps to reproduce for manual QE testing: Install Trend Micro trial version and verify normal browsing works as expected
- Risk associated with taking this patch: low
- Explanation of risk level: Just a content process block of Trend Micro DLLs
- String changes made/needed: no
- Is Android affected?: no
| Assignee | ||
Comment 14•3 months ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D265326
Comment 15•3 months ago
|
||
firefox-release Uplift Approval Request
- User impact if declined: Trend Micro users will experience tab crashes
- Code covered by automated testing: no
- Fix verified in Nightly: no
- Needs manual QE test: yes
- Steps to reproduce for manual QE testing: Install Trend Micro trial version and verify normal browsing works as expected
- Risk associated with taking this patch: low
- Explanation of risk level: Just a content process block of Trend Micro DLLs
- String changes made/needed: no
- Is Android affected?: no
| Assignee | ||
Comment 16•3 months ago
|
||
Original Revision: https://phabricator.services.mozilla.com/D265326
Comment 17•3 months ago
|
||
Updated•3 months ago
|
Updated•3 months ago
|
Updated•3 months ago
|
Updated•3 months ago
|
Comment 18•3 months ago
|
||
| uplift | ||
Updated•3 months ago
|
Updated•3 months ago
|
Comment 19•3 months ago
|
||
| uplift | ||
Comment 20•3 months ago
|
||
| bugherder | ||
Comment 21•3 months ago
|
||
We attempted to reproduce the issue using Firefox 143.0 on both Windows 10 and 11, with Trend Micro Antivirus+ (v. 17.8.1476) installed. While browsing popular webpages and performing copy-paste actions across various websites, we were unable to reproduce the crash.
We also tested the above mentioned scenarios with Firefox 143.0.1 and Firefox 144.0b3 (treeherder build from Comment 18) under the same setup on both Windows 10 and 11, verifying that tmmon64.dll and TmUmEvt64.dll were loaded in about:third-party with the Occurrences value “1”.
Additionally, we noticed that sometimes the “Block this model” option appears in about:third-party only after refreshing the page. @Greg, is this expected?
Testing with the 143.0.1 RC (also with tmmon64.dll and TmUmEvt64.dll blocked) and the 144.0b3 build also did not result in any crashes. However, since we were unable to reproduce the issue in the first place, we are unable to mark this bug as verified.
Comment 22•3 months ago
•
|
||
(In reply to Bianca Hidecuti, Desktop Test Engineering [:bhidecuti] from comment #21)
Additionally, we noticed that sometimes the “Block this model” option appears in about:third-party only after refreshing the page. @Greg, is this expected?
Yes, this is expected. At the same time that the option appears, you should see a button to the far right containing a downarrow that looks like a v. You can use that button to get more info about a specific DLL entry. Here you could use this to confirm what processes the faulty DLLs are loaded into. The patch is working if there is no "Tab" process listed with status "Loaded" for these two DLLs, even after you load some web pages in some tabs.
Comment 23•3 months ago
|
||
Cleaning up earlier needinfos.
Comment 24•3 months ago
|
||
(In reply to Yannis Juglaret [:yannis] from comment #22)
(In reply to Bianca Hidecuti, Desktop Test Engineering [:bhidecuti] from comment #21)
Additionally, we noticed that sometimes the “Block this model” option appears in about:third-party only after refreshing the page. @Greg, is this expected?
Yes, this is expected. At the same time that the option appears, you should see a button to the far right containing a downarrow that looks like a v. You can use that button to get more info about a specific DLL entry. Here you could use this to confirm what processes the faulty DLLs are loaded into. The patch is working if there is no "Tab" process listed with status "Loaded" for these two DLLs, even after you load some web pages in some tabs.
Thank you for the reply! I can confirm that there is no "Tab" process listed after loading different webpages (only a "Main" process with the status "Loaded").
Comment 25•2 months ago
•
|
||
Adding some STR for documentation. They seem to reliably to reproduce the issue for me with bad builds. Essentially, open many tabs, wait for a while, and continue using the browser. The number of tabs and time to wait may need to be adjusted per machine.
STR:
- install Trend Micro Maximum Security (trial version in my case);
- open Firefox, double-check the presence of Trend Micro DLLs in
about:third-party; - install the URLs List addon;
- copy "many" URLs from the top 1000 domains list (for me 100 is a good number, 1000 works for sure, and 25 seems to be rather reliable);
- paste into the URLs List addon and click Open;
- leave the computer unattended for 20 minutes to 1 hour (for me 30 min is a good number);
- navigate between the existing tabs, open new tabs.
Expected behavior: normal navigation.
Bad behavior: at least one of the new tabs or old tabs crashes.
Comment 26•2 months ago
•
|
||
Following mozregression and initial suspicious by :mccr8, I'm marking bug 1970638 as a regressor.
Since the landing of the patch in bug 1970638, after the main thread of a sandboxed content process calls RevertToSelf to start the sandbox, it remains possible to load well-known DLLs. Before that patch, loading any well-known DLL would fail once the sandbox is started.
As noted by :bobowen, the Trend Micro DLLs depend on psapi.dll, which is a well-known DLL that Firefox itself does not load during content process initialization. And based on debugging, the Trend Micro DLLs load on the main thread after Trend Micro queues a user-mode APC.
We suspect that there could be a race condition between the non-deterministic point where the user-mode APC gets called and the call to RevertToSelf that starts the sandbox. More precisely, the crashes could be occuring when the DLLs successfully load after the sandbox is started. Before bug 1970638, the DLLs would fail to load if the sandbox is started, because of their dependency on psapi.dll. After bug 1970638, they will successfully load and they might not expect to be running their initialization code in a sandboxed environment where some initialization calls would fail, ultimately causing crashes.
We have not confirmed this theory, but it sounds like a good starting point for investigation if Trend Micro wants to address this issue in the DLLs directly.
Updated•2 months ago
|
Updated•2 months ago
|
Updated•16 days ago
|
Updated•7 days ago
|
Description
•