Crash in [@ arena_t::~arena_t | ArenaCollection::DisposeArena] via SnowWhiteKiller
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
People
(Reporter: wsmwk, Assigned: sefeng211)
References
(Regression)
Details
(Keywords: crash, regression, topcrash-thunderbird)
Crash Data
Attachments
(1 file, 1 obsolete file)
48 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta-
|
Details | Review |
signature begins with version 77. All crashes are Thunderbird.
First crash is bp-85acf350-a2bf-45fa-8dbc-3afcc0200602.
Top 10 frames of crashing thread:
0 mozglue.dll arena_t::~arena_t memory/build/mozjemalloc.cpp:3565
1 mozglue.dll ArenaCollection::DisposeArena memory/build/mozjemalloc.cpp:1075
2 xul.dll nsIContent::Destroy dom/base/FragmentOrElement.cpp:149
3 xul.dll SnowWhiteKiller::Visit xpcom/base/nsCycleCollector.cpp:2457
4 xul.dll nsPurpleBuffer::VisitEntries<SnowWhiteKiller> xpcom/base/nsCycleCollector.cpp:957
5 xul.dll nsCycleCollector_doDeferredDeletionWithBudget xpcom/base/nsCycleCollector.cpp:3889
6 xul.dll AsyncFreeSnowWhite::Run js/xpconnect/src/XPCJSRuntime.cpp:147
7 xul.dll IdleRunnableWrapper::Run xpcom/threads/nsThreadUtils.cpp:326
8 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1200
9 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:481
Another is bp-63f36bf9-d636-4c3a-92d6-032240200617
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 1•5 years ago
|
||
How might it be possible for all crashes to be happening only on beta channel? 100% of crashes on only one channel is quite odd.
bp-4e8d568e-a5f3-47e6-a02a-9fbe40201231 Windows
Mac 85 beta arena_t::~arena_t | arena_t::DallocSmall | arena_t::DallocSmall | ArenaCollection::DisposeArena bp-b732f931-58c7-4bdb-b393-f2b5e0201215
0 libmozglue.dylib arena_t::~arena_t() memory/build/mozjemalloc.cpp:3579
1 libmozglue.dylib arena_t::DallocSmall(arena_chunk_t*, void*, arena_chunk_map_t*) memory/build/mozjemalloc.cpp:3290
2 libmozglue.dylib arena_t::DallocSmall(arena_chunk_t*, void*, arena_chunk_map_t*) memory/build/mozjemalloc.cpp:3290
3 libmozglue.dylib ArenaCollection::DisposeArena(arena_t*) memory/build/mozjemalloc.cpp:1075
4 XUL nsIContent::Destroy() dom/base/FragmentOrElement.cpp:149
5 XUL SnowWhiteKiller::Visit(nsPurpleBuffer&, nsPurpleBufferEntry*) xpcom/base/nsCycleCollector.cpp:2457
6 XUL void nsPurpleBuffer::VisitEntries<SnowWhiteKiller>(SnowWhiteKiller&) xpcom/base/nsCycleCollector.cpp:957
7 libsystem_pthread.dylib _pthread_cond_updateval
8 XUL nsCycleCollector_doDeferredDeletionWithBudget(js::SliceBudget&) xpcom/base/nsCycleCollector.cpp:3889
9 XUL AsyncFreeSnowWhite::Run() js/xpconnect/src/XPCJSRuntime.cpp:147
10 XUL XUL@0x41c73f
11 XUL <name omitted> xpcom/threads/nsThreadUtils.cpp:344
Comment 2•5 years ago
|
||
The original bug might be different. Possibly this one is a rare crash from bug 1211292.
Reporter | ||
Comment 4•5 years ago
|
||
now a topcrash on beta - #1
For some reason a big spike in last 24 hours - not just a few people according to https://crash-stats.mozilla.org/signature/?product=Thunderbird&signature=arena_t%3A%3A~arena_t%20%7C%20ArenaCollection%3A%3ADisposeArena&date=%3E%3D2021-01-03T16%3A57%3A00.000Z&date=%3C2021-01-04T16%3A57%3A00.000Z#summary
Reporter | ||
Comment 5•5 years ago
|
||
(In reply to Wayne Mery (:wsmwk) from comment #4)
now a topcrash on beta - #1
For some reason a big spike in last 24 hours - not just a few people according to https://crash-stats.mozilla.org/signature/?product=Thunderbird&signature=arena_t%3A%3A~arena_t%20%7C%20ArenaCollection%3A%3ADisposeArena&date=%3E%3D2021-01-03T16%3A57%3A00.000Z&date=%3C2021-01-04T16%3A57%3A00.000Z#summary
Spike appears to be related to gconversations addon 3.2.11 bp-849a9100-2238-4ad7-bc29-67d600210104
Comment 6•5 years ago
|
||
The update I've just realised for Conversations changes how the javascript files are packaged - we're now using webpack to bundle them altogether. That's should generally be standard javascript code, though some of it does run in chrome context, but the packaging shouldn't matter.
The crash stacks here though are cycle collection related and I doubt there's anything I could do about it without actually having steps to repeat. So you're probably better off moving this to Core / Javascript: GC as a starting point.
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 7•5 years ago
|
||
jonco, can you give an assessment of this topcrash?
Comment 8•5 years ago
|
||
This is the cycle collector, not the GC.
Comment 9•5 years ago
|
||
MOZ_CRASH Reason (Sanitized): MOZ_RELEASE_ASSERT(!mStats.allocated_small && !mStats.allocated_large) (Arena is not empty)
The large uptick is for tb beta builds with id 20201222142912 and some, 4% with 20201217170743
Looking at https://hg.mozilla.org/mozilla-central/log/tip/memory/build/mozjemalloc.cpp perhaps bug 1681003 could be the cause?
Comment 10•5 years ago
|
||
Unlikely, since that's a private arena. It seems something allocated in a DOMArena is outliving the arena itself...
(In reply to Jon Coppeard (:jonco) from comment #8)
This is the cycle collector, not the GC.
This is actually not the cycle collector either, though it is more the cycle collector than the GC. :) All cycle collected objects are destroyed by the "SnowWhiteKiller", so this just means that we crashed while running the destroyed for some cycle collected object. It looks like these stacks all have nsIContent::Destroy() in them, so this is some issue with the arena allocation of DOM nodes.
Sean Feng worked on the arena allocator for DOM stuff in bug 1377999 and other bugs, so maybe he has some ideas of the next steps for investigating what is going wrong here.
Comment 13•5 years ago
|
||
(In reply to Andrew McCreight [:mccr8] from comment #11)
Ah, apologies. I saw something about the cycle collector in the stack and plumped for that :)
Comment 14•5 years ago
•
|
||
(In reply to Jon Coppeard (:jonco) from comment #13)
Ah, apologies. I saw something about the cycle collector in the stack and plumped for that :)
No worries. It is a very common point of confusion. It might be worth reworking the names of some of that stuff to make it clearer to people reading the stacks what is going on.
Assignee | ||
Comment 15•5 years ago
|
||
A quick update is that this seems releated to cross docGroup node adoption. After the freeing a node that has been adopted to a different docGroup via the nsIContent::Destroy()
call, DOMArena thinks its safe to dispose the arena because nobody's owning it, so the underlying arena should be empty. In contrast, jemalloc tells us it's not empty.
I don' know how this can possibly happen though, and I am still investigating.
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Comment 16•5 years ago
|
||
We use nsINode::Adopt to store the arenas to a hashtable to keep them
alive, however this method is not guaranteed to be called, so it
may cause arenas to be disposed before all nodes are destroyed.
Updated•5 years ago
|
Assignee | ||
Comment 17•5 years ago
|
||
In a scenario that a node is adopted from docGroup A to B and
then B to C, the cached arena will be updated to B's arena
during the B to C adoption, which is not correct (We want to keep
A's arena alive because the node was created there).
Depends on D101042
Updated•5 years ago
|
Comment 18•5 years ago
|
||
Comment 19•5 years ago
|
||
bugherder |
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 20•5 years ago
|
||
Comment on attachment 9195889 [details]
Bug 1646604 - Fix cross docGroup node adoption may not correctly keep the arena alive r=smaug
Beta/Release Uplift Approval Request
- User impact if declined: Users experience crashes.
- Is this code covered by automated tests?: No
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: Yes
- If yes, steps to reproduce: Note that so far we've only found this bug is reproduciable in Thunderbird.
- Subscribe to the mozilla dev-platform mailing list https://lists.mozilla.org/listinfo/dev-platform
- Install Thunderbird Conversation extension. Make sure you've clicked
Apply changes
after installing the extension. - Keep opening emails from this mailing list randomly.
- If there's no crashes after opening about 20 emails, then we are good.
- List of other uplifts needed: None
- Risk to taking this patch: Medium
- Why is the change risky/not risky? (and alternatives if risky): Medium because I didn't land a crash test for it, so it requires some manual testing. And the patch belongs to dom node adoption, which has a fair amount of complexity.
- String changes made/needed:
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Comment 21•5 years ago
|
||
Given the risk called out in comment 20, the lack of crashes due to this bug in firefox, and the short time before 85 rc, I'd prefer to let this ride to 86.
Updated•5 years ago
|
Reporter | ||
Comment 22•5 years ago
|
||
Good news, the last Thunderbird daily to crash is buildid 20210107105528
Updated•5 years ago
|
Description
•