Moz_crash "[unhandlable oom] Failed to allocate new chunk during GC"
Categories
(Core :: JavaScript: GC, defect, P3)
Tracking
()
People
(Reporter: planetman1125, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: regression)
Crash Data
Attachments
(2 files)
Crash report: https://crash-stats.mozilla.org/report/index/8b28b2c1-2521-427b-aabe-771cf0240930
MOZ_CRASH Reason:
[unhandlable oom] Failed to allocate new chunk during GC
Top 10 frames:
0 xul.dll MOZ_Crash(char const*, int, char const*) mfbt/Assertions.h:317
0 xul.dll js::AutoEnterOOMUnsafeRegion::crash_impl(char const*) js/src/vm/JSContext.cpp:1344
1 xul.dll js::gc::AllocateTenuredCellInGC(JS::Zone*, js::gc::AllocKind) js/src/gc/Allocator.cpp:350
1 xul.dll js::gc::TenuringTracer::allocCell(JS::Zone*, js::gc::AllocKind, js::gc::Alloc... js/src/gc/Tenuring.cpp:771
1 xul.dll js::gc::TenuringTracer::alloc(JS::Zone*, js::gc::AllocKind, js::gc::Cell*) js/src/gc/Tenuring.cpp:732
1 xul.dll js::gc::TenuringTracer::promotePlainObject(js::PlainObject*) js/src/gc/Tenuring.cpp:865
1 xul.dll js::gc::TenuringTracer::promoteObject(JSObject*) js/src/gc/Tenuring.cpp:148
1 xul.dll js::gc::TenuringTracer::promoteOrForward(JSObject*) js/src/gc/Tenuring.cpp:132
1 xul.dll js::gc::StoreBuffer::CellPtrEdge<JSObject>::trace(js::gc::TenuringTracer&) const js/src/gc/Tenuring.cpp:624
1 xul.dll js::gc::StoreBuffer::MonoTypeBuffer<js::gc::StoreBuffer::CellPtrEdge<JSObject... js/src/gc/Tenuring.cpp:320
unsure if this is safe crash (feel free to remove security flag if not needed)
Comment 2•1 year ago
|
||
This is an intentional crash (MOZ_CRASH) in order to exit safely rather than risk continuing. It is unfortunately a fairly common crash. I'm really curious about how we get an OOM when you appear to have so much memory available.
Comment 3•1 year ago
|
||
I'm going to un-dupe this from bug 1472062. This specific signature went from background noise to 500 crashes a day in June. It might just be signature morph, but maybe that's a clue of some kind and worth looking at separately from the dozens of signatures in the generic bug. We can always re-dupe it if there's nothing to go on.
Comment 4•1 year ago
|
||
Preserving the current crash frequency graph in case we wish we had it when the historical data about the regression time frame and shape expires in a few months.
Comment 5•1 year ago
|
||
:jonco, do you have any idea why this signature would jump up in June?
i wonder if bug 1921979 is duplicate to this bug as they both OOM when there is plenty of memory only difference is the crash signature and it affecting the MediaManager i think it probably due to the screen reader (nvda)
Given that it's so high volume, maybe it should be a higher severity or priority rating at least for initial investigation.
Comment 8•1 year ago
|
||
This bug has been marked as a regression. Setting status flag for Nightly to affected.
Updated•1 year ago
|
Updated•1 year ago
|
Comment 9•1 year ago
•
|
||
(In reply to Daniel Veditz [:dveditz] from comment #2)
I'm really curious about how we get an OOM when you appear to have so much memory available.
If you are referring to the crash in comment 0, the system is running out of page file, which can explain the OOM.
(In reply to Daniel Veditz [:dveditz] from comment #3)
I'm going to un-dupe this from bug 1472062. This specific signature went from background noise to 500 crashes a day in June. It might just be signature morph, but maybe that's a clue of some kind and worth looking at separately from the dozens of signatures in the generic bug. We can always re-dupe it if there's nothing to go on.
It looks like the volume might have approximately transferred between the three signatures below:
OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::gc::AllocateCellInGC;OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::gc::AllocateTenuredCellInGC;OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::gc::AllocateTenuredCellInGC.
Attached is the volume on these three signatures for the past six months. I think what's important is that the global cumulated volume in bug 1472062 stays stable. If there really was a faulty isolated signature in the mix, we should see a spike occur there as well.
Comment 10•1 year ago
|
||
(In reply to Yannis Juglaret [:yannis] from comment #9)
Thanks for looking into this.
It looks like the volume might have approximately transferred between the three signatures below:
OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::gc::AllocateCellInGC;OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::gc::AllocateTenuredCellInGC;
AllocateCellInGC was renamed to AllocateTenuredCellInGC by bug 1925197 which landed 19th March so this explains the shift when this hit release.
OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::gc::AllocateTenuredCellInGC.
Likely due to an inlining change.
Base on the above I'll dupe this to bug 1472062.
Updated•1 year ago
|
Description
•