Open Bug 1864087 Opened 11 months ago Updated 10 months ago

Crash in [@ js::gc::FreeSpan::allocate]

Categories

(Core :: JavaScript: GC, defect, P3)

x86
All
defect

Tracking

()

Tracking Status
firefox121 --- affected

People

(Reporter: release-mgmt-account-bot, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash, stalled)

Crash Data

Crash report: https://crash-stats.mozilla.org/report/index/6c29cc90-c818-4e64-94da-ebfa20231104

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0  xul.dll  js::gc::FreeSpan::allocate  js/src/gc/Heap.h:120
0  xul.dll  js::gc::FreeLists::allocate  js/src/gc/ArenaList-inl.h:238
0  xul.dll  js::gc::ArenaLists::allocateFromFreeList  js/src/gc/ArenaList-inl.h:306
0  xul.dll  js::gc::AllocateCellInGC  js/src/gc/Allocator.cpp:295
0  xul.dll  js::gc::TenuringTracer::allocTenured  js/src/gc/Tenuring.cpp:528
0  xul.dll  js::gc::TenuringTracer::allocTenuredString  js/src/gc/Tenuring.cpp:533
0  xul.dll  js::gc::TenuringTracer::moveToTenured  js/src/gc/Tenuring.cpp:773
0  xul.dll  js::gc::TenuringTracer::onStringEdge  js/src/gc/Tenuring.cpp:94
1  xul.dll  js::gc::DispatchToOnEdge  js/src/gc/Tracer.h:403
1  xul.dll  js::gc::StoreBuffer::CellPtrEdge<JSString>::trace const  js/src/gc/Tenuring.cpp:444

By querying Nightly crashes reported within the last 2 months, here are some insights about the signature:

  • First crash report: 2023-09-16
  • Process type: Multiple distinct types
  • Is startup crash: No
  • Has user comments: No
  • Is null crash: Yes - 1 out of 7 crashes happened on null or near null memory address
  • Is use after free crash: Yes - 2 out of 7 crashes happened on or near an allocator poison value
Group: core-security → javascript-core-security
Component: General → JavaScript: GC

jonco, when you have time could you take a look at this bug. We were working on triaging it today and had difficulty and need someone with a little more GC expertise to look at it.

Flags: needinfo?(jcoppeard)

These crashes are almost all when we're trying to tenure a nursery cell. We read the zone pointer stored before it but it isn't valid and we crash. There are various reasons for it being invalid: the original pointer could be wrong, or the nursery itself could be corrupted.

Crashes with the 0x4b4b... poison pattern are interesting because this is the JS_SWEPT_TENURED_PATTERN pattern, so the crash suggests we're trying to tenure a string that was already in tenured heap but has been swept. Not sure how that can happen (well, corruption of the chunk header so the system thinks a tenured chunk is a nursery chunk would be one way).

I think this is another heap corruption issue. It's possible there are some genuine problems lurking here but the signal to noise ration is too low to tell.

It is suggestive that ESR crashes ramped up and then died down a little though. I haven't been able to find a signature shift to explain that.

Flags: needinfo?(jcoppeard)
Crash Signature: [@ js::gc::FreeSpan::allocate] → [@ js::gc::FreeSpan::allocate] [@ js::gc::FreeLists::allocate]

I briefly went through the crashes here and several are marked as potentially caused by a bit-flip, additionally many of the affected machine are old. I agree with the reasoning in comment 2, these are likely due to flaky hardware and if there's something relevant it's drowned by the unactionable crashes.

Keywords: stalled
Severity: -- → S3
Priority: -- → P3

I'm just going to unhide this. It is yet another mysterious GC heap corruption crash.

Group: javascript-core-security
You need to log in before you can comment on or make changes to this bug.