Closed Bug 1406732 Opened 7 years ago Closed 7 years ago

Crash in arena_t::SplitRun | arena_t::AllocRun | je_malloc | js::LifoAlloc::newChunkWithCapacity

Categories

(Core :: JavaScript Engine, defect)

58 Branch
x86
Windows 10
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 1229384
Tracking Status
firefox-esr52 --- unaffected
firefox56 --- unaffected
firefox57 --- unaffected
firefox58 --- fixed

People

(Reporter: calixte, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression, Whiteboard: [clouseau])

Crash Data

This bug was filed from the Socorro interface and is 
report bp-25cef28b-6a9a-44bb-a0b1-ee1f80171007.
=============================================================

There are 3 crashes (from the same installation) in nightly 58 with buildid 20171007100142. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1405795.

[1] https://hg.mozilla.org/mozilla-central/rev?node=80d704c6678142343050bb95e630a24727a240a7
Flags: needinfo?(nicolas.b.pierron)
This signature already existed prior my modification of the LifoAlloc:
  https://crash-stats.mozilla.com/search/?signature=~arena_t%3A%3ASplitRun&date=%3E%3D2017-04-09T10%3A02%3A44.000Z&date=%3C2017-10-09T10%3A02%3A44.000Z&_sort=-date&_facets=signature&_facets=version&_facets=platform&_facets=platform_version&_facets=available_page_file&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-available_page_file

From what I can tell so far, this seems to be an out-of-memory failing with the following MOZ_CRASH:
  https://hg.mozilla.org/mozilla-central/annotate/6b0491f83229/memory/build/mozjemalloc.cpp#l1453

Maybe we should return an error code, and propagate this allocation failure up to the LifoAlloc.

Apparently a common symptom to these reports is that they all have an Available Page File which is in the order of a few MB.
Crash Signature: [@ arena_t::SplitRun | arena_t::AllocRun | je_malloc | js::LifoAlloc::newChunkWithCapacity] → [@ arena_t::SplitRun | arena_t::AllocRun | je_malloc | js::LifoAlloc::newChunkWithCapacity] [@ arena_t::SplitRun | arena_t::AllocRun | je_malloc | js::LifoAlloc::getOrCreateChunk]
Flags: needinfo?(nicolas.b.pierron) → needinfo?(emanuel.hoogeveen)
I will note that this bug only appear on Windows NT, with Firefox 58.0a1 starting on build-id 20170922100051.
This might be related to Bug 1401099.
Blocks: 1401099
No longer blocks: 1405795
Flags: needinfo?(emanuel.hoogeveen) → needinfo?(mh+mozilla)
Crash Signature: [@ arena_t::SplitRun | arena_t::AllocRun | je_malloc | js::LifoAlloc::newChunkWithCapacity] [@ arena_t::SplitRun | arena_t::AllocRun | je_malloc | js::LifoAlloc::getOrCreateChunk] → [@ arena_t::SplitRun | arena_t::AllocRun | je_malloc | js::LifoAlloc::newChunkWithCapacity] [@ arena_t::SplitRun | arena_t::AllocRun | je_malloc | js::LifoAlloc::getOrCreateChunk] [@ arena_t::SplitRun | arena_t::AllocRun | je_malloc]
I think we've been starting to see more of these pages_commit crashes recently. I wonder if the multiple processes from e10s and the increased address space from getting more users onto 64-bit are shifting our OOMs away from per-process limits and toward system-wide limits. It might make sense to start treating pages_commit and similar functions as fallible.
This is not new. What's new is that bug 1401099 changed the crash signature.
https://crash-stats.mozilla.com/signature/?_sort=-date&signature=arena_run_split%20%7C%20arena_run_alloc%20%7C%20je_malloc%20%7C%20js%3A%3ALifoAlloc%3A%3AgetOrCreateChunk%20%7C%20js%3A%3Ajit%3A%3AMBasicBlock%3A%3ANew&date=%3E%3D2017-10-03T00%3A15%3A00.000Z&date=%3C2017-10-10T00%3A15%3A00.000Z

> It might make sense to start treating pages_commit and similar functions as fallible.

While that's true, and cf. bug 1229384 comment 9 we should do this (I'm actually in the middle of doing it, but I've entered some rabbit hole refactoring mutexes in the way there), that's not going to help much. It will just move where the OOM crashes happen.
Flags: needinfo?(mh+mozilla)
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
Fixed in 58 in bug 1229384. Awesome!
You need to log in before you can comment on or make changes to this bug.