Closed Bug 1344539 Opened 4 years ago Closed 3 years ago

Crash in ProcessExecutableMemory::allocate

Categories

(Core :: JavaScript Engine: JIT, defect, P1)

52 Branch
All
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox-esr45 --- unaffected
firefox52 --- wontfix
firefox-esr52 --- wontfix
firefox53 --- wontfix
firefox54 --- wontfix
firefox55 --- wontfix
firefox56 --- wontfix
firefox57 --- wontfix
firefox58 --- wontfix
firefox59 --- fixed
firefox60 --- fixed

People

(Reporter: philipp, Assigned: jandem)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(2 files)

This bug was filed from the Socorro interface and is 
report bp-2af8c4be-a324-403e-90c2-11a4f2170305.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	xul.dll 	ProcessExecutableMemory::allocate(unsigned int, js::jit::ProtectionSetting) 	js/src/jit/ProcessExecutableMemory.cpp:553
1 	xul.dll 	js::jit::ExecutableAllocator::createPool(unsigned int) 	js/src/jit/ExecutableAllocator.cpp:221
2 	xul.dll 	js::jit::ExecutableAllocator::poolForSize(unsigned int) 	js/src/jit/ExecutableAllocator.cpp:158
3 	xul.dll 	js::jit::Linker::newCode<1>(JSContext*, js::jit::CodeKind, bool) 	js/src/jit/Linker.cpp:36
4 	xul.dll 	js::jit::IonCache::linkCode(JSContext*, js::jit::MacroAssembler&, js::jit::IonCache::StubAttacher&, js::jit::IonScript*, js::jit::JitCode**) 	js/src/jit/IonCaches.cpp:284
5 	xul.dll 	js::jit::IonCache::linkAndAttachStub(JSContext*, js::jit::MacroAssembler&, js::jit::IonCache::StubAttacher&, js::jit::IonScript*, char const*, JS::TrackedOutcome) 	js/src/jit/IonCaches.cpp:316
6 	xul.dll 	js::jit::GetPropertyIC::tryAttachNative(JSContext*, JS::Handle<JSScript*>, js::jit::IonScript*, JS::Handle<JSObject*>, JS::Handle<jsid>, void*, bool*) 	js/src/jit/IonCaches.cpp:1532
7 	xul.dll 	js::jit::GetPropertyIC::tryAttachStub(JSContext*, JS::Handle<JSScript*>, js::jit::IonScript*, JS::Handle<JSObject*>, JS::Handle<JS::Value>, bool*) 	js/src/jit/IonCaches.cpp:2196
8 	mozglue.dll 	mozglue.dll@0x479f

this crash signature on windows 32bit & 64bit builds of firefox was first showing up in 52.0b5 and regularly thereafter. it looks related to the uplifts in bug 1334933
Flags: needinfo?(jdemooij)
Crash reason is MOZ_CRASH(CommitPages failed). All crashes except one were on 32-bit (and the x64 one looked like a potential physical memory OOM). I'm not sure why VirtualAlloc would fail like this, maybe some internal OOM or limit in virtual memory mappings?
Flags: needinfo?(jdemooij)
Interestingly, almost half of the crashes with this crash reason have the "EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER" signature [1]. The available physical memory and available virtual memory don't look particularly low for those either (assuming those stats are to be believed), so I'm not sure what to make of them.

While most crashes are on 32-bit Firefox, they seem split between 32-bit and 64-bit Windows (from looking at the total virtual memory).

[1] https://crash-stats.mozilla.com/search/?moz_crash_reason=%3DMOZ_CRASH%28CommitPages%20failed%29&product=Firefox&version=52.0b&version=52.0esr&version=52.0&version=52.0a2&version=52.0a1&version=53.0a2&version=53.0a1&version=54.0a1&date=%3E%3D2017-02-07T00%3A00%3A00.000Z&date=%3C2018-02-06T00%3A00%3A00.000Z&_sort=date&_facets=signature&_facets=total_virtual_memory&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-signature
(In reply to Jan de Mooij [:jandem] from comment #1)
> Crash reason is MOZ_CRASH(CommitPages failed). All crashes except one were
> on 32-bit (and the x64 one looked like a potential physical memory OOM). I'm
> not sure why VirtualAlloc would fail like this, maybe some internal OOM or
> limit in virtual memory mappings?

If you have minidump access, can you try "!gle" (GetLastError) on one of the 32-bit reports? I don't remember whether that works in minidumps. If not, maybe we could log GetLastError explicitly.

https://msdn.microsoft.com/en-us/library/windows/desktop/aa366887(v=vs.85).aspx only mentions one error by name, ERROR_INVALID_ADDRESS, but who knows what else may turn up.
Flags: needinfo?(jdemooij)
xc(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #2)
> Interestingly, almost half of the crashes with this crash reason have the
> "EMPTY: no crashing thread identified; ERROR_NO_MINIDUMP_HEADER" signature
> [1]. The available physical memory and available virtual memory don't look
> particularly low for those either (assuming those stats are to be believed),
> so I'm not sure what to make of them.
> 
> While most crashes are on 32-bit Firefox, they seem split between 32-bit and
> 64-bit Windows (from looking at the total virtual memory).
> 
> [1]
> https://crash-stats.mozilla.com/search/
> ?moz_crash_reason=%3DMOZ_CRASH%28CommitPages%20failed%29&product=Firefox&vers
> ion=52.0b&version=52.0esr&version=52.0&version=52.0a2&version=52.
> 0a1&version=53.0a2&version=53.0a1&version=54.0a1&date=%3E%3D2017-02-
> 07T00%3A00%3A00.000Z&date=%3C2018-02-06T00%3A00%3A00.
> 000Z&_sort=date&_facets=signature&_facets=total_virtual_memory&_columns=date&
> _columns=signature&_columns=product&_columns=version&_columns=build_id&_colum
> ns=platform#facet-signature

For the most part, those crashes ran out of pagefile.
Too late for 52 and likely too late for 53 but we could still take a patch for aurora 54.
dmajor is the implication that they are effective out of virtual memory and this an unactionable terminal OOM from our perspective?
Flags: needinfo?(dmajor)
More likely out of physical memory than virtual, but yes.
Flags: needinfo?(dmajor)
Sounds like it is not actionable, at least for 54.  There are currently about 350 crashes per week with this signature in 53 release.
Marking 57 wontfix since we're a few days away from 57 release. This is still a moderate volume crash on release, around 500 crashes per week.
This jumped in frequency in mid-November, from ~200 per day to ~600 per day.
Firefox 	59.0a1 	141 	11.4% 	37
Firefox 	58.0b13 278 	22.5% 	209
Firefox 	58.0b12 34 	2.8% 	32
Firefox 	58.0b14 22 	1.8% 	20
Firefox 	58.0b10 4 	0.3% 	3
Attached patch Log GetLastErrorSplinter Review
Let's log GetLastError() when the VirtualAlloc in CommitPages fails, as suggested in comment 3. Cargo culted from [0].

[0] https://searchfox.org/mozilla-central/rev/f42618c99dcb522fb674221acfbc68c2d92e7936/js/src/gc/Memory.cpp#859-860
Flags: needinfo?(jdemooij)
Attachment #8940683 - Flags: review?(emanuel.hoogeveen)
Comment on attachment 8940683 [details] [diff] [review]
Log GetLastError

Review of attachment 8940683 [details] [diff] [review]:
-----------------------------------------------------------------

Worth a shot.
Attachment #8940683 - Flags: review?(emanuel.hoogeveen) → review+
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/32371cb3206a
Log GetLastError() when we fail to commit executable pages on Windows. r=ehoogeveen
Keywords: leave-open
Assignee: nobody → jdemooij
So far, all crashes (since the instrumentation landed) have error code 1455. According to [0], that means:

ERROR_COMMITMENT_LIMIT
    1455 (0x5AF)
    The paging file is too small for this operation to complete.

That matches comment 4.

[0] https://msdn.microsoft.com/en-us/library/windows/desktop/ms681385(v=vs.85).aspx
Since this patch landed, crash stat reported the following "moz crash reason" (and the equivalent error reasons[0]):

436	CommitPages failed! Error code: 1455 (ERROR_COMMITMENT_LIMIT: The paging file is too small for this operation to complete.)
1	CommitPages failed! Error code: 87   (ERROR_INVALID_PARAMETER: The parameter is incorrect.)

What would be the next step?

[0] https://msdn.microsoft.com/en-us/library/windows/desktop/ms681381(v=vs.85).aspx
Flags: needinfo?(jdemooij)
Priority: -- → P1
Attached patch PatchSplinter Review
We could try to propagate OOM from CommitPages. This might move (some of) the crashes elsewhere, but ProcessExecutableMemory::allocate already returns nullptr when we allocate too much executable code so it's pretty easy to make this fallible.
Flags: needinfo?(jdemooij)
Attachment #8951566 - Flags: review?(luke)
Comment on attachment 8951566 [details] [diff] [review]
Patch

Review of attachment 8951566 [details] [diff] [review]:
-----------------------------------------------------------------

Makes sense.
Attachment #8951566 - Flags: review?(luke) → review+
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9680ddb02478
Make committing executable memory fallible. r=luke
Status: NEW → RESOLVED
Closed: 3 years ago
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Is this something we should consider backporting to 59?
Flags: needinfo?(jdemooij)
Comment on attachment 8951566 [details] [diff] [review]
Patch

(In reply to Ryan VanderMeulen [:RyanVM] from comment #23)
> Is this something we should consider backporting to 59?

Considering the volume it might make sense.

Approval Request Comment
[Feature/Bug causing the regression]: Bug 1334933.
[User impact if declined]: Crashes on OOM.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: No.
[Why is the change risky/not risky?]: Only affects what happens when we OOM here.
[String changes made/needed]: None.
Flags: needinfo?(jdemooij)
Attachment #8951566 - Flags: approval-mozilla-beta?
Comment on attachment 8951566 [details] [diff] [review]
Patch

Not a new issue, but fixes a long-running OOM crash. Taking for 59b12.
Attachment #8951566 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.