Annotate OOM size for pages_commit (physical memory exhaustion)

RESOLVED FIXED in Firefox 43

Status

()

RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: dmajor, Assigned: dmajor)

Tracking

39 Branch
mozilla43
Points:
---

Firefox Tracking Flags

(firefox43 fixed)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

4 years ago
The crash signature "moz_abort | pages_commit" indicates a physical memory exhaustion. https://hg.mozilla.org/mozilla-central/file/46dde6cfd303/memory/mozjemalloc/jemalloc.c#l2053

As more pre-release channel users move to 64-bit builds, we're seeing address-space OOMs disappear and physical OOMs rise as physical memory and pagefile become the limiting factors. We should record the size of the failed allocations in crash reports, like we do for address space OOMs.

The naive fix is to call mozalloc_handle_oom() when pages_commit fails, but that didn't link, so I'll need some build-wrangling help. (I don't have the build log in front of me right now; I'll attach it when I do)
(Assignee)

Comment 1

4 years ago
Created attachment 8631757 [details]
attempt 1

Here's my first attempt and the resulting error.
(Assignee)

Comment 2

4 years ago
Glandium, sorry... you proposed a workaround a few weeks ago but I don't remember what it was.
Flags: needinfo?(mh+mozilla)
My irc logs say that I told you I don't care how you hook things up because jemalloc3 doesn't have this code path. Nothing about build failures.
Flags: needinfo?(mh+mozilla)
(Assignee)

Comment 4

4 years ago
I'd appreciate help in getting this build working.
(Assignee)

Comment 5

3 years ago
I should revisit this now that we have jemalloc4. I think the new equivalent is: 
[@ JemallocInit::CommitHook ]
Flags: needinfo?(dmajor)
(Assignee)

Comment 6

3 years ago
After a fresh look, I'm starting to understand why I had trouble building. The proper "jemalloc code" knows nothing about our OOM reporting system. It's only concerned with returning a pointer or null. The translation from null to a crash happens further up, in the wrappers in memory/mozalloc/mozalloc.cpp. My attempt to call mozalloc's abort handler from the guts of jemalloc was a layering violation, so no surprise that it didn't work.
Flags: needinfo?(dmajor)
(Assignee)

Comment 7

3 years ago
So, for MEM_COMMIT failures, instead of crashing deep inside jemalloc, it would be nice to bubble up a null pointer and let the infallibility-wrappers do the crashing. That would also conveniently handle the size annotation. And it would allow fallible allocations to survive a MEM_COMMIT failure.

Glandium: I notice that the CommitHook returns a bool. Instead of crashing, could we use that return value to indicate failure?
Flags: needinfo?(mh+mozilla)
Yes
Flags: needinfo?(mh+mozilla)
(Assignee)

Comment 9

3 years ago
Created attachment 8660132 [details] [diff] [review]
Allow the CommitHook to fail

I'm somewhat scared of this change, because I have no way to see its effects. There's no test coverage, since any test whose behavior this changes would have previously crashed.

At first glance jemalloc _looks_ like it can handle this failure, but I don't know the code very well. (What happens to that run?)

If this messes up some state, how will that appear? Random crashes later?
Assignee: nobody → dmajor
Attachment #8631757 - Attachment is obsolete: true
Attachment #8660132 - Flags: review?(mh+mozilla)
Attachment #8660132 - Flags: review?(mh+mozilla) → review+
https://hg.mozilla.org/mozilla-central/rev/408c024f3dc0
Status: NEW → RESOLVED
Last Resolved: 3 years ago
status-firefox43: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
You need to log in before you can comment on or make changes to this bug.