Retire mozalloc_handle_oom()
Categories
(Core :: Memory Allocator, task)
Tracking
()
People
(Reporter: gsvelto, Unassigned)
References
(Blocks 1 open bug)
Details
We had introduced the mozalloc_handle_oom() function to let user code try and recover for OOM conditions, however as of now the only thing it does is crashing the calling process with a message. I think we can safely retire this after the changes we introduced in bug 1716727 to reduce OOM crashes on Windows. The reasoning behind this is the following:
- On Windows it's not useful anymore, as we handle recovering from OOM conditions directly inside the allocation functions
- On Linux it's almost unused as Linux processes are killed by the kernel when running out of memory. We have a very small number of crashes on file caused via
mozalloc_handle_oom()by they almost only affect 32-bit builds running out of address space, not a very interesting scenario and a vanishingly small one too. - On macOS we never run out of memory, so it's effectively unused.
I suggest removing it entirely from the callers that expect it to recover from OOM and just use infallible allocations instead (as they'd do the right thing). For other callers we could replace it with NS_ABORT_OOM() or implement it in the same way and remove the calls to NS_ABORT_OOM() (as there would be no point in having two functions doing the same thing). The only other adjustment I'd make is to make sure that the allocation size is never 0, for when it's 0 we don't flag the crash as an OOM crash which leads to false negatives.
Comment 1•3 years ago
|
||
Hello,
I'd be happy to start working on this bug soon. I have a question about this final note:
(In reply to Gabriele Svelto [:gsvelto] from comment #0)
The only other adjustment I'd make is to make sure that the allocation size is never 0, for when it's 0 we don't flag the crash as an OOM crash which leads to false negatives.
Callers of mozalloc_handle_oom and NS_ABORT_OOM seem to use 0 currently when they don't know the allocation size, so this looks like something we should fix indeed. Should we thus have two special values for gOOMAllocationSize: 0 if it's not an OOM, and SIZE_MAX if the allocation size is not known? And then we would crash differently if we happen to call NS_ABORT_OOM with 0? I saw that gOOMAllocationSize is also read from rust, so I guess having special values looks like a simpler adjustment to make compared to, say, changing the type?
Comment 2•3 years ago
|
||
If you are changing the meaning of OOM allocation size, you'll have to coordinate with Soccorro as it affects the signature. Specifically, it uses it to turn the signature into either OOM | small or OOM | large | someFunction.
| Reporter | ||
Comment 3•3 years ago
|
||
(In reply to Yannis Juglaret from comment #1)
Callers of
mozalloc_handle_oomandNS_ABORT_OOMseem to use 0 currently when they don't know the allocation size, so this looks like something we should fix indeed. Should we thus have two special values forgOOMAllocationSize: 0 if it's not an OOM, andSIZE_MAXif the allocation size is not known? And then we would crash differently if we happen to callNS_ABORT_OOMwith 0? I saw thatgOOMAllocationSizeis also read from rust, so I guess having special values looks like a simpler adjustment to make compared to, say, changing the type?
The way crashes are handled on crash-stats if the value of the OOMAllocationSize annotation is less than 256 KiB then the crash will be lumped together with all other "small" OOM crashes, if it's larger then it gets it's own "OOM | large" signature (see the code in socorro that handles this).
Note that we have a concept of OOMs of unknown size so we could leverage that: in cases where we really don't have the OOM allocation size we could leave it as zero and crash with mozalloc_abort(). Then it would be just a matter of adjusting Socorro to recognize those crashes (by adding mozalloc_abort() here) and they would be classified as OOM | unknown because the allocation size is absent. I can do that part. So to recap:
- Where possible crash with the allocation size
- Where not possible crash with
mozalloc_abort(0)and we'll deal with it server-side
| Reporter | ||
Comment 4•3 years ago
|
||
Actually there's something that just occurred to me which puts a stake in that plan: mozalloc_abort(0) is also used for regular calls to abort(), see bug 1771212. So we can't just flag all the crashes that call mozalloc_abort(0) as OOM because some aren't. Meh.
| Reporter | ||
Comment 5•3 years ago
|
||
After thinking a bit more about this I'd say let's go ahead with replacing all the uses including the ones where we need to call mozalloc_abort(0), we'll figure out how to tell them apart at a later time. No need to block on them.
Description
•