Last Comment Bug 716638 - mozalloc_handle_oom should propogate the failed allocation size into crash reports
: mozalloc_handle_oom should propogate the failed allocation size into crash re...
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: General (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla12
Assigned To: Benjamin Smedberg [:bsmedberg]
:
Mentors:
Depends on:
Blocks: 725024 733262 669108
  Show dependency treegraph
 
Reported: 2012-01-09 12:35 PST by Ted Mielczarek [:ted.mielczarek]
Modified: 2013-01-08 09:14 PST (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Annotate with the size before crashing due to OOM, unsafe WIP (7.68 KB, patch)
2012-01-09 13:41 PST, Benjamin Smedberg [:bsmedberg]
no flags Details | Diff | Splinter Review
Annotate using a static var, rev. 1 (8.02 KB, patch)
2012-01-21 11:57 PST, Benjamin Smedberg [:bsmedberg]
ted: review-
Details | Diff | Splinter Review
Annotate using a static var, rev. 2 (16.31 KB, patch)
2012-01-23 14:38 PST, Benjamin Smedberg [:bsmedberg]
no flags Details | Diff | Splinter Review
Annotate using a static var, rev. 2.1 (17.50 KB, patch)
2012-01-24 03:47 PST, Benjamin Smedberg [:bsmedberg]
ted: review+
Details | Diff | Splinter Review
google-breakpad patch (794 bytes, patch)
2012-01-24 08:59 PST, Benjamin Smedberg [:bsmedberg]
ted: review+
Details | Diff | Splinter Review

Description Ted Mielczarek [:ted.mielczarek] 2012-01-09 12:35:03 PST
We should figure out a way to get the failed allocation size into crash reports from mozalloc_handle_oom. Currently it's hard to tell if we're OOMing from memory/address space exhaustion, or from a single large failed allocation (where we might want to change the caller to use fallible alloc).
Comment 1 Benjamin Smedberg [:bsmedberg] 2012-01-09 13:27:20 PST
<bsmedberg> ted: I was going to implement the mozalloc abort size annotation
<bsmedberg> but then I noticed that if we're really low on memory at that point...
<bsmedberg> we shouldn't be calling AppendAppNotesToCrashReport because it allocates
<bsmedberg> unless all of AppendAppNotesToCrashReport is OOM-safe

Not sure what to do here...
Comment 2 Benjamin Smedberg [:bsmedberg] 2012-01-09 13:41:09 PST
Created attachment 587111 [details] [diff] [review]
Annotate with the size before crashing due to OOM, unsafe WIP

As noted in the patch, this is incomplete and unsafe and the whole approach is probably flawed, but I thought I'd attach it for record keeping.
Comment 3 Ted Mielczarek [:ted.mielczarek] 2012-01-09 15:27:44 PST
I mentioned this in IRC, but I'll repeat it here:
What I'd probably do is just add a new API for it. Add a static  uint64_t OOMAllocationSize = 0; and SetOOMAllocationSize(size_t), and write it out specially like we do with a few other things:
http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/nsExceptionHandler.cpp#427
Comment 4 Chris Jones [:cjones] inactive; ni?/f?/r? if you need me 2012-01-09 16:44:34 PST
Note that that interface technically needs to be thread-safe, but if we hit a small-alloc OOM then stomping that value from another small-alloc failure doesn't matter too much (as long as the read/write are atomic).
Comment 5 Benjamin Smedberg [:bsmedberg] 2012-01-21 11:57:11 PST
Created attachment 590502 [details] [diff] [review]
Annotate using a static var, rev. 1

This totally ignores child processes but I'm pretty sure it should work for the main process.
Comment 6 Ted Mielczarek [:ted.mielczarek] 2012-01-23 13:08:40 PST
Comment on attachment 590502 [details] [diff] [review]
Annotate using a static var, rev. 1

Review of attachment 590502 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/mozalloc/mozalloc_oom.cpp
@@ +60,5 @@
>      mozalloc_abort("out of memory");
>  }
> +
> +void
> +mozalloc_set_oom_abort_handler(mozalloc_oom_abort_handler handler)

I don't see anywhere in this patch that you actually call this.

::: toolkit/crashreporter/nsExceptionHandler.cpp
@@ +333,5 @@
>  
>    return str;
>  }
>  
> +static size_t gOOMAllocationSize;

You want to initialize this, right?

@@ +478,5 @@
> +        _ui64toa(gOOMAllocationSize, buffer, 10);
> +        bufferLen = strlen(buffer);
> +        WriteFile(hFile, buffer, bufferlen, &nBytes, NULL);
> +        WriteFile(hFile, "\n", 1, &nBytes, NULL);
> +      }

You only handled the Windows case here, not the POSIX case.
Comment 7 Benjamin Smedberg [:bsmedberg] 2012-01-23 14:38:49 PST
Created attachment 590880 [details] [diff] [review]
Annotate using a static var, rev. 2
Comment 8 Benjamin Smedberg [:bsmedberg] 2012-01-24 03:47:35 PST
Created attachment 591054 [details] [diff] [review]
Annotate using a static var, rev. 2.1

This time with feeling! (and tests that pass)
Comment 9 Ted Mielczarek [:ted.mielczarek] 2012-01-24 04:38:46 PST
Comment on attachment 591054 [details] [diff] [review]
Annotate using a static var, rev. 2.1

Review of attachment 591054 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/crashreporter/google-breakpad/src/common/linux/linux_libc_support.h
@@ +109,1 @@
>    if (!i)

This change needs to wind up upstream. Can you give me a patch against Breakpad SVN?

::: toolkit/crashreporter/nsExceptionHandler.cpp
@@ +338,5 @@
>  
>    return str;
>  }
>  
> +static size_t gOOMAllocationSize;

Still needs to be initialized...

::: toolkit/crashreporter/test/unit/test_crash_oom.js
@@ +12,5 @@
> +    },
> +    function(mdump, extra) {
> +      do_check_eq(extra.TestingOOMCrash, "Yes");
> +      do_check_true("OOMAllocationSize" in extra);
> +      do_check_true(Number(extra.OOMAllocationSize) > 0);

Thanks for adding a test. I think you should also add a test that does a non-OOM crash, and checks that "OOMAllocationSize" is not present.
Comment 10 Benjamin Smedberg [:bsmedberg] 2012-01-24 08:59:51 PST
Created attachment 591120 [details] [diff] [review]
google-breakpad patch
Comment 11 Benjamin Smedberg [:bsmedberg] 2012-01-24 10:34:22 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/4d6cb73f892b
Comment 12 Ted Mielczarek [:ted.mielczarek] 2012-01-24 11:05:18 PST
You should probably file a Socorro bug about getting this exposed in the UI so we don't have to grovel around in the raw JSON for it.
Comment 13 Benjamin Smedberg [:bsmedberg] 2012-01-24 13:27:49 PST
I added this field to the request in bug 669108.
Comment 14 :Ms2ger (⌚ UTC+1/+2) 2012-01-25 07:28:12 PST
https://hg.mozilla.org/mozilla-central/rev/4d6cb73f892b
Comment 15 Ted Mielczarek [:ted.mielczarek] 2012-01-27 04:43:32 PST
Comment on attachment 591120 [details] [diff] [review]
google-breakpad patch

Review of attachment 591120 [details] [diff] [review]:
-----------------------------------------------------------------

I'll land this upstream for you in a minute.
Comment 16 Ted Mielczarek [:ted.mielczarek] 2012-01-27 04:53:13 PST
http://code.google.com/p/google-breakpad/source/detail?r=909
Comment 17 Gian-Carlo Pascutto [:gcp] 2013-01-08 08:54:20 PST
Did we regress this? We now crash in nsTArrayInfallibleAllocator::SizeTooBig with no info on the size of the allocation.
Comment 18 Benjamin Smedberg [:bsmedberg] 2013-01-08 09:14:39 PST
No, this has not regressed. nsTArrayInfallibleAllocator doesn't use the infallible malloc API but only calls mozalloc_abort directly, so there is no size information to put in the crash report. See also bug 811483 which intends to annotate whenever an allocation fails, regardless of whether it was an infallible alloc or not.

Note You need to log in before you can comment on or make changes to this bug.