Closed
Bug 644707
Opened 13 years ago
Closed 13 years ago
Avoid calling system abort(), generate useful stack traces through a manual segfault instead
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
FIXED
mozilla8
People
(Reporter: jdm, Assigned: azakai)
References
Details
Attachments
(1 file, 1 obsolete file)
1.89 KB,
patch
|
ted
:
review+
|
Details | Diff | Splinter Review |
abort can drag us into libc and screw up backtraces on tinderbox. I've never seen JS_ASSERT fail, so let's use their method for mozalloc_abort.
Reporter | ||
Comment 1•13 years ago
|
||
Comment 2•13 years ago
|
||
raise(SIGABRT) is not caught by the crash reporter, which is why mozalloc_abort uses it only after actually trying to crash first.
Comment 3•13 years ago
|
||
Or maybe I'm wrong, ted knows I think.
Reporter | ||
Comment 4•13 years ago
|
||
Comment on attachment 521576 [details] [diff] [review] Raise an abort signal instead of calling abort. Oh, boo.
Attachment #521576 -
Attachment is obsolete: true
Comment 5•13 years ago
|
||
I really don't remember, honestly. I'm pretty sure whatever we're doing for JS_ASSERT works on tinderbox. It's not hard to test, though, just stick one in somewhere and run with MOZ_CRASHREPORTER=1 in your environment, and see if you get the crash reporter.
Assignee | ||
Comment 6•13 years ago
|
||
I tested on Linux, and both |abort()| and |raise(SIGABRT)| trigger the crash reporter. So that part seems fine. The issue with SIGABRT not caught is apparently limited to OS X, as the source comment says in that file. The patch doesn't change anything on OS X, so should be fine, I am in favor of moving forward with it. But I do have some other concerns: 1. There are other places that call abort(), not just mozalloc_abort. Should we change all of them? Also, some are in things like jemalloc, chromium ipc code, etc. - not our core codebase. What do we want to do about those? 2. I didn't actually check if crash report stack traces look better with this. One thing that concerns me is that we will still have a stack trace inside libc, won't we? Just it will be inside raise() instead of abort(). That is presumably less deep (if abort calls raise), but it doesn't seem guaranteed to work, unless I am missing something?
Comment 7•13 years ago
|
||
The other possibility here is just to use TouchBadMemory() like we do on OS X. That will give us a reliable crash with a stack entirely in libxul.
Assignee | ||
Comment 8•13 years ago
|
||
This patch overrides the system abort() with a redirect to mozalloc_abort, in order to catch all calls to abort() in the browser. It also tries to touch bad memory first, then raise(SIGABORT) (except on OS X where this is known not to work), then _exit, in that order. This seems like the safest thing to do, or can we do better somehow?
Attachment #544339 -
Flags: review?(ted.mielczarek)
Reporter | ||
Comment 9•13 years ago
|
||
I went with raise(SIGABORT) because every crash stack I've seen using JS_ASSERT has worked just fine.
Assignee | ||
Comment 10•13 years ago
|
||
The patch does try raise(SIGABORT), after trying to manually segfault. Manual segfaults seem most likely to work since the stack trace will be entirely in our code. But in any case, I am fine with either way, both seem likely to be improvements over the current situation. ted, which do you prefer?
Comment 11•13 years ago
|
||
I'm in favor of whatever gets us the best stack traces. Have you tested this on Android to see what the stacks look like?
Assignee | ||
Comment 12•13 years ago
|
||
I don't have access to an Android device at the moment, I should be able to test this in a few days.
Updated•13 years ago
|
Assignee: nobody → azakai
Comment 13•13 years ago
|
||
Comment on attachment 544339 [details] [diff] [review] patch v2 Review of attachment 544339 [details] [diff] [review]: ----------------------------------------------------------------- Seems reasonable. ::: memory/mozalloc/mozalloc_abort.cpp @@ +81,2 @@ > > + // On *NIX platforms the preffered way to abort is by touching bad memory, "prefered"
Attachment #544339 -
Flags: review?(ted.mielczarek) → review+
Assignee | ||
Updated•13 years ago
|
Summary: Avoid calling abort() and use raise(SIGABRT) instead → Avoid calling system abort(), generate useful stack traces through a manual segfault instead
Assignee | ||
Comment 14•13 years ago
|
||
Pushed to m-i http://hg.mozilla.org/integration/mozilla-inbound/rev/f986c9b02c1d For posterity's sake, we talked on irc and ted ran some tests which showed that we get useful stack data using manual segfaults, but not with raise(SIGABORT) or abort(), confirming the reasoning before.
Comment 15•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/f986c9b02c1d
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla8
Comment 16•13 years ago
|
||
I wondered what the reason was for interposing the global abort symbol even for aborts from within system libraries, because we end up depending on stack scanning through those system libraries anyway, but I guess it may be useful as it means that all aborts will all have the same signature instead of different operating systems having slightly different offsets within linux-gate.so.
Comment 17•13 years ago
|
||
Note that we could also add symbols to our symbol store for linux-gate. Breakpad has some outdated ones in the source tree: http://code.google.com/p/google-breakpad/source/browse/#svn%2Ftrunk%2Fsrc%2Fclient%2Flinux%2Fdata I landed a patch recently that made linux-gate get a reasonable debug ID again, but I didn't take the time to generate symbol files for it.
You need to log in
before you can comment on or make changes to this bug.
Description
•