Closed
Bug 148453
Opened 23 years ago
Closed 21 years ago
Mozilla no longer dumps core on SEGV signal!
Categories
(Core Graveyard :: Profile: BackEnd, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mozilla-bugs, Assigned: ccarlen)
References
Details
(Keywords: dataloss, regression)
Attachments
(2 files, 3 obsolete files)
1.46 KB,
patch
|
bryner
:
review-
dbaron
:
superreview+
|
Details | Diff | Splinter Review |
1.30 KB,
patch
|
brendan
:
review+
dbaron
:
superreview+
|
Details | Diff | Splinter Review |
I've noticed that Mozilla have recently stopped dumping the core when it
crashes; instead it just exits with exit status 11. I believe that this was
caused by the bug 76431 checkin - the SEGV signal gets intercepted by the
profile manager which then deletes the lock file and exits and core is never dumped!
This is pretty bad. It makes it significantly harder to investigate unexpected
(e.g. those that I didn't cause intentially when investigating a previous crash)
crashes.
Reporter | ||
Comment 1•23 years ago
|
||
This is a recent regression. It's also a dataloss - important stack trace
information gets lost instead of being saved in a core file.
Keywords: dataloss,
regression
Updated•23 years ago
|
Keywords: mozilla1.0,
nsbeta1
Comment 2•23 years ago
|
||
This will also presumably affect the stack-trace-on-crash hack which dbaron will
check in one of these days. The code chains signal handlers, so as long as we
run after talkback is initialied, talkback should still work. If the profile
code runs before, then the lock files probably aren't cleaned up - anyone with a
talkback build (or the comm source) want to test that?
Can we reset the sigaction to SIG_DFL, and then raise() from a signal handler?
Or should the final call be to abort(), not _exit ?
Keywords: mozilla1.0.1
Comment 3•23 years ago
|
||
re-assign and cc some people from bug 76431
Assignee: Matti → ccarlen
Component: Browser-General → Profile Manager BackEnd
QA Contact: imajes-qa → ktrina
Comment 4•23 years ago
|
||
I wonder whether this affects talkback as well?
Comment 5•23 years ago
|
||
this doesn't seem to affect talkback, but it prevents platforms without talkback
from getting stacktraces on unreproducible crashes.
Reporter | ||
Comment 7•22 years ago
|
||
Any progress on this? A recent Mozilla build (2002122113) crashed twice on me
over the last two days without me doing anything distinctive, and I have no idea
what's wrong. :-( It would be *really* nice to get the core dumps back!
Depends on: 76431
Keywords: mozilla1.3
Reporter | ||
Comment 8•22 years ago
|
||
This patch ammends the current nsProfileAccess signal handler to check whether
the old handler was SIG_DFL and if so, then restore that SIG_DFL sigaction and
re-raise the signal to make sure the default action is executed.
This ensures that the core is actually dumped on SIGSEGV/SIGABRT/etc.
Reporter | ||
Comment 9•22 years ago
|
||
Comment on attachment 110048 [details] [diff] [review]
When old action was SIG_DFL, make sure that default handler is actually executed.
Please review. Thanks!
Attachment #110048 -
Flags: superreview?
Attachment #110048 -
Flags: review?
Reporter | ||
Comment 10•22 years ago
|
||
Comment on attachment 110048 [details] [diff] [review]
When old action was SIG_DFL, make sure that default handler is actually executed.
Oops, made a typo, fixing.
Attachment #110048 -
Attachment is obsolete: true
Attachment #110048 -
Flags: superreview?
Attachment #110048 -
Flags: review?
Reporter | ||
Comment 11•22 years ago
|
||
Fixed a small typo in the patch.
Reporter | ||
Updated•22 years ago
|
Attachment #110116 -
Flags: superreview?
Attachment #110116 -
Flags: review?
Attachment #110116 -
Flags: superreview?(dveditz)
Attachment #110116 -
Flags: superreview?
Attachment #110116 -
Flags: review?(blizzard)
Attachment #110116 -
Flags: review?
Comment 12•22 years ago
|
||
does this maybe fix bug 176886?
Reporter | ||
Comment 13•22 years ago
|
||
I doubt it (although I do not really know how the talkback works). My code only
makes sure that the default handler is executed properly when there was no
non-default handler specified before the profile was locked. If talkback is
implemented as a signal handler, then profile locking should start with
non-default handler and my code shouldn't change anything...
But if something really strange is going with signal handling, it might help, I
do not know.
Comment 14•22 years ago
|
||
By the definitions on <http://bugzilla.mozilla.org/bug_status.html#severity> and
<http://bugzilla.mozilla.org/enter_bug.cgi?format=guided>, crashing and dataloss
bugs are of critical or possibly higher severity. Only changing open bugs to
minimize unnecessary spam. Keywords to trigger this would be crash, topcrash,
topcrash+, zt4newcrash, dataloss.
Severity: major → critical
Updated•22 years ago
|
Flags: blocking1.3b? → blocking1.3b-
Blocks: 176886
Updated•22 years ago
|
QA Contact: ktrina → gbush
Reporter | ||
Comment 15•22 years ago
|
||
Updating for bitrot.
Attachment #110116 -
Attachment is obsolete: true
Reporter | ||
Comment 16•22 years ago
|
||
Comment on attachment 113656 [details] [diff] [review]
When original action was SIG_DFL, make sure the default handler is still executed before we exit.
It's a pretty simple fix, please review.
Attachment #113656 -
Flags: superreview?(dveditz)
Attachment #113656 -
Flags: review?(blizzard)
Reporter | ||
Updated•22 years ago
|
Attachment #110116 -
Flags: superreview?(dveditz)
Attachment #110116 -
Flags: review?(blizzard)
Reporter | ||
Comment 17•22 years ago
|
||
Comment on attachment 113656 [details] [diff] [review]
When original action was SIG_DFL, make sure the default handler is still executed before we exit.
Blizzard thinks dbaron should look at it.
Attachment #113656 -
Flags: review?(blizzard) → review?(dbaron)
Comment 18•22 years ago
|
||
removing obsolete mozilla1.0 keywords and the misleading "dataloss" (which
applies to losing user data not an inability to get debugging core dumps).
Accordingly bumping severity back down to normal, but it would be nice to get fixed.
Severity: critical → normal
Keywords: dataloss,
mozilla1.0,
mozilla1.0.1
Comment on attachment 113656 [details] [diff] [review]
When original action was SIG_DFL, make sure the default handler is still executed before we exit.
> struct sigaction act, oldact;
> act.sa_handler = FatalSignalHandler;
>- act.sa_flags = 0;
>+ act.sa_flags = SA_NOMASK;
> sigfillset(&act.sa_mask);
I'm surprised this does anything useful, given that |man sigaction| seems to
suggest that, with the full |sa_mask|, it won't, since "In addition [to
sa_mask], the signal which triggered the handler will be blocked, unless the
SA_NODEFER or SA_NOMASK flags are used." But I'm guessing that one of them is
undone by the new |sigaction| call and the other isn't, or something like that?
But it looks like this code was moved in revision 1.80. Could you attach a new
patch? Sorry to take so long to get to this...
Attachment #113656 -
Flags: superreview?(dveditz)
Attachment #113656 -
Flags: superreview+
Attachment #113656 -
Flags: review?(dbaron)
Attachment #113656 -
Flags: review?(ccarlen)
Assignee | ||
Comment 21•22 years ago
|
||
Comment on attachment 113656 [details] [diff] [review]
When original action was SIG_DFL, make sure the default handler is still executed before we exit.
>@@ -2131,7 +2135,7 @@
> // because mozilla is run via nohup.
> struct sigaction act, oldact;
> act.sa_handler = FatalSignalHandler;
>- act.sa_flags = 0;
>+ act.sa_flags = SA_NOMASK;
I moved the patch to where this code is now. Problem is, SA_NOMASK doesn't
exist on OS X. Then probably not BSD either? If this is really needed, you need
to figure out the #ifdefs.
Attachment #113656 -
Flags: review?(ccarlen) → review-
Attachment #113656 -
Flags: superreview+ → superreview-
Comment 22•21 years ago
|
||
I disagree with comment #18---"dataloss" is not necessarily
misleading, it can be right on the point because there
is _no_ autosave (see bug 16359 and bug 16360).
My wife lost several hours of work yesterday and I failed
to find any of it in kernel memory (not sure why, probably
destroyed it while making a copy of it; I've been successful
before with other programs crashing). A coredump, I believe,
would have allowed me to recover her text fairly easily.
My Mozilla 1.4 for GNU/Linux does not dump core regardless of
whether the talkback agent is or is not in use. If people want
coredumps (which they show by raising the ulimit for it from
the normally default 0), then they should get coredumps. Please
fix this. It may be a weird way to recover data but it will be
something. Autosave hasn't happened for 3+ years already. This
should be easier to fix, I think.
Updated•21 years ago
|
Severity: normal → critical
Flags: blocking1.6a?
Flags: blocking1.5?
Flags: blocking1.4.2?
Flags: blocking1.4.1?
Keywords: mozilla1.3 → dataloss
Comment 23•21 years ago
|
||
There's no need to set blocking1.6a yet, the trunk is still open for checkins.
/be
Status: NEW → ASSIGNED
Flags: blocking1.6a?
Updated•21 years ago
|
Flags: blocking1.5?
Flags: blocking1.4.1?
Comment 24•21 years ago
|
||
Attachment #113656 -
Attachment is obsolete: true
Comment 25•21 years ago
|
||
Comment on attachment 133812 [details] [diff] [review]
compiles on solaris and linux
With the patch:
$ ./run-mozilla.sh ./xpcshell
js> Components.classes['@mozilla.org/generic-factory;1'].createInstance()
Segmentation Fault - core dumped
$ uname -a
SunOS DEATHSTARII 5.9 Generic_112233-08 sun4u sparc SUNW,Ultra-80
Yes I have a patch for generic factory too, but it's known broken today, so it
was convenient :)
Attachment #133812 -
Flags: superreview?(dbaron)
Attachment #133812 -
Flags: review?(bryner)
Comment on attachment 133812 [details] [diff] [review]
compiles on solaris and linux
Sure, sr=dbaron, but don't invent new bracing styles.
...
} else if (...)
{
...
isn't normal. Prevailing style here suggests adding a break before the else
and a break before the { following the if.
Attachment #133812 -
Flags: superreview?(dbaron) → superreview+
Comment 27•21 years ago
|
||
Comment on attachment 133812 [details] [diff] [review]
compiles on solaris and linux
And how about space after , in argument lists (sigaction)?
/be
Comment 28•21 years ago
|
||
Comment on attachment 133812 [details] [diff] [review]
compiles on solaris and linux
I'm a little concerned about signal handling recursion should we ever crash in
the signal handler, due to NODEFER being used. Could you instead just unblock
the signal before calling raise() by calling sigprocmask() ?
Comment 29•21 years ago
|
||
unblock the signal in question just before invoking the default handler
Comment 30•21 years ago
|
||
Comment on attachment 134344 [details] [diff] [review]
this is what i had in mind
>+ if (oldact->sa_handler == SIG_DFL) {
>+ // Make sure the default sig handler is executed
period at end of comment sentence.
>+ // We need it to get Mozilla to dump core.
>+ sigaction(signo,oldact,NULL);
>+
>+ // Now that we've restored the default handler, unmask the
>+ // signal and invoke it.
>+
Why the extra blank line here, but not previously. I'd say axe this one, but
in any event be consistent.
>+ sigset_t unblock_sigs;
>+ sigemptyset(&unblock_sigs);
>+ sigaddset(&unblock_sigs, signo);
>+
>+ sigprocmask(SIG_UNBLOCK, &unblock_sigs, NULL);
>+
>+ raise(signo);
>+ } else if (oldact->sa_handler && oldact->sa_handler != SIG_IGN)
>+ {
>+ oldact->sa_handler(signo);
>+ }
> }
>
> // Backstop exit call, just in case.
Sorry for the drive-by nit-picking, it looks good to me otherwise. Timeless,
darin?
/be
Comment 31•21 years ago
|
||
Comment on attachment 134344 [details] [diff] [review]
this is what i had in mind
this patch makes good sense to me.
to be nit picky, the brace style is not consistent near the "else" and there
should be spaces in the parameter list of sigaction ;-)
r/sr=darin fwiw
Comment 32•21 years ago
|
||
Comment on attachment 133812 [details] [diff] [review]
compiles on solaris and linux
Minusing, I prefer the patch I attached that doesn't use SA_NODEFER...
Attachment #133812 -
Flags: review?(bryner) → review-
Comment 33•21 years ago
|
||
Comment on attachment 134344 [details] [diff] [review]
this is what i had in mind
Brendan, what do you think?
Attachment #134344 -
Flags: review?(brendan)
Comment 34•21 years ago
|
||
Comment on attachment 134344 [details] [diff] [review]
this is what i had in mind
Fix the nit mentioned in comment 27 and r/sr=me.
/be
Attachment #134344 -
Flags: review?(brendan) → review+
Updated•21 years ago
|
Attachment #134344 -
Flags: superreview?(dbaron)
Comment on attachment 134344 [details] [diff] [review]
this is what i had in mind
>+ } else if (oldact->sa_handler && oldact->sa_handler != SIG_IGN)
>+ {
Put the |else| on a new line after the }, make sure someone's tested this, and
sr=dbaron.
Attachment #134344 -
Flags: superreview?(dbaron) → superreview+
Comment 36•21 years ago
|
||
checked in.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Comment 37•21 years ago
|
||
Brian,
Can you check the correct version of this patch into 1.4.2 and mark fixed1.4.2
please?
Flags: blocking1.4.2? → blocking1.4.2+
Updated•9 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•