1309230 - Do not use raise in UnixExceptionHandler

Reporter

Description

•

8 years ago

15:21:57 from decoder:
> did we change anything about how signals are handled in the JS engine
> i keep seeing these stack frames now for many crashes or assertion failures:
> # 00    raise
> # 01    js::UnixExceptionHandler
> # 02    AsmJSFaultHandler<(Signal)0>
> # 03    <signal handler called>
> + the original stack then
> and i ve never seen that before

From the source of both SEGV handler that we have in SpiderMonkey, I can notice that we use different mechanism for forwarding the issue.  Apparently "raise" [1] causes the SEGV-handler to remain on the stack, as opposed to the sigaction [2] call made in the bound-check handler used by Wasm/Asm.js.

[1] http://searchfox.org/mozilla-central/rev/e8dd21b1673c07307068303ab3e99360c78f4d12/js/src/ds/MemoryProtectionExceptionHandler.cpp#253
[2] http://searchfox.org/mozilla-central/rev/e8dd21b1673c07307068303ab3e99360c78f4d12/js/src/asmjs/WasmSignalHandlers.cpp#1197

Luke Wagner [:luke]

Comment 1

•

8 years ago

Yeah, I'm not sure why raise() is used instead of what WasmSignalHandlers.cpp is doing.

Unrelated, I noticed that sHandlingException won't have effect unless SA_NODEFER is passed to sigaction().

Also, didn't we discuss making this feature #ifdef NIGHTLY?

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 2

•

8 years ago

I did this because "It is not safe to return normally from the handler for a program error signal, because the behavior of the program when the handler function returns is not defined after a program error." according to [1], though I don't know what the POSIX standard has to say about it. If returning is the right thing to do then I'll prepare a patch!

> Unrelated, I noticed that sHandlingException won't have effect unless SA_NODEFER is passed to sigaction().

Yeah, that's why I didn't guard against reentrance in the initial patch, but I wasn't sure about it.

> Also, didn't we discuss making this feature #ifdef NIGHTLY?

We did, but I thought the decision was that it wasn't necessary. Sorry if I misinterpreted that!

(unfortunately, or perhaps fortunately the crash we were tracking seems to have been resolved by unrelated work, so now all that code doesn't seem to be gaining us anything in practice :( )

[1] https://www.gnu.org/software/libc/manual/html_node/Handler-Returns.html

Luke Wagner [:luke]

Comment 3

•

8 years ago

(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #2)
> I did this because "It is not safe to return normally from the handler for a
> program error signal, because the behavior of the program when the handler
> function returns is not defined after a program error." according to [1],
> though I don't know what the POSIX standard has to say about it. If
> returning is the right thing to do then I'll prepare a patch!

Yeah, you're right that it's UB according to the spec, but so are the asm.js fixups and PC redirection we're doing for OOB SIGSEGVs so I think we (and probably all debuggers and JVMs :) are already relying on what happens when you resume from a SIGSEGV.

> > Also, didn't we discuss making this feature #ifdef NIGHTLY?
> 
> We did, but I thought the decision was that it wasn't necessary. Sorry if I
> misinterpreted that!
> 
> (unfortunately, or perhaps fortunately the crash we were tracking seems to
> have been resolved by unrelated work, so now all that code doesn't seem to
> be gaining us anything in practice :( )

Oh, maybe I misunderstood.  Exciting to hear the bug might've been fixed!  Do you have the bug #?  Jan I suppose you're already considering this question of what to do here (release, #ifdef NIGHTLY, remove)?

Flags: needinfo?(jdemooij)

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 4

•

8 years ago

(In reply to Luke Wagner [:luke] from comment #3)
> Oh, maybe I misunderstood.  Exciting to hear the bug might've been fixed! 
> Do you have the bug #?

Not a clue unfortunately - we can only infer that the bug is fixed by the fact that we haven't gotten any annotated crashes, and none of the memory exception crashes look related (they're usually memsets, not memcpys). It's still possible that the exception handler isn't even working, though that seems unlikely!

Regarding what to do next, we can 1) do nothing, 2) install the exception handler only on Nightly or 3) remove the code again (I'd rather not, but if no one is using it..). Orthogonal to that, we can switch AssemblerBuffer back to a normal mozilla::Vector (which will leave PageProtectingVector and the exception handlers with no users for now).

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 5

•

8 years ago

Oh, maybe I misunderstood: the bug where we've been tracking the crashes is bug 1124397.

Luke Wagner [:luke]

Comment 6

•

8 years ago

Did the drop in crashes coincide precisely with the landing of MemoryProtectionExceptionHandler?  Because if so, it seems possible we're simply losing those crash reports by breakpad not getting called.  A way to test that would be to disable MemoryProtectionExceptionHandler and see if the crashes come back.

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 7

•

8 years ago

Attached patch Return from the unix signal handler rather than re-raising. — Details — Splinter Review

(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #4)
> Regarding what to do next, we can 1) do nothing

To clarify, I didn't mean "don't fix this bug", I just meant not making the exception handler Nightly only :)

Luke, this replaces the raise call with a call to sigaction like the wasm signal handler uses. Technically we should have already restored the signal handler at this point (since it's the first thing we do when we receive a signal), but setting the same signal handler twice can't hurt. I also added SA_NODEFER to match the wasm signal handlers (and since we already check for reentry).

I'll leave the decision of whether to make the signal handlers Nightly only for another patch or bug.

Assignee: nobody → emanuel.hoogeveen

Status: NEW → ASSIGNED

Attachment #8799871 - Flags: review?(luke)

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 8

•

8 years ago

(In reply to Luke Wagner [:luke] from comment #6)
> Did the drop in crashes coincide precisely with the landing of
> MemoryProtectionExceptionHandler?  Because if so, it seems possible we're
> simply losing those crash reports by breakpad not getting called.  A way to
> test that would be to disable MemoryProtectionExceptionHandler and see if
> the crashes come back.

It's a possibility but.. even on Windows? The exception handler is so simple there. I agree that we need to investigate - Jan says he'll look at the crashes again tomorrow.

Luke Wagner [:luke]

Comment 9

•

8 years ago

(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #8)
Yeah, I was thinking the same thing (this is one of the few areas where I feel the Windows API got it right-er than POSIX); maybe it's just a coincidence.

Luke Wagner [:luke]

Updated

•

8 years ago

Attachment #8799871 - Flags: review?(luke) → review+

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 10

•

8 years ago

Thanks!

I confirmed locally that it still works, try looks good:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b348cd341e3bd5b52a988111ff23f5964842a3d3

Keywords: checkin-needed

Pulsebot

Comment 11

•

8 years ago

Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/860edbf262f7
Return from the unix signal handler rather than re-raising. r=luke

Keywords: checkin-needed

Nicolas B. Pierron [:nbp]

Reporter

Comment 12

•

8 years ago

(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #7)
> Luke, this replaces the raise call with a call to sigaction like the wasm
> signal handler uses. Technically we should have already restored the signal
> handler at this point (since it's the first thing we do when we receive a
> signal), but setting the same signal handler twice can't hurt. I also added
> SA_NODEFER to match the wasm signal handlers (and since we already check for
> reentry).

That's one of the thing I did not understood from the current code.  It does not sounds safe from my point of view to do that if the UnixExceptionHandler is not the last SEGV handler registered.  So I wonder if we have risks of accidentally remove the AsmJS OOB handler.

Also, I did not understood the condition with the atomic which guards that we do not re-enter, as I did not found the location where we reset the flag to re-enter this SEGV handler a second time, if the SEGV is handled by one of the previous SEGV handler registered.

Flags: needinfo?(emanuel.hoogeveen)

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 13

•

8 years ago

This signal handler is installed early during startup - after the crash reporter, but before the wasm signal handler. As a result it's always called *after* the wasm signal handler, and we don't expect to be able to recover from a SEGV at that point. The signal handler it restores should always be the crash reporter or the default signal handler.

That's why we make sure to only run it once - if we somehow recover and enter the exception handler again, someone is probably doing some pretty hefty debugging, so losing the annotation is no great loss.

Flags: needinfo?(emanuel.hoogeveen)

Emanuel Hoogeveen [:ehoogeveen]

Assignee

Comment 14

•

8 years ago

As a heads up, we figured out that the crashes aren't getting annotated because *no* SpiderMonkey crashes get annotated: bug 1309573.

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 15

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/860edbf262f7

Status: ASSIGNED → RESOLVED

Closed: 8 years ago

status-firefox52: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla52

Jan de Mooij [:jandem]

Updated

•

7 years ago

Flags: needinfo?(jdemooij)

Bugzilla

Quick Search

Do not use raise in UnixExceptionHandler

Categories

(Core :: JavaScript Engine, defect)

Tracking

()

People

(Reporter: nbp, Assigned: ehoogeveen)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Updated

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Updated

Attachment

General

Description

File Name

Content Type