Closed Bug 1393287 Opened 2 years ago Closed 2 years ago

Linux sandbox shims should intercept sigaction to remove SIGSYS from sa_mask

Categories

(Core :: Security: Process Sandboxing, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: jld, Assigned: jld)

Details

(Whiteboard: sb+)

Attachments

(2 files)

We currently intercept sigprocmask and pthread_sigmask to prevent SIGSYS from being blocked, because of a misfeature in seccomp-bpf — if the signal is *either* blocked *or* ignored, it both unblocks it *and* resets the signal disposition, causing the process to be immediately killed without a chance to print an error message or do a crash report.  (The “signal broadcast” gadget from bug 970676 is handled similarly, but in that case it's only a problem if the thread never unblocks the signal.)

However, that's not the only way to block signals; the sigaction() call, when it establishes a handler, supplies a set of signals (the sa_mask field) to block while running the handler.

So we should remove SIGSYS from that in processes we intend to sandbox; otherwise, if a handler masks all signals (e.g., [1]) and uses a syscall that's either disallowed or brokered, the process will immediately exit for no obvious reason.

This could already be happening, for all we know.  We wouldn't get crash reports, and if any bugs have been filed they might be sitting around untriaged or wrongly triaged because there'd be no error messages about sandboxing, just the usual unhelpful broken pipe errors from IPC.


[1] https://searchfox.org/mozilla-central/rev/48ea452803907f2575d81021e8678634e8067fc2/ipc/chromium/src/third_party/libevent/signal.c#256
Priority: -- → P2
Whiteboard: sb+
Assignee: nobody → jld
Comment on attachment 8935998 [details]
Bug 1393287 - Intercept sigaction() to fix signal masks for sandboxing.

https://reviewboard.mozilla.org/r/206848/#review213660
Attachment #8935998 - Flags: review?(gpascutto) → review+
Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8a71f6e05783
Intercept sigaction() to fix signal masks for sandboxing. r=gcp
Backed out for hazard build bustage

backout: https://hg.mozilla.org/integration/autoland/rev/095928b49caa8918ac6354044100ed3be6c0b9f6

push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=8a71f6e05783c13f38d5bcc9733e955b9fd7543b&selectedJob=152224035

failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=152225082&repo=autoland&lineNumber=39556

[task 2017-12-18T23:49:11.130Z] TinderboxPrint: heap write hazards<br/>14
39556
[task 2017-12-18T23:49:11.130Z] TEST-UNEXPECTED-FAIL 1 rooting hazards detected
39557
[task 2017-12-18T23:49:11.130Z] TinderboxPrint: documentation<br/><a href='https://wiki.mozilla.org/Javascript:Hazard_Builds#Diagnosing_a_rooting_hazards_failure'>static rooting hazard analysis failures</a>, visit "Inspect Task" link for hazard details
39558
[task 2017-12-18T23:49:11.130Z] TEST-UNEXPECTED-FAIL 14 heap write hazards detected out of 0 allowed
39559
[task 2017-12-18T23:49:11.130Z] TinderboxPrint: documentation<br/><a href='https://wiki.mozilla.org/Javascript:Hazard_Builds#Diagnosing_a_heap_write_hazard_failure'>heap write hazard analysis failures</a>, visit "Inspect Task" link for hazard details
39560
[task 2017-12-18T23:49:11.130Z] /builds/worker/checkouts/gecko/taskcluster/scripts/builder/hazard-analysis.sh: line 170: exit_status: command not found
Flags: needinfo?(jld)
The hazard reports don't seem related, but I figured that backing the patch out was the easiest way to find out.
The hazard reports are totally related.  :-)  Behold:

Time: Mon Dec 18 2017 23:46:59 GMT+0000 (UTC)

Function '_ZN2js6ctypes7LibraryL5CloseEP9JSContextjPN2JS5ValueE$Library.cpp:uint8 js::ctypes::Library::Close(JSContext*, uint32, JS::Value*)' has unrooted 'obj' of type 'JSObject*' live across GC call '_ZN2js6ctypesL13UnloadLibraryEP8JSObject$Library.cpp:void js::ctypes::UnloadLibrary(JSObject*)' at js/src/ctypes/Library.cpp:255
    Library.cpp:241: Call(4,5, obj := __temp_2.toObjectOrNull())
    Library.cpp:242: Assume(5,7, null(obj*), false)
    Library.cpp:244: Call(7,8, __temp_3 := IsLibrary(obj*))
    Library.cpp:244: Assume(8,11, !__temp_3*, false)
    Library.cpp:249: Call(11,12, __temp_4 := args.field:0.length())
    Library.cpp:249: Assume(12,15, (__temp_4* != 0), false)
    Library.cpp:255: Call(15,16, UnloadLibrary(obj*)) [[GC call]]
    Library.cpp:256: Call(16,17, __temp_5 := PrivateValue(0))
    Library.cpp:256: Call(17,18, JS_SetReservedSlot(obj*,0,__temp_5))
GC Function: _ZN2js6ctypesL13UnloadLibraryEP8JSObject$Library.cpp:void js::ctypes::UnloadLibrary(JSObject*)
    PR_UnloadLibrary
    PR_LogPrint
    _PR_ImplicitInitialization
    prinit.c:_PR_InitStuff
    _PR_UnixInit
    sigaction
    IndirectCall: _ZZ9sigactionE9sRealFunc$sigaction:int (* sRealFunc)(int

It's complaining about a hazard because a JSObject* is kept alive over a function-call that might GC, that being PR_UnloadLibrary, which per the stack at bottom there, runs down into sigaction and the sRealFunc indirect function call that, unless certain Things are Done, is treated as potentially invoking a GC that would make the JSObject* bogus.

This is easy to fix -- just make the raw JSObject* a rooted JSObject*.  And along the way I got rid of the JS_THIS_OBJECT in the code, seeing as it was nearby and offended my sensibilities.
Attachment #8937614 - Flags: review?(sphink)
Attachment #8937614 - Flags: review?(sphink) → review+
With my patch here applied, you should be able to apply your patch on top of it and land the two without any further hazard-issues.
Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a3d82454cdd5
Rejigger ctypes library close() code so that it's safe against PR_UnloadLibrary incidentally invoking an indirect function call (which happens if sandboxing code overrides some of the signal code underlying the library-unload entrypoint). r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/8be14b21f4b1
Intercept sigaction() to fix signal masks for sandboxing. r=gcp
To add to comment #6 (as if it weren't… impressive enough already): the “bad” path requires PR_UnloadLibrary to wind up logging something, and for that to be the first time any NSPR function was called (even though it doesn't really make sense for PR_UnloadLibrary to precede PR_Load{,Static}Library, because PR_LogPrint doesn't know that), because the “hazard” is NSPR initialization ignoring SIGPIPE.
Flags: needinfo?(jld)
But the analysis won't give up that easily! If you take away the logging, it will find

  #9308 = PR_UnloadLibrary
  #8954 = PR_SetError
  #9142 = PR_GetCurrentThread
  #8802 = _PR_ImplicitInitialization
  #19396 = prinit.c:_PR_InitStuff
  #9443 = _PR_UnixInit
  #9445 = sigaction

Admittedly, that *still* has to be the first time NSPR is called. (So yeah, this was a false positive.)

If I prevent it from going through _PR_ImplicitInitialization, it seems to give up for good. So if this were a common problem, I'd probably tell the analysis that _PR_ImplicitInitialization isn't going to GC. On the other hand, it seems slightly sketchy to me to have an unrooted GC pointer live across PR_UnloadLibrary -- I'm not sure I trust shared library shutdown code to never GC.

So even though the analysis may be complaining for the wrong reason, I'm inclined to let it complain, at least until it shows up in a situation where it's difficult to avoid.
https://hg.mozilla.org/mozilla-central/rev/a3d82454cdd5
https://hg.mozilla.org/mozilla-central/rev/8be14b21f4b1
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
You need to log in before you can comment on or make changes to this bug.