Closed Bug 1393287 Opened 5 years ago Closed 5 years ago

Linux sandbox shims should intercept sigaction to remove SIGSYS from sa_mask


(Core :: Security: Process Sandboxing, enhancement, P2)




Tracking Status
firefox59 --- fixed


(Reporter: jld, Assigned: jld)


(Whiteboard: sb+)


(2 files)

We currently intercept sigprocmask and pthread_sigmask to prevent SIGSYS from being blocked, because of a misfeature in seccomp-bpf — if the signal is *either* blocked *or* ignored, it both unblocks it *and* resets the signal disposition, causing the process to be immediately killed without a chance to print an error message or do a crash report.  (The “signal broadcast” gadget from bug 970676 is handled similarly, but in that case it's only a problem if the thread never unblocks the signal.)

However, that's not the only way to block signals; the sigaction() call, when it establishes a handler, supplies a set of signals (the sa_mask field) to block while running the handler.

So we should remove SIGSYS from that in processes we intend to sandbox; otherwise, if a handler masks all signals (e.g., [1]) and uses a syscall that's either disallowed or brokered, the process will immediately exit for no obvious reason.

This could already be happening, for all we know.  We wouldn't get crash reports, and if any bugs have been filed they might be sitting around untriaged or wrongly triaged because there'd be no error messages about sandboxing, just the usual unhelpful broken pipe errors from IPC.

Priority: -- → P2
Whiteboard: sb+
Assignee: nobody → jld
Comment on attachment 8935998 [details]
Bug 1393287 - Intercept sigaction() to fix signal masks for sandboxing.
Attachment #8935998 - Flags: review?(gpascutto) → review+
Pushed by
Intercept sigaction() to fix signal masks for sandboxing. r=gcp
Backed out for hazard build bustage


push with failures:

failure log:

[task 2017-12-18T23:49:11.130Z] TinderboxPrint: heap write hazards<br/>14
[task 2017-12-18T23:49:11.130Z] TEST-UNEXPECTED-FAIL 1 rooting hazards detected
[task 2017-12-18T23:49:11.130Z] TinderboxPrint: documentation<br/><a href=''>static rooting hazard analysis failures</a>, visit "Inspect Task" link for hazard details
[task 2017-12-18T23:49:11.130Z] TEST-UNEXPECTED-FAIL 14 heap write hazards detected out of 0 allowed
[task 2017-12-18T23:49:11.130Z] TinderboxPrint: documentation<br/><a href=''>heap write hazard analysis failures</a>, visit "Inspect Task" link for hazard details
[task 2017-12-18T23:49:11.130Z] /builds/worker/checkouts/gecko/taskcluster/scripts/builder/ line 170: exit_status: command not found
Flags: needinfo?(jld)
The hazard reports don't seem related, but I figured that backing the patch out was the easiest way to find out.
The hazard reports are totally related.  :-)  Behold:

Time: Mon Dec 18 2017 23:46:59 GMT+0000 (UTC)

Function '_ZN2js6ctypes7LibraryL5CloseEP9JSContextjPN2JS5ValueE$Library.cpp:uint8 js::ctypes::Library::Close(JSContext*, uint32, JS::Value*)' has unrooted 'obj' of type 'JSObject*' live across GC call '_ZN2js6ctypesL13UnloadLibraryEP8JSObject$Library.cpp:void js::ctypes::UnloadLibrary(JSObject*)' at js/src/ctypes/Library.cpp:255
    Library.cpp:241: Call(4,5, obj := __temp_2.toObjectOrNull())
    Library.cpp:242: Assume(5,7, null(obj*), false)
    Library.cpp:244: Call(7,8, __temp_3 := IsLibrary(obj*))
    Library.cpp:244: Assume(8,11, !__temp_3*, false)
    Library.cpp:249: Call(11,12, __temp_4 := args.field:0.length())
    Library.cpp:249: Assume(12,15, (__temp_4* != 0), false)
    Library.cpp:255: Call(15,16, UnloadLibrary(obj*)) [[GC call]]
    Library.cpp:256: Call(16,17, __temp_5 := PrivateValue(0))
    Library.cpp:256: Call(17,18, JS_SetReservedSlot(obj*,0,__temp_5))
GC Function: _ZN2js6ctypesL13UnloadLibraryEP8JSObject$Library.cpp:void js::ctypes::UnloadLibrary(JSObject*)
    IndirectCall: _ZZ9sigactionE9sRealFunc$sigaction:int (* sRealFunc)(int

It's complaining about a hazard because a JSObject* is kept alive over a function-call that might GC, that being PR_UnloadLibrary, which per the stack at bottom there, runs down into sigaction and the sRealFunc indirect function call that, unless certain Things are Done, is treated as potentially invoking a GC that would make the JSObject* bogus.

This is easy to fix -- just make the raw JSObject* a rooted JSObject*.  And along the way I got rid of the JS_THIS_OBJECT in the code, seeing as it was nearby and offended my sensibilities.
Attachment #8937614 - Flags: review?(sphink)
Attachment #8937614 - Flags: review?(sphink) → review+
With my patch here applied, you should be able to apply your patch on top of it and land the two without any further hazard-issues.
Pushed by
Rejigger ctypes library close() code so that it's safe against PR_UnloadLibrary incidentally invoking an indirect function call (which happens if sandboxing code overrides some of the signal code underlying the library-unload entrypoint). r=sfink
Intercept sigaction() to fix signal masks for sandboxing. r=gcp
To add to comment #6 (as if it weren't… impressive enough already): the “bad” path requires PR_UnloadLibrary to wind up logging something, and for that to be the first time any NSPR function was called (even though it doesn't really make sense for PR_UnloadLibrary to precede PR_Load{,Static}Library, because PR_LogPrint doesn't know that), because the “hazard” is NSPR initialization ignoring SIGPIPE.
Flags: needinfo?(jld)
But the analysis won't give up that easily! If you take away the logging, it will find

  #9308 = PR_UnloadLibrary
  #8954 = PR_SetError
  #9142 = PR_GetCurrentThread
  #8802 = _PR_ImplicitInitialization
  #19396 = prinit.c:_PR_InitStuff
  #9443 = _PR_UnixInit
  #9445 = sigaction

Admittedly, that *still* has to be the first time NSPR is called. (So yeah, this was a false positive.)

If I prevent it from going through _PR_ImplicitInitialization, it seems to give up for good. So if this were a common problem, I'd probably tell the analysis that _PR_ImplicitInitialization isn't going to GC. On the other hand, it seems slightly sketchy to me to have an unrooted GC pointer live across PR_UnloadLibrary -- I'm not sure I trust shared library shutdown code to never GC.

So even though the analysis may be complaining for the wrong reason, I'm inclined to let it complain, at least until it shows up in a situation where it's difficult to avoid.
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
You need to log in before you can comment on or make changes to this bug.