glib "gmain" thread may block the signal used for BroadcastSetThreadSandbox()

RESOLVED FIXED in Firefox 50

Status

()

defect
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: tedd, Assigned: tedd)

Tracking

unspecified
mozilla50
Points:
---

Firefox Tracking Flags

(firefox50 fixed)

Details

(Whiteboard: sblc1)

Attachments

(2 attachments, 3 obsolete attachments)

To enable seccomp for all threads on systems that do not support TSYNC, we currently have a workaround, where we install a signal handler for an unused signal and then signal each thread of the process which in return triggers the signal handler which then enables seccomp.

Since glib 'gmain' thread blocks all signals (Bug 1176099), we can't install the signal handler which causes the process to crash [1], because the 'gmain' thread is non responsive.

Similar to Bug 1176099, we need to unmask the signal being used for the broadcast, but the problem here is that the signal number is not fixed and determined at run-time [2]

[1] https://dxr.mozilla.org/mozilla-central/rev/16663eb3dcfa759f25b5e27b101bc79270c156f2/security/sandbox/linux/Sandbox.cpp#429
[2] https://dxr.mozilla.org/mozilla-central/rev/16663eb3dcfa759f25b5e27b101bc79270c156f2/security/sandbox/linux/Sandbox.cpp#331
(Assignee)

Comment 1

3 years ago
I did some testing regarding this problem using the approach mentioned in Bug 1176099 Comment 1. In theory it should work to register the signal handler in SandboxEarlyInit(), but during my testing it seemed that the handler didn't run.

It turned out that the first signal to be found free was SIGRTMIN+2, after the signal handler was registered, it was overwritten later on [1]. The code in nsMemoryInfoDumper overwrites any previous signal handler without checking if the signal is used or not.

For test purposes, I modified the FindFreeSignalNumber() [2] to search from SIGRTMAX to SIGRTMIN which solved the problem.

[1] https://dxr.mozilla.org/mozilla-central/rev/16663eb3dcfa759f25b5e27b101bc79270c156f2/xpcom/base/nsMemoryInfoDumper.cpp#279
[2] https://dxr.mozilla.org/mozilla-central/rev/16663eb3dcfa759f25b5e27b101bc79270c156f2/security/sandbox/linux/Sandbox.cpp#254
(Assignee)

Comment 3

3 years ago
With this patch applied, the WIP patch works. I am not sure how we want to handle this problem, so it is just a PoC
(Assignee)

Comment 4

3 years ago
:jld, how do you think we should handle the nsMemoryInfoDumper problem? Maybe fix its code to properly check for free signals?
Flags: needinfo?(jld)
Some history: The original version (from bug 970676) used to just hardcode SIGRTMIN + 3, to get the next signal after the nsMemoryInfoDumper signals.  But then there was bug 1038900, when something on B2G devices was doing that kind of dynamic allocation and also got SIGRTMIN + 3, which is why it works the way it does now.

The thing about the nsMemoryInfoDumper signals is that they're used from outside the process — definitely on B2G, and I've also seen it suggested (using kill -NN with whatever the actual signal number is) as a way to get info out of Linux desktop Firefox — so they can't be dynamic.

All that said: I can't think of anything that would break if having the sandbox code searches from SIGRTMAX decreasing.
Flags: needinfo?(jld)
(Assignee)

Comment 6

3 years ago
Attachment #8755529 - Attachment is obsolete: true
Attachment #8756861 - Flags: review?(jld)
(Assignee)

Comment 7

3 years ago
Attachment #8755527 - Attachment is obsolete: true
Attachment #8756862 - Flags: review?(jld)
(Assignee)

Comment 8

3 years ago
Given Comment 5, I added a patch that changes the search order.

Try push for build: https://treeherder.mozilla.org/#/jobs?repo=try&revision=aa622e0b7293

I tested it locally with MOZ_FAKE_NO_SECCOMP_TSYNC=1. My latest try push with tests, include these two patches as well (additionally a patch to enable seccomp with triggers this code):

https://treeherder.mozilla.org/#/jobs?repo=try&revision=855e7084f80d

Test failures are not related to this patch, rather to missing entries in the seccomp whitelist which is another story.
Attachment #8756861 - Flags: review?(jld) → review+
Comment on attachment 8756862 [details] [diff] [review]
Part 2: Move signal handler set up to SandboxEarlyInit() r=jld

Review of attachment 8756862 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, but a few suggestions:

::: security/sandbox/linux/Sandbox.cpp
@@ +64,5 @@
>  __sanitizer_sandbox_on_notify(__sanitizer_sandbox_arguments *args);
>  } // extern "C"
>  #endif // MOZ_ASAN
>  
> +extern int gBroadcastSignum;

A more specific name would be nice, given that this has extern linkage and is in the root namespace.

Also, if you can move the definition here and the extern declaration into SandboxHooks, that will avoid breaking the B2G build.

@@ +349,5 @@
>          continue;
>        }
>        // Reset the futex cell and signal.
>        gSetSandboxDone = 0;
> +      if (syscall(__NR_tgkill, pid, tid, gBroadcastSignum) != 0) {

Is there any way this could be reached with gBroadcastSignum == 0?  That might be worth a MOZ_RELEASE_ASSERT.  (The result would be a crash anyway, since the signal handler will never run, but a specific assertion would be easier to debug.)

@@ +421,5 @@
>      rewinddir(taskdp);
>    } while (sandboxProgress);
> +
> +  void (*oldHandler)(int);
> +  oldHandler = signal(gBroadcastSignum, SIG_DFL);

Maybe gBroadcastSignum should be set back to 0 after this?
Attachment #8756862 - Flags: review?(jld) → review+
(Assignee)

Comment 10

3 years ago
Carry over r+ from :jld.
Try push for build: https://treeherder.mozilla.org/#/jobs?repo=try&revision=8adbdc577057

Thanks jed, I incorporated your suggestions, and tested it locally with MOZ_FAKE_NO_SECCOMP_TSYNC=1.

Before I request checkin-needed, I renamed the signal number variable to:
gSeccompTsyncBroadcastSignum

Is that descriptive enough?
Attachment #8756862 - Attachment is obsolete: true
Flags: needinfo?(jld)
Attachment #8759216 - Flags: review+
Looks good.
Flags: needinfo?(jld)
(Assignee)

Updated

3 years ago
Keywords: checkin-needed

Comment 12

3 years ago
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ae5286493f15
Part 1: Change search order for free signal r=jld
https://hg.mozilla.org/integration/mozilla-inbound/rev/62646bfa1f95
Part 2: Move signal handler set up to SandboxEarlyInit() r=jld
Keywords: checkin-needed
sorry had to back this out, seems this caused https://treeherder.mozilla.org/logviewer.html#?job_id=29651943&repo=mozilla-inbound after landing
Flags: needinfo?(julian.r.hector)

Comment 14

3 years ago
Backout by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b070f6f0ace2
Backed out changeset 62646bfa1f95 
https://hg.mozilla.org/integration/mozilla-inbound/rev/25cf270ae8c5
Backed out changeset ae5286493f15 for frequent timeouts in browser_ManifestObtainer_obtain.js
(Assignee)

Comment 15

3 years ago
Sorry to have the caused the trouble, I will dig into it. Thanks for backing it out again.
Flags: needinfo?(julian.r.hector)

Comment 16

3 years ago
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/1526b47e25e4
Part 1: Change search order for free signal r=jld
https://hg.mozilla.org/integration/mozilla-inbound/rev/17c1f2315eb5
Part 2: Move signal handler set up to SandboxEarlyInit() r=jld
(In reply to Julian Hector [:tedd] [:jhector] from comment #15)
> Sorry to have the caused the trouble, I will dig into it. Thanks for backing
> it out again.

Hi Julian,
very sorry turned out your patch is innocent and was something else that landed before. Relanded your patches!

Comment 18

3 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/1526b47e25e4
https://hg.mozilla.org/mozilla-central/rev/17c1f2315eb5
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
You need to log in before you can comment on or make changes to this bug.