Closed Bug 1041885 Opened 10 years ago Closed 10 years ago

Import Chromium setuid sandbox

Categories

(Core :: Security, defect)

All
Linux
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 986397
mozilla36

People

(Reporter: jld, Assigned: jld)

References

Details

We're going to need to use the setuid sandbox from Chromium — see https://code.google.com/p/chromium/wiki/LinuxSUIDSandbox — to get reliable sandboxing on desktop.  This is a hard requirement for EME, but it will also help improve content process security in the future (when desktop content processes have been fully separated from direct filesystem/network access).

Challenges:

1. If Firefox is installed without root access, it will not have sandboxing.

2. If Firefox is built and run from the build directory (`./mach run`) as a non-root user, it will not have sandboxing.

3. One or both of the above may apply to our continuous integration environment.

These have been true of Chromium on Linux since its initial release; this fact may or may not be persuasive to Firefox users/developers.  Possible mitigations:

1. Allow referencing an existing copy of the sandbox instead of one created by the build.  This is what Chromium does: iff the user running it also owns its executable, it will use the CHROME_DEVEL_SANDBOX environment variable.

2. Allow using only seccomp-bpf sandboxing.  While this was the initial plan, in this context it means that anyone who touches the seccomp-bpf policy has to be careful not to make any changes that would compromise its effectiveness when it's used alone, even if they'd be harmless during normal operation.  (Also, the Chromium sandbox developers have had a lot more time to think about this than we have, and they think it's a bad idea in general.)  I'd rather not do this.  I might consider a build-time option with a sufficiently scary name, if there's a specific use case that needs it.
Depends on: 1041886
Here's another problem: for media plugin loading, I'm currently using seccomp-bpf to intercept and emulate the open() system call from dlopen().  If we don't have seccomp-bpf, then that doesn't work.

So it will also be necessary to change GMP on Linux to use the mozglue linker.  It looks like it should be usable to replace just that one dlopen call without too much difficulty.


Content processes are not as fortunate.  I'm still willing to say "when" rather than "if" with respect to using the setuid sandbox in conjunction with seccomp-bpf, but not by itself.
(In reply to Jed Davis [:jld] from comment #1)
> So it will also be necessary to change GMP on Linux to use the mozglue linker.

It opens /proc/self/maps:

#3  0x00007fe3779dc254 in __fopen_internal (filename=0x7fe37b3d76c0 "/proc/self/maps", 
#4  0x00007fe379f216a6 in EnsureWritable::getProt
#5  0x00007fe379f2192b in EnsureWritable::EnsureWritable<ElfLoader::link_map*>
#6  0x00007fe379f20912 in ElfLoader::DebuggerHelper::Add
#7  0x00007fe379f20e30 in ElfLoader::Register

Can I make it not do that, somehow?  glibc didn't need it.
Flags: needinfo?(mh+mozilla)
I looked at the code some more and asked about things on IRC, and: we can add an environment variable to disable the DebuggerHelper, and set it when sandboxed.  The code to do this is more nonmodular than I'd like — the IPC framework apparently lacks an interface for setting env vars when launching a child — but it seems to work.
Flags: needinfo?(mh+mozilla)
In light of bug 1039819 comment #6, we shouldn't need to require this after all.

But we currently have ~60% of the Linux desktop userbase on kernels that should be new enough to support user namespaces — and that should be going up to more like 70% in a month or two, as Ubuntu 12.04.5 upgrades all the 3.5 and 3.8 kernels (from Precise Pangolins that started as, respectively, 12.04.2 and 12.04.3; details at https://wiki.ubuntu.com/Kernel/LTSEnablementStack) to 3.13.  Hopefully.

This raises the possibility of using the "setuid sandbox" without needing actual root privileges, as an extra layer of security where available.  As mentioned in comment #0 this will require keeping in mind that about 1 in 4 sandboxed users won't have it, and we'll be in that position until 2017… but maybe that's doable.

And I'd add that Chromium, which has had the setuid sandbox in its actually-setuid form as part of their processes since 2009, nonetheless seems to have been finding it to be a source of technical debt and would like to switch to user namespaces where possible: https://code.google.com/p/chromium/issues/detail?id=312380
No longer blocks: 1021232
> This raises the possibility of using the "setuid sandbox" without needing actual root privileges

But that still requires a setuid bit, or at least a way to get CAP_SYS_ADMIN to do CLONE_NEWUSER.
(In reply to Mike Hommey [:glandium] from comment #5)
> But that still requires a setuid bit, or at least a way to get CAP_SYS_ADMIN
> to do CLONE_NEWUSER.

/proc/sys/kernel/unprivileged_userns_clone (and its default of 0) seems to be a Debian feature.  Oddly, despite the patch that adds it having an ubuntu.com address and mentioning an Ubuntu release name, it doesn't seem to be present in the Ubuntu kernels I have on hand — unprivileged CLONE_NEWUSER works fine on the 3.13 series and is EINVAL on the 3.11 series.

All of this means I have to get some new numbers:

40%: Ubuntu on 3.13; has userns.
20%: Ubuntu on 3.{5,8,11}; will be upgraded to 3.13 next month, in theory.
25%: Ubuntu on 3.2; no userns, no forced upgrade until 2017.
15%: Not Ubuntu
(In reply to Jed Davis [:jld] from comment #6) 
> 20%: Ubuntu on 3.{5,8,11}; will be upgraded to 3.13 next month, in theory.

…or not.  Reading https://wiki.ubuntu.com/Kernel/LTSEnablementStack a bit more closely:

> When the 12.10 enablement stack reaches its EOL, 12.10 enablement stack users will NOT be
> automatically upgraded to the 14.04 enablement stack in Precise. Users will need to manually
> upgrade to the 14.04 enablement stack in order to continue receiving official support (ie security
> updates and bug fixes). We will aggressively message (ie MOTD, USN, update-manager, etc.) when
> the 12.10 enablement stack is reaching its EOL and provide instructions on updating to the 14.04
> enablement stack.

> When the Raring HWE stack reaches its EOL, Raring HWE stack users will NOT be automatically
> upgraded to the 14.04 HWE stack in Precise. Users will need to manually upgrade to the 14.04 HWE
> stack in order to continue receiving official support (ie security updates and bug fixes). We will
> aggressively message when the Raring HWE stack is reaching its EOL (ie MOTD, USN, update-manager,
> etc.) and provide instructions on updating to the 14.04 HWE stack.

> When the Saucy HWE stack reaches its EOL, Saucy HWE stack users will NOT be automatically upgraded
> to the 14.04 HWE stack in Precise. Users will need to manually upgrade to the 14.04 HWE stack
> in order to continue receiving official support (ie security updates and bug fixes). We will
> aggressively message when the Saucy HWE stack is reaching its EOL (ie MOTD, USN, update-manager,
> etc.) and provide instructions on updating to the 14.04 HWE stack.

So far, FHR suggests that many (most?) users on those version combinations have not upgraded.
See Also: → 1070036
(In reply to Jed Davis [:jld] from comment #7)
> So far, FHR suggests that many (most?) users on those version combinations
> have not upgraded.

…except that I was still looking at reports from Firefox 30, not the current release.  I retract the above-quoted comment.  Hot off the presses, for the 32 release cycle so far:

65%: Ubuntu, with userns
 5%: Ubuntu, no userns, should upgrade
20%: Ubuntu, no userns, needn't upgrade
10%: not Ubuntu

So, as expected, the Quantal/Raring/Saucy kernels are being upgraded to Trusty/Utopic.

Also, the "not Ubuntu" can be broken down a little more: 2.2% are Fedoras new enough for https://bugzilla.redhat.com/show_bug.cgi?id=917708 (which, if I'm reading correctly, turns it on unpreffed) while 5.7% are either a too-old base version or from distributions that have unprivileged userns preffed off (Debian, Arch, and derivatives of both).

All told, that's about 2/3 of the profiles in question.
The Chromium user-namespace sandbox (https://crbug.com/312380) hasn't landed yet; and as long as Chromium still supports the setuid sandbox (probably until April 2017, for Ubuntu 12.04) it's not as much of a priority for them as it is for us.

But, compared with the suid sandbox proper, a purely user-namespace sandbox has some advantages:

1. No actual root privileges are involved, and certainly not combined with potentially untrusted code (chromium-sandbox can be used to launch arbitrary executables, which could be malicious; it is intended to be robust against this); the level of paranoia needed is lower.  This may or may not be helpful per se, but it probably doesn't hurt.

2. It doesn't need to execve to gain permissions; CLONE_NEWUSER can be done at the same time as CLONE_NEWPID etc., or earlier with unshare(2).  So we could just fork the child process directly into its own namespaces, instead of forking twice and leaving a process around with nothing to do but propagate exit status and signals (as the suid sandbox does).

However:

3. Forking processes by directly calling clone() will probably break libc functions like pthread_kill() and raise() which use a copy of the pid that won't be updated, because this isn't actually libc's fork().  That's fixed by the exec of plugin-container — and if CLONE_NEWPID is used, that process will be pid 1 in a namespace containing no other processes, so if raise() is called in the window between fork and exec (e.g., in a signal handler to re-raise the signal) it will fail instead of killing something unrelated.

4. Nuwa means that doesn't apply on B2G, but this is probably fixable.   More importantly, there's isn't as much value to pid namespaces on B2G, where we already have every child as a separate uid (so they can see other processes but not touch, and other sandboxing layers should mitigate even that), and without that this gets a lot simpler: the network namespace can be unshare()d in an existing process, and chroot() can be used with actual root privileges.


So the main thing to extract from Chromium (or reimplement) is the chroot helper: a process that's forked with CLONE_FS (which shares the root and working directories, so a chroot/chdir in one affects both) from the sandbox target while it has permission to chroot, and is signaled over a socketpair (and wait()ed for) when it's dropped any relevant permissions and is ready to become sandboxed.  That might want to be a new bug, if only so that it doesn't have all the clutter about actual setuid executables confusing people who read it.
(In reply to Jed Davis [:jld] from comment #9)
> That might want to be a new bug

…or an old bug.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.