Implement sandboxing on FreeBSD with Capsicum


My latest work in progress… sandboxing content with Capsicum capability mode on FreeBSD.

The idea behind Capsicum is treating file descriptors as capabilities. Once a process enters capability mode, it's in a very tight sandbox that does not have any access whatsoever to "global namespaces" — no open, connect, nothing like that — you can only derive new file descriptors from the ones you have: openat (open below a directory if you have the directory opened), connectat and other *at calls, accept, recvmsg (IPC fd-passing), dup and so on. Descriptors can have additional irreversible restrictions imposed by cap_rights_limit, and these limits are inherited all the way (e.g. if a directory fd was limited to not having CAP_WRITE, you won't be able to write to anything you openat from that directory).

Most software was not written in this capability style unfortunately, so we have to LD_PRELOAD a library that overrides libc functions with ones that try to use pre-opened directory descriptors. One such library is (which has been partially reused in WASI libc!) — but it's overkill for our use in some ways (we don't impose sandboxing on unsuspecting programs so we don't need to serialize the info about opened fds to shared memory) and not enough in other ways (sysctl, fopen/opendir, symlink resolution, etc.) and it's all C and it's just better to own the code here, so I added a little mozcapsicum library.

This is a work in progress. It mostly works already, but it only supports Wayland for now. Forkserver (bug 1607103) works, actually I've only tested with it enabled. GPU accelerated WebGL content works (tested on Mesa RadeonSI). Audio works with PulseAudio. Various other things (X11, sndio, multi-GPU, nvidia GPU?) are not tested/supported yet. Some paths are hardcoded, etc.

Requires patch from bug 1550891.

Currently dealing with a very VERY VERY weird bug: SecurityInformation fails to deserialize in the HttpChannelChild! Disabled assertion for now, but this is terrible. Can anyone help me debug this?

do not commit yet, see bug for info

Do you have more specifics on the problem you're seeing?

Oh, just had to do was to use a debug build:

[Child 77196, Main Thread] WARNING: 'NS_FAILED(rv)', file /usr/home/greg/src/, line 355

and that's getting the;1 service.

That's because NSS_NoDB_Init(nullptr) != SECSuccess when initializing NSS.

And NSS needs to dlopen by path (and in development everything is in its own dir like …/security/nss/lib/freebl/freebl_freebl3) and to open /dev/urandom (even though there's getentropy/getrandom). I have it working with some extra preloading now..

…yeah, that was also affecting WebCrypto by the way.

tested X11 (Xwayland), it does work. (glxtest failed but I'm pretty sure that's my weird setup; webgl.force-enabled showed that even WebGL works)

other things discovered by debug build:

MOZ_ASSERT_UNREACHABLE("PR_GetPhysicalMemorySize not implemented here") in image/SurfaceCache.cpp because it uses sysctl and my hook for that doesn't work right now. Changing it to sysconf (the Linux/Solaris code) which uses sysctlbyname makes it work easily. Probably would make sense to kill the sysctl section because I think all BSDs support sysconf for this.

js::GetNativeStackBaseImpl() sometimes fails MOZ_ASSERT(stackBase) but that doesn't seem to affect anything.

huh, if PulseAudio is not running yet, Firefox tries to start it from the content process (which is not possible when sandboxed of course). Well, I guess more like libpulse does that. Dang.

After a recent rebase, looks like it starts PulseAudio fine now, but NS_GRE_DIR broke o_0

// in a recent commit I see that Windows is preloading nss/softokn/freebl for the network process, hmm, I should try dlopen-before-cap_enter instead of shoving more things into LD_PRELOAD

media.cubeb.backend=alsa doesn't work here at least without env MOZ_CAPSICUM=1. Hiding ipc/glue/GeckoChildProcessHost.cpp changes behind PR_GetEnv() seems to help.

