Linux sandbox broker support for connect()

RESOLVED FIXED in Firefox 60

Status

()

enhancement
P3
normal
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: jld, Assigned: jld)

Tracking

(Blocks 1 bug)

60 Branch
mozilla60
Unspecified
Linux
Points:
---

Firefox Tracking Flags

(firefox60 fixed)

Details

(crash signature)

Attachments

(1 attachment)

We've encountered some uses of sockets that were blocked in bug 1126437:

1. The NVIDIA proprietary GPU driver communicates with the X server over an abstract socket whose name starts with "nvidia"; I'm told that they're used to dealing with sandboxes like this and there's a fallback, but it's slower.  Empirically it doesn't seem to be a huge difference, but we probably don't want to leave performance on the floor like this.

2. The Bumblebee project, open-source software for using NVIDIA's “Optimus” GPU switching, tries to connect to a named socket to talk to a daemon and appears to exit the process if it fails.  (Currently it's breaking on Nightly because it doesn't check for errors from socket() and tries to connect(-1, ...) and that causes a sandbox crash.)

Fortunately, unlike with SysV IPC, we can broker this.  For socket(), we can allow it under the same restrictions as socketpair(), assuming nothing needs SOCK_DGRAM; for connect() we can read off the domain/type with getsockopt(), have the broker send down an already-connected fd, and dup2() it on top of the old one.  This won't be fully semantics-preserving, I think (consider dup()ing the socket, connect()ing one fd, and using the other), but it should be enough.  Alternately, socket() could return a placeholder.  I don't know if there are other socket options that might be changed between socket() and connect() that we'd need to worry about.  The important benefit of this approach is that the client doesn't have to pass two fds to the server (the response socket and the socket to operate on).

One annoyance is that this needs to apply policy to abstract addresses, which aren't pathnames.  I talked about this with gcp and we've decided the least bad approach is to just hardcode a check for "nvidia", rather than add infrastructure to translate them into pseudo-pathnames; if we encounter any other cases we could reconsider that.
Priority: -- → P1
Crash Signature: [@ libpthread-2.26.so@0x10d4c ]
So there are some problems.  The NVIDIA socket turns out to be SOCK_DGRAM, which is a problem for reasons described in bug 1066750.  In theory it could still work if we were going to unshare the network namespace and chroot, but we'd have to get that information from SandboxLaunch into the broker policy somehow, and it'd be kind of a hack ­— on top of what's already needed to get the opaque octet sequence that is an abstract address (e.g., trailing NULs are significant) through a bunch of code written for C strings and specifically pathnames.  And… it doesn't actually need it.  I'm increasingly of the opinion that this isn't worth it.

The Primus/Bumblebee socket, we need to do *something* about.  Options I've come up with are:
1. Detect it via the GLX client vendor string (needs extra plumbing in GfxInfo) and start GL eagerly in content process startup.
2. Detect it as above but weaken the sandbox instead (clamp level at 3).
3. Land the named/stream case of the connect() brokering just for this one uncommon use case; this is hard to justify.
4. Sandbox the glxtest process with the content policy, so that any GPU driver that can't work under sandboxing is implicitly blocked; this also “fixes” our problems with fglrx and VirtualGL and so on, but it's a little drastic.

And that should probably be a separate bug in any case.
See Also: → 1442055
Created bug 1442055 for the part of this that will cause sad-tabs if it ships.
No longer blocks: 1126437
Priority: P1 → P3
Comment hidden (mozreview-request)

Comment 4

a year ago
mozreview-review
Comment on attachment 8957422 [details]
Bug 1440206 - Allow brokered access to a subset of connect() in the Linux content sandbox.

https://reviewboard.mozilla.org/r/226346/#review232386

::: security/sandbox/common/SandboxSettings.cpp:30
(Diff revision 1)
>      level = 1;
>    }
>  #endif
>  #ifdef XP_LINUX
>    // Level 4 and up will break direct access to audio.
>    // Bug 1438391: also VirtualGL lazily connecting to X.

This comment is now out of date.

::: security/sandbox/linux/SandboxBrokerClient.cpp:253
(Diff revision 1)
> +SandboxBrokerClient::Connect(const sockaddr_un* aAddr, size_t aLen, int aType)
> +{
> +  static const size_t maxLen = sizeof(aAddr->sun_path);
> +  const char* path = aAddr->sun_path;
> +  const auto addrEnd = reinterpret_cast<const char*>(aAddr) + aLen;
> +  // Ensure that the length covers the non-path member(s).

If we assume that aLen must necessarily cover path, and path must be included even if it's a single \0, this check can be tightened to <=.
 
The are some interesting horror stories in unix(7) wrt Linux behavior here, but I think they do not apply on this line.

::: security/sandbox/linux/SandboxBrokerClient.cpp:266
(Diff revision 1)
> +  // How much of sun_path may be accessed?
> +  auto bufLen = static_cast<size_t>(addrEnd - path);
> +  if (bufLen > maxLen) {
> +    bufLen = maxLen;
> +  }
> +  // Require null-termination.

I read unix(7) BUGS and wept, but upon further look, the client code deals with that case and so we don't have to.

::: security/sandbox/linux/SandboxFilter.cpp:705
(Diff revision 1)
> +    }
> +    const int newFd = aBroker->Connect(aAddr, aLen, *maybeType);
> +    if (newFd < 0) {
> +      return newFd;
> +    }
> +    if (fcntl(newFd, F_SETFL, oldFlags & O_NONBLOCK) != 0) {

What does the O_NONBLOCK do here?

::: security/sandbox/linux/SandboxFilter.cpp:768
(Diff revision 1)
>          // them by pointer.
>          return Some(Allow());
>        }
>        Arg<int> domain(0), type(1);
>        return Some(If(domain == AF_UNIX,
> -                     Switch(type & ~SOCK_CLOEXEC)
> +                     Switch(type & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))

I guess this relates to the previous question.

::: security/sandbox/linux/SandboxFilter.cpp:780
(Diff revision 1)
>  
>  #ifdef ANDROID
>      case SYS_SOCKET:
>        return Some(Error(EACCES));
>  #else // #ifdef DESKTOP
> -    case SYS_SOCKET: // DANGEROUS
> +    case SYS_SOCKET: { // DANGEROUS

I guess the // DANGEROUS comment can go now.
Attachment #8957422 - Flags: review?(gpascutto) → review+
(In reply to Gian-Carlo Pascutto [:gcp] from comment #4)
> Comment on attachment 8957422 [details]
> >    // Bug 1438391: also VirtualGL lazily connecting to X.
> 
> This comment is now out of date.

Nice catch; thanks.

> > +    if (fcntl(newFd, F_SETFL, oldFlags & O_NONBLOCK) != 0) {
> 
> What does the O_NONBLOCK do here?

It's to carry over the nonblocking flag from the original socket.  (The other fcntl-settable flags are either irrelevant to sockets, like O_APPEND, or would be denied by the seccomp-bpf policy.)  I'll comment it.  However…

> > -                     Switch(type & ~SOCK_CLOEXEC)
> > +                     Switch(type & ~(SOCK_CLOEXEC | SOCK_NONBLOCK))
> 
> I guess this relates to the previous question.

…it turns out we don't strictly need to handle this, at least on my system: the only thing using SOCK_NONBLOCK was glibc's getaddrinfo() calling nscd, but that won't be allowed to connect.  When I was writing this patch I thought it was the X11 libraries that were using SOCK_NONBLOCK, but actually they're not.

However, we at least need this to fail gracefully (vs. being treated as an unknown socket type), and I already wrote the code to support it, so we might as well.

> > -    case SYS_SOCKET: // DANGEROUS
> > +    case SYS_SOCKET: { // DANGEROUS
> 
> I guess the // DANGEROUS comment can go now.

I suppose.  They're still dangerous to allow, but now that they're getting special treatment we probably don't need them to be flagged like that.

Comment 6

a year ago
Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/792ab44dd9ec
Allow brokered access to a subset of connect() in the Linux content sandbox. r=gcp
Duplicate of this bug: 1442055
`git mozreview push` was giving me HTTP errors so I pushed the updated patch directly.

And I realized immediately after pushing that I forgot to edit the commit message to mention that SOCK_NONBLOCK is now allowed for socketpair, so I'm mentioning it here.

Comment 9

a year ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/792ab44dd9ec
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
Crash Signature: [@ libpthread-2.26.so@0x10d4c ] → [@ libpthread-2.26.so@0x10d4c ] [@ libpthread-2.26.so@0x11a3c ]
You need to log in before you can comment on or make changes to this bug.