Closed Bug 1259508 Opened 8 years ago Closed 8 years ago

Seccomp sandbox violation: sys_clone called in content process of Firefox desktop

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla49

Tracking Flags:

Tracking

Status

firefox49

---

fixed

People

(Reporter: tedd, Assigned: tedd)

References

Details

(Whiteboard: sblc1)

Attachments

(4 files, 4 obsolete files)

Seccomp error log 8 years ago Julian Hector [:tedd] [:jhector] 3.33 KB, text/plain		Details
Seccomp error log + libxul symbols 8 years ago Julian Hector [:tedd] [:jhector] 3.91 KB, text/plain		Details
strace log of preceded sys_clone 8 years ago Julian Hector [:tedd] [:jhector] 1.42 KB, text/plain		Details
DO NOT COMMIT - trigger init code before seccomp 8 years ago Julian Hector [:tedd] [:jhector] 820 bytes, patch		Details \| Diff \| Splinter Review
libpulse Seccomp error log 8 years ago Julian Hector [:tedd] [:jhector] 2.29 KB, text/plain		Details
libpulse Seccomp error log + libxul symbols 8 years ago Julian Hector [:tedd] [:jhector] 2.85 KB, text/plain		Details
DO NOT COMMIT - trigger init code before seccomp 8 years ago Julian Hector [:tedd] [:jhector] 881 bytes, patch		Details \| Diff \| Splinter Review
Cubeb init before seccomp 8 years ago Julian Hector [:tedd] [:jhector] 1.87 KB, patch	jld : review+	Details \| Diff \| Splinter Review

Julian Hector [:tedd] [:jhector]

Assignee

Description

•

8 years ago

Attached file Seccomp error log (obsolete) — Details

When enabling content sandboxing on Linux desktop (using ac_add_options --enable-content-sandbox), the content process crashes early on due to a seccomp violation when calling sys_clone.

We currently have a policy for clone in place [1]. It seems that non whitelisted clone flags were used. (clone flag: 0x1200011)

A more readable form of the flags would be:

> CLONE_CHILD_SETTID | CLONE_CHILD_CLEARTID | SIGCHLD

used in the ARCH_FORK of libc [2]. Only CLONE_CHILD_CLEARTID is part of our whitelist, CLONE_CHILD_SETTID and SIGCHLD are not whitelisted.

The backtrace suggests that it is caused by libgconf, frame #16 is part of nsGConfService::GetString.

How do we want to handle this, if possible we should avoid adding to the whitelist, maybe we can remote that code somehow?

As far as the missing symbols are concerned, currently I would have to look them up manually with gdb (maybe there is a better way?)... Since those offsets in the log are specific to my compiled version, I can supply more information if required, just ni? me.

[1] https://dxr.mozilla.org/mozilla-central/rev/6202ade0e6d688ffb67932398e56cfc6fa04ceb3/security/sandbox/linux/SandboxFilter.cpp#90
[2] https://github.com/lattera/glibc/blob/59ba27a63ada3f46b71ec99a314dfac5a38ad6d2/nptl/sysdeps/unix/sysv/linux/x86_64/fork.c#L27

Jed Davis [:jld] (on leave until 08-31)

Comment 1

•

8 years ago

The last time I tried this I ran into a problem with fork() and gstreamer, but it looks like I didn't file a bug about it.

fork() isn't hugely bad by itself, but it's probably going to execve.  The biggest problem with that is probably that the seccomp-bpf filter would be preserved, but not the SIGSYS handler.  execve means filesystem access, but there's no way to broker it.  (There's an fexecve(), but on Linux it's implemented on top of execve via procfs.)

If there's some way to pre-perform whatever this is before starting the sandbox, that's one solution.  Changing the clone() policy to just return an error for this kind of thing might also work (but maybe via a trap function in order to optionally log it?), depending on what the library is trying to do.  And of course there's ELF symbol interposition.

The backtrace would help figure out what to do.  mozilla-central has tools/rb/fix_linux_stack.py but I don't know if that knows how to look up system libraries (or how hard it would be to add that, if not).

Julian Hector [:tedd] [:jhector]

Assignee

Comment 2

•

8 years ago

Attached file Seccomp error log + libxul symbols — Details

Thanks :jld, I think the pre-perform way would be ideal, I think in Bug 942698, they were looking for a way to turn on the sandbox once most of the initialization was completed, although the patches are outdated I think.

On a different note, isn't audio only started when required? So if the user browses the web without ever needing audio, this wouldn't be triggered right? 

I looked up the symbols in gdb for libxul

Attachment #8734436 - Attachment is obsolete: true

Julian Hector [:tedd] [:jhector]

Assignee

Comment 3

•

8 years ago

Ah, never mind my last comment about audio, shouldn't have anything to do with this bug, I confused it with a different bug.

Jim Mathies [:jimm]

Updated

•

8 years ago

Whiteboard: sb? → sb+

Julian Hector [:tedd] [:jhector]

Assignee

Comment 4

•

8 years ago

Attached file strace log of preceded sys_clone — Details

I did some further investigation and as you mentioned :jld, the fork() is followed by a execve(), the function in question is: _read_subprocess_line_argv() inside dbus/dbus-sysdeps-unix.c, of the dbus repository.

Now, libgconf eventually calls ensure_dbus_connection() which triggers the fork/execve if dbus-launch isn't running (or it doesn't have a connection to it).

Since we can't do much about libdbus nor libgconf, I started looking into our code that leads there. It seems that we use the GConfClient API by calling gconf_client_get_string() in nsGConfService, which in return goes into the libgconf library.

This all seems to be part of the initialization of nsProtocolProxyService, which is probably triggered by some events (I haven't looked into that).

I patched the RecvSetProcessSandbox() function to trigger the init code of nsProtocolProxyService, before seccomp is applied (I will add the patch later) to see if the clone call can be avoided by letting it spawn the process before seccomp is running.

The attached file is a strace log the child, which clearly shows the sys_clone call (which would have hit seccomp) followed by sys_execve() of dbus-launch, followed by sys_prctl() for NO_NEW_PRIVS and seccomp.

The patch is quick and dirty, just to illustrate the point, but it may be worth considering to trigger certain functions before applying seccomp, at least this way we would avoid adding sys_execve to the whitelist. 

But I don't know if this would expose us to additional risks, at least in this case I don't see it being a problem, but then again I don't know the code too well.

Julian Hector [:tedd] [:jhector]

Assignee

Comment 5

•

8 years ago

Attached patch DO NOT COMMIT - trigger init code before seccomp (obsolete) — Details — Splinter Review

Patch to trigger init code before seccomp is applied.

Jed Davis [:jld] (on leave until 08-31)

Comment 6

•

8 years ago

(In reply to Julian Hector [:tedd] [:jhector] from comment #4)
> The patch is quick and dirty, just to illustrate the point, but it may be
> worth considering to trigger certain functions before applying seccomp, at
> least this way we would avoid adding sys_execve to the whitelist. 

We'll probably need to do that.  This has come up before, and I think I suggested having a “PreloadUnsandboxableThings” or something like that (cf. the PreloadSlowThings in ContentChild and TabChild for the B2G preallocated process), but whatever it was wound up being worked around differently.  I think Chromium takes this approach for some things, but I don't remember the details offhand.

Julian Hector [:tedd] [:jhector]

Assignee

Comment 7

•

8 years ago

Attached file libpulse Seccomp error log (obsolete) — Details

Another instance where sys_clone() is called, this time it seems to be out of libpulse.

Julian Hector [:tedd] [:jhector]

Assignee

Comment 8

•

8 years ago

Attached file libpulse Seccomp error log + libxul symbols — Details

I looked up some symbol names in libxul.

Attachment #8742374 - Attachment is obsolete: true

Julian Hector [:tedd] [:jhector]

Assignee

Comment 9

•

8 years ago

Attached patch DO NOT COMMIT - trigger init code before seccomp (obsolete) — Details — Splinter Review

Made some modifications to also solve the libpulse problem.

Attachment #8738705 - Attachment is obsolete: true

Julian Hector [:tedd] [:jhector]

Assignee

Comment 10

•

8 years ago

:padenot, :badger, I ni? you two because I saw you in the git blame of the related code. If you aren't the right person to talk to and you know someone how can help out, please put them on ni?.

In the attached patch, I called |CubebUtils::GetCubebContext();| and the constructor of |NS_PROTOCOLPROXYSERVICE_CONTRACTID| before activating seccomp, to trigger the initialization code which does some fork/execve. 

Now this seems to fix my problem, but the question is, does this expose us to any additional security risks? I.e. does the code handling the initialization, use untrusted content in some way (like content that is provided by the webserver)

I think for the GetCubebContext part, calling it before seccomp is enabled, means that code is initialized which may not be needed during the session (I believe it is audio related, so I think browsing the web without the need for audio would avoid this code path). It probably isn't a big problem, but what are you thoughts on that? (as for ProtocolProxyService, I have not idea what this code does).

Thanks.

Flags: needinfo?(padenot)

Flags: needinfo?(daniel)

Julian Hector [:tedd] [:jhector]

Assignee

Updated

•

8 years ago

Whiteboard: sb+ → sblc1

Paul Adenot (:padenot)

Comment 11

•

8 years ago

(In reply to Julian Hector [:tedd] [:jhector] from comment #10)
> :padenot, :badger, I ni? you two because I saw you in the git blame of the
> related code. If you aren't the right person to talk to and you know someone
> how can help out, please put them on ni?.
> 
> In the attached patch, I called |CubebUtils::GetCubebContext();| and the
> constructor of |NS_PROTOCOLPROXYSERVICE_CONTRACTID| before activating
> seccomp, to trigger the initialization code which does some fork/execve. 
> 
> Now this seems to fix my problem, but the question is, does this expose us
> to any additional security risks? I.e. does the code handling the
> initialization, use untrusted content in some way (like content that is
> provided by the webserver)
> 
> I think for the GetCubebContext part, calling it before seccomp is enabled,
> means that code is initialized which may not be needed during the session (I
> believe it is audio related, so I think browsing the web without the need
> for audio would avoid this code path). It probably isn't a big problem, but
> what are you thoughts on that? (as for ProtocolProxyService, I have not idea
> what this code does).

As you correctly determined, GetCubebContext loads a bunch of audio-related things. Depending on the platform, it can be almost nothing (a malloc and a mutex init on OSX), or more involved (a bunch of syscalls, opening DLL and fetching symbols, on windows and linux). The execution of these methods does not depend on web content at all, we just call it the first time we need to output some audio. 

Last we checked, calling it early meant a longer startup time, but maybe it's better to have seccomp enabled, yeah.

Gian-Carlo Pascutto [:gcp]

Comment 12

•

8 years ago

(In reply to Paul Adenot (:padenot) from comment #11)
> Last we checked, calling it early meant a longer startup time, but maybe
> it's better to have seccomp enabled, yeah.

This will become a non-issue if audio gets remoted (which will trigger lazy init in the chrome process).

Paul Adenot (:padenot)

Updated

•

8 years ago

Flags: needinfo?(padenot)

Julian Hector [:tedd] [:jhector]

Assignee

Comment 13

•

8 years ago

Thanks Paul and Gian-Carlo, so for the time being (until Bug 1104619 is resolved) we would probably do this early initialization.

Daniel Stenberg [:bagder]

Comment 14

•

8 years ago

> as for ProtocolProxyService, I have not idea what this code does

ProtocolProxyService handles proxies, like PAC and related stuff. Sorry for being slow but I don't quite understand what you're saying ProtocolProxyService does wrong or what exactly the problem is there.

I'll ni mcmanus here to see if we can bring some insights.

Flags: needinfo?(daniel) → needinfo?(mcmanus)

Jed Davis [:jld] (on leave until 08-31)

Comment 15

•

8 years ago

(In reply to Daniel Stenberg [:bagder] from comment #14)
> ProtocolProxyService handles proxies, like PAC and related stuff. Sorry for
> being slow but I don't quite understand what you're saying
> ProtocolProxyService does wrong or what exactly the problem is there.

As I understand it, from the attacked stack trace and earlier comments: it's trying to read the systemwide proxy configuration, which calls into some external frameworks (gconf, dbus), and that's what's trying to fork/exec dbus-launch.

Patrick McManus [:mcmanus]

Comment 16

•

8 years ago

1 - do you have a stack that has more xul symbols in it?

2 - as you say it would be ok to remote this, I'm assuming the problem is that this happens in the child and it would be ok in the parent?

2a depending on exactly what the stack is (see 1), its possible that this class isn't needed at all in the child.. so we could just skip init of it.

Flags: needinfo?(mcmanus)

Patrick McManus [:mcmanus]

Comment 17

•

8 years ago

(In reply to Patrick McManus [:mcmanus] from comment #16)
> 1 - do you have a stack that has more xul symbols in it?


found it :)

Patrick McManus [:mcmanus]

Comment 18

•

8 years ago


Sandbox: frame #21: nsProtocolProxyServiceConstructor[/home/user/build/dist/bin/libxul.so +0xd32e84]
Sandbox: frame #22: nsComponentManagerImpl::CreateInstanceByContractID[/home/user/build/dist/bin/libxul.so +0xb29747]
Sandbox: frame #23: nsComponentManagerImpl::GetServiceByContractID[/home/user/build/dist/bin/libxul.so +0xb2a91f]
Sandbox: frame #24: nsGetServiceByContractIDWithError::operator()[/home/user/build/dist/bin/libxul.so +0xb4deeb]
Sandbox: frame #25: nsCOMPtr_base::assign_from_gs_contractid_with_error[/home/user/build/dist/bin/libxul.so +0xb46806]
Sandbox: frame #26: nsIOService::SpeculativeConnectInternal[/home/user/build/dist/bin/libxul.so +0xba0be9]
Sandbox: frame #27: nsDocument::MaybePreconnect[/home/user/build/dist/bin/libxul.so +0x141aecb]

Patrick McManus [:mcmanus]

Comment 19

•

8 years ago

Daniel, I think you can fix this.. the path being called is into nsIOService::SpeculativeConnect() rather than nsHttphandler::SpeculativeConnect().. there is an intentional difference, but it eludes me at the moment.

but one unintentional difference is that you'll see the nsHttpHandler version does indeed remote this.. https://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpHandler.cpp#2200

I think the IO service would be able to call the same remoting code..

Julian Hector [:tedd] [:jhector]

Assignee

Comment 20

•

8 years ago

Thanks Patrick, if we can remote this with a couple of lines code, that would probably be the preferred solution.

As for question 2 in Comment 16, the problem is that it happens in the child after we enabled seccomp, so simply calling the code before seccomp is enabled (see attached "DO NOT COMMIT" patch) solves this problem.

I am just not sure if that leads to any security problems, because an attacker could leverage that code that is triggered, to his advantage. So I was hoping someone who knows the code could answer that.

Patrick McManus [:mcmanus]

Comment 21

•

8 years ago

this bug is currently a blocker for enabling the seccomp sandbox on nightly

can you try and work up a patch? I don't think it will be very involved?

Flags: needinfo?(daniel)

Julian Hector [:tedd] [:jhector]

Assignee

Updated

•

8 years ago

Depends on: 1270147

Patrick McManus [:mcmanus]

Comment 22

•

8 years ago

necko bit now in 1270147

Flags: needinfo?(daniel)

Julian Hector [:tedd] [:jhector]

Assignee

Comment 23

•

8 years ago

With Daniel taking care of Bug 1270147, the only thing left here is audio. According to Comment 11, a early initialization could cause a performance impact.

Which won't be a problem anymore when audio is remoted (Comment 12).

As a workaround, until Bug 1104619 is resolved, I tried to do the early initialization in a not so hackish way by moving it into a separate function located in security/sandbox/linux/, but I couldn't get it to work, because mozilla::CubebUtils::GetCubebContext() is defined in libxul and not exported.

It probably isn't a problem to make it somehow available (export it etc), but I think it is even more hackish as a workaround.

Paul: is there maybe a way to trigger the cubeb init code that can be easily reached from security/sandbox/linux? I looked at some of the functions in the backtrace, but all of them require arguments and I don't know the code.

Jed: if there isn't an 'easy' way to trigger the init code from within the sandbox code, would it be acceptable to add the GetCubebContext() call inside the RecvSetProcessSandbox() function, until Bug 1104619 is resolved? (I know it is ugly, but we could mark it as a workaround with a comment)

Of course there is also the point of having more code that needs to be initialized before seccomp, but so far I only encountered these two instances.

So what do you both think?

Flags: needinfo?(padenot)

Flags: needinfo?(jld)

Jed Davis [:jld] (on leave until 08-31)

Comment 24

•

8 years ago

(In reply to Julian Hector [:tedd] [:jhector] from comment #23)
> Jed: if there isn't an 'easy' way to trigger the init code from within the
> sandbox code, would it be acceptable to add the GetCubebContext() call
> inside the RecvSetProcessSandbox() function, until Bug 1104619 is resolved?
> (I know it is ugly, but we could mark it as a workaround with a comment)

I don't have a problem with that.  But, yes, a comment that we're planning to eventually remove it (and mentioning the bug number) would be good.

If we're concerned about performance, it should be fairly easy to measure the time that that call takes.  Also, there might already be telemetry for content process startup time.

Flags: needinfo?(jld)

Paul Adenot (:padenot)

Comment 25

•

8 years ago

Maybe you can call `cubeb_init()` from the sandbox, and hand off the pointer to Gecko right after ?

You just need a pointer to a `cubeb*` (which is an opaque type), and a c-string.

Flags: needinfo?(padenot)

Julian Hector [:tedd] [:jhector]

Assignee

Comment 26

•

8 years ago

Attached patch Cubeb init before seccomp — Details — Splinter Review

Thanks for the feedback, I decided to place the call inside the RecvSetProcessSandbox() function, because the context passed to cubeb_init() is used inside CubebUtils to check if it is already initialized or not.
So I wanted to avoid passing around too many pointers.

Try push for build: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fa412e1d72b7

Attachment #8742960 - Attachment is obsolete: true

Attachment #8751730 - Flags: review?(jld)

Julian Hector [:tedd] [:jhector]

Assignee

Updated

•

8 years ago

Assignee: nobody → julian.r.hector

Jed Davis [:jld] (on leave until 08-31)

Updated

•

8 years ago

Attachment #8751730 - Flags: review?(jld) → review+

Jed Davis [:jld] (on leave until 08-31)

Comment 27

•

8 years ago

FYI, bug 1234092 removed gstreamer support, so comment #1 and anything I've mentioned about it trying to run gst-plugin-scanner should no longer be an issue.

Julian Hector [:tedd] [:jhector]

Assignee

Comment 28

•

8 years ago

Try run with all tests: https://treeherder.mozilla.org/#/jobs?repo=try&revision=92e937bb5c2b

Julian Hector [:tedd] [:jhector]

Assignee

Comment 29

•

8 years ago

Try run seems good and given Comment 11, the potential performance impact is bearable for the sake of enabling seccomp. After all, this is a temporary solution and will no longer be required once Bug 1104619 is resolved.

Keywords: checkin-needed

Pulsebot

Comment 30

•

8 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/e5e3a65e8220

Keywords: checkin-needed

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 31

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/e5e3a65e8220

Status: NEW → RESOLVED

Closed: 8 years ago

status-firefox49: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla49

Haik Aftandilian [:haik]

Updated

•

6 years ago

Updated

•

6 years ago