Bugzilla

Reporter

Comment 1

•

9 years ago

while running some tests on a production docker image locally, I see dumps being created- one less headache to figure out.

Reporter

Comment 2

•

9 years ago

doing kill -ABRT works; I guess we need to figure out where the missing glue is for getting the actual dump out of there.

Reporter

Comment 3

•

9 years ago

trying to reproduce this on the above mentioned tests, I run into errors while waiting for focus.

Reporter

Comment 4

•

9 years ago

this might be specific to the lack of a window manager (think dialogs not working, etc.)- will update this when we test with a proper window manager on try.  locally it seems to fix the problem.

Comment 5

•

9 years ago

Hm, that's interesting, I wouldn't expect that to have any effect here. We don't launch the crash reporter client during these tests. I would expect lots of other tests to fail without a window manager though, honestly.

I know we're running some sort of window manager under Xvfb on the buildbot Linux testers.

Comment 6

•

9 years ago

I was poking through open Breakpad issues and came across https://bugs.chromium.org/p/google-breakpad/issues/detail?id=644. I did some local testing and it looks like Breakpad is actually just broken under Docker. I'll see if it's something fixable.

Comment 7

•

9 years ago

Okay, so it turns out this is due to Ubuntu's AppArmor profiles. If I pass `--cap-add SYS_PTRACE --security-opt apparmor:unconfined` to `docker run` on my machine I can run the Breakpad unit tests and they pass inside Docker. We'll have to get Taskcluster's docker-worker to do that.

Assignee

Comment 8

•

9 years ago

--security-opt apparmor:unconfined seems .. unconfined

14:41:55 <ted> ¯\_(ツ)_/¯
14:42:07 <ted> that's the best i could get out of that docker issue
14:42:44 <ted> i can give you a super simple command to test and you can try to find something that's more restrictive but still works
14:43:28 <ted> `od -x /proc/$$/auxv`
14:43:35 <ted> if that prints an error, breakpad won't work
14:43:40 <ted> if it prints output, it will
14:44:15 <ted> $ docker run --rm -t -i ubuntu:14.04 /bin/sh -c 'od -x /proc/$$/auxv'
14:44:15 <ted> od: /proc/1/auxv: read error: Permission denied
14:44:51 <ted> $ docker run --rm -t -i --cap-add SYS_PTRACE --security-opt apparmor:unconfined ubuntu:14.04 /bin/sh -c 'od -x /proc/$$/auxv' 
14:44:51 <ted> 0000000 0021 0000 0000 0000 c000 84fe 7ffe 0000
14:44:54 <ted> <...>

Comment 9

•

9 years ago

I'm trying to figure out where to append --cap-add SYS_PTRACE --security-opt apparmor:unconfined to test jobs and I've failed to find where this could be changed.

Any suggestions? Is starting the docker image part of one of the TC repos?

Comment 10

•

9 years ago

dustin mentioned on IRC that it is too risky to use --security-opt apparmor:unconfined

Are we OK to having green TC jobs running side by side as tier2 *across* the board?
What would the value of running tests without having a solution here?

Reporter

Comment 11

•

9 years ago

have we determined on try that adding --security-opt will fix the tests?  If we had that data, then we can determine which tests pass/fail and document it.  Maybe a mozinfo flag and manifest condition could help out?

Assignee

Comment 12

•

9 years ago

This isn't something we can test in try -- it requires changes to the docker-worker which invokes docker.

From the little bit I've been able to read, the proposal is the equivalent of removing your front door and doorframe because your spare key doesn't work in the lock.  Effective?  yes.  A good idea?  maybe not.

Ted gave a pretty good reproduction recipe that we could use to test other, more refined solutions.  Someone -- and it might be me, just not now -- needs to dig into this and look at how AppArmor works, how docker uses it, how that might be configured, and so on, then devise the minimal solution with a full understanding of the security implications.  I suspect that might start by putting AppArmor in complain mode and running Ted's recipe in a docker container.  Then probably duplicate the docker-default profile with the necessary changes (after understanding them -- I'd guess it's just allowing ptrace and/or reading auxv).

https://github.com/docker/docker/blob/master/docs/security/apparmor.md
https://github.com/docker/docker/blob/master/daemon/execdriver/native/apparmor.go#L32

It looks like the profile is pretty basic.  Docker rewrites its own profile on startup, but we could easily copy/paste it to another profile named `tc-docker` and invoke that with `--security-opt apparmor:tc-docker`.

Comment 13

•

9 years ago

As I understand it, this is a tier1 blocker and the top one atm.
I assume we can continue adding more tier2 jobs for now without this.

Assignee

Updated

•

9 years ago

Assignee: nobody → dustin

Comment 14

•

9 years ago

Docker looks to have code to work around this on trunk:
https://github.com/docker/docker/blob/master/daemon/execdriver/native/apparmor.go#L67

Adding `ptrace (trace,read) peer=docker-default,` to /etc/apparmor.d/docker on my local machine and restarting apparmor with `service apparmor restart` makes the simple test (`docker run --rm -t -i ubuntu:14.04 /bin/sh -c 'od -x /proc/$$/auxv'`) work locally. However, these changes don't survive a docker daemon restart.

Doing what dustin suggested in comment 12 seems to work fine though:
```
cp /etc/apparmor.d/docker /etc/apparmor.d/docker-ptrace
patch /etc/apparmor.d/docker-ptrace <<EOF
6c6
< profile docker-default flags=(attach_disconnected,mediate_deleted) {
---
> profile docker-ptrace flags=(attach_disconnected,mediate_deleted) {
14a15
>   ptrace (trace,read) peer=docker-ptrace,
EOF
service apparmor restart
```

Running my test command with this apparmor profile works fine then: `docker run --rm -t -i --security-opt apparmor:docker-ptrace ubuntu:14.04 /bin/sh -c 'od -x /proc/$$/auxv'`.

Running a simple Breakpad test works as well, so I think this should fix the unittest problems.

Comment 15

•

9 years ago

Just as clarification (because I wasn't sure), Docker containers use PID namespaces for each container:
http://man7.org/linux/man-pages/man7/pid_namespaces.7.html

So processes running inside a container cannot inspect processes outside of that container, even with ptrace enabled. I think that enabling ptrace shouldn't actually weaken security.

Assignee

Comment 16

•

9 years ago

So basically this new profile would allow processes in docker containers to ptrace one another (since they're all in the docker-ptrace profile), but due to PID namespaces they can practically only ptrace other processes in the same container.  That seems pretty reasonable.

:kang, is that about right?

I'll start in on a patch to accomplish this as a docker-worker feature, preferably controlled by workerType configuration, so we can enable it only for test jobs' workerTypes, where inter-container interference isn't such a big deal.

Flags: needinfo?(gdestuynder)

Guillaume Destuynder [:kang] [this account is no longer active]

Assignee

Comment 17

•

9 years ago

Oh, hey, we can isolate ptrace to within the single container by creating an apparmor profile per task..

Comment 18

•

9 years ago

Some disclaimer:
ptrac'ing a process is the equivalent of full control of the said process (both memory and execution).

The namespaces in kernel have been "patched" in over time so that all kernel code would be namespace aware and properly separate resources. However, that also means there's a lower confidence in how well separated these namespaces are / higher probability for bugs. Due to that, apparmor (or selinux or else) are used to ensure all potentially dangerous calls are forbidden (its much easier to audit the system call table, than what the code behind the calls actually do). It's a little safer that way and ptrace() is always among the calls to be denied when making such lists.

Likewise, there might be privileged processes inside a container that you don't want to be ptraced even as root - obviously, removing the limitation will also allow that.

With that in mind, it should otherwise work (i.e. can't ptrace the other PID namespace).
Basically, a test case would be:

unshare -f -p /bin/bash
ps aux #pick a process from parent - you can see them because you're not remounting proc.
gdb -p 999 #gdb does a ptrace() call on 999
<No such process 999> #yay!

Guillaume Destuynder [:kang] [this account is no longer active]

Updated

•

9 years ago

Flags: needinfo?(gdestuynder)

Assignee

Comment 19

•

9 years ago

Attached file https://github.com/taskcluster/docker-worker/pull/192 — Details

Attachment #8695305 - Flags: review?(garndt)

Greg Arndt [:garndt]

Comment 20

•

9 years ago

Comment on attachment 8695305 [details] [review]
https://github.com/taskcluster/docker-worker/pull/192

After reviewing some failures with Dustin, he was able to get it running in his environment and we also tested on an actual deployed ami of the worker.

For some reason the test fails in the vagrant environment but since we tested by other means, we should let this PR merge and figure out the test later.

Attachment #8695305 - Flags: review?(garndt) → review+

Assignee

Comment 21

•

9 years ago

Merged!  We'll deploy tomorrow, only to the desktop-test workerType.

Comment 22

•

9 years ago

\o/

What do I need to do to make use of it?
I will try it Tuesday evening.

Assignee

Comment 23

•

9 years ago

Add

task:
  payload:
    features:
      allowPtrace: true

Comment 24

•

9 years ago

I've pushed with ptrace enabled (let's see what the results look like):
https://hg.mozilla.org/try/rev/c3cb854e7fa0

Greg Arndt [:garndt]

Comment 25

•

9 years ago

New amis were added for desktop-test.  There's some existing pending jobs that will keep the old machines around for a bit.  Once those complete and die off the new ones will rollout.  We can kill old instances to speed up with the process if you need it.

us-east-1: ami-2e397344
us-west-1: ami-461e7526
us-west-2: ami-c94f50a8

Assignee

Comment 26

•

9 years ago

Using the proper AMIs:
  https://treeherder.mozilla.org/#/jobs?repo=try&revision=199520ba8bfe

Assignee

Comment 28

•

9 years ago

Argh, I lied -- we also need to add docker-worker:feature:allowPtrace to the scopes.

Assignee

Comment 29

•

9 years ago

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2945b0340d53

Assignee

Comment 30

•

9 years ago

Looks promsing?

Assignee

Comment 31

•

9 years ago

Attached file MozReview Request: Bug 1221661: enable ptrace for desktop tests; r?jmaher — Details

Bug 1221661: enable ptrace for desktop tests; r?jmaher

Attachment #8698518 - Flags: review?(jmaher)

Reporter

Comment 32

•

9 years ago

Comment on attachment 8698518 [details]
MozReview Request: Bug 1221661: enable ptrace for desktop tests; r?jmaher

https://reviewboard.mozilla.org/r/28017/#review25095

awesome, thanks!

Attachment #8698518 - Flags: review?(jmaher) → review+