STR: - Get a one-click loaner for this R1 job: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6813000f2a777de4ed8d410522ef9dc4bf8f6de5&selectedJob=117730537 - Observe that the task definition has `allowPtrace: true` - Create the task - In the shell, select `1` to resume task - Open another shell and select `4` - try to attach gdb to the firefox process that ends up running after a while Result: Attaching to process 1116 Could not attach to process. If your uid matches the uid of the target process, check the setting of /proc/sys/kernel/yama/ptrace_scope, or try again as the root user. For more details, see /etc/sysctl.d/10-ptrace.conf ptrace: Operation not permitted. /proc/sys/kernel/yama/ptrace_scope is 1, which is what prevents this, and can't be written to. ptracing child processes does work, but that also works on tasks without allowPtrace set, so for this use case, allowPtrace makes no difference.
IMO we should probably just allow ptrace of anything by-default on test-machines. Security isn't so important on test machines, and given that the allowPtrace option doesn't really work... Let's just either allow it completely or forbid it completely, on a per-workerType config.
@julien, So I thought why not ask a security expert... I heard a bird mention that you where looking into locking down CI, notably arguing for closing all ports and proxying livelogs and interactive session which we're working on separately. What are your thoughts on allowing ptrace of any process within a docker container? In combination with interactive features we have this would arguably be super useful for debugging tests. And we would probably forbid ptrace on workerTypes that build firefox, while allowing it on workers that run tests. pros: + docker uses process id namespaces, so ideally ptrace can only reference processes within the container + people with interactive access are at-least try users + security of test machines might be less important that developer productivity cons: - security isolation of docker containers have had a lot of issues historically, allowing ptrace could add more. Thoughts?
Bringing in le kang for some help here, as he's our resident kernel security expert. My understanding is that you'll need to run containers in privileged mode, or with --security-opt seccomp=unconfined, to get ptrace. If correct (:kang?), this means losing container isolation, and becomes equivalent to running jobs on the base host.
Flags: needinfo?(jvehent) → needinfo?(gdestuynder)
Thanks for taking an interest :) I think docker-worker has a hack where it makes a custom apparmor profile: https://github.com/taskcluster/docker-worker/blob/241eaa401d0c66f307f29098cbae36f3e77271a4/src/lib/task.js#L406-L444
TLDR: You'll have to disable ptrace protections in Yama (likely host wide though I haven't checked if Yama namespaces this - that may be important if both debug and release things run on the same hosts, and I'd recommend not running them on the same host regardless) or use a different isolation mechanism than namespaces (likely, VMs, f.e using the cc-oci runtime for Docker), or decide that debugging cannot occur in these tasks. Longish: Several things may disallow ptrace. In this case as per comment 0 you can attach to child processes but not other processes due to Yama (if this also runs AppArmor it could also block it "next"). ptrace allows full control over other processes in the same namespace, so in this case I'd argue that all you'd be doing is increase the attack surface (as long as your boundary is the 'container'. if you expect process level separation ptrace is obviously a problem) Another issue is that /proc values aren't always namespaced in the kernel, which means changing /proc/sys/kernel/yama/ptrace_scope might change the setting host-wide (on the flip side that'd mean you can set it on the host and have it for all containers ;-) Finally.. I believe more recent versions of Docker ship with a seccomp profile that also disallows ptrace altogether, in which case you'd also need to run with the Docker option `--security-opt seccomp:unconfined`whenever you upgrade as :ulfr mentioned. Note that this does not disable "all isolation", but definitely isn't an option you'd normally want to use :) At the end of the day ptrace (and other calls, really) is/are absolutely required for debugging and if tasks/containers need to be debugged, isolation has to be lowered when using namespaces. Hope this somewhat helps ;-)
Assignee: nobody → wcosta
Priority: -- → P3
You need to log in before you can comment on or make changes to this bug.