Closed Bug 1461848 Opened 6 years ago Closed 6 years ago

Snap sandbox breaks crash reporting

Categories

(Release Engineering Graveyard :: Release Automation: Snap, defect, P2)

Unspecified
Linux

Tracking

(firefox61 wontfix, firefox62 affected, firefox63 affected)

VERIFIED FIXED
Tracking Status
firefox61 --- wontfix
firefox62 --- affected
firefox63 --- affected

People

(Reporter: jld, Assigned: jld)

References

Details

STR: start Snap-packaged Firefox; find a content process with `ps x`; do `kill -11 [pid]`.

Expected: the sad-tab page offers to submit a report for the (fake) crash

Actual: it just offers to reload or close the tab

Control: the same STR but running /snap/firefox/current/firefox directly (without the Snap sandboxing)


There's a big clue in dmesg — 10 instances of this line, differing only in timestamps and serial number:

[  439.899770] audit: type=1326 audit(1526420849.130:182): auid=4294967295 uid=1000 gid=1000 ses=4294967295 pid=5205 comm="firefox" exe="/snap/firefox/85/firefox" sig=0 arch=c000003e syscall=101 compat=0 ip=0x7f76ac2309d9 code=0x50000

It's not explicitly stated in the message, but from browsing the kernel source, this is a seccomp-bpf denial from a policy with logging enabled.  Specifically: 0x50000 is SECCOMP_RET_ERRNO, pid 5205 is the Firefox parent process, and syscall 101 is ptrace.

(And I'm cheating, because I went into this expecting that crash reporting was probably broken and it'd probably be the Snap sandbox not allowing ptrace, and in fact that's why I tested it in the first place.)


But it turns out there's a reason ptrace isn't allowed: on older kernels it can be used to bypass the seccomp-bpf policy, as described (and patched) in https://lkml.org/lkml/2016/5/26/354

It might be possible to filter the ptrace arguments with seccomp-bpf to allow what breakpad uses but not the operations needed for the exploit; I haven't looked into the details yet, and I don't know how expressive Snap's BPF compiler's source language is.

And to prevent confusion: none of this has anything to do with our own sandboxing, except that they both use seccomp-bpf.
Thanks for the thorough report, Jed! Ken, have you guys already seen a case where `ptrace` calls were needed in Snaps?
Flags: needinfo?(ken.vandine)
ptrace commands used by Breakpad, from searching the source: PTRACE_ATTACH, PTRACE_DETACH, PTRACE_GETREGS, PTRACE_GETREGSET, PTRACE_GETFPREGS, PTRACE_GETFPXREGS, PTRACE_PEEKDATA

In particular, Breakpad doesn't appear to need anything that would modify the tracee's registers or memory, which makes sense given what it does.  In contrast, the bypass exploit seems to need one of the register-changing operations.  The Linux kernel commit linked above added some regression tests to tools/testing/selftests/seccomp/seccomp_bpf.c; the function change_syscall seems to be the important part, and it uses either PTRACE_SETREGS or PTRACE_SETREGSET.
(Adjusting summary because parent process crashes are handled the same way — except that the ptracer is a child process cloned in the crash handler and authorized with prctl(PR_SET_PTRACER, ...) — so they'd have the same problem.)
Summary: Snap sandbox breaks child process crash reporting → Snap sandbox breaks crash reporting
I have a patch for snapd (and snap-seccomp) that works for content process crashes; proof-of-concept at bp-eb12584b-e3a5-4fd8-8b7c-e488e0180523.  I'll clean up the changes and send a PR.

Currently I'm just appending the ptrace rules to the browser-support policy.  This might make more sense as part of a separate interface just for Breakpad, because there are other applications that use it, but I don't know how much else from Firefox's collection of policies would need to be copied over.

For a parent process crash, the crashreporter executable fails to start because the dynamic linker can't find `libgtk-3.so.0`.  I haven't investigated that yet, and that may need to be a separate bug.
Assignee: nobody → jld
Flags: needinfo?(ken.vandine)
Priority: -- → P2
Jed, do you have an update on this? It would be nice to have the crashes from snap to be able to evaluate it stability!
Blocks: snappy
Severity: normal → major
Flags: needinfo?(jld)
The Snap packages aren't very popular yet, if this Telemetry query I did on Linux builds' distribution IDs is accurate: https://sql.telemetry.mozilla.org/queries/56359/source (link is MoCo only; `is_snap` being false vs. null is approximately distro builds vs. Mozilla's tarball releases).  I should have explicitly stated this earlier, but that's why I haven't prioritized this bug higher.

Also, it's worth keeping in mind that the fix will have to be released by Snapcraft and incorporated into distributions, so we're going to be looking at a limited subset of Snap users when we start getting crash reports.  And parent process crashes have other problems in addition to this one; see comment #4, and I'll file a separate bug.

That said, I've cleaned up the FIXME comments, and once I've figured out the legal situation with Canonical's CLA I'll send a PR.
mkaply, rumor goes that you might know if we've got a CLA with Canonical/Ubuntu? We'd like to contribute code fixes upstream.
Flags: needinfo?(mozilla)
Sorry this took so long. Yes, we have an approved CLA with Canonical. You can make contributions/submit patches.
Flags: needinfo?(mozilla)
Finally got enough of the 'i's dotted and 't's crossed with Legal.

PR: https://github.com/snapcore/snapd/pull/5727
Flags: needinfo?(jld)
Status update: the upstream PR is still waiting for review.
Landed upstream: https://github.com/snapcore/snapd/commit/093b3662047123fe684a0d1dae27775593f1dc1f

(But not in a release yet.)
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Verified on Ubuntu 16.04 with `kill -11` (and by overwriting the distribution's `snapd` and `snap-seccomp` binaries with the ones from building the upstream master branch, which probably shouldn't work but does): bp-da203858-0f19-4081-bb92-c29cc0181031
Status: RESOLVED → VERIFIED
See Also: → 1653852
Product: Release Engineering → Release Engineering Graveyard
You need to log in before you can comment on or make changes to this bug.