Broken browser when running under a firejail sandbox

RESOLVED FIXED in Firefox 57

Status

()

Core
Security: Process Sandboxing
P2
normal
RESOLVED FIXED
11 months ago
9 months ago

People

(Reporter: Ralf Jung, Assigned: jld)

Tracking

({hang, regression})

Trunk
mozilla58
hang, regression
Points:
---

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox55 unaffected, firefox56 disabled, firefox57 fixed, firefox58 fixed)

Details

(Whiteboard: sb+)

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(3 attachments)

(Reporter)

Description

11 months ago
Since the latest nightly upgrade, when I go to <youtube.com>, the website loads (most of the time, sometimes it doesn't even seem to do that) and then freezes.  Hovering the links doesn't affect the mouse cursor, clicking them doesn't do anything

Most other tabs still work, but some do not. Quitting Firefox most of the times leaves a dead content process behind; I assume the one hosting YouTube somehow froze.

about:support says
 Name 	Firefox
Version 	56.0a1
Build ID 	20170726100241
Update History 	
Update Channel 	nightly
User Agent 	Mozilla/5.0 (X11; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0
OS 	Linux 4.11.0+

Multiprocess Windows 	1/1 (Enabled by default)
Web Content Processes 	4/4
Stylo 	false (disabled by default)

I was using the nightly from the 24th most of the day; YouTube still worked fine there.
(Reporter)

Comment 1

11 months ago
Actually this doesn't just happen with YouTube, it also sometimes happens e.g. on Rust's Discourse forum.  Nightly is pretty much unusable right now for me.

Comment 2

11 months ago
Can you find a regression window?
Component: General → Untriaged
Flags: needinfo?(post+mozilla)
Keywords: regression, regressionwindow-wanted
(Reporter)

Comment 3

11 months ago
What does that mean? The July 26th nightly has the problem, the July 24th nightly did not.  Is there some way to get more builds in between?
Flags: needinfo?(post+mozilla)

Comment 4

11 months ago
(In reply to Ralf Jung from comment #3)
> What does that mean? The July 26th nightly has the problem, the July 24th
> nightly did not.  Is there some way to get more builds in between?

Yes! You can use http://mozilla.github.io/mozregression/ and tell it to narrow down the range between July 24th and July 26th, given that that range likely still contains several hundreds of changes.
Flags: needinfo?(post+mozilla)
(Reporter)

Comment 5

11 months ago
Hm, the bug doesn't show up with mozregression. Must be something about my profile. I will investigate.
(Reporter)

Comment 6

11 months ago
Okay, so it's something about the sandbox I use. I run Firefox in a "firejail" sandbox, just to contain the browser a little better and, in particular, to not give it access to sensitive parts of the file system like private keys. Disabling the use of seccomp by firejail fixes the issue. So something about the seccomp filters firejail applies must be in conflict with something that happened the last few days.

I will try to run mozregression inside the sandbox.

Updated

11 months ago
Component: Untriaged → Audio/Video: Playback
Keywords: hang
Product: Firefox → Core
(Reporter)

Comment 7

11 months ago
Created attachment 8891542 [details]
mozregression-log

Result of the regression test:

 8:46.09 INFO: Last good revision: 8e1e06adf80f82d3d5cf08eadaf569a107bd1ecf
 8:46.09 INFO: First bad revision: b5fa08551d6e74d8300fa94f3161afdffd867764
 8:46.09 INFO: Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=8e1e06adf80f82d3d5cf08eadaf569a107bd1ecf&tochange=b5fa08551d6e74d8300fa94f3161afdffd867764

I have attached the full log.
Flags: needinfo?(post+mozilla)

Comment 8

11 months ago
(In reply to Ralf Jung from comment #6)
> Okay, so it's something about the sandbox I use. I run Firefox in a
> "firejail" sandbox, just to contain the browser a little better and, in
> particular, to not give it access to sensitive parts of the file system like
> private keys. Disabling the use of seccomp by firejail fixes the issue. So
> something about the seccomp filters firejail applies must be in conflict
> with something that happened the last few days.

I'm assuming this is https://firejail.wordpress.com/ ?

gcp, seems this regressed in bug 1308400 - can you take a look?
Blocks: 1308400
Flags: needinfo?(gpascutto)
Keywords: regressionwindow-wanted
Firejail and Firefox's sandbox both use seccomp-bpf to filter and intercept syscalls. I'm a bit surprised this still worked after Firefox 54 made use of seccomp-bpf itself, as far as I remember you couldn't nest seccomp-bpf filters. There might be a possibility here that firejail quietly broke Firefox's sandbox, and this quiet breakage is fatal now that seccomp-bpf usage is used to broker filesystem reads.

I'll defer to jld who's the expert on this, but I suspect this will be a WONTFIX and you can either have firejail, or Firefox's own sandbox, but not both at the same time.
Flags: needinfo?(gpascutto) → needinfo?(jld)
Component: Audio/Video: Playback → Security: Process Sandboxing
Summary: youtube.com freezes → Broken browser when running under a firejail sandbox
Whiteboard: sb?
(Reporter)

Comment 10

11 months ago
> I'm a bit surprised this still worked after Firefox 54 made use of seccomp-bpf itself, as far as I remember you couldn't nest seccomp-bpf filters. There might be a possibility here that firejail quietly broke Firefox's sandbox, and this quiet breakage is fatal now that seccomp-bpf usage is used to broker filesystem reads.

Yeah, I guessed the same.

> you can either have firejail, or Firefox's own sandbox, but not both at the same time.

I can also just turn off seccomp in firejail, which I did and now things all work again.  (I still prefer having Firefox in a separate mount namespace that blocks access to sensitive files like my private keys.  You know, better safe than sorry.)
(Reporter)

Comment 11

11 months ago
It seems that at the very least, seccomp diagnosis is not working the way it should.  No matter whether I have seccomp enabled in firejail (i.e., no matter whether Firefox is in a nested seccomp or not), about:support shows

Seccomp-BPF (System Call Filtering): true
Seccomp Thread Synchronization: true
User Namespaces: true
Content Process Sandboxing: true
Media Plugin Sandboxing: true
Content Process Sandbox Level: 3
Effective Content Process Sandbox Level: 3
I digged a bit and nested seccomp-bpf is supported, we specifically fixed that a year ago. But something firejail is doing might obviously interfere with our sandbox. Needs further investigation.
(Assignee)

Comment 13

11 months ago
Specifically, with multiple filters the kernel will run all of them and take the “most restrictive” result.  More specifically, it uses the numerically lowest return value: kill < trap < error < trace < allow.

Firejail seems to use SECCOMP_RET_KILL[1], so the process will immediately exit with no error messages if it uses a syscall firejail doesn't like.  Given that chroot is one of them[2], I'd expect media plugins (EME CDMs, and OpenH264 which is used by WebRTC but not <video>) to have been broken since Firefox 40 (bug 1151607) on systems with unprivileged user namespaces, and if so this will break the entire browser relatively soon (bug 1213998).

But I haven't seen anything that would interact with bug 1308400 yet.


[1] https://github.com/netblue30/firejail/blob/9a3344f9a569de5a2b619ff9ebc01cbd195ee1d0/src/include/seccomp.h#L139
[2] https://github.com/netblue30/firejail/blob/9a3344f9a569de5a2b619ff9ebc01cbd195ee1d0/src/fseccomp/seccomp.c#L178
(Assignee)

Comment 14

11 months ago
I can reproduce this locally.  This happens:

Sandbox: rewriting /proc/self/status -> /proc/242/status
Sandbox: SandboxBroker: denied op=open rflags=0 perms=0 path=/proc/242/status for pid=242
Sandbox: Failed errno -13 op 0 flags 00 path /proc/242/status

And that's the last anyone heard of the process with local pid 242, which I'm pretty sure is the one with global pid 73946:

jld       73946  1.9  0.0      0     0 pts/7    Zl+  10:32   0:06 [Web Content] <defunct>

Not sure why the parent process hasn't cleaned that up yet.  I think the only time I've seen something like that happen was when we backported seccomp-bpf to 3.0.x kernels for Boot2Gecko and SECCOMP_RET_KILL was buggy such that the parent/ptracer wouldn't be notified; as far as I know it force-exits the process normally on mainline kernels that have seccomp-bpf.

Next is to find out who's looking at /proc/self/status and what they try to do when they fail…
(Assignee)

Comment 15

11 months ago
libnuma (required by libx265, required by ffmpeg).  Fails to open the file at [1], falls back to calling get_mempolicy at [2], killed by firejail due to [3].

So if we allow /proc/self/status, which doesn't seem to have anything too sensitive, then (a) youtube works again without changes to firejail, and (b) we might be able to stop allowing get_mempolicy in our own seccomp policy.

[1] https://github.com/numactl/numactl/blob/v2.0.11/libnuma.c#L382
[2] https://github.com/numactl/numactl/blob/v2.0.11/libnuma.c#L403
[3] https://github.com/netblue30/firejail/blob/9a3344f9a569de5a2b619ff9ebc01cbd195ee1d0/src/fseccomp/seccomp.c#L163
Flags: needinfo?(jld)
(Assignee)

Comment 16

11 months ago
Also, in case someone else needs to debug something like this: I changed the broker policy to have an entry for /proc/%d/status (like the one for /proc/%d/maps) with the SandboxBroker::CRASH_INSTEAD flag set, ran with the env var MOZ_DISABLE_CRASHREPORTER set (probably unnecessary because it's a debug build), and fed the stack dump to tools/rb/fix_linux_stack.py (with the libnuma-dbg package installed so I could get a line number for that frame instead of tracing through disassembly).

Updated

11 months ago
Assignee: nobody → jld
Priority: -- → P2

Updated

11 months ago
Whiteboard: sb? → sb+
(Reporter)

Comment 17

11 months ago
FYI, I reported this against Firejail as <https://github.com/netblue30/firejail/issues/1414> and they, for now, stop blocking get_mempolicy.
(In reply to Jed Davis [:jld] (⏰UTC-6) from comment #15)
> So if we allow /proc/self/status, which doesn't seem to have anything too
> sensitive, then (a) youtube works again without changes to firejail, and (b)
> we might be able to stop allowing get_mempolicy in our own seccomp policy.

This seems like an easy change. Looking into /prof/self/status, I see a bunch of pids including parent pid. We currently block getppid in preparation for bug 1151624. Presumably, ppid in here will be 0 when running inside a namespace.

Given that this is the only thing that broke on /proc/self/status so far, I don't think we have to be afraid that stuff ends up accidentally using ppids this way.
Comment hidden (mozreview-request)
I verified that blocking get_mempolicy makes me crash on YouTube, and that whitelisting that proc entry makes it work again.
(Assignee)

Comment 21

10 months ago
(In reply to Gian-Carlo Pascutto [:gcp] from comment #18)
> This seems like an easy change. Looking into /prof/self/status, I see a
> bunch of pids including parent pid. We currently block getppid in
> preparation for bug 1151624. Presumably, ppid in here will be 0 when running
> inside a namespace.

procfs uses the pid namespace where it was mounted, so that would be the outer ppid.  (This is not how it treats other namespaces, like net and user.)  I'm not too concerned about revealing it.  If at some point in the distant future we take Chrome's approach of remoting all logging and hiding the process's own outer pid from it, then we might reconsider.
(Assignee)

Comment 22

10 months ago
mozreview-review
Comment on attachment 8898342 [details]
Bug 1384804 - Allow libnuma to read /proc/self/status, block get_mempolicy.

https://reviewboard.mozilla.org/r/169704/#review175038
Attachment #8898342 - Flags: review?(jld) → review+

Comment 23

10 months ago
Pushed by gpascutto@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/afdd35ed8902
Allow libnuma to read /proc/self/status, block get_mempolicy. r=jld

Comment 24

10 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/afdd35ed8902
Status: NEW → RESOLVED
Last Resolved: 10 months ago
status-firefox57: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
status-firefox55: --- → unaffected
status-firefox56: --- → disabled
status-firefox-esr52: --- → unaffected
(Assignee)

Comment 25

10 months ago
I backed this out for crashes on get_mempolicy in libnuma.  Example crash report: bp-a2754076-0449-4777-9662-4be9c0170823
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
status-firefox57: fixed → affected
Target Milestone: mozilla57 → ---
(In reply to Jed Davis [:jld] (⏰UTC-6) from comment #25)
> I backed this out for crashes on get_mempolicy in libnuma.  Example crash
> report: bp-a2754076-0449-4777-9662-4be9c0170823

We should've only backed out the get_mempolicy block, not the unblock of the /proc/pid/status.
Created attachment 8913220 [details] [diff] [review]
Allow reading /proc/self/status for libnuma

MozReview-Commit-ID: LLwmPVtj0PE
Attachment #8913220 - Flags: review?(jld)
(Assignee)

Updated

9 months ago
Attachment #8913220 - Flags: review?(jld) → review+

Comment 30

9 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/d07bcfc779ee
Status: REOPENED → RESOLVED
Last Resolved: 10 months ago9 months ago
status-firefox58: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
gcp, do you want to uplift that to 57 (if really affected)
Flags: needinfo?(gpascutto)
Comment on attachment 8913220 [details] [diff] [review]
Allow reading /proc/self/status for libnuma

Approval Request Comment
[Feature/Bug causing the regression]: bug 1308400 and the previously landed part this bug itself
[User impact if declined]: Running Firefox under older firejails can crash. (Very limited amount of users affected)
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: Landed a few days ago
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]: N/A
[Is the change risky?]: No
[Why is the change risky/not risky?]: Adding an entry to a whitelist
[String changes made/needed]: N/A
Flags: needinfo?(gpascutto)
Attachment #8913220 - Flags: approval-mozilla-beta?

Comment 33

9 months ago
Comment on attachment 8913220 [details] [diff] [review]
Allow reading /proc/self/status for libnuma

Recent regression, Beta57+
Attachment #8913220 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment 34

9 months ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/4a1a1eab35ce
status-firefox57: affected → fixed
You need to log in before you can comment on or make changes to this bug.