Firefox hangs with Xephyr, both running inside an LXC container when sysctl kernel.unprivileged_userns_clone=1
Categories
(Core :: Security: Process Sandboxing, defect, P1)
Tracking
()
People
(Reporter: francois.lesueur, Assigned: jld)
Details
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0
Steps to reproduce:
-
Install a Debian stretch (tested in a Virtualbox VM and on my bare-metal host)
-
Install LXC
-
Activate network in LXC : USE_LXC_BRIDGE="true" in /etc/default/lxc-net
-
Create a debian stretch or buster container : lxc-create -n test -t download -- -d debian -r stretch -a amd64
-
Add network and X11 to the container. At the end of /var/lib/lxc/debian/config, remove the existing lxc.network line and add :
lxc.network.0.type = veth
lxc.network.0.link = lxcbr0
lxc.network.0.flags = up
lxc.mount.entry = /tmp/.X11-unix tmp/.X11-unix none ro,bind,create=dir 0 0 -
Log into the container : lxc-start -n debian && lxc-attach -n debian
-
Install Xephyr and Firefox : apt install xserver-xephyr firefox-esr
-
run "sysctl kernel.unprivileged_userns_clone=1"
-
Create an unprivileged user to start Firefox
-
su - to this unprivileged user
-
run "DISPLAY=:0 Xephyr :2 &"
-
run "DISPLAY=:2 firefox"
Actual results:
Firefox runs but cannot render any tab. Console outputs lots of errors :
###!!! [Parent][MessageChannel] Error: (msgtype=0x160061,name=PBrowser::Msg_UpdateDimensions) Channel error: cannot send/recv
###!!! [Parent][MessageChannel] Error: (msgtype=0x160080,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv
Unable to init server: Could not connect: Connection refused
Switching of multiprocess (autoremote in about:config) of setting sysctl userns_clone to 0 solves the issue on stretch. With a Debian Buster host, even setting userns_clone to 0 does not solve the issue.
I have tested with both firefox-esr from debian repo and from the up-to-date tar.bz2 on mozilla website, the behavior is similar.
Expected results:
Firefox should render tabs
Comment 1•5 years ago
•
|
||
The STR and environment for this bug seem a bit too convoluted for me to try to reproduce, hence I'll try first with the easy short path triaging it to Core::GTK and maybe NI Martin, maybe he has a better idea of a start-up component.
Comment 2•5 years ago
|
||
I'm afraid this is out of my scope. Looks like some IPC issues. Does it work if you disable e10s by setting browser.tabs.remote.autostart to false at about:config?
Reporter | ||
Comment 3•5 years ago
|
||
Yes, it works when I disable browser.tabs.remote.autostart (setting it to false)
Cheers
Francois
Reporter | ||
Comment 4•5 years ago
|
||
(but in my case, this does not solve my issue since I auto-provision LXC containers, which then run with firefox default config, and I'm thus unable to automatically disable e10s)
Assignee | ||
Comment 6•5 years ago
|
||
Looks like sandboxing (kernel.unprivileged_userns_clone
) and I'm guessing this a case of the X11 socket detection code doing not quite the right thing, and possibly a duplicate of bug 1559368 (but with Xephyr instead of Xwayland) given the comment about running Firefox as a different unprivileged user from the X server.
Can you paste the output of ls -l /tmp/.X11-unix
?
Reporter | ||
Comment 7•5 years ago
|
||
Hi
Inside the container :
ls -l /tmp/.X11-unix
total 0
srwxrwxrwx 1 root root 0 juil. 10 22:19 X0
On the host :
$ ls -l /tmp/.X11-unix
total 0
srwxrwxrwx 1 root root 0 juil. 10 22:19 X0
Bug 1559368 seems definitely related. During my testing, I found the same behavior at some point (firefox crashing with the exact same screen as 1559368, and also crashing at the restart in the same way)
/tmp/.X11-unix is bind-mounted inside the container in read-only, so the Xephyr running inside the container cannot create its socket there. I do not know where this Xephyr socket is ??? I have this line in netstat -alnp, inside the container :
netstat -alnp | grep Xephyr
unix 2 [ ACC ] STREAM LISTENING 9455950 509/Xephyr @/tmp/.X11-unix/X15
But I can't "ls" this @/tmp/.X11-unix/X15 ...
Cheers
François
Reporter | ||
Comment 8•5 years ago
|
||
I forgot to mention, to be more precise : Xephyr creates this :15 display and Firefox is running on this :15 display
Reporter | ||
Comment 9•5 years ago
|
||
Hi,
From the comment of the X11 socket detection code, I think you're right !
In my case :
- I have a /tmp/.X11-unix in the container (it is bind mounted from the host). The assumption that this directory does not exist in a container (exemple for a snap in the comment) is thus false
- The listening socket :15 is an abstract socket address (the @ at the beginning) and, still from the comment, this display should be considered remote then
Cheers,
Francois
Assignee | ||
Comment 10•5 years ago
|
||
Yes, that's an abstract address; that @
is how netstat prints a null byte (a.k.a. \0
or ^@
).
So this looks like we'll need to refine that check to look for the actual socket. I'll take this, because I have an idea for how to do this without too much complication.
As a temporary workaround, setting MOZ_ASSUME_USER_NS=0
in the environment should work, similarly to turning off kernel.unprivileged_userns_clone
.
Reporter | ||
Comment 11•5 years ago
|
||
HI,
Thanks ! The workaround MOZ_ASSUME_USER_NS=0 allows my setup to work as expected.
Let me know if you need some more tests when you'll refine the check.
Cheers,
Francois
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Description
•