Closed Bug 1640612 Opened 5 years ago Closed 4 years ago

socket process gets killed due to sandboxing

Categories

(Core :: Networking: HTTP, task, P2)

x86_64
Linux
task

Tracking

()

RESOLVED FIXED
mozilla80
Tracking Status
firefox80 --- fixed

People

(Reporter: kershaw, Assigned: kershaw)

References

Details

(Whiteboard: [necko-triaged])

Attachments

(3 files)

I found this when I tried to enable socket process on try.

[task 2020-05-25T08:56:45.597Z] 08:56:45     INFO - GECKO(1235) | Sandbox: seccomp sandbox violation: pid 1264, tid 1586, syscall 63, args 140327444886880 255 16 140327805271072 0 101008359324075.  Killing process.
[task 2020-05-25T08:56:45.597Z] 08:56:45     INFO - GECKO(1235) | [Parent 1235, Breakpad Server] WARNING: Resource acquired is being released in non-LIFO order; why?
[task 2020-05-25T08:56:45.597Z] 08:56:45     INFO - GECKO(1235) | : file /builds/worker/checkouts/gecko/xpcom/threads/BlockingResourceBase.cpp, line 292
[task 2020-05-25T08:56:45.597Z] 08:56:45     INFO - GECKO(1235) | --- Mutex : dumpSafetyLock (currently acquired)
[task 2020-05-25T08:56:45.597Z] 08:56:45     INFO - GECKO(1235) |  calling context
[task 2020-05-25T08:56:45.597Z] 08:56:45     INFO - GECKO(1235) |   [stack trace unavailable]

Not sure what is syscall 63 and what code causes this.

Hi Michael,
Could you take a look at this? Is there anything that necko team should do to address this?
Thanks.

Flags: needinfo?(mfroman)

This is a question for the sandboxing team. For linux, either :jld or :gcp most likely.

Flags: needinfo?(mfroman)
Flags: needinfo?(jld)
Flags: needinfo?(gpascutto)

Syscall 63 is sys_uname.

We don't generally permit this in at-risk processes due to the infoleak it presents, e.g: https://searchfox.org/mozilla-central/source/security/sandbox/linux/SandboxFilter.cpp#1395

You should figure out what wants to know the exact machine and why. The stack seems to point to nsIPowerManagerService but I don't see why that would access it? The tests it has running are in nsHttp which has this: https://searchfox.org/mozilla-central/source/netwerk/protocol/http/nsHttpHandler.cpp#1034

Flags: needinfo?(gpascutto)
Flags: needinfo?(jld)
Component: Security: Process Sandboxing → Networking: HTTP

ni? for triage purpose.

Flags: needinfo?(kershaw)

(In reply to Gian-Carlo Pascutto [:gcp] from comment #3)

Syscall 63 is sys_uname.

We don't generally permit this in at-risk processes due to the infoleak it presents, e.g: https://searchfox.org/mozilla-central/source/security/sandbox/linux/SandboxFilter.cpp#1395

You should figure out what wants to know the exact machine and why. The stack seems to point to nsIPowerManagerService but I don't see why that would access it? The tests it has running are in nsHttp which has this: https://searchfox.org/mozilla-central/source/netwerk/protocol/http/nsHttpHandler.cpp#1034

Thanks for the explanation!

Assignee: nobody → kershaw
Flags: needinfo?(kershaw)
Priority: -- → P2
Whiteboard: [necko-triaged]

You should figure out what wants to know the exact machine and why. The stack seems to point to nsIPowerManagerService but I don't see why that would access it? The tests it has running are in nsHttp which has this: https://searchfox.org/mozilla-central/source/netwerk/protocol/http/nsHttpHandler.cpp#1034

I've tried to remove the call to uname in nsHttpHandler, but the socket process is still crashed.
The log below is a bit longer with MOZ_SANDBOX_LOGGING=1.

0:12.47 GECKO(630588) Sandbox: SandboxBroker: denied op=open rflags=2000000 perms=0 path=/etc/nsswitch.conf for pid=630619
 0:12.47 GECKO(630588) Sandbox: Failed errno -13 op open flags 02000000 path /etc/nsswitch.conf
 0:12.47 GECKO(630588) Sandbox: SandboxBroker: denied op=stat rflags=0 perms=0 path=/run/NetworkManager/resolv.conf for pid=630619
 0:12.47 GECKO(630588) Sandbox: Failed errno -13 op stat flags 00 path /etc/resolv.conf
 0:12.47 GECKO(630588) Sandbox: SandboxBroker: denied op=open rflags=2000000 perms=0 path=/etc/host.conf for pid=630619
 0:12.47 GECKO(630588) Sandbox: Failed errno -13 op open flags 02000000 path /etc/host.conf
 0:12.47 GECKO(630588) Sandbox: SandboxBroker: denied op=open rflags=2000000 perms=0 path=/run/NetworkManager/resolv.conf for pid=630619
 0:12.47 GECKO(630588) Sandbox: Failed errno -13 op open flags 02000000 path /etc/resolv.conf
 0:12.47 GECKO(630588) Sandbox: seccomp sandbox violation: pid 630619, tid 630889, syscall 63, args 139691352176032 255 16 0 0 40912313412987.  Killing process.
 0:12.52 GECKO(630588) [Parent 630588, Breakpad Server] WARNING: Resource acquired is being released in non-LIFO order; why?
 0:12.53 GECKO(630588) : file /home/kershaw/mozilla-central/xpcom/threads/BlockingResourceBase.cpp, line 292
 0:12.53 GECKO(630588) --- Mutex : dumpSafetyLock (currently acquired)
 0:12.53 GECKO(630588)  calling context
 0:12.53 GECKO(630588)   [stack trace unavailable]

It's still difficult to figure out what code in socket process causes this crash.
:gcp, do you probably have an idea about this?

Flags: needinfo?(gpascutto)

What test on try produces this failure?

The syscall is the same, so something else is calling uname.

Flags: needinfo?(gpascutto)

(In reply to Gian-Carlo Pascutto [:gcp] from comment #7)

What test on try produces this failure?

The syscall is the same, so something else is calling uname.

I finally got an incomplete stack trace, which shows that this is caused by DNS resolution.
This is not good, since we really want to move DNS resolution to socket process. I am not sure if this issue is happened on linux platform only, so I'll do more tests.

:gcp, is it possible to allow calling uname by adjusting sandboxing rules?

Sandbox: seccomp sandbox violation: pid 845761, tid 846077, syscall 63, args 140737043543456 255 16 0 0 127672244520948.  Killing process.
Sandbox: crash reporter is disabled (or failed); trying stack trace:
Sandbox: frame #01: uname[/lib/x86_64-linux-gnu/libc.so.6 +0xcaa97]
Sandbox: frame #02: gethostname[/lib/x86_64-linux-gnu/libc.so.6 +0xf4d7e]
Sandbox: frame #03: ???[/lib/x86_64-linux-gnu/libc.so.6 +0x11bb4d]
Sandbox: frame #04: ???[/lib/x86_64-linux-gnu/libc.so.6 +0x11d3fb]
Sandbox: frame #05: ???[/lib/x86_64-linux-gnu/libc.so.6 +0x11be35]
Sandbox: frame #06: ???[/lib/x86_64-linux-gnu/libc.so.6 +0x11cd1a]
Sandbox: frame #07: ???[/lib/x86_64-linux-gnu/libc.so.6 +0xe7184]
Sandbox: frame #08: getaddrinfo[/lib/x86_64-linux-gnu/libc.so.6 +0xe7c4d]
Sandbox: frame #09: PR_GetAddrInfoByName[/home/kershaw/mozilla-central/objdir/dist/bin/libnspr4.so +0x294f4]
Sandbox: frame #10: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libxul.so +0x68831e4]
Sandbox: frame #11: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libxul.so +0x6876ab6]
Sandbox: frame #12: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libxul.so +0x687d415]
Sandbox: frame #13: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libxul.so +0x66d4110]
Sandbox: frame #14: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libxul.so +0x66cf6c8]
Sandbox: frame #15: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libxul.so +0x66d2863]
Sandbox: frame #16: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libxul.so +0x6db4416]
Sandbox: frame #17: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libxul.so +0x6d38fdd]
Sandbox: frame #18: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libxul.so +0x6d38f35]
Sandbox: frame #19: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libxul.so +0x66cd064]
Sandbox: frame #20: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libnspr4.so +0x366fa]
Sandbox: frame #21: ???[/lib/x86_64-linux-gnu/libpthread.so.0 +0x8f27]
Sandbox: frame #22: clone[/lib/x86_64-linux-gnu/libc.so.6 +0xfd31f]
Sandbox: frame #23: ??? (???:???)
Sandbox: end of stack.
Flags: needinfo?(gpascutto)

Thanks, that's revealing.

The call sequence makes no sense, but the behavior is "correct", for the strange value of "correct" that defines current Linux implementations. See for example the notes about "nodename" here: https://www.man7.org/linux/man-pages/man2/uname.2.html#NOTES

So, a dummy implementation like we have is probably enough: https://searchfox.org/mozilla-central/source/security/sandbox/linux/SandboxFilter.cpp#1402
Just need a similar one in the socket process.

Flags: needinfo?(gpascutto)

Found another crash due to system call 16. I think it's PR_Available() that triggers this.

Sandbox: seccomp sandbox violation: pid 965314, tid 965320, syscall 16, args 27 21531 140736865490644 140737347350731 140737345324512 0.  Killing process.
Sandbox: crash reporter is disabled (or failed); trying stack trace:
Sandbox: frame #01: ???[/lib/x86_64-linux-gnu/libpthread.so.0 +0x14110]
Sandbox: frame #02: ioctl[/lib/x86_64-linux-gnu/libc.so.6 +0xf4457]
Sandbox: frame #03: ???[/home/kershaw/mozilla-central/objdir/dist/bin/libnspr4.so +0x4e8b2]
Sandbox: frame #04: PR_Available[/home/kershaw/mozilla-central/objdir/dist/bin/libnspr4.so +0x194dd]

:gcp, it seems that I am not the right one to fix this. Could you find someone to have a look?
Thanks.

Flags: needinfo?(gpascutto)

Jed, can you help the Necko team with this ioctl?

Flags: needinfo?(gpascutto) → needinfo?(jld)

That ioctl is FIONREAD, which is part of the tty group for historical reasons:

#define FIONREAD        0x541B
#define TIOCINQ         FIONREAD

As for uname, I think we can allow it in the socket process. The handler gcp refers to was specifically for GMP, where as part of normal operation we're running closed-source code that the user is not expected to trust, and which has an incentive to gather identifying information, so revealing details like the hostname and kernel build identifier are problematic. In that case uname appeared to be used for a coarse-grained kernel version check, and answering 3 was good enough.

The socket process, in contrast, may have legitimate reasons to know the hostname (and I strongly suggest not hiding the hostname unless we know what it's being used for in getaddrinfo as seen in comment #8). Also, because it has direct access to networking, it can see local IP addresses, talk to the Internet to find public IP addresses, and probably has some way to read ethernet hardware IDs, so uname may not be adding anything significant.

Flags: needinfo?(jld)
Blocks: 1640105

(In reply to Jed Davis [:jld] ⟨⏰|UTC-6⟩ ⟦he/him⟧ from comment #13)

That ioctl is FIONREAD, which is part of the tty group for historical reasons:

#define FIONREAD        0x541B
#define TIOCINQ         FIONREAD

I am not sure I understand how to fix this. Could you explain more?
Thanks.

Flags: needinfo?(jld)

(In reply to Kershaw Chang [:kershaw] from comment #14)

I am not sure I understand how to fix this. Could you explain more?
Thanks.

FIONREAD needs to be added to the request codes allowed here.

Flags: needinfo?(jld)

The severity field is not set for this bug.
:mayhemer, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(honzab.moz)

(In reply to Release mgmt bot [:sylvestre / :calixte / :marco for bugbug] from comment #17)

The severity field is not set for this bug.
:mayhemer, could you have a look please?

For more information, please visit auto_nag documentation.

This bug is more like a task.

Type: defect → task
Flags: needinfo?(honzab.moz)

Please see the log below. I found that GetAddrInfo is failed because socket process is not allowed to read files in /etc.
Jed, can we allow socket process to do this?

 1:01.18 GECKO(2197199) Sandbox: SandboxBroker: denied op=open rflags=2000000 perms=0 path=/etc/nsswitch.conf for pid=2197231
 1:01.18 GECKO(2197199) Sandbox: Failed errno -13 op open flags 02000000 path /etc/nsswitch.conf
 1:01.18 GECKO(2197199) Sandbox: SandboxBroker: denied op=stat rflags=0 perms=0 path=/run/NetworkManager/resolv.conf for pid=2197231
 1:01.18 GECKO(2197199) Sandbox: Failed errno -13 op stat flags 00 path /etc/resolv.conf
 1:01.18 GECKO(2197199) Sandbox: SandboxBroker: denied op=open rflags=2000000 perms=0 path=/etc/host.conf for pid=2197231
 1:01.18 GECKO(2197199) Sandbox: Failed errno -13 op open flags 02000000 path /etc/host.conf
 1:01.18 GECKO(2197199) Sandbox: SandboxBroker: denied op=open rflags=2000000 perms=0 path=/run/NetworkManager/resolv.conf for pid=2197231
 1:01.19 GECKO(2197199) Sandbox: Failed errno -13 op open flags 02000000 path /etc/resolv.conf
 1:01.19 GECKO(2197199) Sandbox: Failed errno -2 op open flags 02000000 path /home/kershaw/mozilla-central/objdir/dist/bin/libnss_dns.so.2
 1:01.19 GECKO(2197199) Sandbox: SandboxBroker: denied op=open rflags=2000000 perms=0 path=/etc/ld.so.cache for pid=2197231
 1:01.19 GECKO(2197199) Sandbox: Failed errno -13 op open flags 02000000 path /etc/ld.so.cache
 1:01.19 GECKO(2197199) Sandbox: Failed errno -2 op open flags 02000000 path /home/kershaw/mozilla-central/objdir/dist/bin/libnss_files.so.2
 1:01.19 GECKO(2197199) Sandbox: SandboxBroker: denied op=open rflags=2000000 perms=0 path=/etc/hosts for pid=2197231
 1:01.19 GECKO(2197199) Sandbox: Failed errno -13 op open flags 02000000 path /etc/hosts
Flags: needinfo?(jld)

In theory the socket process would only need a few files from /etc, but as right now that content process allows full read access (https://searchfox.org/mozilla-central/source/security/sandbox/linux/broker/SandboxBrokerPolicyFactory.cpp#306), the socket process might as well.

We could try with a restricted set of files but we'd likely end up with some people with weird configs breaking, so let's not bother for now. (That will make it harder to restrict it later, though!)

Flags: needinfo?(jld)

(In reply to Gian-Carlo Pascutto [:gcp] from comment #20)

In theory the socket process would only need a few files from /etc, but as right now that content process allows full read access (https://searchfox.org/mozilla-central/source/security/sandbox/linux/broker/SandboxBrokerPolicyFactory.cpp#306), the socket process might as well.

We could try with a restricted set of files but we'd likely end up with some people with weird configs breaking, so let's not bother for now. (That will make it harder to restrict it later, though!)

Thanks! I(In reply to Gian-Carlo Pascutto [:gcp] from comment #20)

In theory the socket process would only need a few files from /etc, but as right now that content process allows full read access (https://searchfox.org/mozilla-central/source/security/sandbox/linux/broker/SandboxBrokerPolicyFactory.cpp#306), the socket process might as well.

We could try with a restricted set of files but we'd likely end up with some people with weird configs breaking, so let's not bother for now. (That will make it harder to restrict it later, though!)

Thanks! I'll make a patch as you suggested.

Attachment #9155568 - Attachment description: Bug 1640612 - Allow FIONREAD ioctl for socket process → Bug 1640612 - Allow FIONREAD ioctl for socket process, r=jld
Attachment #9153965 - Attachment description: Bug 1640612 - Deal with uname() for socket process → Bug 1640612 - Deal with uname() for socket process, r=jld
Attachment #9158663 - Attachment description: Bug 1640612 - Allow socket process to read /etc → Bug 1640612 - Allow socket process to read /etc, r=gcp
Pushed by kjang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/2fe263339790 Deal with uname() for socket process, r=jld https://hg.mozilla.org/integration/autoland/rev/188dc24e864a Allow FIONREAD ioctl for socket process, r=jld https://hg.mozilla.org/integration/autoland/rev/c2d1a0de6874 Allow socket process to read /etc, r=gcp
Regressions: 1648624
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla79 → ---

I believe that Bug 1648624 was actually caused by bug 1648189, not this one.
The try push did include the faulty patch in bug 1648189. Since bug 1648189 is verified right now, I think we can reland this bug.

Pushed by kjang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e36ee4cd7d5c Deal with uname() for socket process, r=jld https://hg.mozilla.org/integration/autoland/rev/f752ecf794f5 Allow FIONREAD ioctl for socket process, r=jld https://hg.mozilla.org/integration/autoland/rev/2b560ad42c0d Allow socket process to read /etc, r=gcp
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla80
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: