Enable sending more than 4 file descriptors in the IPC channel

RESOLVED FIXED in mozilla33

Status

()

defect
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: chiajung, Assigned: billm)

Tracking

unspecified
mozilla33
x86_64
Gonk (Firefox OS)
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(tracking-b2g:backlog)

Details

Attachments

(2 attachments, 2 obsolete attachments)

After NUWA's enabling, content process ask all top level fd mapping on init, however, current IPC system limit the file_descriptor_set size to 4 in one IPC message, which makes it impossible to add more top-level protocol.

Make NUWA use the PFileDescriptorSet should fix the problem
blocking-b2g: --- → 1.5?
We'll need to send the file descriptors separately as in bug 910010 or we'll run into the limit in PContentChild::SendAddNewProcess()
See Also: → 910010
I can't reuse PFileDecriptorSet protocol for sending/receiving FDs when Nuwa forks a new child because it's sending an array of protocolID->FD mapping, not a FD array. Or maybe I we should add a new protocol similar to PFileDescriptorSet.
Assignee: nobody → cyu
It works, any plan for land this?
Actually I prefer fixing this problem in the IPC layer to hide the fact that that we can only send 4 FDs over the wire in one shot. We can send the FDs in batches of 4 and this can be totally transparent to the upper layer.
Component: General → IPC
What's the user impact if we don't fix this?
Flags: needinfo?(cyu)
At this point, with bug 959089 patched in a way that it doesn't depend on this, I'd say the user impact is non-existent.  It's about good code at this point.
Sounds like we shouldn't block on this then.
blocking-b2g: 2.0? → backlog
Flags: needinfo?(cyu)
I'll revisit this bug and start working on v2 patch to fix this problem for good from the bottom layer of the IPC channel. The idea is sending the file descriptor in separate messages on the sending side and reassemble it on the receiving side. This is totally transparent to the upper layer so we don't need to work around the limit in the IPC actor layer.
This handles a large file descriptor set purely in the IPC layer without resorting to PFileDescriptorSet and the like. An IPC actor can continue to use nsTArray<FileDescriptor> with an arbitrarily large number of file descriptors. The IPC layer will send the file descriptors using multiple messages transparently in the lower layer.

TODO: Find a cleaner, less-copy approach in the receiving side.
Attachment #8398455 - Attachment is obsolete: true
Attachment #8429209 - Flags: feedback?(benjamin)
I change MAX_DESCRIPTORS_PER_MESSAGE from 7 back to 4 and it works to send 5 file descriptors. MAX_DESCRIPTORS_PER_MESSAGE can be a small as 1, which effectively does the same thing as PFileDescriptor, but transparently in the lower layer.
Comment on attachment 8429209 [details] [diff] [review]
WIP: Span the file descriptor set in multiple messages

Somebody else needs to look at this; I'm overloaded.
Attachment #8429209 - Flags: feedback?(benjamin)
Summary: Make NUWA use PFileDescriptorSet to support over 4 toplevel protocols → Enable sending more than 4 file descriptors in the IPC channel
Cervantes, what's the story for Windows? Does it have this problem also?
Flags: needinfo?(cyu)
Comment on attachment 8431528 [details] [diff] [review]
Send a large file descriptor set using multiple messages in the IPC channel

Review of attachment 8431528 [details] [diff] [review]:
-----------------------------------------------------------------

billm has agreed to look over this so that we can expand the set of eyes folks who know a bit about this code (thanks!).

I just looked through quickly and had a few comments but I didn't do a full review.

::: ipc/chromium/src/chrome/common/file_descriptor_set_posix.cc
@@ +120,5 @@
> +      descriptors_.begin() + MAX_DESCRIPTORS_PER_MESSAGE,
> +      descriptors_.end());
> +  descriptors_.resize(MAX_DESCRIPTORS_PER_MESSAGE);
> +
> +  return overflow_fds;

Are you sure that this won't copy again? You could change the signature to ensure it:

  void FileDescriptorSet::StripOverflowFileDescriptors(std::vector<base::FileDescriptor>&)

::: ipc/chromium/src/chrome/common/ipc_channel_posix.cc
@@ -553,3 @@
>          if (m.header()->num_fds) {
>            // the message has file descriptors
> -          const char* error = NULL;

I'm concerned that we're dropping this protection. We should still be able to reason about the number of messages that we should receive in every message, right? As well as the total number of messages that we will eventually get.

@@ +694,5 @@
> +        Message* msg_overflow_fds =
> +            new Message(msg->routing_id(),
> +                        msg->type(),
> +                        msg->priority(),
> +                        Message::MessageCompression(msg->compress()),

This seems wrong... If we compress we're supposed to toss out the most recent previous message. I guess we won't actually do that so let's just always say no compression here.
Attachment #8431528 - Flags: review?(bent.mozilla) → review?(wmccloskey)
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #14)
> Cervantes, what's the story for Windows? Does it have this problem also?

I don't think we have this problem on windows. File descriptor passing is available only in unix domain socket, not on Windows.
Flags: needinfo?(cyu)
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #15)
> Comment on attachment 8431528 [details] [diff] [review]
> Send a large file descriptor set using multiple messages in the IPC channel
> 
> Review of attachment 8431528 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> billm has agreed to look over this so that we can expand the set of eyes
> folks who know a bit about this code (thanks!).
> 
> I just looked through quickly and had a few comments but I didn't do a full
> review.
> 
> ::: ipc/chromium/src/chrome/common/file_descriptor_set_posix.cc
> @@ +120,5 @@
> > +      descriptors_.begin() + MAX_DESCRIPTORS_PER_MESSAGE,
> > +      descriptors_.end());
> > +  descriptors_.resize(MAX_DESCRIPTORS_PER_MESSAGE);
> > +
> > +  return overflow_fds;
> 
> Are you sure that this won't copy again? You could change the signature to
> ensure it:
> 
This vector can be moved, right? Using a return value makes it clear that the overflow fds are returned after this function call.

>   void
> FileDescriptorSet::StripOverflowFileDescriptors(std::vector<base::
> FileDescriptor>&)
> 
> ::: ipc/chromium/src/chrome/common/ipc_channel_posix.cc
> @@ -553,3 @@
> >          if (m.header()->num_fds) {
> >            // the message has file descriptors
> > -          const char* error = NULL;
> 
> I'm concerned that we're dropping this protection. We should still be able
> to reason about the number of messages that we should receive in every
> message, right? As well as the total number of messages that we will
> eventually get.
> 
Yes, we can assert that. Will add the checks in next version's patch.

> @@ +694,5 @@
> > +        Message* msg_overflow_fds =
> > +            new Message(msg->routing_id(),
> > +                        msg->type(),
> > +                        msg->priority(),
> > +                        Message::MessageCompression(msg->compress()),
> 
> This seems wrong... If we compress we're supposed to toss out the most
> recent previous message. I guess we won't actually do that so let's just
> always say no compression here.

Actually this message will never escape from the chromium IPC layer into the glue layer. It will consumed in Channel::ChannelImpl::ProcessIncomingMessages() so setting compress here doesn't really matter. But this points out that we can assert that the message group used to send the file descriptor set should contain the same routing ID, type, priority values.
I have a few questions, since this code is pretty new to me. Is this patch a clean-up or something we need? It sounds like we increased the limit from 4 to 7 for bug 959089. Do we need more than 7 now?

Also, I don't understand why we have a hard limit at all. Does the OS enforce a limit? The best reason I can see for a limit is here:

http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/chrome/common/ipc_channel_posix.h#107

It looks like we have a very conservative calculation of how big the control buffer needs to be for a given message buffer size. If that's the reason for this limitation, it seems like there might be a nicer way to get around it. Perhaps we could send an initial message telling how many file descriptors we're going to send in the main message. Then the receiving side could allocate a large enough control buffer before reading the main message.
(In reply to Bill McCloskey (:billm) from comment #18)
> I have a few questions, since this code is pretty new to me. Is this patch a
> clean-up or something we need? It sounds like we increased the limit from 4
> to 7 for bug 959089. Do we need more than 7 now?
> 
It removes one limitation that we currently have so I think it's something we will need. Bug 959089 increases the limit to 7, but we (maybe soon) will reach this limit one day. Project silk (bug 987532) could bump the number of used file descriptors and things like this will increase the pressure of bumping the limit again.

> Also, I don't understand why we have a hard limit at all. Does the OS
> enforce a limit? The best reason I can see for a limit is here:
> 
> http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/chrome/common/
> ipc_channel_posix.h#107
> 
> It looks like we have a very conservative calculation of how big the control
> buffer needs to be for a given message buffer size. If that's the reason for
> this limitation, it seems like there might be a nicer way to get around it.
> Perhaps we could send an initial message telling how many file descriptors
> we're going to send in the main message. Then the receiving side could
> allocate a large enough control buffer before reading the main message.
Yes, it's very conservative as to how many file descriptors can be sent in one message. The limit of 4 is also quite arbitrary. Maybe chromium doesn't need to send so many file descriptors, or it implements in a higher layer something similar as in this patch.

Your suggestion will not work because problem is not in the receiving side but in the kernel. The kernel implementation of sendmsg() has a limited buffer for passing file descriptors. If we send more file descriptors than what the kernel allows, the kernel just discard the overflow ones and tells the receiving side:
http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/chrome/common/ipc_channel_posix.cc#497

So a large file descriptor set needs to be sent using multiple sendmsg() calls anyway, which is what this patch does (only with a conservative batch size).
(In reply to Cervantes Yu from comment #19)
> Your suggestion will not work because problem is not in the receiving side
> but in the kernel. The kernel implementation of sendmsg() has a limited
> buffer for passing file descriptors. If we send more file descriptors than
> what the kernel allows, the kernel just discard the overflow ones and tells
> the receiving side:
> http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/chrome/common/
> ipc_channel_posix.cc#497
> 
> So a large file descriptor set needs to be sent using multiple sendmsg()
> calls anyway, which is what this patch does (only with a conservative batch
> size).

In Linux, we're limited to 253 file descriptors:
http://lxr.free-electrons.com/source/include/net/scm.h#L13
http://lxr.free-electrons.com/source/net/core/scm.c#L79

On Mac, the limit seems to be 254 based on testing. However, the default ulimit for max number of file descriptors on Macs is 256, so I had to boost that to even test this.

How many file descriptors will we need to send? 253 seems like a lot.
(In reply to Bill McCloskey (:billm) from comment #20)
> In Linux, we're limited to 253 file descriptors:
> http://lxr.free-electrons.com/source/include/net/scm.h#L13
> http://lxr.free-electrons.com/source/net/core/scm.c#L79
> 
> On Mac, the limit seems to be 254 based on testing. However, the default
> ulimit for max number of file descriptors on Macs is 256, so I had to boost
> that to even test this.
> 
> How many file descriptors will we need to send? 253 seems like a lot.

Yes, 253 is much more than we need. I think we sill soon hit the limit of 7, but the increase should be slow. So 253 will be enough, at least in the foreseeable future.
It should be noted that JS can create DOM blobs that are backed by an arbitrary number of file descriptors currently...
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #22)
> It should be noted that JS can create DOM blobs that are backed by an
> arbitrary number of file descriptors currently...

We should probably not allow a given origin to have more than 10 or 20 files open. Otherwise a web page could use up all your file descriptors.

However, I also realized that what I proposed in comment 18 doesn't make sense. The chromium code seems to assume that it's possible for a single recvmsg call to get file descriptors that came from multiple sendmsg calls. I haven't actually been able to get that to happen on Linux, but maybe it's possible. If so, my proposal wouldn't work.

The main assumption the chromium code makes is that file descriptors are "associated" with buffer data if the file descriptors were sent in the same sendmsg call as the buffer. And a recvmsg won't return a file descriptor if it doesn't also return the associated buffer data. Let's assume that's true.

In that case, if we're calling recvmsg and asking for 4096 bytes of buffer data, then we need to make the control buffer big enough to fit all the file descriptors that were associated with those 4096 bytes. The way chromium does this is to limit you to 4 (or 7) fds per Message, figure out the maximum number of messages that could possibly fit into 4096 bytes based on the size of MessageHeader, and then multiply that by the max number of fds per message.

I think there's a better way to do it though. Every time we attach a file descriptor to a message, we also write 4 bytes into the message data:
http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/chrome/common/ipc_message.cc#131
That means that, to send one file descriptor, we need to use 4 bytes of buffer data. So 4096 bytes of buffer data could never contain more than 1024 file descriptors. Therefore, the control data never needs to be bigger than 4096 bytes itself.

So, as best as I can tell, it would be fine to make the control buffer be the same size as the data buffer (Channel::kReadBufferSize). And then we can completely eliminate MAX_DESCRIPTORS_PER_MESSAGE. How does that sound to you guys?
Flags: needinfo?(cyu)
Flags: needinfo?(bent.mozilla)
(In reply to Bill McCloskey (:billm) from comment #23)
> (In reply to ben turner [:bent] (use the needinfo? flag!) from comment #22)
> > It should be noted that JS can create DOM blobs that are backed by an
> > arbitrary number of file descriptors currently...
> 
> We should probably not allow a given origin to have more than 10 or 20 files
> open. Otherwise a web page could use up all your file descriptors.
> 
This places a hidden implementation detail to the web apps that run on the implementation. A resource-hungry app will fail to run even the system still has enough resource and the user is willing to dedicate the whole system just to let the app run. This sounds like a bad thing to the user or the app developer. I don't think we should put such a hidden constraint to web apps.

> I think there's a better way to do it though. Every time we attach a file
> descriptor to a message, we also write 4 bytes into the message data:
> http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/chrome/common/
> ipc_message.cc#131
> That means that, to send one file descriptor, we need to use 4 bytes of
> buffer data. So 4096 bytes of buffer data could never contain more than 1024
> file descriptors. Therefore, the control data never needs to be bigger than
> 4096 bytes itself.
> 
The 4 bytes of data in the buffer is redundant and are only used to limit the number of file descriptors sent over the wire and let us eliminate MAX_DESCRIPTORS_PER_MESSAGE in buffer allocation, right?

> So, as best as I can tell, it would be fine to make the control buffer be
> the same size as the data buffer (Channel::kReadBufferSize). And then we can
> completely eliminate MAX_DESCRIPTORS_PER_MESSAGE. How does that sound to you
> guys?
Even if we eliminate MAX_DESCRIPTORS_PER_MESSAGE there is still a limit of 253 in the linux kernel (and a similar value of 254 on OSX). I think the only limit should be RLIMIT_NOFILE and we should try our best to satisfy the needs of an app, even with the kernel limit of 253.
Flags: needinfo?(cyu)
(In reply to Cervantes Yu from comment #24)
> > We should probably not allow a given origin to have more than 10 or 20 files
> > open. Otherwise a web page could use up all your file descriptors.
> > 
> This places a hidden implementation detail to the web apps that run on the
> implementation. A resource-hungry app will fail to run even the system still
> has enough resource and the user is willing to dedicate the whole system
> just to let the app run. This sounds like a bad thing to the user or the app
> developer. I don't think we should put such a hidden constraint to web apps.

Well, I guess apps are different. I was thinking of web pages. Even still, though, it would be nice if we could ensure that a few file descriptors were left for the browser itself.

> The 4 bytes of data in the buffer is redundant and are only used to limit
> the number of file descriptors sent over the wire and let us eliminate
> MAX_DESCRIPTORS_PER_MESSAGE in buffer allocation, right?

Well, we do use those 4 bytes to decide how to read file descriptors out of the message. The code already exists in chromium.

> Even if we eliminate MAX_DESCRIPTORS_PER_MESSAGE there is still a limit of
> 253 in the linux kernel (and a similar value of 254 on OSX). I think the
> only limit should be RLIMIT_NOFILE and we should try our best to satisfy the
> needs of an app, even with the kernel limit of 253.

I guess if there were a legitimate use case for sending more than 250 file descriptors in a single message, while never using more than 1024 file descriptors total in the process, then I would feel better about doing this. If you can think of something, I'll reconsider my position. Even then, though, the app isn't going to work on Macs, which seems kinda broken.

Besides the added complexity, the cost of taking this patch is that our IPC code will diverge further from chromium's. We have several different copies of the chromium base libraries in the tree and it's a significant problem. People working on sandboxing are running into tricky linker issues because of this. I suspect that eventually we'll just want to have one copy that tracks Google's version. If we take this patch, it will just make it harder to do that.
Posted patch ipc-more-fdsSplinter Review
Just to make things more concrete, here's what I'm proposing.
(In reply to Bill McCloskey (:billm) from comment #25)
> I guess if there were a legitimate use case for sending more than 250 file
> descriptors in a single message, while never using more than 1024 file
> descriptors total in the process, then I would feel better about doing this.
> If you can think of something, I'll reconsider my position. Even then,
> though, the app isn't going to work on Macs, which seems kinda broken.
> 
I can't think of a real use case that can send such many file descriptors in one message. Ben mentioned DOM blobs backed by an arbitrary number of file descriptors in comment #22, but that is handled by using PFileDescriptorSet protocol to send the file descriptors in separate messages, right?

If that's the case, I am good with your proposed approach.
I really like billm's approach!
Flags: needinfo?(bent.mozilla)
Cervantes, do you have an easy way to test this patch?
Flags: needinfo?(cyu)
It doesn't look like we have an easy way to test this. What I can think of is adding a debug-only IPC message that generates and sends many file descriptors.
Flags: needinfo?(cyu)
Comment on attachment 8444579 [details] [diff] [review]
ipc-more-fds

I guess I'll ask for review on this. I'm afraid I don't have time to write tests for it though.
Attachment #8444579 - Flags: review?(bent.mozilla)
Comment on attachment 8444579 [details] [diff] [review]
ipc-more-fds

Review of attachment 8444579 [details] [diff] [review]:
-----------------------------------------------------------------

r=me as long as the slop thing is addressed.

::: ipc/chromium/src/chrome/common/ipc_channel_posix.h
@@ +103,5 @@
>    // We read from the pipe into this buffer
>    char input_buf_[Channel::kReadBufferSize];
>  
>    enum {
> +    kControlBufferSlopBytes = 32

I don't see how you came up with this number. Don't you need to use the CMSG_SPACE macro here? This at least needs a comment saying how it was determined.

@@ +109,5 @@
>  
> +  // This is a control message buffer large enough to hold all the file
> +  // descriptors that will be read in when reading Channel::kReadBufferSize
> +  // bytes of data. Message::WriteFileDescriptor always writes one word of
> +  // data for ever file descriptor added to the message, so kReadBufferSize

Nit: s/ever/every/
Attachment #8444579 - Flags: review?(bent.mozilla) → review+
There's a comment in the current version that says:
  // TODO(agl): OSX appears to have non-constant CMSG macros!
Based on this, I assumed that we wouldn't be able to use CMSG_SPACE to determine the size of input_cmsg_bug_.

The google people "solved" this by just randomly picking a number for the size of input_cmsg_bug_ on Macs. Rather than do that, I picked a number that would be as big as CMSG_SPACE(0) on all platforms and put it in kControlBufferSlopBytes and then asserted at runtime that kControlBufferSlopBytes >= CMSG_SPACE(0).

Should I just add a comment to this effect or do you want something differen?
Flags: needinfo?(bent.mozilla)
That would be totally fine!
Flags: needinfo?(bent.mozilla)
Assignee: cyu → wmccloskey
https://hg.mozilla.org/mozilla-central/rev/4c5c545d7874
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
blocking-b2g: backlog → ---
You need to log in before you can comment on or make changes to this bug.