Should file descriptor/handle limits should be larger to accomodate IPC shared memory?

RESOLVED FIXED in Firefox 62

Status

()

enhancement
P3
normal
RESOLVED FIXED
2 years ago
Last year

People

(Reporter: jld, Assigned: jld)

Tracking

Trunk
mozilla62
Points:
---

Firefox Tracking Flags

(firefox62 fixed)

Details

(Whiteboard: sb+)

Attachments

(1 attachment)

Currently, the code that deals with increasing file descriptor/handle limits is in the SocketTransportService.  I assume this is because historically the main user of that resource was networking.  But, post-e10s, it's possible to have hundreds of shared memory segments as well as hundreds of sockets, and maybe a total limit of 1000 for the parent process isn't as high as it used to be.

(Specifically, I just had to file bug 1401774 because of a case where the parent process really did run out of file descriptors, although I gather that involved several dozen open windows, so I don't know what fraction of the userbase might be affected.  Still, if we can make things not break by increasing a number somewhere, that seems like a good idea.)
Just FYI, we recently changed DOM Cache API code to be less aggressive about opening files since we say crashes due to file descriptor exhaustion on some sites/platforms.  See bug 1397128.
Priority: -- → P3
Whiteboard: sb+
Potentially useful insights:

https://bugzilla.mozilla.org/show_bug.cgi?id=1389812#c10

I got the impression that yes, 1k is on the low side.  Then again, raising the limit doesn't remove the possibility of running out (and not all system allow more fds), so it's no substitute for fixes like what bkelly did for bug 1397128.
The default on Fedora (due to systemd it seems) is 1024 fd's.  I don't think an application can raise it higher, but I can check if you can tell me what to check
For each resource limit there's a “soft limit” (rlim_cur), which can be raised up to the “hard limit” (rlim_max) even by unprivileged processes.  So the hard limit can't be exceeded, but the application has to explicitly opt in to anything above the soft limit.  From a shell, it looks like this:

$ limit desc
descriptors     1024
$ limit -h desc
descriptors     1048576

And/or like this:

$ ulimit -n
1024
$ ulimit -Hn
1048576

nsSocketTransportService::DiscoverMaxCount is where we seem to be inspecting and, if needed, modifying the descriptor limit: https://searchfox.org/mozilla-central/rev/15ce5cb2db0c85abbabe39a962b0e697c9ef098f/netwerk/base/nsSocketTransportService2.cpp#1555
I *think* (since I modified it I can't check) that the soft and hard limits were 1024 on Fedora 26 - but maybe I mis-remember.  Worthwhile doing an install on a VM and checking, or just find a Fedora where no one has mucked with it.  Right now mine says 1024/4096

This bug comment from July is relevant: https://bugzilla.redhat.com/show_bug.cgi?id=1338760#c8

We're probably ok for a fair while with 4096, if we get that on most linux systems
I'll check as I've been on Fedora a bit to fix bug 1382323.
Flags: needinfo?(gpascutto)
Fedora 26:

ulimit -n
1024
ulimit -Hn
4096

This is as of Fedora 24: https://bugzilla.redhat.com/show_bug.cgi?id=1338760 and probably an unintended consequence of systemd.

On Ubuntu & friends this is indeed much higher. I couldn't find where the limit actually comes from on either system (there's a dozen places it can be *changed*, but they all appear to be using the defaults.

In any case, we'll have to live with this, I guess.
Flags: needinfo?(gpascutto)
See Also: → 1434501
(Fedora 27 here)
I came here after reading https://bugzilla.mozilla.org/show_bug.cgi?id=1421353#c20
Since I would like to make some tests, I need infos about how to start Firefox with an increased max file descriptor limit. I would like to not change it for the whole system.
I tried to edit the /usr/bin/firefox script, but I had no success.

Best regards
Firefox hasn't used a shell script wrapper in a long time.  I wrote a patch yesterday that raises the limit in nsAppRunner (and forgot to assign myself the bug before I did that; sorry), so that should take care of this, assuming there's agreement that that's the right place to do that.

For testing, you can run `ulimit -n 4096` (or `limit descriptors 4096`, if using csh) and then run firefox from the same shell.
Comment on attachment 8967239 [details]
Bug 1401776 - Raise fd limit to 4096 on Unix.

I'm not quite sure who to ask this, so I'll try the last person to touch SOCKET_LIMIT_TARGET: Does it seem reasonable to change the file descriptor limits like this patch is doing, and leave the constants in nsSocketTransportService2 unchanged?
Attachment #8967239 - Flags: feedback?(mcmanus)
Comment on attachment 8967239 [details]
Bug 1401776 - Raise fd limit to 4096 on Unix.

https://reviewboard.mozilla.org/r/235940/#review241784

I think that's fine.. you can also just ifdef XPUNIX SOCKET_LIMIT_TARGET to 4096 in nsSocketTraansportService2 and accomplish the same thing with less code.. but I certainly see the argument for putting it in nsAppRunner - I think that would play fine with the existing code in netwerk
Attachment #8967239 - Flags: review+
There's something else going on here; see bug 1453735 comment #1 about how I don't normally see *any* open fds to IPC shared memory, because ipc::Shmem closes them after they're mapped.

But we might still want to do this in some form even if that's fixed: IPC transports (top-level actors) do consume fds on both regular Unix and Mac, so if Necko reaches network.http.max-connections then things may break.
Comment on attachment 8967239 [details]
Bug 1401776 - Raise fd limit to 4096 on Unix.

https://reviewboard.mozilla.org/r/235940/#review247142

::: toolkit/xre/nsAppRunner.cpp:3268
(Diff revision 2)
> +    return;
> +  }
> +  // Don't decrease the limit if it's already high enough, but don't
> +  // try to go over the hard limit.  (RLIM_INFINITY isn't required to
> +  // be the numerically largest rlim_t, so don't assume that.)
> +  if (rlim.rlim_cur != RLIM_INFINITY && rlim.rlim_cur < kFDs) {

you should check if rlim_cur is < rlim_max too ; no need to setrlimit if you won't be changing it.

Please also remove the similar code in gfx/thebes/gfxPlatformMac.cpp (for mac) and toolkit/xre/nsSigHandlers.cpp (for solaris).

Not sure there's a reason to keep the necko code either.
Attachment #8967239 - Flags: review?(mh+mozilla)
Assignee: nobody → jld
Comment on attachment 8967239 [details]
Bug 1401776 - Raise fd limit to 4096 on Unix.

https://reviewboard.mozilla.org/r/235940/#review249808
Attachment #8967239 - Flags: review?(mh+mozilla) → review+
Pushed by jedavis@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/fcfbfc7d96d9
Raise fd limit to 4096 on Unix. r=glandium,mcmanus
https://hg.mozilla.org/mozilla-central/rev/fcfbfc7d96d9
Status: NEW → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
You need to log in before you can comment on or make changes to this bug.