Closed Bug 147659 Opened 22 years ago Closed 7 years ago

[UNIX] Helper apps inherit open file descriptors

Categories

(Core :: Networking: File, defect, P3)

x86
Linux
defect

Tracking

()

RESOLVED DUPLICATE of bug 1406971
Tracking Status
blocking2.0 --- .x+

People

(Reporter: tenthumbs, Assigned: jld)

References

Details

(Keywords: helpwanted, sec-want, Whiteboard: [sg:want][necko-backlog])

Attachments

(3 files, 1 obsolete file)

I ran into this while investigating something else. I set up /usr/bin/gs 
as a helper app for postscript files. In the attachment note that gs is 
a child of one of the mozilla threads. Note also that gs has a lot of 
open file descriptors and could write to a number of interesting things 
if it were so inclined. An unpleasant thought.

Clearly, mozilla needs to enforce the usual close-on-exec behavior but I 
haven't the foggiest idea where that should be. I'll start this off in file
handling but feel free to change the component.
Attached file helper app data
This is an nsIProcess bug...

ccing some people who may know who it belongs to.
Assignee: law → dougt
Component: File Handling → Networking: File
QA Contact: sairuh → benc
Just in case there's any confusion, this bug is about closing all file 
descriptors except stdin, stdout, and stderr.
I think it might be a necko bug.  We should set FD_CLOEXEC on all the fds we open.

Or maybe people are getting these fds via the OpenANSIDescriptor thingy?  Then
nsLocalFileUnix needs to set FD_CLOEXEC on the underlying fd, which is perhaps
_mildly_ exciting to do portably, but hey.
LXR finds FD_CLOEXEC in nsprpub/pr/src/md/unix/unix.c which mumbles 
about inheritance or some such. But nspr #defines so many things that I 
can't figure out what it really means.
NSPR has a PR_SetFDInheritable function but no one seems to use it.

Maybe I'm more paranoid than most but the potential for mischief is 
enormous here.
How so?  A helper app that's running as the user can do anything Mozilla can do.
 It can write to any files, open /proc/$mozillapid/fd/*, etc.

If you don't trust your helper apps, you're screwed in ways we can't begin to
protect against.
An external app can't read the pipes or sockets. I'm not sure how much
of a problem that is but I certainly don't want to find out.

Yes, there are many ways to get screwed. I just prefer to eliminate as
many as possible. Armor plating is nice.

And no, I don't trust helper apps or anything else that isn't guaranteed
bug free. I've been burned too often.

Take a look at bug 125505. There I proposed a mime assistant that could
execv the real helper. You could use the same idea to sanitize the
helper environment. It would be a lot faster than fixing all of mozilla.
On Linux, an external app _can_ read the pipes or sockets, via the
/proc/$mozillapid/fd/* mechanism I described earlier.  There are almost
certainly similar mechanisms on other platforms, since debuggers provide such
facilities.

I'm all for mitigating the damage buggy apps can do by accident, but protection
against "mischief" on the part of a malicious helper-app author is something
we're not going to be able to provide, and I'd rather we spent the effort elsewhere.

Closing file descriptors is a correctness issue, not a security/protection one.
You're better than I am. I tried and I can't get an app that isn't a
mozilla child to read from anonymous pipes or socketpairs. Sure you can
attach a process with ptrace but that's almost certainly going to be
noticed by the user. On Linux, the kernel stops an attached process.

So maybe I shouldn't have used the word "mischief." I didn't mean it in
quite the nefarious sense you seem to. I meant for it to include the
typical buggy app. That's what gotten me in the past. I've never
actually been burned by a malicious app but I have been severely burned
by buggy ones.

This might be a correctness issue but protection against data loss makes
it much more important, at least to me.
wtc, what is the default inherit trait of a file descriptor opened via
PR_Open()?  Is there a way to set all file descriptors opened by an application
not-inheritable by a child process without explictly calling
|PR_SetFDInheritable| after every PR_Open?
Keywords: helpwanted
Target Milestone: --- → Future
The default inherit trait of a file descriptor opened via
PR_Open() is inheritable on Unix and not-inheritable on Windows.

There is no convenient way to set all file descriptors opened
by an application not-inheritable by a child process on Unix.
I've seen people who resort to a for loop that closes all the
possible Unix fd's.  That for loop is rather expensive.

A possible solution is to use a helper process whose only
purpose is to fork and exec helper apps for the Mozilla client.
You know exactly which file descriptors are open in this helper
process and you can set FD_CLOEXEC on all these fd's.
Here's a first try at a mime-helper. Rather than closing descriptors it sets
FD_CLOEXEC which some problems. It prefers to read /proc/self/exe but falls
back to a brute force strategy.

It works on Linux but that's not saying much.
Blocks: 147274
An alternative XP solution to this problem is to try to utilize the IPC daemon.
 Since it is spawned early at startup it should not have access to most (or all)
of the file descriptors we are concerned with.  It'd be relatively easy to add a
protocol extension for spawning processes via the IPC daemon.  We'd of course
have to take care to not make XPCOM (nsIProcess) depend on the IPC daemon. 
Thoughts?
How about we just use the NSPR function that is meant for this purpose:

PR_NewProcessAttr
http://www.mozilla.org/projects/nspr/reference/html/prprocess.html#24416
(see the description)

and use that processattr in PR_CreateProcess:
http://www.mozilla.org/projects/nspr/reference/html/prprocess.html#24535

at this place in mozilla code:
http://lxr.mozilla.org/seamonkey/source/xpcom/threads/nsProcessCommon.cpp#283
Attached patch like this (obsolete) — Splinter Review
note, this is currently untested
biesi: well, NSPR process attr only allows us to control specific values passed
on to the child process, like stdin for example.  currently, it doesn't provide
a way to make other file descriptors close-on-exec, which is what we'd need.  my
concern is with portability... basically, we need to iterate over any open file
descriptors (other than those for stdin, et al.) and mark them as close-on-exec.
 the portability problem is how to determine the set of open file descriptors. 
we know how to do this under linux using the /proc filesystem, but that
obviously doesn't work on other platforms.  for example, what about OSX?

if we use the IPC daemon then that problem is avoided.
darin: well, the first url I mentioned in my comment contains this sentence:

 No file descriptors are inherited by the new process.

Is that not what we want here?
i may be mistaken, but i think NSPR's notion of inheritable file descriptors is
just a mechanism to allow a child process to get access to a PRFileDesc handle
that references the OS file descriptor.  in other words, it is just a XP
mechanism for accessing inherited file descriptors.  it does not seem to get in
the way of the native mechanisms.  it just makes it possible to get a PRFileDesc
from a native file descriptor.  in fact, it seems to create a name to PRFileDesc
dictionary, which is passed to the child process as an environment variable (at
least that's how the UNIX impl works).  so, i think the comments in the NSPR
reference are just misleading.  but, you should probably give your patch a try ;)
Comment on attachment 127178 [details] [diff] [review]
like this

aww... it seems you are right, unfortunately :( the child process still has the
file descriptors open
Attachment #127178 - Attachment is obsolete: true
One solution is to execute the following for loop in
the child process before it calls exec:

    int fd;

    for (fd = 3; fd <= MaxFdOfTheProcess; fd++) {
        (void) close(fd);
    }

You can use PR_GetSysfdTableMax as an example of getting
the value of 'MaxFdOfTheProcess'.  (PR_GetSysfdTableMax
is an unsupported function, so its code may be incorrect.)
In the time since I field this bug I realized that open file descriptors 
are just once aspect of the problem of gracefully exiting the 
mozilla/gre/whatever environment. In particular, you can't leave it 
properly unless you know the start-up environment, which you don't. you 
really need to think about the proper way to start before you can leave.

I don't see how you can have a clean secure XP way to do any of this so 
I don't think how IPC (whatever that is) can help. I also don't think an 
XP solution is appropriate anyway.

See bug 125505, attachment 105336 [details]. I try to handle setting 
FD_CLOEXEC by testing /proc/self but fall back to looping over all fd's.
I noticed today that the esd (ESound Daemon) spawned by TBird playing the "new mail" sound (mail.biff.play_sound.url) had also inherited all the descriptors. Is this another instance of this bug, or a different one?
Different bug.
That patch looks like it should go in a separate bug, really, since it doesn't solve this bug.  It's a good special case that we should check in, though.
OK I will think about making this a separate bug.  It is a little depressing that nsProfileLock.cpp is a preprocessor mess.  I had assumed that the #ifdef jungle was supposed to be confined to NSPR.  Alas.

Also, I developed this after getting user complaints that helper applications spawned by previous invocations would prevent Firefox from starting, but now I have to go investigate that more, because on my platform (Linux) fcntl locks are not inherited by the child.
on ubuntu 6.06 with firefox 2.0 with an NFS $HOME i've had the problem where i've killed every firefox instance but still have an acroread instance going (which was spawned via firefox -- not the plugin, i hate the plugin) and firefox won't restart... and the acroread has a good 60 fds open which were inherited from the firefox, including a couple .parentlock fds.

when i kill the acroread i can start firefox again.

so something is inherited here.
you know, it occurs to me that NSPR should endeavour to present the same close-on-exec behaviour on all platforms... and honestly, the unix API is totally broken in this area -- this is something which NT (actually VMS, which NT inherits from) get right.  the unix behaviour is OK when you're dealing with a half dozen fds... but real programs (especially multithreaded programs) have way more than a half dozen open.  when you spawn a process you should explicitly specify the fds to be inherited.  NSPR should default to setting F_CLOEXEC on everything on unix.
Unfortunately in NSPR the same path is used to spawn an unrelated process and to fork your own process.  If you explicitly close all the FDs when NSPR forks, half the unit tests fail.
Sounds like we could use an NSPR bug about passing more information into that code, eh?
Blocks: 372734
Now that this is blocking a bug I'm not allowed to view, we can only assume that the security implications here have finally been realized?
They were realized all along.  All it means in this case is someone filed a duplicate of this bug with the security flag set and someone else didn't mark it duplicate.
(In reply to comment #20)
> (From update of attachment 127178 [details] [diff] [review])
> aww... it seems you are right, unfortunately :( the child process still has the
> file descriptors open

But shouldn't it close them ? If so, let's just add something like in comment #21 to NSPR.
mass reassigning to nobody.
Assignee: dougt → nobody
(In reply to comment #12)
> The default inherit trait of a file descriptor opened via
> PR_Open() is inheritable on Unix and not-inheritable on Windows.

Any reason why we can't make the behavior of PR_Open() consistent on both platforms by getting PR_Open to make the file descriptor close-on-exec?

(It does get a little tricky with threads: it would be best if one thread didn't exec while another is between open and fcntl, but isn't NSPR the best place to deal with this?)
as you note you can't create any fd (i.e. via open/socket/accept/dup/dup2/F_DUPFD/...) without a race condition between the creation and setting F_CLOEXEC in a threaded program.  unix wasn't designed for threads.  there are a few choices:

(1) all fd creation under a mutex... gross.

(2) portable:  after fork() use dup2 to set up stdin/out/err then getrlimit(RLIMIT_NOFILE) and iterate for (int i = 2; i < N; ++i) { close(i); }

(3) not portable:  linux (and perhaps others) is slowly acquiring O_CLOEXEC, F_DUPFD_CLOEXEC and other such flags so that you can cause fds to be created with F_CLOEXEC set from the start.

imho (2) is the best solution.  you just have to catch all uses of fork().  i.e. popen() is hell.

-dean
Karl: unfortunately this will break any normal fork that Mozilla does to communicate with itself or a subproc.  See comment #31.  If you make PR_Open set close-on-exec by default, most of the NSPR test suite breaks.
(In reply to comment #31)
> Unfortunately in NSPR the same path is used to spawn an unrelated process
> and to fork your own process.  If you explicitly close all the FDs when NSPR
> forks, half the unit tests fail.

I didn't understand how this applied to a default close-on-exec approach.
Sure this would be a problem if explicitly closing all FDs in the sense of
choice (2) of comment #41.

But for a default close-on-exec approach, PR_SetFDInheritable would be needed
on any FD needed across exec.  (They would still be available after fork
but:)

Should an NSPR process wanting to fork use PR_CreateProcess to exec another
instance of itself?  And is that what Mozilla does?
http://developer.mozilla.org/en/docs/Process_Forking_in_NSPR

I guess PR_SetFDInheritable would be needed in quite a few places then.
And if it can't be done after fork, that could get complicated.
(In reply to comment #41)
> (2) portable:  after fork() use dup2 to set up stdin/out/err then
> getrlimit(RLIMIT_NOFILE) and iterate for (int i = 2; i < N; ++i) { close(i); }

This would be best to watch out for race conditions too.
N could be large, but /proc/PID/fd could be an optimization I guess.

> 
> (3) not portable:  linux (and perhaps others) is slowly acquiring O_CLOEXEC,
> F_DUPFD_CLOEXEC and other such flags so that you can cause fds to be created
> with F_CLOEXEC set from the start.

This seemed ideal to me (and a portable library seemed the best way to decide
whether to use this) but I guess this is only going to help with open, not all
the fd creation functions.

> 
> imho (2) is the best solution.

I'm thinking this might be easiest.

If an external helper process like tenthumbs suggests is used, it wouldn't
need to worry about race conditions.  And once its a separate process, doing a
little work to close file descriptors is not a significant issue.

If we were already using the IPC daemon then that would seem the way to go.
If this or a similar daemon was launched early enough, then it would avoid the
problem of being unable to fork when using more than half the available
virtual memory (http://www.cmiss.org/cm/tracker/227/).  But maybe that's for
the future.

>  you just have to catch all uses of fork(). 
> i.e. popen() is hell.

:-) I'm hoping we are not using popen.
> :-) I'm hoping we are not using popen.

In general, we are.  Not for helper apps, though.

Obvious popen() users in the tree:

* Cycle collector debugging
* some camino unarchiver code
* JS engine code (whatever file_open is)
* Mailnews smime code
(In reply to comment #45)
> Obvious popen() users in the tree:
> * JS engine code (whatever file_open is)

Unused, unbuilt, unloved jsfile.c.  Nothing we're going to see in any product.
there's also pthread_atfork()...
Flags: blocking1.9? → blocking1.9+
Priority: -- → P5
Flags: tracking1.9+ → wanted-next+
Flags: wanted1.9.1?
Priority: P5 → P2
Whiteboard: [sg:want]
This bug hits me *every single day*.  I play mp3's (usually internet radio using rhythmbox or songbird) and use firefox extensively, reading lots of pdf documents with xpdf.  After some time, xpdf will block the sound device and I have to go track down which pid is blocking it and kill all the documents I have open (usually I have several).  The choice by ubuntu to use pulseaudio seems to make the situation worse.  I usually find it's the first thing I have to kill, then it's xpdf sessions, or firefox itself.

This may be a deeper bug.  Clearly it works with 2 pid's holding the same file handle, but shows up as more and more pid's hold the filehandle.  Exactly which circumstance leads to the sound device being blocked?  The reported programs which cause this (xpdf, evince, gv) do not read or write the sound device at all.  So why does it get blocked in the first place?  

Closing the file handle may or may not fix the problem depending on the ultimate reason that the sound device gets blocked.  Has anyone actually tested the theory that it's just holding file handles that causes the problem?  (e.g. with a helper script as suggested above?)
Have also seen this bug on gentoo and ubuntu (32 and 64 bit).  If I open a pdf from firefox (using kpdf), the next day the pdf will be stealing the sound devices and trying to play flash sound from mozilla causes a spike in cpu.  I usually resort to killing firefox and the app stealing the sound device.
This bug makes the "open with" functionality potentially dangerous, since it leaks file descriptors concerning hardware access (raw sound devices). Maybe "open with" should be disabled on affected systems to avoid security issues.
Flags: wanted1.9.2?
QA Contact: benc → networking.file
Target Milestone: Future → ---
This bug is very annoying, as I regularly find Ktorrent (launched from Firefox) stealing away my soundcard (I debug with "fuser"). I am pretty sure many users actually don't realize it's due to Firefox that their sound doesn't work, and merely blame 'Linux' in general. I am convinced that fixing this bug would help many: it would be great to get this fixed after 7 years.
This bug is embarassing. Of all the applications I'm using, Firefox is the /only/ one that reliably manages to break sound, of all functionality. Sound doesn't work? Find out what application was spawned by FF, terminate it, sound resumes.

It can't be that hard to iterate over and close all file descriptors >= 2.
(In reply to comment #51)
> This bug hits me *every single day*.  I play mp3's (usually internet radio
> using rhythmbox or songbird) and use firefox extensively, reading lots of pdf
> documents with xpdf.  After some time, xpdf will block the sound device and I
> have to go track down which pid is blocking it and kill all the documents I
> have open (usually I have several).  The choice by ubuntu to use pulseaudio
> seems to make the situation worse.  I usually find it's the first thing I have
> to kill, then it's xpdf sessions, or firefox itself.

I am pretty sure that the PA libraries set O_CLOEXEC on all its fds. What makes you think PA is at fault here?

That said, regardless whether PA is misbehaving here or not, firefox definitely needs to close all open fds after fork()ing and before exec()ing an application. On linux this is best done by iterating through /proc/self/fd. Something like this:
http://git.0pointer.de/?p=pulseaudio.git;a=blob;f=src/pulsecore/core-util.c;h=b64c51e18fb71d2a111b0a0236bc81bc5d849950;hb=HEAD#l2230

That code is fast on Linux and still compatible with non-Linux Unixes, so should be a safe choice.
Just think how fun it will be to have a tenth birthday party for this bug in 2012 when it's still not fixed.  Security and platform parity used to be important features of the Mozilla project but not these days.
OS: Linux → Windows Server 2003
OS: Windows Server 2003 → Linux
Assignee: nobody → karlt
blocking2.0: --- → final+
Roc, why is this blocking?  It's been around forever and we have a TON of bugs blocking 4.0 at this point.  Something has to give.
It's blocking because it's a pretty bad bug, we have no other procedure for getting bugs like this fixed, and I was hoping Karl would have time to do it without impacting anything else important.
Comment 59 is nasty but correct. Sometimes you just have to put your foot down and fix it.
We do need to fix this, but we're not going to hold Gecko 2 for it.  Moving to the 2.x list, pending Karl actually having time...
blocking2.0: final+ → .x
yeah let's not fix this before it's a decade old.  not much longer to wait!
Hey, we only have 13 months before the 10-year anniversary of this bug!

Here's a doodle for scheduling the party:

http://www.doodle.com/9uav5v7igabaxndv

*Maybe* we could get the ticket status updated to a status that actually exists  for the occasion :) "NEW" doesn't seem to be on the list anymore: https://bugzilla.mozilla.org/page.cgi?id=fields.html#status 

I won't go so far as to suggest RESOLVED, but IN_PROGRESS would be *awesome*.
So this is the culprit?
http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/md/unix/uxproces.c#263

How exactly do you trigger this? I thought download code used GIO or GnomeVFS now to spawn processes.

Is there any valid case for wanting to keep the file descriptors open after a fork? I was thinking that, in that function ForkAndExec(), if attr is null then I close the fds. karl, thoughts?
Some relevant reference and code over at stackoverflow:

http://stackoverflow.com/questions/5713242/linux-fork-file-descriptors-inheritance
Michael, I'd love it if you could take this.
(In reply to comment #67)
> So this is the culprit?
> http://mxr.mozilla.org/mozilla-central/source/nsprpub/pr/src/md/unix/
> uxproces.c#263
> 
> How exactly do you trigger this? I thought download code used GIO or
> GnomeVFS now to spawn processes.
>
> Is there any valid case for wanting to keep the file descriptors open after
> a fork? I was thinking that, in that function ForkAndExec(), if attr is null
> then I close the fds. karl, thoughts?

My 2 cents:
- The problem is not keeping the file descriptors open after a fork, but after an exec.
- There are a lot of valid reasons to keep file descriptors open after a fork
- There are a few valid reasons to keep file descriptors open after an exec, but I don't think any of these reasons are in the scope of a browser
- Modifying ForkAndExec() behaviour in nspr means a change in behaviour in all applications using NSPR, which may or may not expect file descriptors not to be closed.
- As you pointed out, nowadays GIO and GnomeVFS is used (except for mailcap entries), but I don't think they close file descriptors either.
wtc doesn't want the proposed approach as he mentioned in bug 372734. We need to find a way to do this without touching NSPR.

This might possibly mean auditing all PRFileDesc* in mozilla-central and calling PR_SetFDInheritable(..., PR_FALSE) on them....
Doing some more code searching, the most important place to put this call might be nsLocalFile::OpenNSPRFileDesc (and possibly OpenANSIFileDesc). Writing that here for my own records.
gnome_vfs_mime_application_launch_with_env calls g_spawn_async without
G_SPAWN_LEAVE_DESCRIPTORS_OPEN so helper apps should be fine with GnomeVFS.

Non-GNOME (and newer GNOME) systems without --enable-gio will use nsIProcess
which uses PR_CreateProcess, thus hitting this bug.

Helper apps should be fine in --enable-gio builds because
g_desktop_app_info_launch_uris also calls g_spawn_async without
G_SPAWN_LEAVE_DESCRIPTORS_OPEN.

For plugin/content processes this is addressed by CloseSuperfluousFds.

Most systems should have GIO these days, so it looks like the appropriate fix
here is to update glib on our build systems so we can --enable-gio.

(I haven't done an audit of other nsIProcess consumers.)
Ideally at the same time we can install GTK3 on our build systems and migrate to that asap.
Depends on: 678369
(In reply to Karl Tomlinson (:karlt) from comment #73)
> Helper apps should be fine in --enable-gio builds because
> g_desktop_app_info_launch_uris also calls g_spawn_async without
> G_SPAWN_LEAVE_DESCRIPTORS_OPEN.

Actually, Non-GNOME systems won't be covered with --enable-gio, because the gio service is in the mozgnome component, which depends on GNOME libs, which will most likely fail to load on non-GNOME systems. And if mozgnome loads anyway, it contains the GnomeVFSService, which would already use gnome_vfs_mime_application_launch. So --enable-gio would change nothing.
(In reply to Mike Hommey [:glandium] from comment #75)
> Actually, Non-GNOME systems won't be covered with --enable-gio, because the
> gio service is in the mozgnome component, which depends on GNOME libs, which
> will most likely fail to load on non-GNOME systems.

We'd have to either --disable-gconf --disable-gnomevfs (as I expect GIO distributions will do), or make mozgnome dynamically check for gconf and gnomevfs if still useful.
Also note that mozgnome also may depend on libnotify. (mozilla builds don't but distros' do)
Mozilla builds depend on libnotify.so.1.  The ABI and soname has since changed, so that may be a problem on newer distros.  If there's no stable notification library we can use instead, then we could dynamically detect.
Actually, I was thinking about this last week, but why do we have the gio support in the mozgnome component? gio exists on any system with a gtk version of 2.14 or newer (the version there is just plucked from a brief look through the git history of gtk)

And if we modified the mozgnome component to dynamically check for gconf/gnomevfs, then is there any point to having the separate component for them anyway? (perhaps I'm missing the point of putting them in libmozgnome in the first place though?)
Ah, I answered my own question - the minimum gtk requirement is 2.10
Something else: mailcap support uses nsIProcess to start commands. These wouldn't be covered either.
Depends on: 713802
Whiteboard: [sg:want] → [sg:want][necko-backlog]
The same issue is handled in the open source digital PVR recorder OpenVix.

The solution there is to intercept the open calls using a library that is loaded using LD_PRELOAD in the startup (so is a Unix/Linux solution...).

See:
   https://github.com/OpenViX/enigma2/blob/master/tools/libopen.c
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P2 → P1
Bulk change to priority: https://bugzilla.mozilla.org/show_bug.cgi?id=1399258
Priority: P1 → P3
Depends on: 1406971
Jed has fixed this via bug 1406971, thank you!
Assignee: karlt → jld
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → DUPLICATE
See Also: → 1276388
See Also: → 1463873
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: