Closed Bug 101016 Opened 23 years ago Closed 23 years ago

when the installer runs mozilla, it dies immediately with "getcwd: Function not implemented"

Categories

(SeaMonkey :: Installer, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.0

People

(Reporter: zwol, Assigned: netscape)

References

Details

(Keywords: relnote, Whiteboard: [adt2 rtm] [fixed on trunk and 1.0 branch])

Attachments

(1 file)

I do $ mozilla-installer/mozilla-installer and it churns away. The it tries to run the just-installed mozilla, which immediately dies like so: /home/zack/mozilla/run-mozilla.sh /home/zack/mozilla/mozilla-bin -installer getcwd() failed: Function not implemented If I run that command myself from a shell window, mozilla comes up normally. My theory is that the installer is executing mozilla with its current directory set to a directory that has been deleted, but I have no proof of this.
QA Contact: bugzilla → ktrina
QA Contact: ktrina → gbush
cannot reproduce here.
Zack, what build are you using? Have you tried a recent nightly build and a fresh profile?
Just reproduced the bug with the installer at <http://ftp.mozilla.org/pub/mozilla/nightly/2001-10-08-08-trunk/mozilla-i686-pc-linux-gnu-installer.tar.gz> and no profiles directory at all. I install into $HOME/mozilla, running as myself not root, might that have something to do with it?
OK, I tried that very same installer build and it works fine, I only get this error message on the console but mozilla comes up fine. shell-init: could not get current directory: getcwd: cannot access parent directories: No such file or directory
Aha, sounds like a difference in shell implementations. My /bin/sh is "ash" (NetBSD's sh, as hacked up by Debian). What do you have?
I just have good ole bash (2.05.0), as hacked up by Debian, as well :-)
I just did an install as a normal user and got the same message. Was able to install fine and run fine. btw, I have had this 'shell-init * getcwd' message appear on almost all recent releases. It does not seem to interfere with installation or running for me. (Talkback 0.95 tar.gz install) Points of interest: __Error shows up after calls to GTK are made: [dunk@skippy mozilla-installer]$ ./mozilla-installer Gtk-CRITICAL **: file gtkwidget.c: line 1510 (gtk_widget_hide): assertion \ `widget != NULL' failed. [dunk@skippy mozilla-installer]$ /home/dunk/9.5/run-mozilla.sh \ /home/dunk/9.5/mozilla-bin -installer shell-init: could not get current directory: getcwd: cannot access parent \ directories: No such file or directory MOZILLA_FIVE_HOME=/home/dunk/9.5 [Install continues as normal here..] __Doesn't appear to be in the .sh installer scripts: [dunk@skippy mozilla-installer]$ strings --print-file-name * | grep getcwd mozilla-installer-bin: getcwd __Shell version: [dunk@skippy mozilla-installer]$ /bin/sh --version GNU bash, version 2.05.1(1)-release (i586-mandrake-linux-gnu) Copyright 2000 Free Software Foundation, Inc. __After installation.. (In application's directory): [dunk@skippy 9.5]$ strings --print-file-name * | grep getcwd libxpcom.so: getcwd libxpistub.so: getcwd __Who I am: [dunk@skippy 9.5]$ uname -a Linux skippy 2.4.6-5mdk #1 Wed Jul 18 19:59:39 CEST 2001 i686 unknown I can provide any other information needed of my system. Running somewhere between RedHat 6.1 and Mandrake 8.1. What is the process for getting this bug confirmed and fixed? thanks dunk
OK, I think it is safe to confirm this bug now, we seem to depend on the shell implementation. Maybe the bug is with ash, but then we should add a comment to the release notes. ---> NEW Zack, can you try to switch your /bin/sh to bash for a single test and see if the problem still occurs? Also please do try it with milestone 0.9.5. That way we should be able to pin down the problem further. Bunk: To have a bug confirmed you need to convince somebody with canconfirm privileges (like myself) like me that the bug is real. Writing good reports and following suggestions (like Zack did) does help a lot here. To have a bug fixed you need to convince somebody with the proper skillset that it is worth her precious time to fix it. Or you just dive into the code and do it yourself :-)
Status: UNCONFIRMED → NEW
Ever confirmed: true
I can certainly do these tests. Here's what I get. This is with the mozilla0.9.5 installer. It was instructed to install Navigator only, into /home/zack/m095/install, which did not exist before the installation, and to preserve .xpi modules. I did a dry run first, then installed repeatedly from the .xpi modules. With /bin/sh = ash: ~/m095 $ mozilla-installer/mozilla-installer Gtk-CRITICAL **: file gtkwidget.c: line 1510 (gtk_widget_hide): assertion `widget != NULL' failed. ~/m095 $ getcwd() failed: Function not implemented and Mozilla does not load. With /bin/sh = bash: Gtk-CRITICAL **: file gtkwidget.c: line 1510 (gtk_widget_hide): assertion `widget != NULL' failed. ~/m095 $ mozilla-installer/mozilla-installer ~/m095 $ shell-init: could not get current directory: getcwd: cannot access parent directories: No such file or directory /home/zack/m095/install/run-mozilla.sh /home/zack/m095/install/mozilla-bin -installer shell-init: could not get current directory: getcwd: cannot access parent directories: No such file or directory MOZILLA_FIVE_HOME=/home/zack/m095/install LD_LIBRARY_PATH=/home/zack/m095/install:/home/zack/m095/install/plugins:.: LIBRARY_PATH=/home/zack/m095/install:/home/zack/m095/install/components SHLIB_PATH=/home/zack/m095/install LIBPATH=/home/zack/m095/install ADDON_PATH=/home/zack/m095/install MOZ_PROGRAM=/home/zack/m095/install/mozilla-bin MOZ_TOOLKIT= moz_debug=0 moz_debugger= shell-init: could not get current directory: getcwd: cannot access parent directories: No such file or directory *** QfaServices is being registered I am inside the initialize Hey : You are in QFA Startup (QFA)Talkback loaded Ok. and Mozilla proceeds to load up. Since mozilla loaded completely when sh = bash, I guessed that it might have retained the bogus working directory. ~/m095 $ ps xc | grep mozilla 17905 pts/4 S 0:00 run-mozilla.sh 17911 pts/4 S 0:09 mozilla-bin 17912 pts/4 S 0:00 mozilla-bin 17913 pts/4 S 0:00 mozilla-bin 17914 pts/4 S 0:00 mozilla-bin 17916 pts/4 S 0:00 mozilla-bin ~/m095 $ ls -l /proc/17911/cwd lrwx------ 1 zack zack 0 Oct 15 19:58 cwd -> /var/tmp/.tmp.xi.0/bin (deleted) Indeed it did. The installer is running mozilla from inside a directory that has been deleted. Linux, and some other Unixes, have no problem whatsoever with deleting a directory that is empty but still some process's working directory. The directory inode is not recycled until the last process using it leaves or exits - just like when you delete an open file. However, in order to maintain file system consistency, you cannot create files in a deleted directory, the '.' and '..' links are removed, and most important in this case, getcwd(2) will fail. $ mkdir x $ cd x $ rmdir $HOME/x $ ls -la total 0 $ strace -e getcwd /bin/pwd getcwd(0xbffff73c, 1024) = -1 ENOENT (No such file or directory) /bin/pwd: cannot get current directory: No such file or directory Like that. Now, watch what happens if I try to run shells from that deleted directory: $ strace -e getcwd ash -c 'echo hello' getcwd(0xbffff90c, 256) = -1 ENOENT (No such file or directory) getcwd() failed: No such file or directory $ strace -e getcwd bash -c 'echo hello' getcwd(0x80b800c, 4095) = -1 ENOENT (No such file or directory) shell-init: could not get current directory: getcwd: cannot access parent directories: No such file or directory hello Conclusion: Both bash and ash call getcwd() during their initialization, even if they don't strictly need to (neither of the above commands makes any use of the result of getcwd). If it fails, ash treats that as a fatal error; bash carries on. /var/tmp/.tmp.xi.0/bin was presumably a scratch directory created by the installer. It changed into that directory while unpacking, then when it was done it deleted the directory - but didn't bother changing back out again. Thus, when it spawned run-mozilla.sh, that script got the deleted directory for its working directory, and (if executed by ash) it gave up. Proposed fix: Have the installer remember where it came from, and chdir back there before deleting the temporary directories. [p.s. dunk - getcwd is the name of a system library function, not a shell command. the corresponding shell command is "pwd". however, as you can see, ash and bash both call getcwd even if pwd is never used in the script.]
Target Milestone: --- → M1
Target Milestone: M1 → Future
Shouldn't we warn about this problem in the release notes?
Keywords: relnote
*** Bug 113667 has been marked as a duplicate of this bug. ***
I'm also seeing this behaviour using Mandrakes bash2. Another note - if this is known to only work correctly with bash, why is it using /bin/sh? If it's not a sh script, it shouldn't pretend to be one! [askwar@teich askwar]$ ls -la /bin/sh /bin/bash ; rpm -qf /bin/bash -rwxr-xr-x 1 root root 580940 Nov 19 14:14 /bin/bash* lrwxrwxrwx 1 root root 4 Dez 4 18:04 /bin/sh -> bash* bash-2.05-15mdk
yikes, we're seeing this on donner's RH 7.3 machine. several bad things happen: 4.x migration fails, you can't create a new profile, the app won't launch, etc.
dmose tells me that bash is the default shell. (It is a sign that I'm old school that I use /bin/tcsh?)
> It is a sign that I'm old school that I use /bin/tcsh? Yes.
zack's suggestion: "Proposed fix: Have the installer remember where it came from, and chdir back there before deleting the temporary directories." I'll see if that fixes it for me. un-futuring this.
Keywords: nsbeta1
Target Milestone: Future → ---
data point about donner's RH 7.3 machine: he installed it as a redhat workstation, with no developer tools, if that makes sense.
ok, some info: it's not the installer. I installed mozilla on another machine, zipped it up, brought it over to donner's machine, unzipped it, and it fails. he's got /bin/sh -> bash, and his version of bash is Bash version 2.05a.0(1) release GNU
Seth, could you compile up the following and tell me if you get any link or runtime errors on donner's system? #include <unistd.h> #include <stdio.h> main( int argc, char *argv[] ) { char buf[1024]; if ( getcwd( buf, sizeof( buf ) ) != (char *) NULL ) printf( "Current working dir is %s\n", buf ); else printf( "getcwd failed\n" ); } Just cc -o testme testme.c is necessary for me to compile link and run on Linux here. getcwd is in manual 3 so I can't imagine what the shell has to do with it.
syd, it's not that getcwd() fails, it's that doing getcwd() from a deleted directory fails. I build and tested your code on donner's machine, and it worked fine. output: "Current working dir is /home/stephend" note, just running /bin/bash in a deleted directory will cause this problem. [stephend@h-10-169-108-235 stephend]$ mkdir foo [stephend@h-10-169-108-235 stephend]$ cd foo [stephend@h-10-169-108-235 foo]$ rm -rf ~/foo [stephend@h-10-169-108-235 foo]$ bash shell-init: could not get current directory: getcwd: cannot access parent directories: No such file or directory
Ok, I misread one of the earlier comments. And this happens in the browser. hrmmmm. So, can we load a debug build of mozilla into gdb, do a shar libc, set a break on getcwd, and find out who is calling it?
Seth: How did you run the browser after extracting it on a different machine? I have never had any trouble running the browser from the command line once it gets installed. This sounds like a different bug from mine. I looked through the installer code and it's quite clear where the bug I reported is. Look at http://lxr.mozilla.org/seamonkey/source/xpinstall/wizard/unix/src2/nsXIEngine.cpp. Search for mOriginalDir and mTmp. You will see that there is code which is supposed to save the original directory, create a temp directory, chdir() into it, unpack some zipfiles, chdir() back to the original directory, and delete the temp directory. It doesn't work: strace indicates that the final chdir() is to the directory that the installer is already in. I am guessing that it doesn't work because mOriginalDir gets set in nsXIEngine::LoadXPIStub(). That function gets called once for each .xpi module downloaded. The first time, it correctly sets mOriginalDir, but the second time through it clobbers that with the path to the temp directory. Therefore the chdir() in nsXIEngine::~nsXIEngine doesn't go anywhere. If I'm right, this can be fixed by moving the getcwd() call from LoadXPIStub() into the nsXIEngine constructor. I don't have a Mozilla checkout or any idea how to test that change if I did, though.
>Seth: How did you run the browser after extracting it on a different machine? >I have never had any trouble running the browser from the command line once it >gets installed. This sounds like a different bug from mine. I installed on another linux box, and then zipped up the whole beast, ftp it over to the machine with the problem, unzipped it. after doing that, doing ./mozilla -installer will fail. (I can attach the strace log.) I was trying to take the actually installer out of it. based on my zip test, even once we fix the installer problem, we'll have other problems. I can get a build with the suggestion you made, and try it out and see if I get further.
Peculiar. When I run ./mozilla -installer it works just fine. I think we've definitely got two different bugs here.
In my initial strace of the installer run, I saw that we don't chdir() back to the installer dir until *after* the sub-process has forked. This patch causes us to chdir() to the location of the binary (because it was handy available) before running it.
Zack: try this ./mozilla -ProfileManager are you able to create a profile manually?
seth: Yeah, it works just fine. chris: Aha, so _that_'s what's really going on. That patch looks like the right idea to me.
in LoadXPIStub /home/syd/machvbeta/netscape-installer In XIEngine destructor /home/syd/machvbeta/netscape-installer shell-init: could not get current directory: getcwd: cannot access parent directories: No such file or directory So, the above printfs show that the chdirs are matched, it appears no clobbering is happening before we hit the destructor.
Ok, I think we might do something better than fprintf in handling the error (however unlikely that is)
Comment on attachment 83463 [details] [diff] [review] chdir to destination dir before fork() r=syd
Attachment #83463 - Flags: review+
I just rebuilt with cls' patch and tried it on donner's machine. now, install works, and running mozilla works, everything is fine. Is this something we want for 1.0? I'm not sure how common it is to have a /bin/bash that has this bug. re-assign to cls to land on the trunk.
Assignee: syd → seawood
Comment on attachment 83463 [details] [diff] [review] chdir to destination dir before fork() sr=sspitzer It's probably one of our golden rules not printf / fprintf anything in optimized builds. I think you should consider wrapping #include <errno.h> and the "if (chdir(dest) < 0)" block with #ifdef DEBUG.
Attachment #83463 - Flags: superreview+
> Is this something we want for 1.0? I'm not sure how common it is to have a > /bin/bash that has this bug. Very common. All recent versions of bash exhibit this behavior. Nominating for 1.0.
Keywords: mozilla1.0
getting on Mach V rtm radar.
Keywords: nsbeta1nsbeta1+
Whiteboard: [adt2 rtm]
fix checked in as is on the trunk. thanks to cls for the fix, and zackw@panix.com for all the debugging / testing. I'll talk to drivers about getting this for 1.0.
Whiteboard: [adt2 rtm] → [adt2 rtm] [fixed on the trunk]
Target Milestone: --- → mozilla1.0
Keywords: adt1.0.0
Comment on attachment 83463 [details] [diff] [review] chdir to destination dir before fork() a=blizzard on behalf of drivers for the 1.0 branch
Attachment #83463 - Flags: approval+
The patch has been checked into the moz1.0.0 branch.
Status: NEW → RESOLVED
Closed: 23 years ago
Keywords: adt1.0.0fixed1.0.0
Resolution: --- → FIXED
Whiteboard: [adt2 rtm] [fixed on the trunk] → [adt2 rtm]
thanks for landing this on the 1.0 branch, as that will mean it will make it as part of the machv rtm.
Whiteboard: [adt2 rtm] → [adt2 rtm] [fixed on trunk and 1.0 branch]
removing the item for this fixed bug from the mozilla 1.0 rc3 release notes and future versions.
verifying on trunk 2002053108, unable to verify on branch until bug 145776 lands and am able to get to ftp.mozilla.org to get a build!
Status: RESOLVED → VERIFIED
verifying on branch build for Mozilla 6/12
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: