Last Comment Bug 552864 - Throw away wrapper shell script on unix and lazily load libxul
: Throw away wrapper shell script on unix and lazily load libxul
Status: RESOLVED FIXED
:
Product: Toolkit
Classification: Components
Component: Startup and Profile System (show other bugs)
: unspecified
: x86 Linux
: -- normal with 1 vote (vote)
: mozilla7
Assigned To: Mike Hommey [:glandium]
:
Mentors:
Depends on: 675585 642450 658273 658995 665509 666657 670788 677247 678438 689719
Blocks: 658847 658850 554421 632404 648581 668869 673899 722262
  Show dependency treegraph
 
Reported: 2010-03-16 19:05 PDT by Zack Weinberg (:zwol)
Modified: 2012-01-30 03:19 PST (History)
41 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
shell overhead benchmark (919 bytes, text/x-csrc)
2010-03-17 09:33 PDT, Zack Weinberg (:zwol)
no flags Details
source for /tmp/true (x86-64 assembly) (118 bytes, application/octet-stream)
2010-03-17 09:34 PDT, Zack Weinberg (:zwol)
no flags Details
Ditch wrapper script, use rpath=$ORIGIN (2.58 KB, patch)
2010-08-17 12:09 PDT, Ted Mielczarek [:ted.mielczarek]
no flags Details | Diff | Splinter Review
get rid of firefox shell script (2.95 KB, patch)
2010-09-02 06:08 PDT, Ted Mielczarek [:ted.mielczarek]
no flags Details | Diff | Splinter Review
PoC using the standalone glue (13.38 KB, patch)
2011-03-17 04:21 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
PoC using the standalone glue, v2 (17.41 KB, patch)
2011-03-17 10:11 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
part 1 - Move DLL blocklist in XRE, and inline NS_SetDllDirectory (7.26 KB, patch)
2011-03-30 06:05 PDT, Mike Hommey [:glandium]
benjamin: review+
Details | Diff | Splinter Review
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script (16.74 KB, patch)
2011-03-30 06:08 PDT, Mike Hommey [:glandium]
benjamin: review-
Details | Diff | Splinter Review
cleanup patch, getting rid of XRE_WANT_DLL_BLOCKLIST (1.74 KB, patch)
2011-03-30 06:09 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
part 0.5 - Move environment sanitization in wmain (3.20 KB, patch)
2011-05-16 01:17 PDT, Mike Hommey [:glandium]
benjamin: review+
ehsan: review-
Details | Diff | Splinter Review
part 1 - Move DLL blocklist in XRE, and inline NS_SetDllDirectory (7.41 KB, patch)
2011-05-16 01:18 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
part 1.5 - Move XRE_GetBinaryPath code in a dedicated class (12.47 KB, patch)
2011-05-16 09:49 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script (13.67 KB, patch)
2011-05-16 09:51 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
part 1 - Move DLL blocklist in XRE, and inline NS_SetDllDirectory and environment sanitization, which now needs to be called manually (11.82 KB, patch)
2011-05-18 11:46 PDT, Mike Hommey [:glandium]
ehsan: review+
Details | Diff | Splinter Review
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script (13.69 KB, patch)
2011-05-18 11:51 PDT, Mike Hommey [:glandium]
ted: review+
taras.mozilla: review+
Details | Diff | Splinter Review
part 1.5 - Move XRE_GetBinaryPath code in a dedicated class. (12.52 KB, patch)
2011-05-20 11:57 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
part 1.5 - Move XRE_GetBinaryPath code in a dedicated class. (12.52 KB, patch)
2011-05-20 12:31 PDT, Mike Hommey [:glandium]
taras.mozilla: review+
Details | Diff | Splinter Review
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script. (13.83 KB, patch)
2011-05-21 23:16 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
part 1 - Move DLL blocklist in XRE, and inline NS_SetDllDirectory and environment sanitization, which now needs to be called manually. (11.28 KB, patch)
2011-05-23 12:09 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
part 1.5 - Move XRE_GetBinaryPath code in a dedicated class. (12.49 KB, patch)
2011-05-23 12:09 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script. (11.55 KB, patch)
2011-05-23 12:10 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script. (11.57 KB, patch)
2011-06-14 22:34 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
rungdb.sh (355 bytes, text/plain)
2011-06-24 07:35 PDT, Benjamin Smedberg [:bsmedberg]
no flags Details

Description Zack Weinberg (:zwol) 2010-03-16 19:05:34 PDT
Bug 550310 observes that the 'firefox' and 'run-mozilla.sh' wrapper scripts add significant startup overhead on systems where they are used (Unix that is not OSX, if I understand correctly).  We should see if we can get rid of them.  The purpose of the wrapper scripts is to tell the dynamic linker where our private libraries live, so here are three possible ways to get rid of them:

1) If all the shared libraries installed by an --enable-libxul build have sufficiently stable public ABIs, we could give them SONAMEs, install them in /usr/lib, install firefox-bin etc. as just plain 'firefox' etc., and throw the wrapper scripts away in an installed configuration.  (We would still need a wrapper script for running out of the build tree, but that's not the case we care about for performance, and that script could be much simpler.)  The SONAMEs we assigned would presumably involve the Gecko version -- it seems to me that we do make pretty strong binary compatibility guarantees within one Gecko version.

2) If we decide we want to go for static linkage of most of those libraries into the 'firefox' binary anyway (bug 525013) we could make firefox-on-top-of-xulrunner the normal release build configuration (I don't know how this works but I know several Linux distributions do it), put xulrunner in /usr/bin, and only worry about binary compatibility for the libraries that are still dynamic (presumably just libxpcom.so at that point, plus system stuff?)

3) If (1) and (2) are not feasible we could look into expanded use of $ORIGIN, which used to be a Solaris-ism but I believe is now quite widely available.  Here's what my manpages say about that:

       $ORIGIN
              ld.so understands the string $ORIGIN (or equivalently ${ORIGIN})
              in  an  rpath specification to mean the directory containing the
              application  executable.  Thus,  an   application   located   in
              somedir/app   could   be  compiled  with  gcc  -Wl,-rpath,'$ORI‐
              GIN/../lib' so that it finds an  associated  shared  library  in
              somedir/lib  no matter where somedir is located in the directory
              hierarchy.

That would have the advantage of not needing the shell script even when running out of the build directory, and not having to worry about expanded binary compatibility constraints, but the disadvantage of maybe not working universally.
Comment 1 Benjamin Smedberg [:bsmedberg] 2010-03-17 08:42:24 PDT
#1 is not an option
The static build and XULRunner are incompatible options
#3 is worth looking into (I think there's already a bug, but we can use this one). I think that would drop support for older versions of ld.so, so we should figure out exactly when it was added and which Linux versions we'd be breaking.

Although I'm surprised that, if we fix the arithmetic in bug 550310, we'd still be seing a lot of overhead on Linux systems: typically execing a single process is very lightweight. Do we have numbers measuring the actual cost?
Comment 2 Mike Hommey [:glandium] 2010-03-17 08:51:50 PDT
FWIW, the only really interesting thing in the firefox and run-mozilla scripts are the -g and --debugger options. The rest is pretty pointless nowadays with the xpcom standalone glue. On Debian, we don't even use these scripts, and have our own wrapper script that only provide -g and --debugger options, on top of some extra sugar for dsp wrappers (esddsp, artsdsp, aoss, etc.), and a check to prevent users to shoot themselves in the foot when running through sudo (yes, there are people doing that and they end up thrashing their profile as a result)
Comment 3 Mike Hommey [:glandium] 2010-03-17 09:06:35 PDT
(In reply to comment #1)
> #3 is worth looking into (I think there's already a bug, but we can use this
> one). I think that would drop support for older versions of ld.so, so we should
> figure out exactly when it was added and which Linux versions we'd be breaking.

According to glibc's git, version 2.0.96, released in 1998.
Comment 4 Zack Weinberg (:zwol) 2010-03-17 09:32:48 PDT
(In reply to comment #1)
> #1 is not an option

Please elaborate.

> The static build and XULRunner are incompatible options

That seems like it should just get fixed.

> Although I'm surprised that, if we fix the arithmetic in bug 550310, we'd still
> be seing a lot of overhead on Linux systems: typically execing a single process
> is very lightweight. Do we have numbers measuring the actual cost?

It's shell startup overhead that I was concerned with.  That's a lot cheaper than it used to be, though.  I'll attach a test program that runs the same minimal just-call-exit binary directly, via /bin/sh, and via /bin/bash.  This is what I get on my computer -- numbers are in seconds:

direct: 0.000165
    sh: 0.000578
  bash: 0.001584

So that's less than two milliseconds of overhead even with the heavyweight shell.
Comment 5 Zack Weinberg (:zwol) 2010-03-17 09:33:24 PDT
Created attachment 433066 [details]
shell overhead benchmark
Comment 6 Zack Weinberg (:zwol) 2010-03-17 09:34:35 PDT
Created attachment 433067 [details]
source for /tmp/true (x86-64 assembly)

Yeah, I wrote a minimal just-exit binary in assembler, because /bin/true itself costs on the order of 600 microseconds, comparable to the numbers we're measuring :-(
Comment 7 Wolfgang Rosenauer [:wolfiR] 2010-03-17 09:58:18 PDT
It seems $ORIGIN is what you want to use.

Speaking of the Linux distributor's perspective a few comments:
1) for Firefox-on-xulrunner that doesn't make a real difference as basically
everything is covered by the glue stuff and xulrunner system registration
2) we still (and probably will always) have a small wrapper script to set up
some environment (but not using run-mozilla.sh since around 8 years for that
(back in mozilla (the suite) times)
3) it's probably possible to get rid of the wrapper script for Thunderbird or
alike if ORIGIN is used

Actually all discussion is basically just about Mozilla.com's release builds
for Linux. While I don't have numbers I'd think that many people are not using
these because their handling is pretty inconvenient.
Comment 8 Benjamin Smedberg [:bsmedberg] 2010-03-18 06:50:52 PDT
We currently have a stable method of doing binary linking (using XPCOM glue dynamic linking). This allows binary extensions to link against libxpcom.so and be compatible with multiple major versions if they are written to do so. Using a versioning scheme which changes the soname at each major release would make this difficult/impossible, and would increase the likelihood of loading two versions of XPCOM in the same process, which is obviously bad.
Comment 9 Mike Hommey [:glandium] 2010-03-18 10:12:31 PDT
Isn't the firefox-bin on mozilla release builds using the standalone glue under the hood ? As such, isn't loading libxpcom.so from the gredir ? Then, the nspr and nss libs are loaded by being listed in dependentlibs.list. Then what exactly is LD_LIBRARY_PATH supposed to add ?
Comment 10 Benjamin Smedberg [:bsmedberg] 2010-03-18 10:47:36 PDT
No, mozilla release builds don't use the standalone glue/xulrunner stub (they aren't firefox+xulrunner, just an all-in-one build).
Comment 11 Ted Mielczarek [:ted.mielczarek] 2010-04-03 12:58:47 PDT
I have a patch in bug 557027 that fixes the Fennec build to use -rpath=$ORIGIN and it works quite nicely on my Ubuntu system. We should just do the same thing for the Firefox build and ditch the wrapper script.
Comment 12 Zack Weinberg (:zwol) 2010-04-05 12:18:26 PDT
FWIW I just did some experiments with a program that lets you change the rpath after the fact (it's called 'patchelf' and it's a little buggy, but not too much for experiments) and it appears that we would need -rpath=$ORIGIN:$ORIGIN/.. for every .so in components/, as well as -rpath=$ORIGIN for the main executable.  I'm not sure.  The difference between DT_RPATH and DT_RUNPATH (which I have not been able to find clear documentation on) may be coming into play here.

Nonetheless, I'm all for this approach.  I would like to know what the implications for a firefox-atop-xulrunner build are, though, since that seems to be how the distros do it.

(In reply to comment #8)
> We currently have a stable method of doing binary linking (using XPCOM glue
> dynamic linking). This allows binary extensions to link against libxpcom.so and
> be compatible with multiple major versions if they are written to do so. Using
> a versioning scheme which changes the soname at each major release would make
> this difficult/impossible, and would increase the likelihood of loading two
> versions of XPCOM in the same process, which is obviously bad.

If it is possible for binary extensions to link against libxpcom.so and be compatible with multiple major versions, then the hypothetical SONAME doesn't need bumping as often as every major release, ne?  I was just being conservative.

(In reply to comment #2)
> FWIW, the only really interesting thing in the firefox and run-mozilla scripts
> are the -g and --debugger options.

As I understand it, the -g/--debugger mode of the script is only necessary because the script exists in the first place -- if there is no script, I can just do "gdb --args ./dist/bin/firefox file://whatever" and it'll work.  Yes/no?

> On Debian, we don't even use these scripts, and have
> our own wrapper script that only provide -g and --debugger options, on top of
> some extra sugar for dsp wrappers (esddsp, artsdsp, aoss, etc.), 

I am completely unfamiliar with dsp wrappers.  Is there something we could be doing internal to the executable that would obviate the need for them?

> and a check to
> prevent users to shoot themselves in the foot when running through sudo (yes,
> there are people doing that and they end up thrashing their profile as a
> result)

The thought of people running firefox with root privileges fills me with nameless dread, but it's their funeral, I suppose.  This appears to be that check: 

if [ "${SUDO_USER}" ] && [ "${SUDO_USER}" != "${USER}" ]; then
    SUDO_HOME=`getent passwd ${SUDO_USER} | cut -f6 -d:`
    if [ "${SUDO_HOME}" = "${HOME}" ]; then
        echo "You shouldn't really run Iceweasel through sudo WITHOUT the -H opt
ion." >&2
        echo "Continuing as if you used the -H option." >&2
        HOME=`getent passwd ${USER} | cut -f6 -d:`
        if [ -z "${HOME}" ]; then
            echo "Could not find the correct home directory. Please use the -H o
ption of sudo." >&2
        fi
    fi
fi

Do you think it makes sense to merge that into early app startup?  Wouldn't be difficult.
Comment 13 Mike Hommey [:glandium] 2010-04-05 23:43:51 PDT
(In reply to comment #12)
> FWIW I just did some experiments with a program that lets you change the rpath
> after the fact (it's called 'patchelf' and it's a little buggy, but not too
> much for experiments) and it appears that we would need
> -rpath=$ORIGIN:$ORIGIN/.. for every .so in components/, as well as
> -rpath=$ORIGIN for the main executable.  I'm not sure.  The difference between
> DT_RPATH and DT_RUNPATH (which I have not been able to find clear documentation
> on) may be coming into play here.

You shouldn't need any rpath on components.

> Nonetheless, I'm all for this approach.  I would like to know what the
> implications for a firefox-atop-xulrunner build are, though, since that seems
> to be how the distros do it.

Much less of a problem, since the standalone glue doesn't need rpath at all.

> (In reply to comment #2)
> As I understand it, the -g/--debugger mode of the script is only necessary
> because the script exists in the first place -- if there is no script, I can
> just do "gdb --args ./dist/bin/firefox file://whatever" and it'll work. 
> Yes/no?

Yes

> > On Debian, we don't even use these scripts, and have
> > our own wrapper script that only provide -g and --debugger options, on top of
> > some extra sugar for dsp wrappers (esddsp, artsdsp, aoss, etc.), 
> 
> I am completely unfamiliar with dsp wrappers.  Is there something we could be
> doing internal to the executable that would obviate the need for them?

dsp wrappers redirect /dev/dsp to the corresponding audio daemon (esd, artsd, pulseaudio, etc.) or in the case of aoss, redirects it on alsa output. I'm not entirely sure this is necessary nowadays for firefox itself, but I guess this is still the case for the main contender: the flash plugin. One of the problems firefox has, and which got worse with the <video> support is that it has several ways it can interact with audio hardware: through libcanberra or libesd from nsISound, through alsa from libsoundfish, and through whatever a plugin might use.

> The thought of people running firefox with root privileges fills me with
> nameless dread, but it's their funeral, I suppose.  This appears to be that
> check: 
> 
> if [ "${SUDO_USER}" ] && [ "${SUDO_USER}" != "${USER}" ]; then
>     SUDO_HOME=`getent passwd ${SUDO_USER} | cut -f6 -d:`
>     if [ "${SUDO_HOME}" = "${HOME}" ]; then
>         echo "You shouldn't really run Iceweasel through sudo WITHOUT the -H
> opt
> ion." >&2
>         echo "Continuing as if you used the -H option." >&2
>         HOME=`getent passwd ${USER} | cut -f6 -d:`
>         if [ -z "${HOME}" ]; then
>             echo "Could not find the correct home directory. Please use the -H
> o
> ption of sudo." >&2
>         fi
>     fi
> fi
> 
> Do you think it makes sense to merge that into early app startup?  Wouldn't be
> difficult.

I think it would be better if that wasn't necessary at all, and if firefox wasn't destroying profiles when it doesn't quite have the right to write it, but has enough rights to destroy it. Though since the switch to places-based bookmarks, it might have gotten better (the worst case being the loss of bookmarks, previously)
Comment 14 Ted Mielczarek [:ted.mielczarek] 2010-04-06 05:58:32 PDT
I'll throw together a candidate patch and we can see what it does.
Comment 15 Ted Mielczarek [:ted.mielczarek] 2010-08-17 12:09:17 PDT
Created attachment 466732 [details] [diff] [review]
Ditch wrapper script, use rpath=$ORIGIN

This should be most of what we need. I guess we'll need a similar fix for Solaris. I know rpath=$ORIGIN is supported there, but I don't know the specifics. I have no idea what BSDs support in this regard.
Comment 16 Mike Hommey [:glandium] 2010-08-17 12:20:15 PDT
It's overkill to a) have everything copied over to the dist directory and b) have all binaries and library include that rpath.

IMHO, it would be better to have that rpath added to the firefox binary and probably libxul.so, too. But only these two.
Comment 17 Ted Mielczarek [:ted.mielczarek] 2010-08-17 12:36:19 PDT
Oops, I intended to do this only |ifdef PROGRAM| so that only binaries would get it. Guess I forgot to add that to the patch.
Comment 18 Mike Hommey [:glandium] 2010-08-17 12:40:06 PDT
IIRC how rpath works, |ifdef PROGRAM| won't be enough, because libxul.so needs to have an rpath too ; except if you link all the libraries to the firefox binary.
Comment 19 Ted Mielczarek [:ted.mielczarek] 2010-08-17 12:45:05 PDT
We're shipping Fennec on Maemo with this exact configuration, and it works fine.
Comment 20 Mike Hommey [:glandium] 2010-08-17 12:57:35 PDT
Confirmed, I just checked, and it works with rpath on the program only. I must have been confused with something else.

Note, however, that the change in js/src would make the js shell also have an rpath, which may not be entirely expected, especially on standalone builds of the js engine.
Comment 21 Ted Mielczarek [:ted.mielczarek] 2010-08-17 13:35:55 PDT
I don't think rpath=$ORIGIN is any worse than anything else. (And the JS shell is more of a testing tool than anything end-users would really use.) We keep config/rules.mk and js/src/config/rules.mk in sync just to preserve our sanity.
Comment 22 Zack Weinberg (:zwol) 2010-08-17 14:11:40 PDT
I don't know if it's *supposed* to be this way, but I've had to use run-mozilla.sh to invoke the js shell in the past, because it wanted some library from dist/lib that it wasn't getting.

Possibly related, the shell pulls in a bunch of NSPR symbols, even though (I was under the impression that) js/ avoided using NSPR:

$ nm --dynamic ./js | grep ' U PR_'
                 U PR_AtomicDecrement
                 U PR_AtomicIncrement
                 U PR_AtomicSet
                 U PR_CreateThread
                 U PR_DestroyCondVar
                 U PR_DestroyLock
                 U PR_FindFunctionSymbol
                 U PR_FindSymbol
                 U PR_GetCurrentThread
                 U PR_IntervalNow
                 U PR_IntervalToMilliseconds
                 U PR_JoinThread
                 U PR_LoadLibraryWithFlags
                 U PR_Lock
                 U PR_NewCondVar
                 U PR_NewLock
                 U PR_NewThreadPrivateIndex
                 U PR_NotifyAllCondVar
                 U PR_NotifyCondVar
                 U PR_SetThreadPrivate
                 U PR_TicksPerSecond
                 U PR_UnloadLibrary
                 U PR_Unlock
                 U PR_WaitCondVar
Comment 23 Ted Mielczarek [:ted.mielczarek] 2010-08-17 17:14:00 PDT
Yeah, we compile spidermonkey as threadsafe, which pulls in some NSPR. (Which is probably the thing from dist/bin it wanted.)
Comment 24 Ted Mielczarek [:ted.mielczarek] 2010-09-02 06:08:07 PDT
Created attachment 471472 [details] [diff] [review]
get rid of firefox shell script
Comment 25 Mike Hommey [:glandium] 2010-09-02 06:34:48 PDT
Just a small note on the binary name.

One of the reasons I'm still providing a firefox-bin executable is that historically, on unix, because of the startup script, some desktop environments would pick firefox-bin as the name of the application they need to restart (in desktop session management, most of the time). I never checked if that was still a concern, but if that is, a symlink could be helpful.
Comment 26 Mike Hommey [:glandium] 2010-09-02 06:36:50 PDT
(In reply to comment #24)
> Created attachment 471472 [details] [diff] [review]
> get rid of firefox shell script

You also need an update in browser/installer/package-manifest.in
Comment 27 Ted Mielczarek [:ted.mielczarek] 2010-09-02 06:44:37 PDT
Yeah, this is just a rebased patch. I was kind of just testing my bzexport extension and had this patch handy. :)
Comment 28 Karl Tomlinson (:karlt) 2010-09-02 14:43:27 PDT
(In reply to comment #25)
> One of the reasons I'm still providing a firefox-bin executable is that
> historically, on unix, because of the startup script, some desktop environments
> would pick firefox-bin as the name of the application they need to restart (in
> desktop session management, most of the time).

gnome_client_set_restart_command tells the session manager the name of the binary, but I guess there could be an issue on systems that don't have libgnomeui.  (There's probably a less GNOME way to do that.)
Comment 29 Zack Weinberg (:zwol) 2010-09-02 16:10:28 PDT
I've been sort of thinking of tackling a GTK+3 update project.  Innn my copious free time.
Comment 30 Ginn Chen 2010-09-02 20:43:35 PDT
Can we run dist/bin/firefox-bin with RPATH on Linux?
It doesn't work on Solaris because RPATH=$ORIGIN doesn't follow symbolic links.
So we have
http://mxr.mozilla.org/mozilla-central/source/build/unix/run-mozilla.sh#300

I agree with Mike Hommey, we should at least leave firefox-bin as a symbolic link.
Comment 31 Ted Mielczarek [:ted.mielczarek] 2010-09-03 06:04:26 PDT
Ginn: no, it can't be a symbolic link. That's why I added the NSDISTMODE = copy in this patch:
https://bugzilla.mozilla.org/attachment.cgi?id=471472&action=diff#a/config/config.mk_sec1

Adding a firefox-bin symlink should be possible. It might be a little tricky because during packaging we explicitly dereference symlinks, but I can probably work something out.
Comment 32 gml4410 2010-09-15 09:41:13 PDT
FWIW:  (since this got mentioned in #573958)

The dynamic loader on Linux (at least) also understands $LIB, which expands to lib or lib64 according to whether it's used in a32- or 64-bi executable.  But, as I write this, I realize that is of more use to LD_LIBRARY_PATH setting (for a script that may call 32- or 64-bit binaries along the way) than an RPATH setting (which is obviously in a fixed-binary setting - that of the executable containing it).

>> The difference between DT_RPATH and DT_RUNPATH (which I have not
>> been able to find clear documentation on)

My understanding of the difference is based on how Solaris used to document it.  One of them, RPATH, was used before LD_LIBRARY_PATH (and hence could not be overridden by it) and the other, RUNPATH, was used after.  At one point (last century) I'm sure I had Solaris executables with *only* RPATH set, that was non-overridable.  Can't reproduce it now, though.  Still have the RPATH-only executables, but LD_LIBRARY_PATH overrides it.

See:

  http://www.mail-archive.com/nix-dev@cs.uu.nl/msg02578.html
Comment 33 Mike Hommey [:glandium] 2010-10-01 02:58:16 PDT
I just remembered an issue we hit a while ago with the wrapper script gone: other gecko application with the wrapper script broke the application without.

For example, you have both thunderbird and firefox. You run thunderbird with the wrapper script and firefox without. This means thunderbird sets MOZILLA_FIVE_HOME. When firefox is not already opened and the user opens an url link from thunderbird with firefox, firefox starts with MOZILLA_FIVE_HOME set and tries to use it.

I don't know if that still applies, but I thought it was worth mentioning.
Comment 34 gml4410 2010-10-01 05:24:29 PDT
Ahhh - another example of global names for app-specific values causing problems (LD_LIBRARY_PATH being the Prime Example).
If Firefox is built to start without a script why would it look for MOZILLA_FIVE_HOME at all?  And if it wants to provide a variable for debug/test then it should use something specific - such as MOZILLA_FIREFOX_HOME.
(and Thunderbird could use MOZILLA_THUNDERBIRD_HOME - you can easily set two variables to have the same value, but you can't set one variable to two different values at the same time).
Comment 35 Mike Hommey [:glandium] 2011-03-17 04:21:00 PDT
Created attachment 519866 [details] [diff] [review]
PoC using the standalone glue

This is a proof of concept of replacing the firefox binary with one using the standalone glue to load libxul.so. This method has two advantages: we don't need to rely on rpath, and it allows to more easily hook things such as bug 632404.

This patch works on Mac and Linux, though on Mac tests fail because of a script doing "find . -maxdepth 4 -type f -name firefox-bin", which fails because firefox-bin, while kept for compatibility, is a symbolic link.

It however fails to link on Windows with the following message:
nsBrowserApp.obj : error LNK2019: unresolved external symbol "void __cdecl NS_SetHasLoadedNewDLLs(void)" (?NS_SetHasLoadedNewDLLs@@YAXXZ) referenced in function "long __stdcall patched_LdrLoadDll(unsigned short *,unsigned long *,struct _UNICODE_STRING *,void * *)" (?patched_LdrLoadDll@@YGJPAGPAKPAU_UNICODE_STRING@@PAPAX@Z)
nsBrowserApp.obj : error LNK2019: unresolved external symbol _NS_SetDllDirectory referenced in function _wmain
firefox.exe : fatal error LNK1120: 2 unresolved externals

This is related to the DLL blacklisting. I wonder if we could initialize it in XRE_main instead of wmain.

Last but not least, I do think the GetBinaryPath function, that is mostly a copy of XRE_GetBinaryPath from toolkit/xre/nsAppRunner.cpp should be put as a helper function with a proper API in the XPCOM standalone glue.
Comment 36 Mike Hommey [:glandium] 2011-03-17 04:59:47 PDT
I'm also getting a small leak during reftest:
nsTraceRefcntImpl::DumpStatistics: 1181 entries
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 396 bytes during test execution
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 3 instances of nsLocalFile with size 124 bytes each (372 bytes total)
TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 3 instances of nsStringBuffer with size 8 bytes each (24 bytes total)
Comment 37 Mike Hommey [:glandium] 2011-03-17 10:11:30 PDT
Created attachment 519935 [details] [diff] [review]
PoC using the standalone glue, v2

Same one, but against the build-system branch, and with the leaks fixed.
Comment 38 Mike Hommey [:glandium] 2011-03-30 06:05:46 PDT
Created attachment 522981 [details] [diff] [review]
part 1 - Move DLL blocklist in XRE, and  inline NS_SetDllDirectory

This solves the linking problems from comment 35, while keeping compatibility with seamonkey, that still relies on XRE_WANT_DLL_BLOCKLIST. Seamonkey would need the same change as what is done to browser/app/nsBrowserApp.cpp, and once this is done, some cleanup to toolkit/xre/nsWindowsDllBlocklist.cpp and toolkit/xre/nsWindowsWMain.cpp can occur (I'll attach the cleanup patch for reference, but that will have to be a followup bug, I guess)
Comment 39 Mike Hommey [:glandium] 2011-03-30 06:08:51 PDT
Created attachment 522982 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script

I do understand that we want to get rid of the glue and the xpcom stub, but the standalone glue currently is the most convenient way to do this. I think we can still remove the glue later, by creating a better API for this dynamic loading type of embedding, which really doesn't need to be very elaborated, but can be a separate bug.
Comment 40 Mike Hommey [:glandium] 2011-03-30 06:09:25 PDT
Created attachment 522983 [details] [diff] [review]
cleanup patch, getting rid of XRE_WANT_DLL_BLOCKLIST
Comment 41 Benjamin Smedberg [:bsmedberg] 2011-05-09 12:16:31 PDT
Comment on attachment 522982 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script

On linux the binary will still be firefox-bin, but on mac it will be firefox?

I really don't like copying the binary-path code like that. Can we move it into a .h-inl file which is included from the couple different places it is needed, or maybe move it into the glue code?
Comment 42 Mike Hommey [:glandium] 2011-05-11 07:11:42 PDT
(In reply to comment #41)
> Comment on attachment 522982 [details] [diff] [review] [review]
> part 2 - Load the XUL library with the standalone glue and get rid of the
> startup shell script
> 
> On linux the binary will still be firefox-bin, but on mac it will be firefox?

It's firefox everywhere, with a compatibility symbolic link kept on linux and mac.

> I really don't like copying the binary-path code like that. Can we move it
> into a .h-inl file which is included from the couple different places it is
> needed, or maybe move it into the glue code?

The problem is that the version in toolkit returns a nsILocalFile while the version in this patch returns a char *. It can't return a nsILocalFile because the glue is not initialized yet, since the path is to be determined exactly to initialize the glue...
Comment 43 Benjamin Smedberg [:bsmedberg] 2011-05-11 07:26:39 PDT
> The problem is that the version in toolkit returns a nsILocalFile while the
> version in this patch returns a char *. It can't return a nsILocalFile
> because the glue is not initialized yet, since the path is to be determined
> exactly to initialize the glue...

Make an intermediate .h-inl function which fills a char* buffer and wrap XRE_GetBinaryPath around it...
Comment 44 Mike Hommey [:glandium] 2011-05-16 01:17:11 PDT
Created attachment 532579 [details] [diff] [review]
part 0.5 - Move environment sanitization in wmain

This moves the code from bug 642469 in wmain instead of doing it on
the first NS_SetDllDirectory call.
Comment 45 Mike Hommey [:glandium] 2011-05-16 01:18:24 PDT
Created attachment 532580 [details] [diff] [review]
part 1 - Move DLL blocklist in XRE, and  inline NS_SetDllDirectory

Refreshed for context changes in toolkit/xre/nsWindowsDllBlocklist.cpp
Comment 46 Mike Hommey [:glandium] 2011-05-16 01:23:23 PDT
Note that we'll need to revert bug 627591 first, but this will be replaced with bug 632404 landing at the same time.
Comment 47 Mike Hommey [:glandium] 2011-05-16 09:49:03 PDT
Created attachment 532674 [details] [diff] [review]
part 1.5 - Move XRE_GetBinaryPath code in a dedicated class

It happens that nothing sets XRE_BINARY_PATH anymore, so I also
removed that part.
Comment 48 Mike Hommey [:glandium] 2011-05-16 09:51:10 PDT
Created attachment 532675 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script

The difference with the previous version is that it relies on part 1.5 instead
of using its own copy of GetBinaryPath.
Comment 49 Benjamin Smedberg [:bsmedberg] 2011-05-16 10:03:50 PDT
Comment on attachment 532579 [details] [diff] [review]
part 0.5 - Move environment sanitization in wmain

I want ehsan to look at this also.
Comment 50 :Ehsan Akhgari 2011-05-17 08:16:10 PDT
Comment on attachment 532579 [details] [diff] [review]
part 0.5 - Move environment sanitization in wmain

Under the new setup, how is wmain executed?  (I need to know who calls it, and which processes does it get run in).

Also, what is the purpose of this patch?  Is it just refactoring stuff?
Comment 52 Mike Hommey [:glandium] 2011-05-17 08:22:04 PDT
wmain is still the first function executed when the program starts (modulo static initializers). That doesn't change. AIUI, all processes run it.
Comment 53 Mike Hommey [:glandium] 2011-05-17 08:30:13 PDT
(In reply to comment #52)
> wmain is still the first function executed when the program starts (modulo
> static initializers). That doesn't change. AIUI, all processes run it.

browser/app/nsBrowserApp.cpp:#include "nsWindowsWMain.cpp"
ipc/app/MozillaRuntimeMain.cpp:#include "nsWindowsWMain.cpp"
ipc/ipdl/test/cxx/app/TestIPDL.cpp:#include "nsWindowsWMain.cpp"
mobile/app/nsBrowserApp.cpp:#include "nsWindowsWMain.cpp"
xulrunner/app/nsXULRunnerApp.cpp:#include "nsWindowsWMain.cpp"
xulrunner/stub/nsXULStub.cpp:#include "nsWindowsWMain.cpp"

this means both the main process and plugin-container run it.
Comment 54 Benjamin Smedberg [:bsmedberg] 2011-05-17 08:55:24 PDT
wmain is the app entry point on Windows: it takes unicode arguments (WCHAR) and converts them to UTF8 so that we can be fully unicode-aware on Windows.
Comment 55 :Ehsan Akhgari 2011-05-17 14:30:01 PDT
(In reply to comment #54)
> wmain is the app entry point on Windows: it takes unicode arguments (WCHAR)
> and converts them to UTF8 so that we can be fully unicode-aware on Windows.

Yeah, I was just checking to make sure that we're not changing that by passing something else besides -ENTRY:wmainCRTStarup to the linker with these patches.
Comment 56 :Ehsan Akhgari 2011-05-17 14:31:59 PDT
Comment on attachment 532579 [details] [diff] [review]
part 0.5 - Move environment sanitization in wmain

We don't want NS_SetDllDirectory or SanitizeEnvironmentVariables to be called as plugin process startup <http://mxr.mozilla.org/mozilla-central/source/ipc/app/MozillaRuntimeMain.cpp#52>.  But we _do_ want SanitizeEnvironmentVariables to be called the first time NS_SetDllDirectory is called.  Which makes this patch wrong.
Comment 57 :Ehsan Akhgari 2011-05-17 14:36:39 PDT
Also, it seems to me that these patches will make NS_SetDllDirectory to be called at plugin process startup, which will break some plugins...
Comment 58 Mike Hommey [:glandium] 2011-05-17 14:43:12 PDT
(In reply to comment #56)
> Comment on attachment 532579 [details] [diff] [review] [review]
> part 0.5 - Move environment sanitization in wmain
> 
> We don't want NS_SetDllDirectory or SanitizeEnvironmentVariables to be
> called as plugin process startup
> <http://mxr.mozilla.org/mozilla-central/source/ipc/app/MozillaRuntimeMain.
> cpp#52>.  But we _do_ want SanitizeEnvironmentVariables to be called the
> first time NS_SetDllDirectory is called.  Which makes this patch wrong.

Mmmmm... How exactly does the following not do what you want?

#ifndef XRE_DONT_PROTECT_DLL_LOAD
  SanitizeEnvironmentVariables();
  mozilla::NS_SetDllDirectory(L"");
#endif
Comment 59 Mike Hommey [:glandium] 2011-05-17 22:40:03 PDT
FWIW, this is a try run including the full stack (this bug + bug 632404 + bug 657297)
http://tbpl.mozilla.org/?tree=Try&rev=8919b890936e
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-8919b890936e/
Comment 60 :Ehsan Akhgari 2011-05-18 11:28:38 PDT
As I said to glandium on IRC, I think we should move SanitizeEnvironmentVariables calls to the callers of NS_SetDllDirectory.
Comment 61 Mike Hommey [:glandium] 2011-05-18 11:46:08 PDT
Created attachment 533349 [details] [diff] [review]
part 1 - Move DLL blocklist in XRE, and inline NS_SetDllDirectory and environment sanitization, which now needs to be called manually

This merges part 0.5 and part 1, because that was going to be messy.
This implements what we discussed on IRC.
Comment 62 Mike Hommey [:glandium] 2011-05-18 11:51:38 PDT
Created attachment 533352 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script

Refreshed
Comment 63 Mike Hommey [:glandium] 2011-05-18 11:58:46 PDT
New Try push for the new patch queue.
http://tbpl.mozilla.org/?tree=Try&rev=72cc4fe65a0d
Comment 65 Marcia Knous [:marcia - use ni] 2011-05-19 16:56:20 PDT
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-72cc4fe65a0d/ is the link to the build in case anyone testing needs it.
Comment 67 juan becerra [:juanb] 2011-05-20 01:24:35 PDT
It took a little while, but I made sure to test all the bugs Ehsan mentioned in the private comment, with the exception of the last one which I couldn't figure out how to check. I found no regressions. Hopefully that's sufficient.
Comment 68 juan becerra [:juanb] 2011-05-20 01:31:00 PDT
Just to be complete, I used the try builds in comment #65 on the OSs in the bugs were reported on.
Comment 71 Mike Hommey [:glandium] 2011-05-20 08:49:31 PDT
I tested the remaining one with success (that is, the try build does the right thing).
Comment 72 Mike Hommey [:glandium] 2011-05-20 10:02:47 PDT
Comment on attachment 532674 [details] [diff] [review]
part 1.5 - Move XRE_GetBinaryPath code in a dedicated class

Tentatively try someone else. Mike or Ted, would you have some time to review this? I'd like to land it this week-end.
Comment 73 Mike Shaver (:shaver -- probably not reading bugmail closely) 2011-05-20 10:16:25 PDT
Comment on attachment 533352 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script

Not qualified to do this one, sorry. Kyle maybe? Taras?
Comment 74 Mike Hommey [:glandium] 2011-05-20 11:57:00 PDT
Created attachment 534048 [details] [diff] [review]
part 1.5 - Move XRE_GetBinaryPath code in a dedicated class.

Updated with review comments from IRC
Comment 75 (dormant account) 2011-05-20 12:26:38 PDT
Comment on attachment 533352 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script

r+ for the C++ bits. This is mostly a trickier platform-independent equivalent of what I did with windows linker switches in bug 627591.
Comment 76 Mike Hommey [:glandium] 2011-05-20 12:31:42 PDT
Created attachment 534067 [details] [diff] [review]
part 1.5 - Move XRE_GetBinaryPath code in a dedicated class.

Brown paper bag fix
Comment 77 (dormant account) 2011-05-20 12:39:12 PDT
Comment on attachment 534067 [details] [diff] [review]
part 1.5 - Move XRE_GetBinaryPath code in a dedicated class.

Looks good
Comment 78 Mike Hommey [:glandium] 2011-05-20 12:41:40 PDT
Comment on attachment 533352 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script

for the build bits.
Comment 79 Ted Mielczarek [:ted.mielczarek] 2011-05-21 09:31:49 PDT
Comment on attachment 533352 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script

Review of attachment 533352 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/app/Makefile.in
@@ +94,1 @@
>  PROGRAM = $(MOZ_APP_NAME)$(BIN_SUFFIX)

I love this so much. Does this need to be fixed separately for other apps? I know I already fixed this for Fennec on Linux a while back, but presumably we ought to fix Thunderbird and maybe XULRunner.

While you're here you should change:
http://mxr.mozilla.org/mozilla-central/source/build/binary-location.mk#46
and probably:
http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/installer/linux/rpm/mozilla.desktop#48

@@ -104,2 @@
>  
> -ifdef BUILD_STATIC_LIBS

This code is all getting removed in bug 648911, but that is a huge patch and probably already bitrotted anyway. :-/

@@ +185,5 @@
>  
>  ifneq (,$(filter-out OS2 WINNT,$(OS_ARCH)))
>  
> +libs:: 
> +	ln -sf $(MOZ_APP_NAME)$(BIN_SUFFIX) $(DIST)/bin/$(MOZ_APP_NAME)-bin$(BIN_SUFFIX)

On Mac, does this survive as a symlink when we do all the silly rsync'ing to the app bundle?

::: xpinstall/packager/Packager.pm
@@ +317,5 @@
>        }
>      }
>      unlink("$destpath$destname$destsuffix") if ( -e "$destpath$destname$destsuffix");
> +    # If source is a symbolic link pointing in the same directory, create a
> +    # symbolic link

So awful, but nice hack.
Comment 80 Mike Hommey [:glandium] 2011-05-21 09:47:26 PDT
(In reply to comment #79)
> Comment on attachment 533352 [details] [diff] [review] [review]
> part 2 - Load the XUL library with the standalone glue and get rid of the
> startup shell script
> 
> Review of attachment 533352 [details] [diff] [review] [review]:
> -----------------------------------------------------------------
> 
> ::: browser/app/Makefile.in
> @@ +94,1 @@
> >  PROGRAM = $(MOZ_APP_NAME)$(BIN_SUFFIX)
> 
> I love this so much. Does this need to be fixed separately for other apps? I
> know I already fixed this for Fennec on Linux a while back, but presumably
> we ought to fix Thunderbird and maybe XULRunner.

I was planning to file a followup bug to have a generic app launcher, actually. There's no reason why we need to continue to duplicate that initialization code in mobile/app, xulrunner/app, suite/app, mail/app, and other places.

> While you're here you should change:
> http://mxr.mozilla.org/mozilla-central/source/build/binary-location.mk#46
> and probably:
> http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/installer/
> linux/rpm/mozilla.desktop#48

That wouldn't be safe until we fix other applications, I believe.

> @@ -104,2 @@
> >  
> > -ifdef BUILD_STATIC_LIBS
> 
> This code is all getting removed in bug 648911, but that is a huge patch and
> probably already bitrotted anyway. :-/
> 
> @@ +185,5 @@
> >  
> >  ifneq (,$(filter-out OS2 WINNT,$(OS_ARCH)))
> >  
> > +libs:: 
> > +	ln -sf $(MOZ_APP_NAME)$(BIN_SUFFIX) $(DIST)/bin/$(MOZ_APP_NAME)-bin$(BIN_SUFFIX)
> 
> On Mac, does this survive as a symlink when we do all the silly rsync'ing to
> the app bundle?

It looks like so, as it is a symlink in the .dmg archive.
Comment 81 Mike Hommey [:glandium] 2011-05-21 11:07:11 PDT
(In reply to comment #80)
> It looks like so, as it is a symlink in the .dmg archive.

Sadly, the updater doesn't seem to support symbolic links and simply removes the file :(
I think the best short term option would be to copy firefox to firefox-bin (it's only a few dozens kilobytes).
Comment 82 Ted Mielczarek [:ted.mielczarek] 2011-05-21 17:45:21 PDT
(In reply to comment #80)
> > While you're here you should change:
> > http://mxr.mozilla.org/mozilla-central/source/build/binary-location.mk#46
> > and probably:
> > http://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/installer/
> > linux/rpm/mozilla.desktop#48
> 
> That wouldn't be safe until we fix other applications, I believe.

Okay, make sure you remember this bit in the followup.

(In reply to comment #81)
> (In reply to comment #80)
> > It looks like so, as it is a symlink in the .dmg archive.
> 
> Sadly, the updater doesn't seem to support symbolic links and simply removes
> the file :(
> I think the best short term option would be to copy firefox to firefox-bin
> (it's only a few dozens kilobytes).

Okay, I can deal with that.
Comment 83 Mike Hommey [:glandium] 2011-05-21 23:16:58 PDT
Created attachment 534275 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script.

As I'll be landing (added a comment discussed with Taras on irc, and replaced ln -sf with cp -p)
Comment 85 Mike Hommey [:glandium] 2011-05-22 08:54:09 PDT
Backed out because of leak test bustage
http://hg.mozilla.org/mozilla-central/rev/aa83abd4fd01
Comment 86 Mike Hommey [:glandium] 2011-05-22 23:39:27 PDT
So, what went wrong is that while most stuff work, a few of our leak reporting is happening with atexit(). Except that at the time the process exits, the function that was registered is nowhere, since libxul is unloaded. This is why, e.g. we don't unload gconf in the system pref service.

Interestingly, this only affected linux debug builds.

The chromium code comes with an AtExitManager which would allow to avoid this problem.
Comment 87 Mike Hommey [:glandium] 2011-05-23 02:20:03 PDT
(In reply to comment #86)
> Interestingly, this only affected linux debug builds.

And this is as such because on top of the atexit problem, the malloc/free/... wrappers from nsTraceMalloc.c are not used.
Comment 88 Mike Hommey [:glandium] 2011-05-23 02:45:16 PDT
And to be more specific, only trace malloc is broken. Other leak tests were still working (xpcom bloat log, string stats...)
Comment 89 Mike Hommey [:glandium] 2011-05-23 12:09:00 PDT
Created attachment 534508 [details] [diff] [review]
part 1 - Move DLL blocklist in XRE, and inline NS_SetDllDirectory and environment sanitization, which now needs to be called manually.

Refreshed after bug 648911
Comment 90 Mike Hommey [:glandium] 2011-05-23 12:09:45 PDT
Created attachment 534509 [details] [diff] [review]
part 1.5 - Move XRE_GetBinaryPath code in a dedicated class.

Refreshed after bug 648911
Comment 91 Mike Hommey [:glandium] 2011-05-23 12:10:44 PDT
Created attachment 534510 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script.

Refreshed after bug 648911
Comment 92 Mike Hommey [:glandium] 2011-06-14 22:34:48 PDT
Created attachment 539437 [details] [diff] [review]
part 2 - Load the XUL library with the standalone glue and get rid of the startup shell script.

Refreshed (context changes)
Comment 94 Matt Brubeck (:mbrubeck) 2011-06-15 08:49:56 PDT
I merged this from m-i to m-c, but it will be backed out shortly while we investigate Talos regressions:
http://hg.mozilla.org/mozilla-central/rev/7b390fa15157
http://hg.mozilla.org/mozilla-central/rev/46974e5278ff
http://hg.mozilla.org/mozilla-central/rev/4a319872b760
Comment 97 Ginn Chen 2011-06-20 02:09:01 PDT
After this fix, I have to export NSDISTMODE=copy to compile firefox, otherwise it won't work.

See comment #31.
Comment 98 Mike Hommey [:glandium] 2011-06-20 02:13:29 PDT
(In reply to comment #97)
> After this fix, I have to export NSDISTMODE=copy to compile firefox,
> otherwise it won't work.
> 
> See comment #31.

This is unlikely to be the same thing as comment #31. Please file a followup bug with the error you get when you build without NSDISTMODE=copy.
Comment 99 Boris Zbarsky [:bz] 2011-06-23 14:56:12 PDT
This looks like it broke running with -g, unfortunately.  Was that an expected outcome?
Comment 100 Mike Hommey [:glandium] 2011-06-23 16:31:24 PDT
Yes, you either run it through run-mozilla.sh, or, like any other program, just run with gdb --args. We need to update the wiki.
Comment 101 Boris Zbarsky [:bz] 2011-06-23 19:33:30 PDT
Both of those solutions fail if you have a script that sets up the right profile and runs firefox with the given args (and you need to do profile setup because otherwise we stomp all over ourselves when multiple versions of firefox are involved).   You used to be able to pass -g to the script, it would pass all its args to firefox, and things would Just Work.  Now it's way more complicated, since the -g needs to be passed to run-mozilla.sh but the other args need to be passed to firefox...

Given that this is a startup performance improvement, would it make any sense to allow _debug_ builds to continue using the old setup?  It's not like we're worried about startup perf there.
Comment 102 Zack Weinberg (:zwol) 2011-06-23 20:16:40 PDT
We could stick code at the very beginning of main (or possibly XRE_main, not sure) that scans argv for "-g" and "-d" and if it finds them, splices them out and execs the debugger.
Comment 103 Mike Hommey [:glandium] 2011-06-23 20:38:58 PDT
(In reply to comment #102)
> We could stick code at the very beginning of main (or possibly XRE_main, not
> sure) that scans argv for "-g" and "-d" and if it finds them, splices them
> out and execs the debugger.

That'd be main(), but I'm not a big fan of keeping that legacy option. Maybe we should bring this up on dev-platform?
Comment 104 Benjamin Smedberg [:bsmedberg] 2011-06-24 07:35:50 PDT
Created attachment 541679 [details]
rungdb.sh

This script will run gdb like -g does:

rungdb.sh dist/bin/firefox -profile ~/test-profile -no-remote
Comment 105 Mike Hommey [:glandium] 2011-06-24 18:42:53 PDT
(In reply to comment #104)
> Created attachment 541679 [details]
> rungdb.sh
> 
> This script will run gdb like -g does:
> 
> rungdb.sh dist/bin/firefox -profile ~/test-profile -no-remote

gdb --args dist/bin/firefox -profile ~/test-profile -no-remote
works just as well ;)

But bz's usecase is more subtle than that...

Note You need to log in before you can comment on or make changes to this bug.