Last Comment Bug 632404 - Preload Firefox libraries at startup
: Preload Firefox libraries at startup
Status: RESOLVED FIXED
:
Product: Firefox
Classification: Client Software
Component: Build Config (show other bugs)
: unspecified
: All Linux
: -- normal with 4 votes (vote)
: Firefox 7
Assigned To: Mike Hommey [:glandium]
:
Mentors:
http://glandium.org/blog/?p=1719
: 554421 (view as bug list)
Depends on: 552864 853212 975634
Blocks: 659028 627591 657297 665816 673900
  Show dependency treegraph
 
Reported: 2011-02-08 08:04 PST by Mike Hommey [:glandium]
Modified: 2014-02-21 17:17 PST (History)
32 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
"Preload" *.so before launching firefox-bin (396 bytes, patch)
2011-02-08 08:04 PST, Mike Hommey [:glandium]
taras.mozilla: review+
Details | Diff | Review
ELF preloading (3.02 KB, text/plain)
2011-02-10 02:42 PST, Mike Hommey [:glandium]
taras.mozilla: review-
Details
WIP for Linux and OSX (7.84 KB, text/plain)
2011-02-15 10:37 PST, Mike Hommey [:glandium]
no flags Details
Preload dependent libraries at startup on Mac and Linux (21.86 KB, patch)
2011-02-22 11:43 PST, Mike Hommey [:glandium]
taras.mozilla: review-
Details | Diff | Review
Preload dependent libraries at startup on Mac and Linux, v2 (21.86 KB, patch)
2011-03-03 01:48 PST, Mike Hommey [:glandium]
no flags Details | Diff | Review
Preload dependent libraries at startup on Mac and Linux, v3 (7.42 KB, patch)
2011-03-17 05:06 PDT, Mike Hommey [:glandium]
taras.mozilla: review+
Details | Diff | Review
Preload dependent libraries at startup on Mac and Linux, v4 (7.33 KB, patch)
2011-03-30 06:11 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Review
Preload dependent libraries at startup on Mac and Linux, v4.1 (7.32 KB, patch)
2011-03-30 06:22 PDT, Mike Hommey [:glandium]
taras.mozilla: review+
Details | Diff | Review
Preload dependent libraries at startup on Mac and Linux, v4.2 (8.91 KB, patch)
2011-03-31 09:56 PDT, Mike Hommey [:glandium]
mh+mozilla: review+
Details | Diff | Review
Preload dependent libraries at startup. (15.00 KB, patch)
2011-05-16 01:30 PDT, Mike Hommey [:glandium]
benjamin: review+
Details | Diff | Review
Preload dependent libraries at startup. (14.96 KB, patch)
2011-06-14 22:36 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Review
Preload dependent libraries at startup. (14.95 KB, patch)
2011-06-16 00:46 PDT, Mike Hommey [:glandium]
no flags Details | Diff | Review

Description Mike Hommey [:glandium] 2011-02-08 08:04:52 PST
Created attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

This is roughly the same idea as bug 627591, except that it's for UNIX. Something as simple as that wouldn't work on OSX, since the libraries on OSX are now just too big for that to be a win because they are universal binaries.

This gives a 20%+ startup time improvement on a rather slow disk by today's standards. We can expect even more improvement on faster disks.

Anyways, I am about to blog about quantified benefits, and why it just works so well. But I wanted to point the patch from the blog post, so I needed to open the bug first.
Comment 1 Mike Hommey [:glandium] 2011-02-08 10:44:11 PST
Corresponding blog post is http://glandium.org/blog/?p=1719

Note that we could hook that in run-mozilla.sh instead of mozilla.in, but it feels more right (less wrong, even) to do that in mozilla.in, since run-mozilla.sh is used for other things.
Comment 2 Benjamin Smedberg [:bsmedberg] 2011-02-08 11:59:27 PST
Comment on attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

This feels so wrong.
Comment 3 (dormant account) 2011-02-08 12:00:54 PST
Comment on attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

It is morally wrong, but until we reorganize our binary better, this is good stopgap
Comment 4 Mike Hommey [:glandium] 2011-02-08 13:10:20 PST
Comment on attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

This is an ugly hack, but is also very simple, and it ends up being pretty efficient because of how much the kernel actually reads from the binaries, since bug 603370 couldn't make it in time.

FYI, a try run gives a ts of 502.11 on linux x86, which, looking at the last numbers on m-c, doesn't seem like a regression.
Comment 5 David Baron :dbaron: ⌚️UTC-7 (review requests must explain patch) 2011-02-09 15:40:53 PST
Comment on attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

a2.0=dbaron

(Will this hurt startup time on debug builds?  If so, is there an easy way to disable it for debug builds?)
Comment 6 Mike Hommey [:glandium] 2011-02-10 00:22:06 PST
Good point, it will hurt builds with debugging symbols, that make the libxul.so in the order of hundreds of megabytes. Unfortunately, there are many ways the build can include debugging symbols :(
Comment 7 Mike Hommey [:glandium] 2011-02-10 02:42:41 PST
Created attachment 511329 [details]
ELF preloading

Replacing cat with this little program makes it slightly faster because calling readahead() once is faster than calling read() several, without counting the avoidance of memory copies. It also has the advantage of being aware of the elf headers and will thus avoid preloading debugging sections, which are not included in PT_LOAD headers.

I don't know where to stick this in the tree, though.
Comment 8 Landry Breuil (:gaston) 2011-02-10 03:01:00 PST
That code won't build on OpenBSD as-is as we don't have elf.h, the closest i'd find which defines those types is libelf/sys_elf.h from libelf package. And it still doesn't build, because posix_fadvise() doesn't exist.

I appreciate the fact that you want to replace cat *.so>/dev/null by a smarter thing, but please make it portable ! :)
Comment 9 Mike Hommey [:glandium] 2011-02-10 03:05:34 PST
(In reply to comment #8)
> That code won't build on OpenBSD as-is as we don't have elf.h, the closest i'd
> find which defines those types is libelf/sys_elf.h from libelf package. And it
> still doesn't build, because posix_fadvise() doesn't exist.

What do you have on openbsd that could replace it ?
Comment 10 Mike Hommey [:glandium] 2011-02-10 03:15:33 PST
Note that I was thinking about making it build on Linux only for now, as this would need testing on other platforms to know if that works.

The benefits are well known on Linux, because of the toolchain limitations, and we know for sure that there is not going to be a downside. That can't be said from platforms with a different toolchain.
Comment 11 Mike Hommey [:glandium] 2011-02-10 03:16:34 PST
(I wouldn't even guarantee that cat *.so > /dev/null would work as well on other platforms)
Comment 12 Landry Breuil (:gaston) 2011-02-10 04:07:05 PST
I can't gather startup times as your startup.xpi extension (from your website or https://bugzilla.mozilla.org/show_bug.cgi?id=593743) makes firefox stuck when starting, but 'cat *.so > /dev/null just before run_mozilla.sh call works here. I won't judge if it makes things better or not, question of feeling.

As for a replacement for posix_madvise(), as it's apparently supposed to be portable it's our fault if we don't have the corresponding empty stub in libc.
Comment 13 Mike Hommey [:glandium] 2011-02-10 04:21:08 PST
(In reply to comment #12)
> I can't gather startup times as your startup.xpi extension (from your website
> or https://bugzilla.mozilla.org/show_bug.cgi?id=593743) makes firefox stuck
> when starting

That doesn't make much sense, the extension doesn't do anything until you go to about:startup. But feel free to ping me on irc if you want some help debugging that.

> but 'cat *.so > /dev/null just before run_mozilla.sh call works
> here. I won't judge if it makes things better or not, question of feeling.
>
> As for a replacement for posix_madvise(), as it's apparently supposed to be
> portable it's our fault if we don't have the corresponding empty stub in libc.

That's fadvise, not madvise, and apparently, a lot of unices don't have it.
Comment 14 Landry Breuil (:gaston) 2011-02-10 04:56:44 PST
(In reply to comment #13)
> (In reply to comment #12)
> > I can't gather startup times as your startup.xpi extension (from your website
> > or https://bugzilla.mozilla.org/show_bug.cgi?id=593743) makes firefox stuck
> > when starting
> 
> That doesn't make much sense, the extension doesn't do anything until you go to
> about:startup. But feel free to ping me on irc if you want some help debugging
> that.

Oh, nevermind, the CalculateProcessCreationTimestamp() code in nsAppStartup.cpp is conditional for linux|android, win and macosx. On OpenBSD, it just returns NaN, and your extensions works on a fresh profile. I'll try to come with the missing code....
Comment 15 Landry Breuil (:gaston) 2011-02-10 07:38:37 PST
Posted the missing code in https://bugzilla.mozilla.org/show_bug.cgi?id=633193

Without the cat *.so (btw, we need cat $dist_bin/*.so.* , as libs have a so version here)
main	124
sessionRestored	1146
firstPaint	1345

main	126
sessionRestored	955
firstPaint	1150

with it:
main	140
sessionRestored	960
firstPaint	1147

main	136
sessionRestored	975
firstPaint	1177

So not a big difference..
Comment 16 Mike Hommey [:glandium] 2011-02-10 07:44:06 PST
(In reply to comment #15)
> Posted the missing code in https://bugzilla.mozilla.org/show_bug.cgi?id=633193
> 
> Without the cat *.so (btw, we need cat $dist_bin/*.so.* , as libs have a so
> version here)

(does that come from a change of yours or is that how the libs are named on *bsd ?)

> main    124
> sessionRestored    1146
> firstPaint    1345
> 
> main    126
> sessionRestored    955
> firstPaint    1150
> 
> with it:
> main    140
> sessionRestored    960
> firstPaint    1147
> 
> main    136
> sessionRestored    975
> firstPaint    1177
> 
> So not a big difference..

Seeing your "main" numbers, I'd say you are testing warn startup, not cold startup.
Comment 17 Mike Hommey [:glandium] 2011-02-10 07:44:21 PST
(or you are using a ssd)
Comment 18 Landry Breuil (:gaston) 2011-02-10 08:05:44 PST
(In reply to comment #16)
> (In reply to comment #15)
> > Posted the missing code in https://bugzilla.mozilla.org/show_bug.cgi?id=633193
> > 
> > Without the cat *.so (btw, we need cat $dist_bin/*.so.* , as libs have a so
> > version here)
> 
> (does that come from a change of yours or is that how the libs are named on
> *bsd ?)

That's how our ports infrastructure deal with shared libraries, each of them have a so_version set by the port Makefile.

And i have patches to the build infrastructure that makes it take that environment variable into account:
patches/patch-configure_in:+    DLL_SUFFIX=".so.${SO_VERSION}"
patches/patch-js_src_configure_in:+    DLL_SUFFIX=".so.${SO_VERSION}"

> 
> > main    124
> > sessionRestored    1146
> > firstPaint    1345
> > 
> > main    126
> > sessionRestored    955
> > firstPaint    1150
> > 
> > with it:
> > main    140
> > sessionRestored    960
> > firstPaint    1147
> > 
> > main    136
> > sessionRestored    975
> > firstPaint    1177
> > 
> > So not a big difference..
> 
> Seeing your "main" numbers, I'd say you are testing warn startup, not cold
> startup.

cold startup as in 'kill X session' ? I only had one instance of ffx running for each test, closed and reopened several times. And no, it's not on a ssd, but on fast sata disks with ahci.
Comment 19 Mike Hommey [:glandium] 2011-02-10 08:17:42 PST
(In reply to comment #18)
> (In reply to comment #16)
> > (In reply to comment #15)
> > > Posted the missing code in https://bugzilla.mozilla.org/show_bug.cgi?id=633193
> > > 
> > > Without the cat *.so (btw, we need cat $dist_bin/*.so.* , as libs have a so
> > > version here)
> > 
> > (does that come from a change of yours or is that how the libs are named on
> > *bsd ?)
> 
> That's how our ports infrastructure deal with shared libraries, each of them
> have a so_version set by the port Makefile.
> 
> And i have patches to the build infrastructure that makes it take that
> environment variable into account:
> patches/patch-configure_in:+    DLL_SUFFIX=".so.${SO_VERSION}"
> patches/patch-js_src_configure_in:+    DLL_SUFFIX=".so.${SO_VERSION}"

Then I guess you'll have to adjust the mozilla.in file yourself.

> > > main    124
> > > sessionRestored    1146
> > > firstPaint    1345
> > > 
> > > main    126
> > > sessionRestored    955
> > > firstPaint    1150
> > > 
> > > with it:
> > > main    140
> > > sessionRestored    960
> > > firstPaint    1147
> > > 
> > > main    136
> > > sessionRestored    975
> > > firstPaint    1177
> > > 
> > > So not a big difference..
> > 
> > Seeing your "main" numbers, I'd say you are testing warn startup, not cold
> > startup.
> 
> cold startup as in 'kill X session' ? I only had one instance of ffx running
> for each test, closed and reopened several times. And no, it's not on a ssd,
> but on fast sata disks with ahci.

cold startup as in 'kernel page cache flushed from anything related to firefox', also known as 'reboot'
Comment 20 (dormant account) 2011-02-11 09:44:10 PST
Comment on attachment 511329 [details]
ELF preloading

make this a library
Comment 21 Mike Hommey [:glandium] 2011-02-15 10:37:06 PST
Created attachment 512524 [details]
WIP for Linux and OSX
Comment 22 Mike Hommey [:glandium] 2011-02-16 08:21:26 PST
(In reply to comment #20)
> Comment on attachment 511329 [details]
> ELF preloading
> 
> make this a library

So, it turns out this doesn't work, neither on Linux, nor on OSX, both for roughly the same reason: the dynamic linker loads and relocates all libraries before running init functions, which makes some sense, since these functions may be calling functions from other libraries (though that could be better than it currently is).

Which means: we need a wrapper on all platforms.
Comment 23 Mike Hommey [:glandium] 2011-02-22 11:43:45 PST
Created attachment 514237 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux

This implements the whole thing as a wrapper binary for Mac and Linux.

On Mac, this provokes a slight ts regression, especially on x86_64, but this is the price to pay on a system that doesn't have a readahead() system call.
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=657c2a92ee2b&newRev=b11f384dd664&tests=ts&submit=true

With some more tweaks, it might be possible to regress a bit less. (FWIW, madvise doesn't help at all (on the contrary), and it might be possible to avoid a bunch more page faults, which seem to be rather killing on mac, contrary to linux)

This takes care to load only the relevant parts of the binaries, and I verified with dtrace that most of these parts we load were already actually loaded by force of page faulting at random places in the binaries. (only could verify on x86, though)
Comment 24 (dormant account) 2011-02-22 17:46:15 PST
Comment on attachment 514237 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux

>+ifneq (,$(filter Linux Darwin,$(OS_TARGET)))
>+PROGRAM = $(MOZ_APP_NAME)-bin-real$(BIN_SUFFIX)


>+#ifdef MOZ_PRELOAD
>+@BINPATH@/@MOZ_APP_NAME@-bin-real
>+#endif
> @BINPATH@/@MOZ_APP_NAME@-bin
> @BINPATH@/@MOZ_APP_NAME@
> #endif

use -app or -main or something other than -real.

Don't need to copy

>+void
>+run(char *argv[], char *env[])
>+{
>+#ifdef XP_MACOSX
>+    cpu_type_t cpu_type = CPU_TYPE;
>+    size_t count;
>+    posix_spawnattr_t spawnattr;
>+    if (posix_spawnattr_init(&spawnattr) != 0)
>+        return;
>+
>+    if ((posix_spawnattr_setflags(&spawnattr, POSIX_SPAWN_SETEXEC) == 0) &&
>+        (posix_spawnattr_setbinpref_np(&spawnattr, 1, &cpu_type, &count) == 0))
>+        posix_spawnp(NULL, argv[0], NULL, &spawnattr, argv, env);
>+
>+    posix_spawnattr_destroy(&spawnattr);
>+#else
>+    execv(argv[0], argv);
>+#endif
>+}

i'd do
#ifdef XP_MACOSX
...
#else 
#define run execv

run() should be static

>+            return 255;

all of these 255s should be constants

>+    /* Partly stolen from nsXPCOMGlue.cpp */
>+    f = fopen(filePath, "r");
>+    if (f) {

if the  if(!f) return ... error pattern here.


>+    struct mach_header *mh = (struct mach_header *)base;
>+    if (mh->magic != MH_MAGIC)
>+        goto cleanup;
>+
>+    char *cmd = &base[sizeof(struct mach_header)];
>+    uint32_t ncmds = mh->ncmds;
>+    off_t end = 0;
>+    for (; ncmds; ncmds--) {
>+        struct segment_command *sh = (struct segment_command *)cmd;
>+        if (sh->cmd != LC_SEGMENT)
>+            continue;
>+        if (end < sh->fileoff + sh->filesize)
>+            end = sh->fileoff + sh->filesize;
>+        cmd += sh->cmdsize;
>+    }
>+    if (end > 0) {
>+        int i = 0;
>+        volatile char *tmp = base;
>+        if (end >= 32768)
>+            for (; i < 32768 - ((base - buf) % 32768); i += 4096)
>+                tmp[i];
>+        for (; i < end; i += 32768)
>+            tmp[i];

use a constant for 32768, 4096. % is bad style for binary numbers, I think you want &

I'm surprised the compiler doesn't optimize away tmp[i], even with volatile in there.
>+void
>+preload(const char *file)
>+{
>+    char buf[BUFSIZE];
>+    Elf_Ehdr *ehdr = (Elf_Ehdr *)buf;

Shouldn't this be a union?

>+    Elf_Phdr *phdr;
>+    Elf_Off end = 0;
>+    int phnum;
>+    int fd = open(file, O_RDONLY);
>+    if (fd < 0)
>+        return;
>+    if ((read(fd, buf, BUFSIZE) <= 0) ||
>+        (memcmp(buf, ELFMAG, 4))) {
>+        close(fd);
>+        return;
>+    }
>+    if ((ehdr->e_ident[EI_CLASS] != ELFCLASS) ||
>+        (ehdr->e_phoff + ehdr->e_phentsize * ehdr->e_phnum >= BUFSIZE)) {
>+        close(fd);
>+        return;
>+    }

In theory you could combine those two ifs :)

How much IO is caused by scanning the elf header?

>+
>+#ifndef PRELOAD_H
>+#define PRELOAD_H
>+#include <limits.h>
>+#ifdef XP_MACOSX
>+#include <mach/machine.h>
>+
>+#if defined(__i386__)
>+#define CPU_TYPE CPU_TYPE_X86
>+#elif defined(__x86_64__)
>+#define CPU_TYPE CPU_TYPE_X86_64
>+#elif defined(__ppc__)
>+#define CPU_TYPE CPU_TYPE_POWERPC
>+#elif defined(__ppc64__)
>+#define CPU_TYPE CPU_TYPE_POWERPC64
>+#else
>+#error Unsupported CPU type
>+#endif
>+
>+#endif /* XP_MACOSX */

At least part of above is redundant. PPC is obsolete on darwin.

>+
>+/* Stolen from nsXPCOMPrivate.h */
>+#ifndef MAXPATHLEN
>+#ifdef PATH_MAX
>+#define MAXPATHLEN PATH_MAX
>+#elif defined(_MAX_PATH)
>+#define MAXPATHLEN _MAX_PATH
>+#else
>+#define MAXPATHLEN 1024
>+#endif
>+#endif

I'm not sure about this. Seems like you should just be able to rely on MAXPATHLEN since the code is already linux-specific.

>+extern "C" void
>+preload(const char *file);
>+
> static void
> ReadDependentCB(const char *aDependentLib)
> {
>+#ifdef LINUX
>+    preload(aDependentLib);
>+#endif
>     void *libHandle = dlopen(aDependentLib, RTLD_GLOBAL | RTLD_LAZY);
>     if (!libHandle)
>         return;
> 
>     AppendDependentLib(libHandle);
> }


> 
> nsresult
>diff --git a/xpcom/glue/standalone/nsGlueLinkingOSX.cpp b/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
>--- a/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
>+++ b/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
>@@ -42,19 +42,23 @@
> #include <mach-o/dyld.h>
> #include <sys/param.h>
> #include <stdlib.h>
> #include <string.h>
> #include <stdio.h>
> 
> static const mach_header* sXULLibImage;
> 
>+extern "C" void
>+preload(const char *file);
>+
> static void
> ReadDependentCB(const char *aDependentLib)
> {
>+    preload(aDependentLib);
>     (void) NSAddImage(aDependentLib,
>                       NSADDIMAGE_OPTION_RETURN_ON_ERROR |
>                       NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME);
> }
> 
> static void*
> LookupSymbol(const mach_header* aLib, const char* aSymbolName)
> {

I assume this is for .so files loaded by firefox. Is this really a win? Most of the .so files are either small or mostly-unused. I'd like more data on this change.
Comment 25 Mike Hommey [:glandium] 2011-02-23 00:04:03 PST
(In reply to comment #24)
> Comment on attachment 514237 [details] [diff] [review]
> Preload dependent libraries at startup on Mac and Linux
> 
> >+ifneq (,$(filter Linux Darwin,$(OS_TARGET)))
> >+PROGRAM = $(MOZ_APP_NAME)-bin-real$(BIN_SUFFIX)
> 
> 
> >+#ifdef MOZ_PRELOAD
> >+@BINPATH@/@MOZ_APP_NAME@-bin-real
> >+#endif
> > @BINPATH@/@MOZ_APP_NAME@-bin
> > @BINPATH@/@MOZ_APP_NAME@
> > #endif
> 
> use -app or -main or something other than -real.
> 
> Don't need to copy

What do you mean?

> use a constant for 32768, 4096. % is bad style for binary numbers, I think you
> want &

I used % 32768 because it's more readable than & ~(32768 - 1) (which, using a constant, would be mandatory, except if using 2 constants, in which case we'd rely on anyone modifying one to modify both at the same time)

> How much IO is caused by scanning the elf header?

One readahead, so 128K, but they are part of what's going to be read anyways. They're obviously not read twice.

> 
> >+
> >+#ifndef PRELOAD_H
> >+#define PRELOAD_H
> >+#include <limits.h>
> >+#ifdef XP_MACOSX
> >+#include <mach/machine.h>
> >+
> >+#if defined(__i386__)
> >+#define CPU_TYPE CPU_TYPE_X86
> >+#elif defined(__x86_64__)
> >+#define CPU_TYPE CPU_TYPE_X86_64
> >+#elif defined(__ppc__)
> >+#define CPU_TYPE CPU_TYPE_POWERPC
> >+#elif defined(__ppc64__)
> >+#define CPU_TYPE CPU_TYPE_POWERPC64
> >+#else
> >+#error Unsupported CPU type
> >+#endif
> >+
> >+#endif /* XP_MACOSX */
> 
> At least part of above is redundant. PPC is obsolete on darwin.

How redundant ? Note that some people are supporting OSX PPC, 4 lines is not going to bloat us while making their life easier.

> >+
> >+/* Stolen from nsXPCOMPrivate.h */
> >+#ifndef MAXPATHLEN
> >+#ifdef PATH_MAX
> >+#define MAXPATHLEN PATH_MAX
> >+#elif defined(_MAX_PATH)
> >+#define MAXPATHLEN _MAX_PATH
> >+#else
> >+#define MAXPATHLEN 1024
> >+#endif
> >+#endif
> 
> I'm not sure about this. Seems like you should just be able to rely on
> MAXPATHLEN since the code is already linux-specific.

MAXPATHLEN is used in a non-linux specific part of the code. This could be moved to main.c, though, since it's only used there.

> >+extern "C" void
> >+preload(const char *file);
> >+
> > static void
> > ReadDependentCB(const char *aDependentLib)
> > {
> >+#ifdef LINUX
> >+    preload(aDependentLib);
> >+#endif
> >     void *libHandle = dlopen(aDependentLib, RTLD_GLOBAL | RTLD_LAZY);
> >     if (!libHandle)
> >         return;
> > 
> >     AppendDependentLib(libHandle);
> > }
> 
> 
> > 
> > nsresult
> >diff --git a/xpcom/glue/standalone/nsGlueLinkingOSX.cpp b/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
> >--- a/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
> >+++ b/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
> >@@ -42,19 +42,23 @@
> > #include <mach-o/dyld.h>
> > #include <sys/param.h>
> > #include <stdlib.h>
> > #include <string.h>
> > #include <stdio.h>
> > 
> > static const mach_header* sXULLibImage;
> > 
> >+extern "C" void
> >+preload(const char *file);
> >+
> > static void
> > ReadDependentCB(const char *aDependentLib)
> > {
> >+    preload(aDependentLib);
> >     (void) NSAddImage(aDependentLib,
> >                       NSADDIMAGE_OPTION_RETURN_ON_ERROR |
> >                       NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME);
> > }
> > 
> > static void*
> > LookupSymbol(const mach_header* aLib, const char* aSymbolName)
> > {
> 
> I assume this is for .so files loaded by firefox. Is this really a win? Most of
> the .so files are either small or mostly-unused. I'd like more data on this
> change.

This does the same as the firefox-bin wrapper, but for the xulrunner stub (used on firefox-on-xulrunner setups). On the long term, we should switch to the standalone glue in the firefox binary, and remove the wrapper. I have a WIP for that, but that wouldn't be accepted at this time of the release process (asked bsmedberg last week).
Comment 26 Mike Hommey [:glandium] 2011-02-23 00:26:00 PST
> >+    /* Partly stolen from nsXPCOMGlue.cpp */
> >+    f = fopen(filePath, "r");
> >+    if (f) {
> 
> if the  if(!f) return ... error pattern here.

The point here was to keep launching firefox even if we fail to open the dependentlibs.list file. Doing if(!f) return will prevent that. All errors before that are going to prevent running firefox, because they are related to finding the binary directory, which, if we don't, means we can't run the firefox binary.
Comment 27 Mike Hommey [:glandium] 2011-02-23 08:38:20 PST
we're doomed, for the ts regression. On my machine, here is what I see:
plain firefox: 1400ms
firefox + preload: 1450ms
firefox + useless preload: 1440ms

Useless preload being a wrapper that does nothing more than launching firefox (and timing its main() until before run() shows that it spends 132µs in main(), far from the 40ms regression it adds.

FWIW, half of the time spent in the preload() function is spent on mmap() and munmap(), so timing the preloading loop to cut if a certain amount is loaded in less than a certain amount of time is only ever going to cut 9% of the regression.

If you want, I can push to try to see if the same can be observed on the talos boxes.
Comment 28 Mike Beltzner [:beltzner, not reading bugmail] 2011-02-23 18:32:16 PST
Comment on attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

(bookkeeping)
Comment 29 Mike Hommey [:glandium] 2011-03-03 01:48:04 PST
Created attachment 516545 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux, v2

This version uses the F_RDADVISE fcntl on mac, and makes things actually faster. It turns out the sequential page faulting hack was not doing much good because of page faults slowness.

This is however not a final version, as the final version will be using the standalone glue instead of a wrapper.
Comment 30 Mike Hommey [:glandium] 2011-03-17 05:06:25 PDT
Created attachment 519874 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux, v3

This version relies on the binary using the standalone glue, for which I attached a preliminary patch in bug 552864. Note that the preload functions could be directly embedded in the nsGlueLinking*.cpp files, instead of having a separate file.
Comment 31 (dormant account) 2011-03-21 10:32:21 PDT
Comment on attachment 519874 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux, v3

>+#define BUFSIZE 4096

Optimal nit, I would prefer if this code was C++ and bufsize was a local const int in preload. I see no benefit in restricting us to C here. Other than that looks good.
Comment 32 Mike Hommey [:glandium] 2011-03-30 01:28:37 PDT
*** Bug 554421 has been marked as a duplicate of this bug. ***
Comment 33 Mike Hommey [:glandium] 2011-03-30 06:11:57 PDT
Created attachment 522984 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux, v4

This turns the whole thing into C++, and embeds the code in the nsGlueLinking*.cpp files instead of having a separate file.
Comment 34 Mike Hommey [:glandium] 2011-03-30 06:22:38 PDT
Created attachment 522986 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux, v4.1

Wasn't the right patch. This is.
Comment 35 (dormant account) 2011-03-30 12:24:50 PDT
Comment on attachment 522986 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux, v4.1

>diff --git a/xpcom/glue/standalone/nsGlueLinkingDlopen.cpp b/xpcom/glue/standalone/nsGlueLinkingDlopen.cpp
>--- a/xpcom/glue/standalone/nsGlueLinkingDlopen.cpp
>+++ b/xpcom/glue/standalone/nsGlueLinkingDlopen.cpp
>@@ -16,16 +16,17 @@
>  *
>  * The Initial Developer of the Original Code is
>  * Benjamin Smedberg <benjamin@smedbergs.us>
>  *
>  * Portions created by the Initial Developer are Copyright (C) 2005
>  * the Initial Developer. All Rights Reserved.
>  *
>  * Contributor(s):
>+ *   Mike Hommey <mh@glandium.org>
>  *
>  * Alternatively, the contents of this file may be used under the terms of
>  * either the GNU General Public License Version 2 or later (the "GPL"), or
>  * the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
>  * in which case the provisions of the GPL or the LGPL are applicable instead
>  * of those above. If you wish to allow use of your version of this file only
>  * under the terms of either the GPL or the LGPL, and not to allow others to
>  * use your version of this file under the terms of the MPL, indicate your
>@@ -34,16 +35,24 @@
>  * the provisions above, a recipient may use your version of this file under
>  * the terms of any one of the MPL, the GPL or the LGPL.
>  *
>  * ***** END LICENSE BLOCK ***** */
> 
> #include "nsGlueLinking.h"
> #include "nsXPCOMGlue.h"
> 
>+#ifdef LINUX
>+#define _GNU_SOURCE 
>+#include <fcntl.h>
>+#include <unistd.h>
>+#include <elf.h>
>+#include <limits.h>
>+#endif
>+
> #include <dlfcn.h>
> #include <stdlib.h>
> #include <string.h>
> #include <stdio.h>
> 
> #if defined(SUNOS4) || defined(NEXTSTEP) || \
>     (defined(OPENBSD) || defined(NETBSD)) && !defined(__ELF__)
> #define LEADING_UNDERSCORE "_"
>@@ -68,19 +77,68 @@ AppendDependentLib(void *libHandle)
>         return;
> 
>     d->next = sTop;
>     d->libHandle = libHandle;
> 
>     sTop = d;
> }
> 
>+#ifdef LINUX
>+static const unsigned int bufsize = 4096;
>+
>+#ifdef HAVE_64BIT_OS
>+#define Elf_Ehdr Elf64_Ehdr
>+#define Elf_Phdr Elf64_Phdr
>+#define ELFCLASS ELFCLASS64
>+#define Elf_Off Elf64_Off
>+#else
>+#define Elf_Ehdr Elf32_Ehdr
>+#define Elf_Phdr Elf32_Phdr
>+#define ELFCLASS ELFCLASS32
>+#define Elf_Off Elf32_Off
>+#endif

Minor nit. These should be typedefs.

Minor nit. Please comment both preloads a little so that non-elf/mach-nerds can follow along for future maintenance. 

>+    char *cmd = &base[sizeof(struct mach_header)];
>+    uint32_t ncmds = mh->ncmds;
>+    off_t end = 0;
>+    for (; ncmds; ncmds--) {

Nit. cmd might as well be declared within for() loop.

>+        fcntl(buf.getFd(), F_RDADVISE, &ra);

This definitely needs a comment, mac fcntl are not obvious :)

r+ with these addressed. I like how clean this patch turned out.
Comment 36 Mike Hommey [:glandium] 2011-03-31 09:56:07 PDT
Created attachment 523339 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux, v4.2

Same, with review comments addressed.
Comment 37 Mike Hommey [:glandium] 2011-05-16 01:30:50 PDT
Created attachment 532581 [details] [diff] [review]
Preload dependent libraries at startup.

This is mostly the same as the previous iterations, but changes a few
things:
- This adds a XPCOMGlueEnablePreload function to the xpcom standalone glue
  and changes the ReadDependentCB callbacks accordingly.
- This adds windows preloading from bug 627591.
- This also fixes a conflict between mach_header_64 and mach_header used
  later in the OSX implementation.
Comment 38 Benjamin Smedberg [:bsmedberg] 2011-05-17 11:01:09 PDT
Comment on attachment 532581 [details] [diff] [review]
Preload dependent libraries at startup.

I'm not sure why you have a separate XPCOMGlueEnablePreload instead of merely adding a bool parameter to XPCOMGlueStartup, but r=me either way.
Comment 39 Mike Hommey [:glandium] 2011-05-21 23:26:43 PDT
http://hg.mozilla.org/mozilla-central/rev/5488eda7538f
Comment 40 Mike Hommey [:glandium] 2011-05-22 08:54:25 PDT
Backed out because of leak test bustage
http://hg.mozilla.org/mozilla-central/rev/aa83abd4fd01
Comment 41 Mike Hommey [:glandium] 2011-06-14 22:36:17 PDT
Created attachment 539438 [details] [diff] [review]
Preload dependent libraries at startup.

Refreshed to apply after bug 658995
Comment 43 Matt Brubeck (:mbrubeck) 2011-06-15 08:49:13 PDT
I merged this from m-i to m-c, but it will be backed out shortly while we investigate Talos regressions:
http://hg.mozilla.org/mozilla-central/rev/e92f98e8a335
Comment 44 Mike Hommey [:glandium] 2011-06-16 00:46:31 PDT
Created attachment 539740 [details] [diff] [review]
Preload dependent libraries at startup.

Refreshed against the last version of bug 658995 part 1 (only context changes)
Comment 46 Philipp von Weitershausen [:philikon] 2011-06-17 07:38:02 PDT
http://hg.mozilla.org/mozilla-central/rev/cc18551d5cc3
Comment 47 Joe Wilson 2011-06-19 04:51:53 PDT
will this work for portable Fx?
Comment 48 Girish Sharma [:Optimizer] 2011-07-03 12:38:23 PDT
Was it ever backed out ? ,or is it still in the latest Nightlies?. I am not seeing any improved startup scores . Back in May when bug 627591 landed for a while , my startup time lowered down to around 900ms , but now they are around 2700ms . I have the about:startup extension so I can confirm the values
Comment 49 Mike Hommey [:glandium] 2011-07-03 22:59:27 PDT
Can you install the telemetry extension[1], go to about:telemetry and check the histograms for ProcessIoCounters*

1. http://people.mozilla.org/~tglek/telemetry/ping.telemetry.xpi
Comment 50 Mike Hommey [:glandium] 2011-07-03 23:08:42 PDT
(In reply to comment #49)
> Can you install the telemetry extension[1], go to about:telemetry and check
> the histograms for ProcessIoCounters*

*GLUESTARTUP*, that is.
Comment 51 Girish Sharma [:Optimizer] 2011-07-03 23:23:16 PDT
Thank you for your response, though the problem was solved as I tracked the source . It was Stumble Upon addon that was causing a 2 second delay at startup , now my Firefox starts in less than 800ms. 17 addons still enabled.

Mozilla developers should probably add Stumble upon addon in the https://addons.mozilla.org/en-US/firefox/performance/ list, up high .

Thanks you for your response.

Note You need to log in before you can comment on or make changes to this bug.