Closed Bug 632404 Opened 13 years ago Closed 13 years ago

Preload Firefox libraries at startup

Categories

(Firefox Build System :: General, defect)

All
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
mozilla7

People

(Reporter: glandium, Assigned: glandium)

References

()

Details

Attachments

(1 file, 11 obsolete files)

14.95 KB, patch
Details | Diff | Splinter Review
This is roughly the same idea as bug 627591, except that it's for UNIX. Something as simple as that wouldn't work on OSX, since the libraries on OSX are now just too big for that to be a win because they are universal binaries.

This gives a 20%+ startup time improvement on a rather slow disk by today's standards. We can expect even more improvement on faster disks.

Anyways, I am about to blog about quantified benefits, and why it just works so well. But I wanted to point the patch from the blog post, so I needed to open the bug first.
Corresponding blog post is http://glandium.org/blog/?p=1719

Note that we could hook that in run-mozilla.sh instead of mozilla.in, but it feels more right (less wrong, even) to do that in mozilla.in, since run-mozilla.sh is used for other things.
Attachment #510621 - Flags: review?(benjamin)
Comment on attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

This feels so wrong.
Attachment #510621 - Flags: review?(benjamin) → review?(tglek)
Comment on attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

It is morally wrong, but until we reorganize our binary better, this is good stopgap
Attachment #510621 - Flags: review?(tglek) → review+
Comment on attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

This is an ugly hack, but is also very simple, and it ends up being pretty efficient because of how much the kernel actually reads from the binaries, since bug 603370 couldn't make it in time.

FYI, a try run gives a ts of 502.11 on linux x86, which, looking at the last numbers on m-c, doesn't seem like a regression.
Attachment #510621 - Flags: approval2.0?
Assignee: nobody → mh+mozilla
Status: NEW → ASSIGNED
Comment on attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

a2.0=dbaron

(Will this hurt startup time on debug builds?  If so, is there an easy way to disable it for debug builds?)
Attachment #510621 - Flags: approval2.0? → approval2.0+
Good point, it will hurt builds with debugging symbols, that make the libxul.so in the order of hundreds of megabytes. Unfortunately, there are many ways the build can include debugging symbols :(
Attached file ELF preloading (obsolete) —
Replacing cat with this little program makes it slightly faster because calling readahead() once is faster than calling read() several, without counting the avoidance of memory copies. It also has the advantage of being aware of the elf headers and will thus avoid preloading debugging sections, which are not included in PT_LOAD headers.

I don't know where to stick this in the tree, though.
Attachment #511329 - Flags: review?(tglek)
That code won't build on OpenBSD as-is as we don't have elf.h, the closest i'd find which defines those types is libelf/sys_elf.h from libelf package. And it still doesn't build, because posix_fadvise() doesn't exist.

I appreciate the fact that you want to replace cat *.so>/dev/null by a smarter thing, but please make it portable ! :)
(In reply to comment #8)
> That code won't build on OpenBSD as-is as we don't have elf.h, the closest i'd
> find which defines those types is libelf/sys_elf.h from libelf package. And it
> still doesn't build, because posix_fadvise() doesn't exist.

What do you have on openbsd that could replace it ?
Note that I was thinking about making it build on Linux only for now, as this would need testing on other platforms to know if that works.

The benefits are well known on Linux, because of the toolchain limitations, and we know for sure that there is not going to be a downside. That can't be said from platforms with a different toolchain.
(I wouldn't even guarantee that cat *.so > /dev/null would work as well on other platforms)
I can't gather startup times as your startup.xpi extension (from your website or https://bugzilla.mozilla.org/show_bug.cgi?id=593743) makes firefox stuck when starting, but 'cat *.so > /dev/null just before run_mozilla.sh call works here. I won't judge if it makes things better or not, question of feeling.

As for a replacement for posix_madvise(), as it's apparently supposed to be portable it's our fault if we don't have the corresponding empty stub in libc.
(In reply to comment #12)
> I can't gather startup times as your startup.xpi extension (from your website
> or https://bugzilla.mozilla.org/show_bug.cgi?id=593743) makes firefox stuck
> when starting

That doesn't make much sense, the extension doesn't do anything until you go to about:startup. But feel free to ping me on irc if you want some help debugging that.

> but 'cat *.so > /dev/null just before run_mozilla.sh call works
> here. I won't judge if it makes things better or not, question of feeling.
>
> As for a replacement for posix_madvise(), as it's apparently supposed to be
> portable it's our fault if we don't have the corresponding empty stub in libc.

That's fadvise, not madvise, and apparently, a lot of unices don't have it.
(In reply to comment #13)
> (In reply to comment #12)
> > I can't gather startup times as your startup.xpi extension (from your website
> > or https://bugzilla.mozilla.org/show_bug.cgi?id=593743) makes firefox stuck
> > when starting
> 
> That doesn't make much sense, the extension doesn't do anything until you go to
> about:startup. But feel free to ping me on irc if you want some help debugging
> that.

Oh, nevermind, the CalculateProcessCreationTimestamp() code in nsAppStartup.cpp is conditional for linux|android, win and macosx. On OpenBSD, it just returns NaN, and your extensions works on a fresh profile. I'll try to come with the missing code....
Posted the missing code in https://bugzilla.mozilla.org/show_bug.cgi?id=633193

Without the cat *.so (btw, we need cat $dist_bin/*.so.* , as libs have a so version here)
main	124
sessionRestored	1146
firstPaint	1345

main	126
sessionRestored	955
firstPaint	1150

with it:
main	140
sessionRestored	960
firstPaint	1147

main	136
sessionRestored	975
firstPaint	1177

So not a big difference..
(In reply to comment #15)
> Posted the missing code in https://bugzilla.mozilla.org/show_bug.cgi?id=633193
> 
> Without the cat *.so (btw, we need cat $dist_bin/*.so.* , as libs have a so
> version here)

(does that come from a change of yours or is that how the libs are named on *bsd ?)

> main    124
> sessionRestored    1146
> firstPaint    1345
> 
> main    126
> sessionRestored    955
> firstPaint    1150
> 
> with it:
> main    140
> sessionRestored    960
> firstPaint    1147
> 
> main    136
> sessionRestored    975
> firstPaint    1177
> 
> So not a big difference..

Seeing your "main" numbers, I'd say you are testing warn startup, not cold startup.
(or you are using a ssd)
(In reply to comment #16)
> (In reply to comment #15)
> > Posted the missing code in https://bugzilla.mozilla.org/show_bug.cgi?id=633193
> > 
> > Without the cat *.so (btw, we need cat $dist_bin/*.so.* , as libs have a so
> > version here)
> 
> (does that come from a change of yours or is that how the libs are named on
> *bsd ?)

That's how our ports infrastructure deal with shared libraries, each of them have a so_version set by the port Makefile.

And i have patches to the build infrastructure that makes it take that environment variable into account:
patches/patch-configure_in:+    DLL_SUFFIX=".so.${SO_VERSION}"
patches/patch-js_src_configure_in:+    DLL_SUFFIX=".so.${SO_VERSION}"

> 
> > main    124
> > sessionRestored    1146
> > firstPaint    1345
> > 
> > main    126
> > sessionRestored    955
> > firstPaint    1150
> > 
> > with it:
> > main    140
> > sessionRestored    960
> > firstPaint    1147
> > 
> > main    136
> > sessionRestored    975
> > firstPaint    1177
> > 
> > So not a big difference..
> 
> Seeing your "main" numbers, I'd say you are testing warn startup, not cold
> startup.

cold startup as in 'kill X session' ? I only had one instance of ffx running for each test, closed and reopened several times. And no, it's not on a ssd, but on fast sata disks with ahci.
(In reply to comment #18)
> (In reply to comment #16)
> > (In reply to comment #15)
> > > Posted the missing code in https://bugzilla.mozilla.org/show_bug.cgi?id=633193
> > > 
> > > Without the cat *.so (btw, we need cat $dist_bin/*.so.* , as libs have a so
> > > version here)
> > 
> > (does that come from a change of yours or is that how the libs are named on
> > *bsd ?)
> 
> That's how our ports infrastructure deal with shared libraries, each of them
> have a so_version set by the port Makefile.
> 
> And i have patches to the build infrastructure that makes it take that
> environment variable into account:
> patches/patch-configure_in:+    DLL_SUFFIX=".so.${SO_VERSION}"
> patches/patch-js_src_configure_in:+    DLL_SUFFIX=".so.${SO_VERSION}"

Then I guess you'll have to adjust the mozilla.in file yourself.

> > > main    124
> > > sessionRestored    1146
> > > firstPaint    1345
> > > 
> > > main    126
> > > sessionRestored    955
> > > firstPaint    1150
> > > 
> > > with it:
> > > main    140
> > > sessionRestored    960
> > > firstPaint    1147
> > > 
> > > main    136
> > > sessionRestored    975
> > > firstPaint    1177
> > > 
> > > So not a big difference..
> > 
> > Seeing your "main" numbers, I'd say you are testing warn startup, not cold
> > startup.
> 
> cold startup as in 'kill X session' ? I only had one instance of ffx running
> for each test, closed and reopened several times. And no, it's not on a ssd,
> but on fast sata disks with ahci.

cold startup as in 'kernel page cache flushed from anything related to firefox', also known as 'reboot'
Attachment #511329 - Attachment mime type: text/x-csrc → text/plain
Comment on attachment 511329 [details]
ELF preloading

make this a library
Attachment #511329 - Flags: review?(tglek) → review-
Attached file WIP for Linux and OSX (obsolete) —
Attachment #511329 - Attachment is obsolete: true
(In reply to comment #20)
> Comment on attachment 511329 [details]
> ELF preloading
> 
> make this a library

So, it turns out this doesn't work, neither on Linux, nor on OSX, both for roughly the same reason: the dynamic linker loads and relocates all libraries before running init functions, which makes some sense, since these functions may be calling functions from other libraries (though that could be better than it currently is).

Which means: we need a wrapper on all platforms.
Attachment #512524 - Attachment mime type: text/x-csrc → text/plain
This implements the whole thing as a wrapper binary for Mac and Linux.

On Mac, this provokes a slight ts regression, especially on x86_64, but this is the price to pay on a system that doesn't have a readahead() system call.
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=657c2a92ee2b&newRev=b11f384dd664&tests=ts&submit=true

With some more tweaks, it might be possible to regress a bit less. (FWIW, madvise doesn't help at all (on the contrary), and it might be possible to avoid a bunch more page faults, which seem to be rather killing on mac, contrary to linux)

This takes care to load only the relevant parts of the binaries, and I verified with dtrace that most of these parts we load were already actually loaded by force of page faulting at random places in the binaries. (only could verify on x86, though)
Attachment #510621 - Attachment is obsolete: true
Attachment #512524 - Attachment is obsolete: true
Attachment #514237 - Flags: review?(tglek)
Attachment #514237 - Flags: review?(ted.mielczarek)
Comment on attachment 514237 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux

>+ifneq (,$(filter Linux Darwin,$(OS_TARGET)))
>+PROGRAM = $(MOZ_APP_NAME)-bin-real$(BIN_SUFFIX)


>+#ifdef MOZ_PRELOAD
>+@BINPATH@/@MOZ_APP_NAME@-bin-real
>+#endif
> @BINPATH@/@MOZ_APP_NAME@-bin
> @BINPATH@/@MOZ_APP_NAME@
> #endif

use -app or -main or something other than -real.

Don't need to copy

>+void
>+run(char *argv[], char *env[])
>+{
>+#ifdef XP_MACOSX
>+    cpu_type_t cpu_type = CPU_TYPE;
>+    size_t count;
>+    posix_spawnattr_t spawnattr;
>+    if (posix_spawnattr_init(&spawnattr) != 0)
>+        return;
>+
>+    if ((posix_spawnattr_setflags(&spawnattr, POSIX_SPAWN_SETEXEC) == 0) &&
>+        (posix_spawnattr_setbinpref_np(&spawnattr, 1, &cpu_type, &count) == 0))
>+        posix_spawnp(NULL, argv[0], NULL, &spawnattr, argv, env);
>+
>+    posix_spawnattr_destroy(&spawnattr);
>+#else
>+    execv(argv[0], argv);
>+#endif
>+}

i'd do
#ifdef XP_MACOSX
...
#else 
#define run execv

run() should be static

>+            return 255;

all of these 255s should be constants

>+    /* Partly stolen from nsXPCOMGlue.cpp */
>+    f = fopen(filePath, "r");
>+    if (f) {

if the  if(!f) return ... error pattern here.


>+    struct mach_header *mh = (struct mach_header *)base;
>+    if (mh->magic != MH_MAGIC)
>+        goto cleanup;
>+
>+    char *cmd = &base[sizeof(struct mach_header)];
>+    uint32_t ncmds = mh->ncmds;
>+    off_t end = 0;
>+    for (; ncmds; ncmds--) {
>+        struct segment_command *sh = (struct segment_command *)cmd;
>+        if (sh->cmd != LC_SEGMENT)
>+            continue;
>+        if (end < sh->fileoff + sh->filesize)
>+            end = sh->fileoff + sh->filesize;
>+        cmd += sh->cmdsize;
>+    }
>+    if (end > 0) {
>+        int i = 0;
>+        volatile char *tmp = base;
>+        if (end >= 32768)
>+            for (; i < 32768 - ((base - buf) % 32768); i += 4096)
>+                tmp[i];
>+        for (; i < end; i += 32768)
>+            tmp[i];

use a constant for 32768, 4096. % is bad style for binary numbers, I think you want &

I'm surprised the compiler doesn't optimize away tmp[i], even with volatile in there.
>+void
>+preload(const char *file)
>+{
>+    char buf[BUFSIZE];
>+    Elf_Ehdr *ehdr = (Elf_Ehdr *)buf;

Shouldn't this be a union?

>+    Elf_Phdr *phdr;
>+    Elf_Off end = 0;
>+    int phnum;
>+    int fd = open(file, O_RDONLY);
>+    if (fd < 0)
>+        return;
>+    if ((read(fd, buf, BUFSIZE) <= 0) ||
>+        (memcmp(buf, ELFMAG, 4))) {
>+        close(fd);
>+        return;
>+    }
>+    if ((ehdr->e_ident[EI_CLASS] != ELFCLASS) ||
>+        (ehdr->e_phoff + ehdr->e_phentsize * ehdr->e_phnum >= BUFSIZE)) {
>+        close(fd);
>+        return;
>+    }

In theory you could combine those two ifs :)

How much IO is caused by scanning the elf header?

>+
>+#ifndef PRELOAD_H
>+#define PRELOAD_H
>+#include <limits.h>
>+#ifdef XP_MACOSX
>+#include <mach/machine.h>
>+
>+#if defined(__i386__)
>+#define CPU_TYPE CPU_TYPE_X86
>+#elif defined(__x86_64__)
>+#define CPU_TYPE CPU_TYPE_X86_64
>+#elif defined(__ppc__)
>+#define CPU_TYPE CPU_TYPE_POWERPC
>+#elif defined(__ppc64__)
>+#define CPU_TYPE CPU_TYPE_POWERPC64
>+#else
>+#error Unsupported CPU type
>+#endif
>+
>+#endif /* XP_MACOSX */

At least part of above is redundant. PPC is obsolete on darwin.

>+
>+/* Stolen from nsXPCOMPrivate.h */
>+#ifndef MAXPATHLEN
>+#ifdef PATH_MAX
>+#define MAXPATHLEN PATH_MAX
>+#elif defined(_MAX_PATH)
>+#define MAXPATHLEN _MAX_PATH
>+#else
>+#define MAXPATHLEN 1024
>+#endif
>+#endif

I'm not sure about this. Seems like you should just be able to rely on MAXPATHLEN since the code is already linux-specific.

>+extern "C" void
>+preload(const char *file);
>+
> static void
> ReadDependentCB(const char *aDependentLib)
> {
>+#ifdef LINUX
>+    preload(aDependentLib);
>+#endif
>     void *libHandle = dlopen(aDependentLib, RTLD_GLOBAL | RTLD_LAZY);
>     if (!libHandle)
>         return;
> 
>     AppendDependentLib(libHandle);
> }


> 
> nsresult
>diff --git a/xpcom/glue/standalone/nsGlueLinkingOSX.cpp b/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
>--- a/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
>+++ b/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
>@@ -42,19 +42,23 @@
> #include <mach-o/dyld.h>
> #include <sys/param.h>
> #include <stdlib.h>
> #include <string.h>
> #include <stdio.h>
> 
> static const mach_header* sXULLibImage;
> 
>+extern "C" void
>+preload(const char *file);
>+
> static void
> ReadDependentCB(const char *aDependentLib)
> {
>+    preload(aDependentLib);
>     (void) NSAddImage(aDependentLib,
>                       NSADDIMAGE_OPTION_RETURN_ON_ERROR |
>                       NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME);
> }
> 
> static void*
> LookupSymbol(const mach_header* aLib, const char* aSymbolName)
> {

I assume this is for .so files loaded by firefox. Is this really a win? Most of the .so files are either small or mostly-unused. I'd like more data on this change.
Attachment #514237 - Flags: review?(tglek) → review-
(In reply to comment #24)
> Comment on attachment 514237 [details] [diff] [review]
> Preload dependent libraries at startup on Mac and Linux
> 
> >+ifneq (,$(filter Linux Darwin,$(OS_TARGET)))
> >+PROGRAM = $(MOZ_APP_NAME)-bin-real$(BIN_SUFFIX)
> 
> 
> >+#ifdef MOZ_PRELOAD
> >+@BINPATH@/@MOZ_APP_NAME@-bin-real
> >+#endif
> > @BINPATH@/@MOZ_APP_NAME@-bin
> > @BINPATH@/@MOZ_APP_NAME@
> > #endif
> 
> use -app or -main or something other than -real.
> 
> Don't need to copy

What do you mean?

> use a constant for 32768, 4096. % is bad style for binary numbers, I think you
> want &

I used % 32768 because it's more readable than & ~(32768 - 1) (which, using a constant, would be mandatory, except if using 2 constants, in which case we'd rely on anyone modifying one to modify both at the same time)

> How much IO is caused by scanning the elf header?

One readahead, so 128K, but they are part of what's going to be read anyways. They're obviously not read twice.

> 
> >+
> >+#ifndef PRELOAD_H
> >+#define PRELOAD_H
> >+#include <limits.h>
> >+#ifdef XP_MACOSX
> >+#include <mach/machine.h>
> >+
> >+#if defined(__i386__)
> >+#define CPU_TYPE CPU_TYPE_X86
> >+#elif defined(__x86_64__)
> >+#define CPU_TYPE CPU_TYPE_X86_64
> >+#elif defined(__ppc__)
> >+#define CPU_TYPE CPU_TYPE_POWERPC
> >+#elif defined(__ppc64__)
> >+#define CPU_TYPE CPU_TYPE_POWERPC64
> >+#else
> >+#error Unsupported CPU type
> >+#endif
> >+
> >+#endif /* XP_MACOSX */
> 
> At least part of above is redundant. PPC is obsolete on darwin.

How redundant ? Note that some people are supporting OSX PPC, 4 lines is not going to bloat us while making their life easier.

> >+
> >+/* Stolen from nsXPCOMPrivate.h */
> >+#ifndef MAXPATHLEN
> >+#ifdef PATH_MAX
> >+#define MAXPATHLEN PATH_MAX
> >+#elif defined(_MAX_PATH)
> >+#define MAXPATHLEN _MAX_PATH
> >+#else
> >+#define MAXPATHLEN 1024
> >+#endif
> >+#endif
> 
> I'm not sure about this. Seems like you should just be able to rely on
> MAXPATHLEN since the code is already linux-specific.

MAXPATHLEN is used in a non-linux specific part of the code. This could be moved to main.c, though, since it's only used there.

> >+extern "C" void
> >+preload(const char *file);
> >+
> > static void
> > ReadDependentCB(const char *aDependentLib)
> > {
> >+#ifdef LINUX
> >+    preload(aDependentLib);
> >+#endif
> >     void *libHandle = dlopen(aDependentLib, RTLD_GLOBAL | RTLD_LAZY);
> >     if (!libHandle)
> >         return;
> > 
> >     AppendDependentLib(libHandle);
> > }
> 
> 
> > 
> > nsresult
> >diff --git a/xpcom/glue/standalone/nsGlueLinkingOSX.cpp b/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
> >--- a/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
> >+++ b/xpcom/glue/standalone/nsGlueLinkingOSX.cpp
> >@@ -42,19 +42,23 @@
> > #include <mach-o/dyld.h>
> > #include <sys/param.h>
> > #include <stdlib.h>
> > #include <string.h>
> > #include <stdio.h>
> > 
> > static const mach_header* sXULLibImage;
> > 
> >+extern "C" void
> >+preload(const char *file);
> >+
> > static void
> > ReadDependentCB(const char *aDependentLib)
> > {
> >+    preload(aDependentLib);
> >     (void) NSAddImage(aDependentLib,
> >                       NSADDIMAGE_OPTION_RETURN_ON_ERROR |
> >                       NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME);
> > }
> > 
> > static void*
> > LookupSymbol(const mach_header* aLib, const char* aSymbolName)
> > {
> 
> I assume this is for .so files loaded by firefox. Is this really a win? Most of
> the .so files are either small or mostly-unused. I'd like more data on this
> change.

This does the same as the firefox-bin wrapper, but for the xulrunner stub (used on firefox-on-xulrunner setups). On the long term, we should switch to the standalone glue in the firefox binary, and remove the wrapper. I have a WIP for that, but that wouldn't be accepted at this time of the release process (asked bsmedberg last week).
> >+    /* Partly stolen from nsXPCOMGlue.cpp */
> >+    f = fopen(filePath, "r");
> >+    if (f) {
> 
> if the  if(!f) return ... error pattern here.

The point here was to keep launching firefox even if we fail to open the dependentlibs.list file. Doing if(!f) return will prevent that. All errors before that are going to prevent running firefox, because they are related to finding the binary directory, which, if we don't, means we can't run the firefox binary.
we're doomed, for the ts regression. On my machine, here is what I see:
plain firefox: 1400ms
firefox + preload: 1450ms
firefox + useless preload: 1440ms

Useless preload being a wrapper that does nothing more than launching firefox (and timing its main() until before run() shows that it spends 132µs in main(), far from the 40ms regression it adds.

FWIW, half of the time spent in the preload() function is spent on mmap() and munmap(), so timing the preloading loop to cut if a certain amount is loaded in less than a certain amount of time is only ever going to cut 9% of the regression.

If you want, I can push to try to see if the same can be observed on the talos boxes.
Comment on attachment 510621 [details] [diff] [review]
"Preload" *.so before launching firefox-bin

(bookkeeping)
Attachment #510621 - Flags: approval2.0+
Attachment #514237 - Flags: review?(ted.mielczarek)
This version uses the F_RDADVISE fcntl on mac, and makes things actually faster. It turns out the sequential page faulting hack was not doing much good because of page faults slowness.

This is however not a final version, as the final version will be using the standalone glue instead of a wrapper.
Attachment #514237 - Attachment is obsolete: true
This version relies on the binary using the standalone glue, for which I attached a preliminary patch in bug 552864. Note that the preload functions could be directly embedded in the nsGlueLinking*.cpp files, instead of having a separate file.
Attachment #516545 - Attachment is obsolete: true
Attachment #519874 - Flags: review?(tglek)
Comment on attachment 519874 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux, v3

>+#define BUFSIZE 4096

Optimal nit, I would prefer if this code was C++ and bufsize was a local const int in preload. I see no benefit in restricting us to C here. Other than that looks good.
Attachment #519874 - Flags: review?(tglek) → review+
Depends on: 552864
Blocks: 627591
This turns the whole thing into C++, and embeds the code in the nsGlueLinking*.cpp files instead of having a separate file.
Attachment #519874 - Attachment is obsolete: true
Attachment #522984 - Flags: review?(tglek)
Wasn't the right patch. This is.
Attachment #522984 - Attachment is obsolete: true
Attachment #522986 - Flags: review?(tglek)
Attachment #522984 - Flags: review?(tglek)
Comment on attachment 522986 [details] [diff] [review]
Preload dependent libraries at startup on Mac and Linux, v4.1

>diff --git a/xpcom/glue/standalone/nsGlueLinkingDlopen.cpp b/xpcom/glue/standalone/nsGlueLinkingDlopen.cpp
>--- a/xpcom/glue/standalone/nsGlueLinkingDlopen.cpp
>+++ b/xpcom/glue/standalone/nsGlueLinkingDlopen.cpp
>@@ -16,16 +16,17 @@
>  *
>  * The Initial Developer of the Original Code is
>  * Benjamin Smedberg <benjamin@smedbergs.us>
>  *
>  * Portions created by the Initial Developer are Copyright (C) 2005
>  * the Initial Developer. All Rights Reserved.
>  *
>  * Contributor(s):
>+ *   Mike Hommey <mh@glandium.org>
>  *
>  * Alternatively, the contents of this file may be used under the terms of
>  * either the GNU General Public License Version 2 or later (the "GPL"), or
>  * the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
>  * in which case the provisions of the GPL or the LGPL are applicable instead
>  * of those above. If you wish to allow use of your version of this file only
>  * under the terms of either the GPL or the LGPL, and not to allow others to
>  * use your version of this file under the terms of the MPL, indicate your
>@@ -34,16 +35,24 @@
>  * the provisions above, a recipient may use your version of this file under
>  * the terms of any one of the MPL, the GPL or the LGPL.
>  *
>  * ***** END LICENSE BLOCK ***** */
> 
> #include "nsGlueLinking.h"
> #include "nsXPCOMGlue.h"
> 
>+#ifdef LINUX
>+#define _GNU_SOURCE 
>+#include <fcntl.h>
>+#include <unistd.h>
>+#include <elf.h>
>+#include <limits.h>
>+#endif
>+
> #include <dlfcn.h>
> #include <stdlib.h>
> #include <string.h>
> #include <stdio.h>
> 
> #if defined(SUNOS4) || defined(NEXTSTEP) || \
>     (defined(OPENBSD) || defined(NETBSD)) && !defined(__ELF__)
> #define LEADING_UNDERSCORE "_"
>@@ -68,19 +77,68 @@ AppendDependentLib(void *libHandle)
>         return;
> 
>     d->next = sTop;
>     d->libHandle = libHandle;
> 
>     sTop = d;
> }
> 
>+#ifdef LINUX
>+static const unsigned int bufsize = 4096;
>+
>+#ifdef HAVE_64BIT_OS
>+#define Elf_Ehdr Elf64_Ehdr
>+#define Elf_Phdr Elf64_Phdr
>+#define ELFCLASS ELFCLASS64
>+#define Elf_Off Elf64_Off
>+#else
>+#define Elf_Ehdr Elf32_Ehdr
>+#define Elf_Phdr Elf32_Phdr
>+#define ELFCLASS ELFCLASS32
>+#define Elf_Off Elf32_Off
>+#endif

Minor nit. These should be typedefs.

Minor nit. Please comment both preloads a little so that non-elf/mach-nerds can follow along for future maintenance. 

>+    char *cmd = &base[sizeof(struct mach_header)];
>+    uint32_t ncmds = mh->ncmds;
>+    off_t end = 0;
>+    for (; ncmds; ncmds--) {

Nit. cmd might as well be declared within for() loop.

>+        fcntl(buf.getFd(), F_RDADVISE, &ra);

This definitely needs a comment, mac fcntl are not obvious :)

r+ with these addressed. I like how clean this patch turned out.
Attachment #522986 - Flags: review?(tglek) → review+
Same, with review comments addressed.
Attachment #522986 - Attachment is obsolete: true
Attachment #523339 - Flags: review+
Blocks: 657297
Attachment #523339 - Attachment filename: diff → bug632404
This is mostly the same as the previous iterations, but changes a few
things:
- This adds a XPCOMGlueEnablePreload function to the xpcom standalone glue
  and changes the ReadDependentCB callbacks accordingly.
- This adds windows preloading from bug 627591.
- This also fixes a conflict between mach_header_64 and mach_header used
  later in the OSX implementation.
Attachment #532581 - Flags: review?(benjamin)
Attachment #523339 - Attachment is obsolete: true
Comment on attachment 532581 [details] [diff] [review]
Preload dependent libraries at startup.

I'm not sure why you have a separate XPCOMGlueEnablePreload instead of merely adding a bool parameter to XPCOMGlueStartup, but r=me either way.
Attachment #532581 - Flags: review?(benjamin) → review+
http://hg.mozilla.org/mozilla-central/rev/5488eda7538f
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 6
Backed out because of leak test bustage
http://hg.mozilla.org/mozilla-central/rev/aa83abd4fd01
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Blocks: 659028
Refreshed to apply after bug 658995
Attachment #532581 - Attachment is obsolete: true
http://hg.mozilla.org/integration/mozilla-inbound/rev/e92f98e8a335
Whiteboard: [inbound]
Target Milestone: Firefox 6 → ---
I merged this from m-i to m-c, but it will be backed out shortly while we investigate Talos regressions:
http://hg.mozilla.org/mozilla-central/rev/e92f98e8a335
Refreshed against the last version of bug 658995 part 1 (only context changes)
Attachment #539438 - Attachment is obsolete: true
http://hg.mozilla.org/mozilla-central/rev/cc18551d5cc3
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Whiteboard: [inbound]
Target Milestone: --- → Firefox 7
will this work for portable Fx?
Blocks: 665816
Was it ever backed out ? ,or is it still in the latest Nightlies?. I am not seeing any improved startup scores . Back in May when bug 627591 landed for a while , my startup time lowered down to around 900ms , but now they are around 2700ms . I have the about:startup extension so I can confirm the values
Can you install the telemetry extension[1], go to about:telemetry and check the histograms for ProcessIoCounters*

1. http://people.mozilla.org/~tglek/telemetry/ping.telemetry.xpi
(In reply to comment #49)
> Can you install the telemetry extension[1], go to about:telemetry and check
> the histograms for ProcessIoCounters*

*GLUESTARTUP*, that is.
Thank you for your response, though the problem was solved as I tracked the source . It was Stumble Upon addon that was causing a 2 second delay at startup , now my Firefox starts in less than 800ms. 17 addons still enabled.

Mozilla developers should probably add Stumble upon addon in the https://addons.mozilla.org/en-US/firefox/performance/ list, up high .

Thanks you for your response.
Blocks: 673900
Depends on: 853212
Depends on: 975634
Component: Build Config → General
Product: Firefox → Firefox Build System
Target Milestone: Firefox 7 → mozilla7
You need to log in before you can comment on or make changes to this bug.