Last Comment Bug 525013 - Implement a more static build configuration of Firefox
: Implement a more static build configuration of Firefox
Status: RESOLVED WONTFIX
[ts][2010Q1]
:
Product: Core
Classification: Components
Component: Build Config (show other bugs)
: Trunk
: x86 Linux
: P1 normal with 5 votes (vote)
: ---
Assigned To: Ted Mielczarek [:ted.mielczarek]
:
: Gregory Szorc [:gps]
Mentors:
Depends on: 534471 532531 532763 532765 532769 532771 537090 537709 538129 538241 545889 553635
Blocks: 447581 514083 531406 542015 542126 542567
  Show dependency treegraph
 
Reported: 2009-10-28 11:04 PDT by (dormant account)
Modified: 2011-01-04 11:46 PST (History)
40 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
This patch allows for building firefox with a static XUL and packaging it as DMG on the Mac (2.87 KB, patch)
2009-11-12 12:22 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Static XUL, sqlite3, mozjs and the NSPR libs. (4.09 KB, patch)
2009-11-13 15:29 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
A completely static build of Firefox. (18.62 KB, patch)
2009-11-17 13:55 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Static build. Brings up my tabs and crashes with a segfault (20.15 KB, patch)
2009-11-18 14:20 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
No separate XUL framework as it's contents went into firefox-bin (2.48 KB, patch)
2009-11-23 13:15 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Dynamic plds4, plc4, nspr4, crypto, smime3, ssl3, nss3, nssutil3, SSL works (7.24 KB, patch)
2009-11-25 04:12 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
What I use to build a static x86-64 Firefox on Snow Leopard (734 bytes, text/plain)
2009-12-01 11:02 PST, Joel Reymont (:joelr)
no flags Details
Remove -gfull and -dead_strip (658 bytes, text/plain)
2009-12-01 15:55 PST, Joel Reymont (:joelr)
no flags Details
Static build. Now works on Mac and Linux. (12.06 KB, patch)
2009-12-08 09:44 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Should require no mozconfig (12.83 KB, patch)
2009-12-14 10:18 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
static build (12.02 KB, patch)
2009-12-15 13:00 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Builds on Linux and Mac (except 10.5.2) on the try server. (13.23 KB, patch)
2009-12-16 05:47 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
static build (13.23 KB, patch)
2009-12-18 10:27 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Builds libxpcom as a dynamic library to avoid breaking components (4.94 KB, patch)
2009-12-18 10:29 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Talos errors on Linux (16.23 KB, patch)
2009-12-22 14:38 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Still fails on Maemo (25.74 KB, patch)
2009-12-23 09:03 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
static build (133.09 KB, patch)
2010-01-03 05:30 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
static build, tests pass, xpcshell disabled (133.20 KB, patch)
2010-01-04 00:21 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
static build (138.20 KB, patch)
2010-01-05 07:35 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
static build (73.68 KB, patch)
2010-01-07 10:59 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Raw data I used for statistics in comment #83 (1.69 KB, application/octet-stream)
2010-01-12 14:44 PST, Joel Reymont (:joelr)
no flags Details
really basic diagnostic on statistics from comment #83 (22.09 KB, application/pdf)
2010-01-13 15:53 PST, Zack Weinberg (:zwol)
no flags Details
script to do diagnosis (1.66 KB, text/plain)
2010-01-13 16:00 PST, Zack Weinberg (:zwol)
no flags Details
visualization of statistical model (3.94 KB, application/pdf)
2010-01-14 15:46 PST, Zack Weinberg (:zwol)
no flags Details
Static build, 12/09/09 sources (505 bytes, text/plain)
2010-01-15 02:11 PST, Joel Reymont (:joelr)
no flags Details
analysis script for fully static vs "working components" configuration (2.32 KB, text/plain)
2010-01-15 12:35 PST, Zack Weinberg (:zwol)
no flags Details
static libxul build (17.78 KB, patch)
2010-01-15 12:36 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
chart 1: predictions vs data (36.78 KB, application/pdf)
2010-01-15 12:39 PST, Zack Weinberg (:zwol)
no flags Details
chart 2: q-q plot of residuals (11.32 KB, application/pdf)
2010-01-15 12:48 PST, Zack Weinberg (:zwol)
no flags Details
static libxul build (17.81 KB, patch)
2010-01-18 07:43 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
raw timings data on snow leopard (12.11 KB, application/octet-stream)
2010-01-20 05:44 PST, Joel Reymont (:joelr)
no flags Details
raw timings data on snow leopard, now with results (18.67 KB, application/octet-stream)
2010-01-20 06:04 PST, Joel Reymont (:joelr)
no flags Details
startup performance chart, most instructive! (198.59 KB, application/pdf)
2010-01-20 08:43 PST, Joel Reymont (:joelr)
no flags Details
diagnostic plot of data from comment 108 (59.61 KB, application/pdf)
2010-01-20 13:27 PST, Zack Weinberg (:zwol)
no flags Details
overall prediction with data (68.48 KB, application/pdf)
2010-01-20 13:29 PST, Zack Weinberg (:zwol)
no flags Details
prediction with effect of 'disk' removed (69.91 KB, application/pdf)
2010-01-20 13:42 PST, Zack Weinberg (:zwol)
no flags Details
R scripts and raw data (216.72 KB, application/zip)
2010-01-20 13:45 PST, Zack Weinberg (:zwol)
no flags Details
Linux raw data, 7200rpm, 5400rpm and 4200rpm over USB2 (3.85 KB, application/x-zip-compressed)
2010-01-21 08:41 PST, Joel Reymont (:joelr)
no flags Details
startup performance chart for Linux, even more instructive than Mac OSX! (201.72 KB, application/pdf)
2010-01-21 08:47 PST, Joel Reymont (:joelr)
no flags Details
Linux raw data, now with SSD (2.83 KB, application/x-compressed-tar)
2010-01-22 09:20 PST, Joel Reymont (:joelr)
no flags Details
Startup performance chart for Linux, now with SSD (68.63 KB, application/pdf)
2010-01-22 09:22 PST, Joel Reymont (:joelr)
no flags Details
Startup performance chart for Linux with the effects of disk removed (69.43 KB, application/pdf)
2010-01-27 12:59 PST, Zack Weinberg (:zwol)
no flags Details
complete analysis of comment 115 data (259.27 KB, application/x-gzip)
2010-01-27 13:06 PST, Zack Weinberg (:zwol)
no flags Details
static libxul build with windows support (19.97 KB, patch)
2010-02-03 07:40 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Builds on Linux, Maemo, Mac and Windows (24.12 KB, patch)
2010-02-05 06:16 PST, Joel Reymont (:joelr)
no flags Details | Diff | Splinter Review
Builds everywhere (24.51 KB, patch)
2010-02-08 09:50 PST, Joel Reymont (:joelr)
ted: review-
Details | Diff | Splinter Review
static libxul build with ted's review changes (28.77 KB, patch)
2010-02-17 05:47 PST, Joel Reymont (:joelr)
ted: review-
Details | Diff | Splinter Review
updated patch (23.31 KB, patch)
2010-03-17 10:52 PDT, Ted Mielczarek [:ted.mielczarek]
no flags Details | Diff | Splinter Review
updated patch (22.79 KB, patch)
2010-03-22 10:49 PDT, Ted Mielczarek [:ted.mielczarek]
no flags Details | Diff | Splinter Review
updated patch (24.00 KB, patch)
2010-03-26 14:53 PDT, Ted Mielczarek [:ted.mielczarek]
no flags Details | Diff | Splinter Review
Rebased patch (for posterity) (23.04 KB, patch)
2010-04-26 13:08 PDT, Ted Mielczarek [:ted.mielczarek]
no flags Details | Diff | Splinter Review

Description (dormant account) 2009-10-28 11:04:16 PDT
Currently firefox-bin is a library that loads a large number of our and system libraries.
It would be very helpful to produce a build where every firefox library is linked into firefox-bin(even at expense of disabling stuff like plugins/etc for now) to test improvement in startup time.
Comment 1 Benjamin Smedberg [:bsmedberg] 2009-10-28 11:10:08 PDT
NSPR is a small-ish problem: it creates static versions of the NSPR libs, so I think we can use it.

NSS is a huge problem: not only does it not create static libs, but it relies on having various DLLs so that it can do weird checksum stuff and do other weird dynamic-loading things.

We had a static configuration in the past, but it still had xpcom.dll dynamic so that extensions would work. This meant that there were many relocations in the binary itself and libxul was actually faster. If you want to completely break the possibility of XPCOM extensions we could fold that into the staticness and gain some additional wins.

But as an "let's investigate this" bug I think it's fairly low-priority until we decide that we can actually break binary XPCOM components, so I don't think we should work on it soon.
Comment 2 (dormant account) 2009-10-28 11:16:06 PDT
(In reply to comment #1)
> We had a static configuration in the past, but it still had xpcom.dll dynamic
> so that extensions would work. This meant that there were many relocations in
> the binary itself and libxul was actually faster. If you want to completely
> break the possibility of XPCOM extensions we could fold that into the
> staticness and gain some additional wins.

Yes. That's the idea

> 
> But as an "let's investigate this" bug I think it's fairly low-priority until
> we decide that we can actually break binary XPCOM components, so I don't think
> we should work on it soon.

I would prefer to investigate this soon to have a better idea of how much of our performance is attributed to our library layout. Assuming this is a win, we could try some hacks to approximate a proper layout.
Comment 3 Joel Reymont (:joelr) 2009-11-12 12:22:12 PST
Created attachment 412038 [details] [diff] [review]
This patch allows for building firefox with a static XUL and packaging it as DMG on the Mac

Not much of an improvement in start-up speed. Still need to take care of libmozjs, libxpcom, libxpcom_core, libplds4, libplc4, libnspr4, libsmime3, libssl3, libnss3, libnssutil3 and libmozsqlite3.
Comment 4 Joel Reymont (:joelr) 2009-11-13 15:27:35 PST
Any suggestions on how to resolve this?

ld: duplicate symbol _freedtoa in ../../dist/lib/libnspr4.a(prdtoa.o) and ../../dist/lib/libjs_static.a(jsdtoa.o)
Comment 5 Joel Reymont (:joelr) 2009-11-13 15:29:58 PST
Created attachment 412317 [details] [diff] [review]
Static XUL, sqlite3, mozjs and the NSPR libs.

There's a duplicate symbol conflict, though.
Comment 6 Brian Crowder 2009-11-13 15:36:02 PST
Those functions should be declared static, are they not?
Comment 7 Joel Reymont (:joelr) 2009-11-16 07:58:04 PST
Not in js/src/dtoa.c that jsdtoa.cpp includes.
Comment 8 Brian Crowder 2009-11-16 09:59:41 PST
Fixing that should fix your linkage.
Comment 9 Joel Reymont (:joelr) 2009-11-17 13:55:36 PST
Created attachment 412929 [details] [diff] [review]
A completely static build of Firefox.

Does not run as of yet, e.g.

./firefox-bin -no-remote -foreground -P 2
fcntl(F_SETLK) failed. errno = 35
WARNING: NS_ENSURE_TRUE(compMgr) failed: file nsComponentManagerUtils.cpp, line 90
WARNING: NS_ENSURE_TRUE(compMgr) failed: file nsComponentManagerUtils.cpp, line 90
WARNING: NS_ENSURE_TRUE(compMgr) failed: file nsComponentManagerUtils.cpp, line 90
*** Registering components in: Apprunner
*** Registering components in: Apprunner
WARNING: NS_ENSURE_SUCCESS(rv, ((nsresult) 0x80004005L)) failed with result 0x8000FFFF: file ../../../mozilla-central/toolkit/xre/nsAppRunner.cpp, line 1749
nsStringStats
 => mAllocCount:            853
 => mReallocCount:            2
 => mFreeCount:             853
 => mShareCount:              0
 => mAdoptCount:            104
 => mAdoptFreeCount:        104
Comment 10 Joel Reymont (:joelr) 2009-11-17 14:02:40 PST
My apologies, I had another instance of Firefox running. This is where I am right now:

WARNING: NS_ENSURE_SUCCESS(rv, 1) failed with result 0x8000FFFF: file ../../../mozilla-central/toolkit/xre/nsAppRunner.cpp, line 3285

From there I go to line 3597 and exit.
Comment 11 Joel Reymont (:joelr) 2009-11-18 14:20:43 PST
Created attachment 413187 [details] [diff] [review]
Static build. Brings up my tabs and crashes with a segfault
Comment 12 Joel Reymont (:joelr) 2009-11-18 14:25:47 PST
Lo and behold, it didn't crash this time. 

Here's the debug output: 

http://pastie.org/705016
Comment 13 Joel Reymont (:joelr) 2009-11-23 12:45:48 PST
I decided to redo the patch again, measuring every step of the way. At this step I don't have the XUL framework (Mac) since it's now part of firefox-bin. Using Vlad's startup timing approach I have the following.

Dynamic XUL:

1) 3260, 2189, 2002
2) 3144, 1969, 1954, 2250, 2193

Static XUL:

1) 3392, 1905, 1896
2) 3053, 1911, 2068, 1944, 1909   

The gain is inconclusive.
Comment 14 Joel Reymont (:joelr) 2009-11-23 12:49:00 PST
1) and 2) refer to two separate runs. The higher first number is the component registration path. Also, I'm running off of a USB thumb drive since file system caches are cleared when the drive is ejected.
Comment 15 Joel Reymont (:joelr) 2009-11-23 12:51:38 PST
My writing is suffering this evening. I ran the dynamic XUL build 3 times (1), followed by 3 runs of the static XUL build (1), followed by 5 runs of the dynamic XUL build (2), followed by 5 runs of the static build (2).
Comment 16 Joel Reymont (:joelr) 2009-11-23 13:07:47 PST
A huge bummer with a slow USB HD, e.g.

Dynamic XUL:

5029, 3749, 3750, 3731, 3822 

Static XUL:

6473, 4142, 4098, 3792, 4213
Comment 17 Joel Reymont (:joelr) 2009-11-23 13:15:13 PST
Created attachment 414094 [details] [diff] [review]
No separate XUL framework as it's contents went into firefox-bin

I'm starting from scratch here with a new series of patches.
Comment 18 (dormant account) 2009-11-23 13:17:48 PST
Drew,
Could you please test Joel's patch in your environment?
Comment 19 Joel Reymont (:joelr) 2009-11-24 15:16:58 PST
A more static Firefox:

5204, 3281, 3363, 3309, 3269 

where dynamic libraries are: crypto, smime3, ssl3, nss3, nssutil3

vs a less static Firefox from comment #16

5483, 3813, 3928, 4084, 4013 

where the dynamic libraries are: mozjs, xpcom, xpcom_core, plds4, plc4, nspr4, crypto, smime3, ssl3, nss3, nssutil3, mozsqlite3

vs a regular dynamic build:

5464, 3857, 3818, 3764, 3790

All of the above are running from a slow (IDE) USB HD.

One caveat is that NSS seems to be broken since the bugzilla tabs do not come up. Will post the patch in a few minutes.
Comment 20 Joel Reymont (:joelr) 2009-11-24 15:37:19 PST
I think we can discard the results in comment #16 as a fluke of sorts. The mostly static (apart from NSS) build looks to be 400-500ms faster than the regular dynamic one. 

I'll tackle reordering symbols for improved locality of reference tomorrow. Fortunately, I can use gprof now that the build is mostly static.
Comment 21 Joel Reymont (:joelr) 2009-11-25 04:10:15 PST
5896, 3612, 3410, 3364, 3368 

with dynamic plds4, plc4, nspr4, crypto, smime3, ssl3, nss3, nssutil3

SSL works fine in this case.
Comment 22 Joel Reymont (:joelr) 2009-11-25 04:12:24 PST
Created attachment 414487 [details] [diff] [review]
Dynamic plds4, plc4, nspr4, crypto, smime3, ssl3, nss3, nssutil3, SSL works

This patch corresponds to comment #21
Comment 23 Joel Reymont (:joelr) 2009-11-25 08:28:04 PST
My timings are for a 64-bit Firefox build running on a unibody MacBook Pro, 2.93Ghz Core 2 Duo, 4Gb of 1067 Mhz DDR3 memory, Mac OSX 10.6.2 (Snow Leopard) running in 64-bit mode. 

I have a slow IDE HD in a USB enclosure but don't remember the HD speed which could be 7200 or 5400 rpm. Does it even matter over USB2?
Comment 24 Brian Crowder 2009-11-25 08:30:30 PST
Is the dynamic nspr + plc + plds necessary for SSL to work?
Comment 25 Joel Reymont (:joelr) 2009-11-25 08:43:32 PST
Someone on #developers (timeless_mbp) insisted that you cannot have a static NSPR since it just plain won't work. It worked yesterday although SSL was broken.

You cannot have a static NSS since it insists on loading things dynamically and I was told that's a must for FIPS... or something like that. 

Having a dynamic nspr + plc + plds avoids having duplicate code in NSS dylibs and firefox-bin, I suppose.
Comment 26 Brian Crowder 2009-11-25 09:20:15 PST
Have you spoken with the NSPR or NSS developers themselves?
Comment 27 Joel Reymont (:joelr) 2009-11-25 09:22:00 PST
I don't know who they are, really.
Comment 28 Benjamin Smedberg [:bsmedberg] 2009-11-25 09:23:46 PST
How much effort do we really want to spend on this investigation bug? We know that making NSS work statically will be both technically and socially difficult. Is the current configuration insufficient for measuring what we want to measure?
Comment 29 Joel Reymont (:joelr) 2009-11-25 09:28:16 PST
The current configuration should be sufficient, although we may want to make static nspr + plc + plds. 

I have proven that a (mostly) static Firefox starts 300-500ms faster. 

I'm not trying to apply code reordering to this build of Firefox and time startup performance for each reordering. Perhaps I need to file a separate bug, dependent on this one.
Comment 30 Benjamin Smedberg [:bsmedberg] 2009-11-25 09:32:52 PST
300-500ms is what percent improvement?
Comment 31 Joel Reymont (:joelr) 2009-11-25 09:36:39 PST
Benjamin, please see comment 13, comment 16, comment 19 and comment 21.
Comment 32 Brian Crowder 2009-11-25 09:39:06 PST
This could be way more than 500ms for Firefox or Fennec (if we can apply similar knowledge to xulrunner apps) on mobile devices, too.  I haven't tested it yet, however.
Comment 33 Benjamin Smedberg [:bsmedberg] 2009-11-25 09:56:28 PST
Sure. I'm more interested in what decision point we're trying to get to. The basic tradeoff, unless I'm mistaken, is that we can gain some significant amount of startup time (and perhaps runtime) in exchange for breaking the possibility of having binary components. That may be a good tradeoff, but what more numbers do we need to have before actually presenting that data and deciding, and will numbers from static NSPR/NSS change the decision?
Comment 34 (dormant account) 2009-11-25 09:59:45 PST
(In reply to comment #33)
> Sure. I'm more interested in what decision point we're trying to get to. The
> basic tradeoff, unless I'm mistaken, is that we can gain some significant
> amount of startup time (and perhaps runtime) in exchange for breaking the
> possibility of having binary components. That may be a good tradeoff, but what
> more numbers do we need to have before actually presenting that data and
> deciding, and will numbers from static NSPR/NSS change the decision?

I don't think this is about banning binary components. It may involve breaking backwards compatibility with binary components. I think we have a chance to do that on 64bit mac builds and mobile.
Comment 35 Brian Crowder 2009-11-25 10:00:36 PST
Oh, I see what you mean.  Well, this decision would be a lot simpler if js-ctypes were working everywhere, right?  I don't have a good idea of what components we might be losing there, though (for example, I -think- Fennec's phone support qualifies as just such a binary component, but I could be wrong).
Comment 36 Joel Reymont (:joelr) 2009-11-25 10:05:10 PST
The js-ctypes bug was fixed recently on Maemo. 

We still need to try reordering the code in Firefox to see how that affects startup performance [1]. gprof produces the following kinds of reorderings

1) Ordering based on a “closest is best” analysis of the profiling call graph. Calls that call each other frequently are placed close together.

2) Routines sorted by the number of calls made to each routine, largest numbers first.

3) Routines sorted by the order in which they are called.

4) Routines sorted by the amount of CPU time spent, largest times first.

gprof only works on static code, though, so the more static we make the build the better it is for gprof.

There are issues with my use of gprof on Snow Leopard and I'm trying to work these out.

[1] http://developer.apple.com/mac/library/documentation/Performance/Conceptual/CodeFootprint/Articles/ImprovingLocality.html#//apple_ref/doc/uid/20001862-103710
Comment 37 Joel Reymont (:joelr) 2009-12-01 11:02:01 PST
Created attachment 415438 [details]
What I use to build a static x86-64 Firefox on Snow Leopard
Comment 38 Benjamin Smedberg [:bsmedberg] 2009-12-01 11:05:19 PST
Binary components require linking against a shared library for their symbols (libxpcom.so currently). Unless we were to do something really obscure with a table of function pointers, I think a static build implies not supporting binary XPCOM components in their current form.
Comment 39 Vladimir Vukicevic [:vlad] [:vladv] 2009-12-01 11:28:42 PST
(In reply to comment #29)
> I have proven that a (mostly) static Firefox starts 300-500ms faster. 

... on Mac?  Or cross platform?  Mac seems to have the biggest hit for dynamic libs out of our three main platforms, so this might not be anywhere near as big of a win for Windows or Linux.
Comment 40 Joel Reymont (:joelr) 2009-12-01 11:31:18 PST
I'm strictly Mac, I don't test anywhere else.
Comment 41 (dormant account) 2009-12-01 11:33:16 PST
(In reply to comment #39)
> (In reply to comment #29)
> > I have proven that a (mostly) static Firefox starts 300-500ms faster. 
> 
> ... on Mac?  Or cross platform?  Mac seems to have the biggest hit for dynamic
> libs out of our three main platforms, so this might not be anywhere near as big
> of a win for Windows or Linux.

I plan to test this on Linux in the near future. On Linux > half of our io is due to libraries. Combining these and laying them out properly should be a significant win on linux too.
Comment 42 Joel Reymont (:joelr) 2009-12-01 15:55:25 PST
Created attachment 415512 [details]
Remove -gfull and -dead_strip
Comment 43 Joel Reymont (:joelr) 2009-12-08 09:44:04 PST
Created attachment 416581 [details] [diff] [review]
Static build. Now works on Mac and Linux.
Comment 44 Joel Reymont (:joelr) 2009-12-08 09:49:30 PST
It looks like I missed a -whole lot- of system dynamic libraries on Linux (X11, Xt, etc.). Working on that...
Comment 45 Joel Reymont (:joelr) 2009-12-08 10:12:22 PST
Taras says there's no need to link against static versions of system libs so the patch should be complete.
Comment 46 Benjamin Smedberg [:bsmedberg] 2009-12-09 13:59:48 PST
From jimm's comment last week, I think we can actually do this in a way which doesn't break existing XPCOM components: have a libxpcom.so (xpcom.dll) which does basically the same thing it does today, but instead of forwarding calls to libxul like it does currently, import symbols directly from the firefox.exe (xulrunner.exe?) binary. I believe we can make this work on all three platforms, though in slightly different ways on each.
Comment 47 Joel Reymont (:joelr) 2009-12-09 15:15:27 PST
5547, 3852, 4048, 4204, 4008, 4039 - dynamic, not compressed
4177, 3052, 3138, 3112, 3093, 3263 - dynamic, compressed

4746, 3659, 3532, 3446, 3416, 3486 - static, not compressed
4019, 2716, 2849, 2679, 2671, 2769 - static, compressed

The compressed bit refers to bug 514083.

Also, I'm wisely using a clean profile now. The numbers posted in previous comments unwisely included add-ons, e.g. Delicious Bookmarks.
Comment 48 Joel Reymont (:joelr) 2009-12-11 07:42:38 PST
Re: comment 46:

I don't know how to build libxpcom.so that imports from firefox-bin since firefox-bin wants -lxpcom at build time.
Comment 49 Benjamin Smedberg [:bsmedberg] 2009-12-12 18:55:49 PST
You have to do it the other way around, of course: make firefox-bin not want -lxpcom.
Comment 50 Joel Reymont (:joelr) 2009-12-14 03:15:00 PST
I honestly don't know how to make firefox-bin not want xpcom. What is firefox-bin now is basically XUL and I always thought XUL was components and thus required xpcom. If we unbundle XUL from firefox-bin then we are back to square one and a dynamic build.
Comment 51 Benjamin Smedberg [:bsmedberg] 2009-12-14 06:24:34 PST
Did you try it? XUL is components, but it doesn't need the symbols in libxpcom because those are just forwarder symbols which enter back into libxul.
Comment 52 Joel Reymont (:joelr) 2009-12-14 10:18:36 PST
Created attachment 417502 [details] [diff] [review]
Should require no mozconfig
Comment 53 Zack Weinberg (:zwol) 2009-12-15 11:40:23 PST
This strikes me as a complete nonstarter on Linux due to distro requirements.  They already hate our habit of pulling system libraries into our source tree (note: *all* third party code in our tree is system libraries on Linux) -- static linkage is Right Out.
Comment 54 Joel Reymont (:joelr) 2009-12-15 11:44:37 PST
Zack, what exactly is a non-starter? I do not plan to statically link against system libraries, we have plenty of libraries within the Firefox proper to statically link against!
Comment 55 Ted Mielczarek [:ted.mielczarek] 2009-12-15 11:45:46 PST
Joel: he means that Linux distros ship with --enable-system-{everything}. They also ship their Firefox on top of a system XULRunner build.

Zack: as long as their builds continue to work, I think that's fine. We don't have to ship in the same configuration as they do (we're not currently).
Comment 56 (dormant account) 2009-12-15 11:47:30 PST
(In reply to comment #54)
> Zack, what exactly is a non-starter? I do not plan to statically link against
> system libraries, we have plenty of libraries within the Firefox proper to
> statically link against!

a) Distros already ship in a different configuring to our linux nightlies
b) It's not as bad as it may seem. Distros ship xulrunner configurations, so this bug just means a fat xulrunner-bin. 
c) This will break anybody trying to embed gecko with distro packages.
Comment 57 Joel Reymont (:joelr) 2009-12-15 11:49:57 PST
All platforms apart from the Mac are fine as far as I'm concerned. Mac is the platform that clearly benefits from a static build so I'll be happy to leave other platforms as they are ;-).
Comment 58 Zack Weinberg (:zwol) 2009-12-15 11:58:28 PST
It's my contention that we should be trying to minimize the differences between the typical distro configuration and the binaries we distribute, thus this seems like a step in the wrong direction on Linux.
Comment 59 (dormant account) 2009-12-15 12:03:46 PST
(In reply to comment #58)
> It's my contention that we should be trying to minimize the differences between
> the typical distro configuration and the binaries we distribute, thus this
> seems like a step in the wrong direction on Linux.

I think given the choice, we should optimize firefox towards better user experience rather than towards better distro packager experience. Taking a big perf hit due to packaging traditions is absurd. If anything, I'd rather work with distros on changing their packaging strategy.
Comment 60 Brendan Eich [:brendan] 2009-12-15 12:21:07 PST
What Taras said.

zwol: Firefox (and in some ways Mozilla [SeaMonkey, the suite] before it) has defied Linux distro convenience and other botches that have not served Linux or open source tied to it well. We aren't going to stop now, although if there is no need to differ then of course we'll not go out of our way to make distros' lives harder.

/be
Comment 61 Zack Weinberg (:zwol) 2009-12-15 12:29:12 PST
(In reply to comment #59)
> 
> I think given the choice, we should optimize firefox towards better user
> experience rather than towards better distro packager experience.

Well, abstractly, yes...

> Taking a big
> perf hit due to packaging traditions is absurd. If anything, I'd rather work
> with distros on changing their packaging strategy.

... and I definitely think we should work more closely with the Linux distros, but the thing is, their  use of --enable-system-{everything} is not a tradition, or a botch, or a convenience; it's a hard constraint imposed by their need to do security updates to all those libraries without checking their entire package collection for bundled copies.
Comment 62 Joel Reymont (:joelr) 2009-12-15 13:00:56 PST
Created attachment 417757 [details] [diff] [review]
static build

Incorporates feedback from the try server.
Comment 63 Brendan Eich [:brendan] 2009-12-15 21:20:17 PST
(In reply to comment #61)
> > Taking a big
> > perf hit due to packaging traditions is absurd. If anything, I'd rather work
> > with distros on changing their packaging strategy.
> 
> ... and I definitely think we should work more closely with the Linux distros,
> but the thing is, their  use of --enable-system-{everything} is not a
> tradition, or a botch, or a convenience; it's a hard constraint imposed by
> their need to do security updates to all those libraries without checking their
> entire package collection for bundled copies.

The botches I was referring to have to do with the wrong linkage visibility default, and the inordinate cost of dynamic libraries on Linux compared to Windows.

Linux distros have to mind their security updates, but so do we -- and we have to speed up startup and other benchmarks against fierce browser competition. Forgive me for putting Linux distros a distant second to Firefox here.

/be
Comment 64 Joel Reymont (:joelr) 2009-12-16 03:09:28 PST
Brendan, what is the -right- linkage visibility default for Linux?
Comment 65 Joel Reymont (:joelr) 2009-12-16 05:47:33 PST
Created attachment 417915 [details] [diff] [review]
Builds on Linux and Mac (except 10.5.2) on the try server.
Comment 66 Brendan Eich [:brendan] 2009-12-16 11:23:26 PST
(In reply to comment #64)
> Brendan, what is the -right- linkage visibility default for Linux?

See bug 227537.

/be
Comment 67 Joel Reymont (:joelr) 2009-12-17 09:11:15 PST
I'm trying to make firefox-bin not want -lxpcom. My understanding is that xpcom forwards to private APIs (_P_) in firefox-bin at the moment. I'm trying to build components that go into firefox-bin with MOZILLA_INTERNAL_API = 1. 

It doesn't seem to work, though:

In file included from ../../../../../mozilla-central/browser/components/migration/src/nsProfileMigrator.cpp:60:
../../../../dist/include/nsStringAPI.h:47:2: error: #error nsStringAPI.h is only usable from non-MOZILLA_INTERNAL_API code!

What is the workaround?
Comment 68 Joel Reymont (:joelr) 2009-12-17 09:20:57 PST
To word it differently...

Is replacing all #include "nsStringAPI.h" with "nStringGlue.h" a proper workaround?
Comment 69 Benjamin Smedberg [:bsmedberg] 2009-12-17 09:25:35 PST
Yes, that's what nsStringGlue is for.
Comment 70 Joel Reymont (:joelr) 2009-12-18 10:27:58 PST
Created attachment 418393 [details] [diff] [review]
static build
Comment 71 Joel Reymont (:joelr) 2009-12-18 10:29:14 PST
Created attachment 418394 [details] [diff] [review]
Builds libxpcom as a dynamic library to avoid breaking components
Comment 72 Joel Reymont (:joelr) 2009-12-22 14:38:39 PST
Created attachment 418913 [details] [diff] [review]
Talos errors on Linux
Comment 73 Joel Reymont (:joelr) 2009-12-23 09:03:14 PST
Created attachment 419016 [details] [diff] [review]
Still fails on Maemo
Comment 74 Joel Reymont (:joelr) 2009-12-28 14:19:09 PST
What tests should -not- be run with the static build?
Comment 75 Joel Reymont (:joelr) 2010-01-03 05:30:16 PST
Created attachment 419820 [details] [diff] [review]
static build

Tests pass on Snow Leopard and Linux.

Try Server runs out of space when running Linux tests.
Comment 76 Joel Reymont (:joelr) 2010-01-04 00:21:58 PST
Created attachment 419869 [details] [diff] [review]
static build, tests pass, xpcshell disabled

No Maemo (xulrunner) yet.

Build fails on Mac OSX 10.5.2 since there's no gcc -force_load.

Build and tests fail on Linux Try Server since it runs out of disk space.
Comment 77 Joel Reymont (:joelr) 2010-01-05 07:35:32 PST
Created attachment 420078 [details] [diff] [review]
static build

xulrunner builds on my machine but non on the try server (maemo).
Comment 78 Joel Reymont (:joelr) 2010-01-07 10:59:03 PST
Created attachment 420574 [details] [diff] [review]
static build

Builds and checks on Mac, Linux and Maemo. No xpcshell still.
Comment 79 Joel Reymont (:joelr) 2010-01-08 12:56:08 PST
Here are the latest timings from the static build measured against the regular dynamic one. I built and measured on the 'normal user' Mozilla laptops that arrived recently. Needless to say, these timings are completely f..d up! 

10.5.8 build, run on 32-bit, 2Ghz Core Duo, 2Gb 667Mhz DDR2, 1.5Gbit SATA, HD 7200rpm

4292, 2694, 2590, 2527, 2480, 2880 - dynamic
4000, 2528, 2980, 2518, 2535, 2544 - static

10.5.2 build, 32-bit, 2.33Ghz Core2 Duo, 2Gb 667Mhz DDR2, 1.5Gbit SATA, HD 5400rpm

6495, 4479, 5043, 4503, 4381, 4167 - dynamic
5948, 5311, 4614, 4534, 4322, 4144 - static

Save as right above but running on 10.6.2

4910, 3200, 3285, 3200, 3149, 3054 - dynamic
4884, 3025, 3013, 3244, 3638, 3061 - static
Comment 80 Joel Reymont (:joelr) 2010-01-08 13:09:59 PST
I haven't run the old/original patch on my newly acquired slow laptops so I don't know if -that- performance gain still holds. The difference between what I was doing then and now is that 

1) components work now and tests pass
2) xpcom is routed through a dynamic xpcom dylib to the firefox body

Components didn't work previously and there was no 'routing' since xpcom was a static library, right within firefox.

I'll dig up the old patch and see if the performance gains hold. Obviously, they are not there with the current fully working build.
Comment 81 Joel Reymont (:joelr) 2010-01-11 00:30:16 PST
Here I include the timings of a the old static/dynamic builds (as of 12/09/09) that I had lying around, as well as the timings using a simple static patch that breaks components (comment #70). The latter is the patch that gave me fast static startup.

Notice that there's not much of a difference between dynamic, static and comment #70 timings. Firefox startup must have gotten much faster due to all the patches we have landed in December.

This is proven by the fact that the 12/09/09 static and dynamic builds of Firefox are SO MUCH slower than the recent builds. Skip to the last set of timings to see this.

Here are my conclusions:

1) A static build of Firefox is moot at this point. Making and landing one is a HUGE hassle and the Windows static build is yet to be done. 

2) A static build may make code layout optimization easier but this is in no way a given. We may be able to optimize code layout with dynamic build.

3) Since a dynamic build startup is slow on Linux and Maemo, this is where we should focus our future optimization efforts.

---

10.5.8, 32-bit, 2Ghz Core Duo, 2Gb 667Mhz DDR2, 1.5Gbit SATA, HD 7200rpm

4292, 2694, 2590, 2527, 2480, 2880 - dynamic
4000, 2528, 2980, 2518, 2535, 2544 - static
4564, 2621, 2270, 2349, 2227, 2314 - bug 525013, comment #70

10.5.2, 32-bit, 2.33Ghz Core2 Duo, 2Gb 667Mhz DDR2, 1.5Gbit SATA, HD 5400rpm

6495, 4479, 5043, 4503, 4381, 4167 - dynamic
5948, 5311, 4614, 4534, 4322, 4144 - static
7919, 4413, 4204, 4465, 3605, 4238 - bug 525013, comment #70

Same as right above but 10.6.2

6103, 3977, 3891, 3954, 3743, 3791 - old dynamic, 12/09/09 build
4910, 3200, 3285, 3200, 3149, 3054 - dynamic
5286, 4039, 3594, 3863, 3588, 3552 - old static, 12/09/09 build
4884, 3025, 3013, 3244, 3638, 3061 - static
5203, 3181, 3644, 3185, 3172, 3248 - bug 525013, comment #70
Comment 82 Joel Reymont (:joelr) 2010-01-11 14:56:46 PST
One other thing I would do here is roll back to the last commit on or before 12/09/09 and build on 10.5.x, just in case the 10.6.1 to 10.6.2 upgrade is the issue. 

Timing on Linux makes sense, if only to disprove my theory that the speed up came from code improvements.
Comment 83 Joel Reymont (:joelr) 2010-01-12 14:23:36 PST
10.6.2, 64-bit, 2.93Ghz Core2 Duo, 4Gb 1067Mhz DDR3, 3bit SATA, external USB HD

This is the configuration I used to test all builds prior to comment #79.

Here, I used 100 runs per build (A-G) using Vlad's timing approach, modified to include the stop.js code and exit the browser after each run. window.close() does not exit the browser on the Mac, unfortunately.

A, B, C build everything statically, with patch from comment #70 or very similar. I didn't get fully working components until late in December. Current sources are fastest. 

These builds break component compatibility because everything (xpcom and browsercomps included) is static and built into firefox-bin. There's no overhead with libxpcom.dylib redirecting to firefox-bin.

D is a static build with fully working components, built using current sources. It lags the above (A, B, C) because it's build with dynamic libxpcom and libbrowsercomps, where libxpcom routes to functions in firefox-bin. XPCOM routing adds overhead, thus a slower build.

D is almost as fast as C where C does much less work, i.e. no XPCOM routing to firefox-bin. I think our code must have gotten much faster to compensate for the XPCOM overhead.

E is our regular dynamic build, one I built on 12/09/09. F is the same but built today from 12/09/09 sources. They are fairly close. 

G is our stock build (dynamic) from current sources and is the slowest.

---

A: static, built with patch from comment #70, current sources, 41.9mb firefox-bin
mean = 3507.87, stdev = 146.3946 

B: static, 12/09/09 build I have around, 41.8mb ff-bin
mean = 3686.03, stdev = 118.4819 

C: static, comment #70, 12/09/09 sources, 41.7mb ff-bin
mean = 3736.85, stdev = 161.0536 

D: static, current sources, 41.5Mb firefox-bin
mean = 3740.24, stdev = 156.8986 

E: stock/dynamic, 12/09/09 build, 40.2Mb XUL + dylibs 
mean = 3908.21, stdev = 124.6394 

F: stock/dynamic, 12/09/09 sources, 36.9Mb XUL + dylibs
mean = 4039.61, stdev = 120.5084 

G: stock/dynamic, current sources, 37.1Mb XUL + dylibs
mean = 4065.44, stdev = 127.2844
Comment 84 Joel Reymont (:joelr) 2010-01-12 14:44:35 PST
Created attachment 421337 [details]
Raw data I used for statistics in comment #83

A: C70.time, static, built with patch from comment #70, current sources, 41.9mb firefox-bin
B: Static1209.time, static, 12/09/09 build I have around, 41.8mb ff-bin
C: C70Tip1209.time, static, comment #70, 12/09/09 sources, 41.7mb ff-bin
D: Static.time, static, current sources, 41.5Mb firefox-bin
E: Stock1209.time, stock/dynamic, 12/09/09 build, 40.2Mb XUL + dylibs 
F: StockTip1209.time, stock/dynamic, 12/09/09 sources, 36.9Mb XUL + dylibs
G: Stock.time, stock/dynamic, current sources, 37.1Mb XUL + dylibs
Comment 85 (dormant account) 2010-01-13 14:01:28 PST
So there is a 6% difference between A & D, is that a result of linking in browser components statically?
Comment 86 Zack Weinberg (:zwol) 2010-01-13 14:09:37 PST
I'm a little confused by comment 84.  Is this a correct breakdown?

file           sources   compiled  configuration
----           -------   --------  -------------
C70            current   now       fully static
Static1209     12/09/09  12/09/09  fully static
C70Tip1209     12/09/09  now       fully static
Static         current   now       mostly static
Stock1209      12/09/09  12/09/09  dynamic
StockTip1209   12/09/09  now       dynamic
Stock          current   now       dynamic

If I understand it correctly, there's a gap in your experimental matrix: I'd like to see results for sources from 12/09/09, compiled now, with the "mostly static" patches (that is, dynamic libxpcom and libbrowsercomps, otherwise static).  If that's not easy to do, though, don't worry about it.
Comment 87 Joel Reymont (:joelr) 2010-01-13 15:10:04 PST
Re: comment #85, it's linking in browser components statically and not having a dynamic libxpcom and forwards APIs to firefox-bin.

Re: comment #86, I'll see what I can do.
Comment 88 Zack Weinberg (:zwol) 2010-01-13 15:53:40 PST
Created attachment 421532 [details]
really basic diagnostic on statistics from comment #83

I did a simple diagnostic on the raw data joelr posted.  With data of this form, one should always take a quick look at the observation density function, because if it doesn't conform to a normal distribution, almost all the "standard" statistical tools cannot be trusted.

Each plot in the attached PDF corresponds to one of Joel's conditions A through G; the gray filled curve is a density estimate from the data, and the red line shows the theoretical normal distribution.  The vertical lines along the x-axis are "rug" plots showing each individual data point, to make it easier to spot outliers.

You can immediately see that many of these data sets have negative kurtosis (that is, they are more tightly clustered than the normal distribution); this is not usually a problem.  The left skew with secondary peaks to the right of the mode in every data set, however, _is_ a problem; it indicates that it is unsafe to use the mean, or any statistical test based on the mean, on this data.
Comment 89 Zack Weinberg (:zwol) 2010-01-13 16:00:07 PST
Created attachment 421534 [details]
script to do diagnosis

In case anyone is curious, here's the R script to generate the PDF I posted in the last comment.  You need the nonstandard 'reshape' and 'ggplot2' packages, and it assumes that the tarball from attachment 421337 [details] has been unpacked in a subdirectory named 'data'.
Comment 90 Joel Reymont (:joelr) 2010-01-13 16:06:10 PST
Zack, I'm no statistics maven so I don't understand the implication of what you are saying. I just ran 7 builds of Firefox 100 times each. I want to summarize those runs into a single number. Are you saying I cannot average? What should I do then?
Comment 91 Zack Weinberg (:zwol) 2010-01-13 16:55:52 PST
I'm saying the mean (what people usually mean by "average") should be not used with this data set, yes.

For summaries, I recommend using the median and half the interquartile distance instead of the mean and the standard deviation.  Those are much more robust against skew and outliers.

This also means you cannot safely use any "parametric" statistical test like ANOVA.  (You couldn't trust a two-sample test like t or chi-square with this *experimental design* anyway, regardless of the data.)  There are nonparametric alternatives; I'm looking into them, but it'll be a little more time.
Comment 92 Zack Weinberg (:zwol) 2010-01-13 16:56:30 PST
I really, really need to know whether the breakdown table in comment 86 is correct, by the way.
Comment 93 Zack Weinberg (:zwol) 2010-01-13 17:23:29 PST
To be clear, the problem with the mean is just that it's very sensitive to skew and outliers; for instance, if I strip the outliers from all the data sets, the means shift leftward by 0.02 to 0.1 sigmas, and it's not consistent among sets.  That's enough to throw a comparison off.
Comment 94 Joel Reymont (:joelr) 2010-01-13 23:14:13 PST
I believe the table in comment #86 to be correct. Anxiously waiting for the properly statistical summary matching comment #83 ;-).
Comment 95 Zack Weinberg (:zwol) 2010-01-14 15:38:49 PST
The data set joelr produced is what statistics texts call an
incomplete two-factor experiment.  There are two independent
variables: the baseline source, and the static-ness of the build.
Each of these has three 'levels' -- three possible cases.  Please
forgive the bad ASCII table:

    Source--> Tip, built now   12/09 built now   12/09 built 12/09
Config       \--------------   ---------------   -----------------
| Dynamic    |  Stock           StockTip1209       Stock1209
| Full static|  C70             C70Tip1209         Static1209
V Part static|  Static            XXX                XXX

It's an _incomplete_ experiment because we don't have data for two of
the combinations (the ones marked XXX in the table above).  We can
never get data for the "12/09 built 12/09" case in the "part static"
configuration, unless someone's got a time machine; but we could at
least get the "12/09 built now, partly static" case, which is one of
the things I asked for in comment 86.

The standard thing to do with a two-factor experiment is ANOVA.  ANOVA
attempts to fit a set of linear equations to the data, with one term
for each of the rows, one term for each of the columns, one term for
each individual cell taken as a whole, and finally an "error" term
specific to each observation.  It tests three hypotheses at once:

  * No "main effect" of the baseline source -- the differences among
    the three columns are due to chance.

  * No "main effect" of the build configuration -- the differences
    among the three rows are due to chance.

  * No "interaction" between baseline and configuration -- this is a
    little harder to explain without math, but the idea is that which
    row you're in doesn't change the effect of the column, or vice
    versa.  (Due to the missing data, we can't get this one.)

Yesterday I said that ANOVA could not be used for this data because
the samples aren't normally distributed, but it's actually more
important that the sample variances aren't equal.  This is invisible
in my graphs, but obvious from comment 83 - just glance at the
standard deviations, they're not even close to equal.  Fortunately,
there's a drop-in alternative: "analysis of robust deviance", which
uses the same linear equations but tries to minimize a different
"predictor" than ANOVA.  R lets you do this almost as easily as plain
ANOVA; you just have to install another optional library ("robust")
and use slightly different functions.

We have good reason to believe that the two independent variables in
this experiment interact, which is a problem, because we can't include
the interaction term in our linear equations without the data from the
missing cells.  What I did instead is discard the 'part static'
configuration, leaving a 3x2 complete experiment, and fit the
appropriate model to it.

> anova(model)
| Terms added sequentially (first to last)
|
|               Chisq Df RobustF     Pr(F)
| (Intercept)          1
| source               2   15.53 5.918e-05 ***
| config               1  732.51 < 2.2e-16 ***
| source:config        2  178.48 < 2.2e-16 ***

This tells us that both our choice of baseline source (old-build,
old-source-new-build, or new-source) and the configuration (dynamic or
static) have significant effects on startup time, and also that
they're not independent: one affects the other.

Next we look at the results - note that I've cleaned this up a bit:

> summary(model)
| Call: lmRob(formula = time ~ source * config, data = timings.t)

It prints back out what I told it to do in the first place.
"source*config" means to consider source, config, and their
interaction.

| Residuals:
|     Min       1Q   Median       3Q      Max
| -356.21   -76.84    -4.15    81.81  1646.54

"Residuals" are the components of the original data that could not be
explained by the model.  In other words, it takes each original data
point and subtracts off its prediction for that combination of
indepedent variables.  These look quite large, but keep in mind that
the variation within cells in the original data is enormous; we can't
hope to explain much of it with this model.

| Coefficients:
|                             Value     Std. Error t value   Pr(>|t|)
| (Intercept)                 4065.1015   12.6421   321.5516    0.0000
| sourceoldScurB               -36.1217   17.8595    -2.0225    0.0436
| sourceoldSoldB              -169.6398   17.8886    -9.4831    0.0000
| configstatic                -570.8953   18.0133   -31.6930    0.0000
| sourceoldScurB:configstatic  254.7560   25.4431    10.0128    0.0000
| sourceoldSoldB:configstatic  353.8470   25.3904    13.9362    0.0000

You should ignore the t-value and Pr(>|t|) columns on this; they don't
mean anything straightforward or particularly useful.  The interesting
part is the Value column.  The first row is the predicted startup time
for the cell it took as the default condition; by examining the other
rows we can deduce that this must be source=curScurB,config=dynamic;
i.e. "Stock".  Subsequent rows tell us the predicted *change* in
startup times due to the independent variables *in isolation*, and
then the interactions.  Contra comment 81, we seem to have gotten
*slower* at starting up since 12/09, but most of that appears to be
due to the build, not the source (compiler issue, maybe?  10.6 vs 10.5?)
The effect of a static build is quite large (nearly 600 ms faster) but
it doesn't help as much with the older builds - again, this
contradicts comment 81.

The std.err column can be thought of as a weak confidence interval for
the prediction.  There are ways to get better confidence intervals but
I don't have time to dig them up right now; I've already spent nearly
all day on this :)

| Residual standard error: 118 on 600 degrees of freedom
| Multiple R-Squared: 0.609
|
| Test for Bias:
|              statistic p-value
| M-estimate   -4.68           1
| LS-estimate 135.34           0

These are just sanity checks.  R-squared is a goodness-of-fit
statistic; 0 would mean no overall correlation between the independent
variables and the data, 1 would mean perfect correlation.  60% is
pretty darn good for data like this.

The "test for bias" is comparing the robust analysis ("M-estimate") to
conventional ANOVA ("LS-estimate"; LS stands for least squares); it
claims that conventional ANOVA would be way off on this data set.

----

I'd draw two conclusions from this: first, we have confirmed that the
"comment 70" static build configuration is significantly faster than
the stock configuration; second, current sources may in fact be
*slower* than they were a month ago.  Without knowing more about what
changed between "old source, old build" and "old source, new build" I
don't want to say anything about what might be going on there.

To know whether the "fully working components" partially dynamic build
is worthwhile, I need to see the results for it with the old sources;
or else we could just forget about the old sources and do a three-way
comparison with only one independent variable (dynamic / working
components / fully static).
Comment 96 Zack Weinberg (:zwol) 2010-01-14 15:46:31 PST
Created attachment 421717 [details]
visualization of statistical model

This chart might be of interest - it's a visualization of the model I described in the previous comment.  The box plots are computed directly from the raw data.  The solid lines show the cell predictions made by a robust linear model (aka "robust ANOVA"), and the dotted lines show the predictions of a standard least-squares linear model with the same structure.  You can see that the robust model pretty much nails the median value for all cells, whereas the standard model is consistently biased higher -- this is exactly what one would expect for least squares on skew-left data with high outliers.

Note that *for this data set* it doesn't really matter that much, because the cell differences are so large; we could probably have gotten away with regular ANOVA.  (The central limit theorem is a friend to us all.) However, I tend to reach for the robust and/or nonparametric statistics regardless, because it means I don't have to worry about it :)
Comment 97 Joel Reymont (:joelr) 2010-01-15 02:11:52 PST
Created attachment 421794 [details]
Static build, 12/09/09 sources

Missing data for comment #86
Comment 98 Zack Weinberg (:zwol) 2010-01-15 12:35:56 PST
Created attachment 421891 [details]
analysis script for fully static vs "working components" configuration

Thanks, Joel! Now I can look at the differences between the fully-static and "working components" configurations.  The attached script does the whole thing. It spits out two graphs which I'll attach separately, plus a bunch of printouts.

This analysis considers only the "built now" data, since we still don't have the "built 12/09" data for the working-components configuration (nor will we ever have it).  Thus, we again have a 3x2 experimental design, but now it's the configuration that has three possible levels.  The model looks just the same, though.

> model <- lmRob(time ~ source * config, data=timings.f)
> anova(model)

Terms added sequentially (first to last)

              Chisq Df RobustF   Pr(F)    
(Intercept)          1                    
source               1      41 8.1e-11 ***
config               2     642 < 2e-16 ***
source:config        2      99 < 2e-16 ***

Again, we have highly significant effects of both the source (december or current) and the configuration (dynamic, partial, static), and an interaction between them.

The full summary is printed by the script, but i'm not going to go over the whole thing, only the coefficient table (note that I have manually converted
it from scientific notation to normal; also, I probably should have rounded the numbers more aggressively).

Coefficients:
                               Value  Std. Error  t value   Pr(>|t|)
(Intercept)                   4065.5        13.7   295.7    0.
sourcedecember                 -35.7        19.5    -1.83   0.0677
configpartial                 -337.2        19.5   -17.3    0.
configstatic                  -568.9        19.6   -29.0    0.
sourcedecember:configpartial   233.1        27.8     8.38   4.44e-16
sourcedecember:configstatic    251.1        27.8     9.03   0.

The baseline timing (labeled "(Intercept)") is for current sources in the dynamic configuration.  Switching to December sources speeds things up slightly, consistent with yesterday's results.  The partial-static configuration, applied to current sources, speeds things up by 340ms, and the fully-static configuration speeds things up by 570ms.  December sources are *slower* in either the partial-static or full-static configuration, by roughly the same amount.

I picked five pairwise comparisons to do on top of this analysis.  When you do this you have to be careful to avoid an inflated risk of false positives -- if you set your confidence level at 95% and do 20 tests, at least one of them should appear to be true just by chance.  There are systematic ways of avoiding this, but for a simple setup like this one, we can just pick a tighter confidence level, like 99% or 99.9%.

There are several excellent "nonparametric" tests for pairwise comparison, so we don't have to worry about our sample distributions nearly as much as we did with the ANOVA.  I used the Wilcoxon rank-sum test (aka Mann-Whitney U), which is a good default choice.

1) Is there a significant performance difference between the December code and the current code, in the stock (dynamic) configuration?

> with(timings.c, wilcox.test(current.dynamic, december.dynamic)
data:  current.dynamic and december.dynamic 
W = 5832, p-value = 0.04231
alternative hypothesis: true location shift is not equal to 0

The p-value is just barely below the usual 95% confidence threshold.  Since we are doing multiple comparisons, I'd be inclined to be pickier, and say that we cannot prove a performance difference.

2) Which configurations are significantly different from each other? Remember that the robust ANOVA only tells us that _some_ configuration change makes a difference, not which one it is.  To answer this we need to compare both "dynamic" and "static" to "partial", for both levels of "source".  We don't need to compare "dynamic" directly to "static", because we know that "partial" lies between them, so if "partial" is significantly separated from both, they must be significantly separated from each other.

I've reformatted the test results into a table:

                           W      Pr(>W)
Current  dynamic/partial   9540   < 2.2e-16
         partial/static    8456   < 2.2e-16
December dynamic/partial   6636     6.4e-05
	 partial/static    8715   < 2.2e-16

Even with a multiple-comparisons correction to our confidence threshold, those look conclusively separated -- the part-static configuration is definitely faster than the full-dynamic configuration, and the full-static configuration is faster than either of them, regardless of the age of the source.
Comment 99 Joel Reymont (:joelr) 2010-01-15 12:36:35 PST
Created attachment 421893 [details] [diff] [review]
static libxul build

Should work on Mac, Linux and Maemo.
Comment 100 Zack Weinberg (:zwol) 2010-01-15 12:39:41 PST
Created attachment 421895 [details]
chart 1: predictions vs data

This graph (warning: it's very tall) plots the modeled predictions for each condition over a jittered dotplot of the raw data.  It visually illustrates what I said in the previous comment: dynamic configurations are about the same from December to now, the part-static configuration speeds things up, the full-static configuration speeds it up even more, and current sources are faster than December in either static config.

I find it very strange that the dynamic configuration doesn't show a performance difference between December and now.  That might be worth further investigation (debugging, not statistics).
Comment 101 Joel Reymont (:joelr) 2010-01-15 12:44:28 PST
> I find it very strange that the dynamic configuration doesn't show a
performance difference between December and now.  That might be worth further
investigation (debugging, not statistics).

Have we landed Mac optimizations in December?

Does Talos show an improvement?

This is all on the Mac, I think we should run the same batch of tests on Linux and Maemo, current sources only, static with working components vs. stock dynamic.

I don't think it's worth considering the fully static version that breaks components. Unless >500ms make it work our while.
Comment 102 Zack Weinberg (:zwol) 2010-01-15 12:48:57 PST
Created attachment 421896 [details]
chart 2: q-q plot of residuals

This is what's called a "quantile-quantile" plot of all the residuals, regardless of test condition.  If the data were normally distributed within each cell, all the purple dots would lie exactly on the black line.  It's not, so they don't.

I'm throwing this in because it shows some interesting structure.  All the negative residuals are pretty close to the line, but the positive residuals are on a steeper curve, with at least two kinks in it.  I'd speculate that not all the test cycles are getting the same benefit from some/all of the computer's caches, because falling out of this or that cache tends to put you on a different performance _curve_, hence the kinks.
Comment 103 Zack Weinberg (:zwol) 2010-01-15 12:51:33 PST
(In reply to comment #101)
> > I find it very strange that the dynamic configuration doesn't show a
> > performance difference between December and now.
> 
> Have we landed Mac optimizations in December?
> Does Talos show an improvement?

Dunno.

> This is all on the Mac, I think we should run the same batch of tests on Linux
> and Maemo, current sources only, static with working components vs. stock
> dynamic.

Yes.

On Linux it would also be interesting to compare stock-dynamic to stock-dynamic with the 'prelink' utility run over every .so loaded into the runtime image (both from our code and from the system).

> I don't think it's worth considering the fully static version that breaks
> components. Unless >500ms make it work our while.

Concur.
Comment 104 Zack Weinberg (:zwol) 2010-01-15 12:54:24 PST
(In reply to comment #103)
> 
> On Linux it would also be interesting to compare stock-dynamic to stock-dynamic
> with the 'prelink' utility run over every .so loaded into the runtime image
> (both from our code and from the system).

Come to think of it, Maemo is also an ELF-based OS, isn't it?  If so, the optimization that 'prelink' does should apply there too.
Comment 105 Joel Reymont (:joelr) 2010-01-18 07:43:08 PST
Created attachment 422208 [details] [diff] [review]
static libxul build

Should build on Mac OSX 10.5.2, 10.6.x, Linux and Maemo (xulrunner).

Maemo build fails on the try server because tests are enabled and Mac OSX 10.5.2 fails on the try server for some strange reason.
Comment 106 Joel Reymont (:joelr) 2010-01-20 05:44:22 PST
Created attachment 422526 [details]
raw timings data on snow leopard

Included are the timings for stock, static and libxul static builds running on Snow Leopard and 4200rpm external USB, 5400rpm and 7200rpm disks.
Comment 107 Joel Reymont (:joelr) 2010-01-20 05:46:49 PST
5400rpm is paired with a 2Ghz CoreDuo
7200rpm and 4200rpm are for a 2.33Ghz Core2Duo 

I included timings for 2.93Ghz Core2Duo and SSD as well but I can't fit them into the analysis for some reasons. R is giving me errors.
Comment 108 Joel Reymont (:joelr) 2010-01-20 06:04:37 PST
Created attachment 422528 [details]
raw timings data on snow leopard, now with results

Same as comment #106 and comment #107 but now with results for those of you who don't have R installed.
Comment 109 Joel Reymont (:joelr) 2010-01-20 08:43:53 PST
Created attachment 422547 [details]
startup performance chart, most instructive!

This chart shows how our startup performance changes depending on hard drive speed and build type. 

Note how the 100 timings for each build type (shown in color) cluster more and more tightly depending on disk speed.

Note also that a static build consistently blows both the stock build and the libxul static build out of the water!
Comment 110 Zack Weinberg (:zwol) 2010-01-20 11:34:09 PST
The errors with the SSD data included were due to a bug in lmRob(), for which I managed to find a fix on the internets. Here's what I get. Note that I reorganized the data table and the model a bit.

> model <- lmRob(time ~ disk*build, data=timings.f)

I put 'disk' first in the model because we know from Joel's chart that that's the biggest effect. This doesn't change the predictions, but it does change what we get out of anova() --

> anova(model)
| Terms added sequentially (first to last)
|
|             Chisq Df RobustF     Pr(F)
| (Intercept)        1
| disk               3  4925.8 < 2.2e-16 ***
| build              2   208.1 < 2.2e-16 ***
| disk:build         6   312.7 < 2.2e-16 ***

Compare:

> anova(lmRob(time ~ build*disk, data=timings.f))
|             Chisq Df RobustF  Pr(F)
| (Intercept)        1
| build              2     1.1 0.2774
| disk               3  4933.0 <2e-16 ***
| build:disk         6   312.7 <2e-16 ***

It fails to find a significant effect for 'build' unless we subtract out the effects of 'disk' first.

Let's have a look at the coefficients. Yes, I edited the output a whole bunch; I don't know any way of persuading R not to spit out a bunch of unreadable numbers in scientific notation. Sorry.

> summary(model)
[...]
|                      Value      Std. Error t value    Pr(>|t|)
| (Intercept)           4075.      8.103     502.8     tiny

The 'value' on this row is the estimated time for a stock build to start up on the slowest disk (4200rpm), in milliseconds.

| disk5400             -1627.      9.925    -164.0     tiny
| disk7200             -2250.      9.772    -230.3     tiny
| diskssd              -3303.      9.741    -339.1     tiny

The values in these rows are the estimated *change* in time, relative to the first row, for the stock build to start up on one of these three faster disks. No surprises here - startup is I/O bound, faster disks are faster, and SSD is faster yet.

| buildlibxul          -   5.049  10.79     -  0.4681  0.6398
| buildstatic          - 236.8    11.24     - 21.06    tiny

Again, estimated change in time, relative to the first row, for one of these two modified builds to start up on the slowest disk. The "libxul" configuration is not notably faster, the "static" configuration is about 230ms faster overall. (Let me remind everyone that the "t value" and "Pr(>|t|)" columns do not tell us anything especially useful - but whether or not an improvement of 5ms is *significant*, I hope we can all agree that it's *peanuts*.)

| disk5400:buildlibxul   121.0    13.45        8.996   tiny
| disk7200:buildlibxul -  63.20   13.21     -  4.784   1.938e-06
| diskssd:buildlibxul  -  10.06   13.19     -  0.7624  0.4460

These are additional adjustments for the libxul configuration on the faster disks. Note the lack of a consistent trend. It's *slower* than stock on the 5400rpm disk, but faster again (by not much) on the 7200rpm and SSD machines.

| disk5400:buildstatic   216.7    13.81       15.69    tiny
| disk7200:buildstatic   120.3    13.58        8.861   tiny
| diskssd:buildstatic    197.8    13.53       14.62    tiny

These numbers mimic the pattern of the libxul configuration, but with different offsets.  That means it's not just the libxul configuration bouncing around zero due to noise; there's actually some reason why this pattern happens.  I'll come back to this later.

Without doing any graphs or specific comparisons, at this point I think we can draw three conclusions:

1) Disk performance matters a lot.
2) The "libxul" configuration does not provide enough of an improvement to bother with.
3) The "static" configuration *may* be worth bothering with, but only on machines with slow disks.  Can we get an estimate of the proportion of our userbase on such machines?

Visualizations and code to follow.
Comment 111 Zack Weinberg (:zwol) 2010-01-20 13:27:21 PST
Created attachment 422606 [details]
diagnostic plot of data from comment 108

I generated three plots from the data and I'm going to go through them one by one.

First off, this is the same diagnostic plot I did for the previous set of data.  This time, the panels are all labeled and sorted!  Each column corresponds to a test disk, and each row to a build configuration.

The interesting thing about this chart is what happens as we go left to right -- the Gaussian variation drops, but we still get outliers of about the same relative magnitude.  What I think this means is, the big source of Gaussian variation in these tests is mechanical variation in hard drive behavior.  This will naturally shrink as total access time goes down (i.e. as the disk spins faster) and it makes sense that the SSD configuration has far less of that than the others.  However, even in the SSD configuration we get a scatter of points to the right of the main peak, and in the others there are significant second modes to the data.  That, I'm gonna blame on the operating system not being perfectly idle when the tests are running.
Comment 112 Zack Weinberg (:zwol) 2010-01-20 13:29:17 PST
Created attachment 422607 [details]
overall prediction with data

This is the same as Joel's "startup performance chart" except that it includes the SSD data, since I was able to fix lmRob() to handle it.

The major trend here, due to the disk, is so strong that I don't think we can draw any conclusions from this graph other than "disk matters".  Moving on...
Comment 113 Zack Weinberg (:zwol) 2010-01-20 13:42:04 PST
Created attachment 422614 [details]
prediction with effect of 'disk' removed

The thing to do when you've got a trend that interferes with seeing anything else is, of course, to subtract off that trend.  That's this graph.  Again, the points are individual observations (two extreme outliers are not shown on this scale, being above the vertical maximum) and the lines are the model's predictions.  The gray areas are a (feeble) measure of uncertainty in said predictions.

You will notice that there is something odd going on here: the "dynamic" build is right on top of the "libxul" build for the 4200rpm disk, but is right on top of the "static" build for the 5400rpm disk, and then goes back to being the slowest for the 7200rpm disk!  Meantime, the "libxul" and "static" builds show more or less the same trends relative to each other, but both of them are *worse* on the 5400rpm system than the 4200rpm system.

I think what's going on here is that two effects are conflated.  First, there's this nice predictable reduction in the difference between the three builds as the disk gets faster.  That's to be expected, and that's why I said before that the "static" configuration might be worth it for very slow disks but not otherwise.

The other thing that we're seeing is the effect of different CPUs, which is conflated with the disk effect.  Joel, correct me if I'm wrong on this:

4200rpm disk -- 2.33Ghz CPU
5400rpm      -- 2.00Ghz
7200rpm      -- 2.33Ghz
SSD          -- 2.93Ghz

Now, it can't be quite as simple as "the slower CPU makes everything slower" because the dynamic build actually got *faster* on that platform, but that could be something wacky, like different cache or read-ahead behavior.

----

Only on the very slowest disk does the performance difference approach 200ms, and I'd argue that for a change this invasive, we ought to be getting at least 200ms -- maybe even more -- for it to be worth it.
Comment 114 Zack Weinberg (:zwol) 2010-01-20 13:45:24 PST
Created attachment 422615 [details]
R scripts and raw data

Here's a zip file with all the scripting and data.  Note: R doesn't know how to do internal compression of PDFs, so the script shells out to the 'qpdf' utility to do that.  You can just comment out all the system() lines if you don't have that program; you will then get larger PDFs with "-u" suffixes on their names.
Comment 115 Joel Reymont (:joelr) 2010-01-21 08:41:52 PST
Created attachment 422766 [details]
Linux raw data, 7200rpm, 5400rpm and 4200rpm over USB2

This matches the Mac OSX ran of yesterday.
Comment 116 Joel Reymont (:joelr) 2010-01-21 08:47:41 PST
Created attachment 422767 [details]
startup performance chart for Linux, even more instructive than Mac OSX!

Linux needs our help!!!
Comment 117 Zack Weinberg (:zwol) 2010-01-21 10:48:44 PST
I don't have time to do much on this today, but it's very interesting that the "libxul" configuration seems to be so much faster than *either* stock or static on Linux, when it did hardly any good on OSX.

Joel, could you explain in detail, but without reference to the innards of our build system, exactly what the "libxul" configuration is and how it differs from the "static" configuration?  I don't understand the build system, so I can't figure it out from your patches.
Comment 118 Joel Reymont (:joelr) 2010-01-21 12:47:03 PST
It's clear that the way to speed up Mac startup is to buy a faster disk.

I think we can get Linux to be as fast as Mac with a Mac-style 'dyld shared library cache'. The number of system libraries we link against on Linux is huuuuuge!

I'll time startup on Linux/SSD tonight and will compare the static and libxul static patches tomorrow. I honestly don't know what the difference is and I'm surprised myself. Static libxul is a 'lighter' patch for sure.
Comment 119 Joel Reymont (:joelr) 2010-01-22 09:20:45 PST
Created attachment 423004 [details]
Linux raw data, now with SSD
Comment 120 Joel Reymont (:joelr) 2010-01-22 09:22:39 PST
Created attachment 423007 [details]
Startup performance chart for Linux, now with SSD

All I can say is... holy cow!
Comment 121 Zack Weinberg (:zwol) 2010-01-22 09:40:52 PST
Which filesystem do you have on those Linux installs?
Comment 122 Joel Reymont (:joelr) 2010-01-22 10:50:31 PST
Zack, I think it's Ext4, the one Ubuntu 9.10 selects by default.

Why do you ask?

Also, can you perform your excellent and rigorous statistical analysis on Linux data when you have a chance? I'm sure everyone will appreciate it, specially me!
Comment 123 Zack Weinberg (:zwol) 2010-01-22 11:27:52 PST
(In reply to comment #122)
> Zack, I think it's Ext4, the one Ubuntu 9.10 selects by default.
> Why do you ask?

Could you check for sure?  (just run /bin/mount from the command line with no arguments and post the result)

I ask because filesystem differences seem like a credible explanation for why the performance is so much slower on Linux than OSX, but only with rotating disks.  I don't think a difference measured in *seconds* can be explained by extra seeks due to additional shared libraries, or anything like that; but if the OS is introducing a bunch of pauses to flush the filesystem log, for instance, that could do it.

(Concrete request: if it turns out to be ext2 or ext3 that you're using, repeat the tests with ext4 or btrfs.  Yes, I realize this involves reinstallation.  Sorry.)

> Also, can you perform your excellent and rigorous statistical analysis on Linux
> data when you have a chance? I'm sure everyone will appreciate it, specially
> me!

Am head down on stuff I should have done last month for the CSS parser; may not get to it till Monday.
Comment 124 Joel Reymont (:joelr) 2010-01-22 11:58:27 PST
I'm using Ext4, the Ubuntu 9.10 default. 

I would attribute the differences to the -large- number of dynamic libraries we are linking against on Linux and the lack of a Snow Leopard-style dyld dynamic library cache. The latter is a single file that has all the system frameworks (libraries), is "pre-linked" and gets mapped into application memory space.

I did a lot of work on figuring out what happens during startup on Mac OSX. Perhaps I should redo that on Linux. I think we should thoroughly investigate things that we don't understand. Taras is driving this bug, though, so he should probably say whether that's something I should tackle next.
Comment 125 Joel Reymont (:joelr) 2010-01-25 23:20:31 PST
Zack, to answer the questions you had, the stock builds do have libxul enabled so you can safely do the Linux analysis now ;-).
Comment 126 Joel Reymont (:joelr) 2010-01-25 23:25:56 PST
Comment on attachment 422208 [details] [diff] [review]
static libxul build

Kicking off the review process...
Comment 127 Zack Weinberg (:zwol) 2010-01-27 12:26:40 PST
Finally got 'round to statistics.  Using basically the same script as for the OSX analysis...

> model <- lmRob(time ~ disk * build, data=timings)
> print(anova(model))
|
| Terms added sequentially (first to last)
|
|             Chisq Df RobustF  Pr(F)    
| (Intercept)        1                   
| disk               3    3647 <2e-16 ***
| build              2    1420 <2e-16 ***
| disk:build         6     982 <2e-16 ***

No real surprise here: both independent variables are highly significant, and so is their interaction.  Let's have a look at the coefficients... (massively reformatted as usual)

| Coefficients:
|                      Value      Std. Error
| (Intercept)           6953.      7.297
| disk5400               263.1    10.46
| disk7200             -1206.     10.43
| diskssd              -5319.     10.23
| buildlibxul          - 762.5    10.47
| buildstatic          - 127.1    10.54
| disk5400:buildlibxul - 202.4    14.85
| disk7200:buildlibxul    62.56   14.74
| diskssd:buildlibxul    457.9    14.55
| disk5400:buildstatic -   1.607  14.95
| disk7200:buildstatic     1.754  15.05
| diskssd:buildstatic     69.69   14.68

Here again, as on OSX, we see the phenomenon that the 5400rpm disk is actually *slower* across the board than the 4200rpm disk.  That's probably the slower CPU making itself known.  The 7200rpm disk is faster as expected.  The SSD is a whole lot faster, and again that's going to be partially disk and partially CPU.

The interactions are a little confusing on this chart, but I think what they mean is, the static configurations don't make as much of a difference with the faster disks, across the board.  That's not surprising.

I'm not going to do any pairwise comparisons here because we are getting into enough experimental cells that we really need something like Tukey's HSD method, and R doesn't let me do that on an lmRob model.  Also, the interesting comparison would be "is libxul-static consistently faster than dynamic" and you don't need any spiffy statistics to say yes, it's obvious from the chart -- there's almost no overlap in the data sets at all.
Comment 128 Zack Weinberg (:zwol) 2010-01-27 12:59:53 PST
Created attachment 423839 [details]
Startup performance chart for Linux with the effects of disk removed

Joel already posted the overview chart I would have generated, so here's a chart with the main trend (the speed improvement due to faster disk) taken out.  I did this slightly differently than last time.  The "dynamic" build is the reference point, so the line for that is intentionally a straight horizontal line at 0.  More negative is better.

If there were no interaction between build configuration and "disk" at all, the other two trend lines would also be horizontal.  They're not.  The other more-or-less natural thing to expect would be that a faster disk makes this change less worthwhile; in that case, the other two trend lines would monotonically approach zero from below as we moved rightward.  There's some of that here (most obvious in the SSD config) but not as much as you might think.

What jumps out at me here is that the "libxul" trend line is in almost the same place for the 4200rpm and 7200rpm configurations, but displaced *downward* (i.e. *more* perf win) for the 5400rpm config, and radically upward for the SSD config.  I think this is a CPU effect: recall that the 4200 and 7200rpm machines both have 2.33Ghz CPUs (or maybe they're even the same computer?), whereas the 5400rpm machine has a 2.00Ghz CPU and the SSD machine has a 2.93Ghz CPU.  If we are seeing, for instance, the effect of radically reduced overhead due to most of the code no longer being compiled position-independently, it makes sense that that would be more of a win on slower CPUs.

(The "static" trend line actually has the same shape, but you can't see it in the graph because the difference is only on the order of 2ms.)
Comment 129 Zack Weinberg (:zwol) 2010-01-27 13:06:29 PST
Created attachment 423841 [details]
complete analysis of comment 115 data

This tarball has all the data from comment 115 plus the revised analysis scripts and the full set of charts (including the diagnostic ones I didn't bother posting).

One thing I want to call out is that Linux is much better than OSX at not disturbing the results with other system activity -- the variances are much lower across the board.  There are still some nasty (>6sigma) outliers though.
Comment 130 Justin Dolske [:Dolske] 2010-01-27 17:11:44 PST
(In reply to comment #127)

> Here again, as on OSX, we see the phenomenon that the 5400rpm disk is actually
> *slower* across the board than the 4200rpm disk.

What's the capacity of these two drives?

Waiting for a needed sector to rotate around to the head is obviously a function of RPM alone, but data transfer rate is a function of both RPM and bit density... A 1GB 7200rpm drive would be much slower in that respect than a 1TB 5400rpm drive. How relevant this is depends on the I/O mix of the test. :)
Comment 131 Joel Reymont (:joelr) 2010-01-27 23:17:54 PST
4200rpm - 80gb
5400rpm - 120gb
7200rpm - 200gb
SSD     - 256gb
Comment 132 Joel Reymont (:joelr) 2010-02-03 07:40:26 PST
Created attachment 424991 [details] [diff] [review]
static libxul build with windows support

Linux and Maemo fail on the try server, I'll fix this shortly.
Comment 133 Joel Reymont (:joelr) 2010-02-05 06:16:32 PST
Created attachment 425458 [details] [diff] [review]
Builds on Linux, Maemo, Mac and Windows

WinMo fails on the try server, missing pthread.h

Maemo fails on the try server because it insists on building tests.
Comment 134 Joel Reymont (:joelr) 2010-02-05 06:25:34 PST
I take it back, the pthreads issue on WinMo is a configuration bit I have to fix.

I don't know why Maemo on the try server overrides my --disable-tests setting.
Comment 135 Joel Reymont (:joelr) 2010-02-08 09:50:46 PST
Created attachment 425823 [details] [diff] [review]
Builds everywhere
Comment 136 Ted Mielczarek [:ted.mielczarek] 2010-02-11 06:38:21 PST
Comment on attachment 425823 [details] [diff] [review]
Builds everywhere

>diff --git a/browser/app/Makefile.in b/browser/app/Makefile.in
>index 8e04667..31f9002 100644
>--- a/browser/app/Makefile.in
>+++ b/browser/app/Makefile.in
>@@ -103,6 +103,9 @@ else
> STATIC_COMPONENTS_LINKER_PATH = -L$(DEPTH)/staticlib
> endif
> LIBS += $(DEPTH)/toolkit/xre/$(LIB_PREFIX)xulapp_s.$(LIB_SUFFIX)
>+ifeq (cocoa,$(MOZ_WIDGET_TOOLKIT))
>+LIBS += -framework SystemConfiguration -framework OpenGL
>+endif

We probably want these in OS_LIBS, not LIBS, as per bug 543079.

>+ifndef BUILD_STATIC_LIBS
> ifdef MOZ_ENABLE_LIBXUL
> APP_XPCOM_LIBS = $(XPCOM_GLUE_LDOPTS)
> else
> MOZILLA_INTERNAL_API = 1
> APP_XPCOM_LIBS = $(XPCOM_LIBS)
> endif
>+else
>+MOZILLA_INTERNAL_API = 1
>+APP_XPCOM_LIBS =
>+endif

You don't need to set APP_XPCOM_LIBS to an empty value, unset variables expand to nothing in make.

>+ifdef BUILD_STATIC_LIBS
>+# dependent libraries
>+ifneq (,$(MOZ_ENABLE_GTK2))
>+SHARED_LIBRARY_LIBS += \
>+  $(DEPTH)/embedding/browser/gtk/src/$(LIB_PREFIX)gtkembedmoz.$(LIB_SUFFIX)
>+DEFINES += -DMOZ_ENABLE_GTK2
>+endif

Why are you using SHARED_LIBRARY_LIBS here? This file links a PROGRAM, not a SHARED_LIBRARY.

>+
>+SHARED_LIBRARY_LIBS += \
>+	$(DEPTH)/toolkit/xre/$(LIB_PREFIX)xulapp_s.$(LIB_SUFFIX) \
>+	$(NULL)

Also, FYI, our current style for Makefiles is to indent line continuations by two spaces, so the hunk before this is correct, but this one is wrong. (I realize you probably copy/pasted them both, but please fix them while you're here.)

> ifdef BUILD_STATIC_LIBS
> include $(topsrcdir)/config/static-config.mk
> 
>+STATIC_EXTRA_LIBS += $(DEPTH)/dist/lib/$(LIB_PREFIX)mozz.$(LIB_SUFFIX)
>+STATIC_CPPSRCS =

Why are you unsetting this variable instead of just changing its definition in static-config.mk?

>+
>+ifeq ($(OS_ARCH), Darwin)
>+MOZ_OPTIMIZE_LDFLAGS =
>+endif

What is this change for?

>+
>+ifdef MOZ_ENABLE_DBUS
>+LIBS += $(MOZ_DBUS_GLIB_LIBS)
>+endif
>+
>+ifdef MOZ_PLATFORM_HILDON
>+LIBS += $(LIBHILDONMIME_LIBS) $(LIBHILDONFM_LIBS)
>+endif 
>+
>+ifdef MOZ_ENABLE_LIBCONIC
>+LIBS += $(LIBCONIC_LIBS)
>+endif
>+
>+ifdef NS_OSSO
>+LIBS += $(LIBOSSO_LIBS)
>+endif

bug 543976 consolidated these defines into MOZ_PLATFORM_MAEMO and MOZ_PLATFORM_MAEMO_LIBS.

>+
>+ifdef MOZ_ENABLE_DBUS
>+LIBS += $(MOZ_DBUS_GLIB_LIBS)
>+endif

You already have this block above.

>+
>+ifeq (,$(filter WINNT WINCE,$(OS_ARCH)))
>+FORCE_LOAD_LIBS= \
>+	$(DIST)/lib/$(LIB_PREFIX)xul.$(LIB_SUFFIX) \
>+	$(DIST)/lib/$(LIB_PREFIX)xpcom_core.$(LIB_SUFFIX) \
>+	$(NULL)
>+else
>+LIBS += \
>+	/OPT:NOREF \
>+	$(DIST)/lib/$(LIB_PREFIX)xul.$(LIB_SUFFIX) \
>+	$(DIST)/lib/$(LIB_PREFIX)xpcom_core.$(LIB_SUFFIX) \
>+	/OPT:REF \
>+	$(NULL)

This block can't possibly work, can it? MSYS does path translation on anything starting with a /, so it should mangle those /OPT. Visual C++ accepts all its options as -FOO as well, so you should be able to use -OPT:NOREF.

>+ifeq ($(OS_ARCH),Linux)
>+LDFLAGS += -rdynamic
>+endif

Does this wind up exporting only our non-hidden symbols? I can't find any documentation on that.

>diff --git a/browser/components/build/Makefile.in b/browser/components/build/Makefile.in
>index 4538754..72d53d0 100644
>--- a/browser/components/build/Makefile.in
>+++ b/browser/components/build/Makefile.in
>@@ -86,4 +86,19 @@ EXTRA_DSO_LDOPTS += $(LIBXUL_DIST)/lib/$(LIB_PREFIX)thebes.$(IMPORT_LIB_SUFFIX)
> endif
> endif
> 
>+ifdef BUILD_STATIC_LIBS
>+ifeq ($(OS_ARCH), Darwin)
>+LIBXUL_LIBS=$(XPCOM_FROZEN_LDOPTS) -lobjc
>+XPCOM_LIBS=$(XPCOM_FROZEN_LDOPTS) -lobjc
>+endif
>+endif

-lobjc winds up in LDFLAGS in bug 543081, I don't think you need it here.

>+
>+ifdef BUILD_STATIC_LIBS
>+ifeq ($(OS_ARCH),WINNT)
>+EXTRA_DSO_LDOPTS += \
>+	$(DEPTH)/$(MOZ_BUILD_APP)/app/$(LIB_PREFIX)$(MOZ_APP_NAME).$(IMPORT_LIB_SUFFIX)
>+endif
>+endif

You can just hardcode browser/app here as the directory, since this is in browser/components.

>diff --git a/browser/confvars.sh b/browser/confvars.sh
>index 56f2986..243d554 100755
>--- a/browser/confvars.sh
>+++ b/browser/confvars.sh
>@@ -41,7 +41,10 @@ MOZ_UPDATER=1
> MOZ_PHOENIX=1
> 
> MOZ_ENABLE_LIBXUL=1
>-MOZ_STATIC_BUILD_UNSUPPORTED=1
>+BUILD_STATIC_LIBS=1
>+MOZ_IPC=
>+ENABLE_TESTS=

As previously discussed, you can't disable tests here. We want to keep the default setting to be --enable-tests. Also, what is that MOZ_IPC setting doing there? We clearly need to be able to build with IPC, since OOPP is on by default.

I'm also not happy with having MOZ_ENABLE_LIBXUL set in a static build. It's going to be confusing if we can't effectively distinguish between static and libxul builds. If you need some things to be shared between the two build configurations, I'd suggest you invent a new variable and replace whatever MOZ_ENABLE_LIBXUL instances you're relying on with it. (And then make sure that both --enable-libxul and --enable-static set it in configure.)


>diff --git a/config/config.mk b/config/config.mk
>index 1499cc4..f29c74c 100644
>--- a/config/config.mk
>+++ b/config/config.mk
>@@ -349,6 +349,7 @@ endif
> # Disable PIC if necessary
> #
> 
>+ifndef BUILD_STATIC_LIBS
> ifndef _ENABLE_PIC
> DSO_CFLAGS=
> ifeq ($(OS_ARCH)_$(HAVE_GCC3_ABI),Darwin_1)
>@@ -357,6 +358,7 @@ else
> DSO_PIC_CFLAGS=
> endif
> endif
>+endif

What is this change for?

Also, just FYI, if you make changes to config/{config,rules}.mk, you need to make the same changes in js/src/config/{config,rules}.mk. You can just copy the files over wholesale, they're intended to always be copies of each other. 

>diff --git a/config/static-config.mk b/config/static-config.mk
>index eba4edf..d66b59f 100644
>--- a/config/static-config.mk
>+++ b/config/static-config.mk
>@@ -117,3 +117,11 @@ STATIC_EXTRA_LIBS += $(call EXPAND_LIBNAME,odm cfg)
> endif
> 
> LOCAL_INCLUDES += -I$(topsrcdir)/config
>+
>+# 
>+# Force whole loading of static libraries
>+#
>+
>+FORCE_LOAD_CONTENTS = $(addsuffix .contents, $(FORCE_LOAD_LIBS))
>+GARBAGE += $(FORCE_LOAD_CONTENTS)

You want GARBAGE_DIRS, since this is a dir.

>+EXTRA_DEPS += $(FORCE_LOAD_CONTENTS)
>diff --git a/config/static-rules.mk b/config/static-rules.mk
>index f4842db..d66310a 100644
>--- a/config/static-rules.mk
>+++ b/config/static-rules.mk
>@@ -23,3 +23,23 @@ endif
> ifeq (cocoa,$(MOZ_WIDGET_TOOLKIT))
> LIBS	+= -framework QuickTime -framework IOKit -lcrypto
> endif
>+
>+# 
>+# Force whole loading of static libraries
>+#
>+
>+$(DIST)/lib/$(LIB_PREFIX)%.$(LIB_SUFFIX).contents: $(DIST)/lib/$(LIB_PREFIX)%.$(LIB_SUFFIX)
>+	rm -fr $@
>+	mkdir $@
>+	(cd $@ && $(AR_EXTRACT) ../$<)

You don't need the parens, each line is executed in its own subshell.

>+
>+ifneq (,$(filter OSF1 BSD_OS FreeBSD NetBSD OpenBSD SunOS Darwin,$(OS_ARCH)))
>+CLEANUP1	:= | egrep -v '(________64ELEL_|__.SYMDEF)'
>+CLEANUP2	:= rm -f ________64ELEL_ __.SYMDEF
>+else
>+CLEANUP2	:= true
>+endif
>+
>+FORCE_LOAD_OBJS := $(foreach lib, $(FORCE_LOAD_LIBS), $(addprefix $(lib).contents/, $(shell $(AR_LIST) $(lib) $(CLEANUP1))))
>+GARBAGE += $(FORCE_LOAD_OBJS)
>+LIBS += $(FORCE_LOAD_OBJS)

This is mostly copied from that dtrace section in rules.mk, right? Is there any way you can share that machinery rather than duplicating most of it?

>diff --git a/configure.in b/configure.in
>index 0c7d471..8daeadc 100644
>--- a/configure.in
>+++ b/configure.in
>@@ -7478,10 +7478,6 @@ if test -n "$MOZ_STATIC_BUILD_UNSUPPORTED" -a -n "$BUILD_STATIC_LIBS"; then
> 	AC_MSG_ERROR([--enable-static is not supported for building $MOZ_APP_NAME. You probably want --enable-libxul.])
> fi
> 
>-if test -n "$MOZ_ENABLE_LIBXUL" -a -n "$BUILD_STATIC_LIBS"; then
>-	AC_MSG_ERROR([--enable-libxul is not compatible with --enable-static])
>-fi

As I said before, I'm not comfortable with having MOZ_ENABLE_LIBXUL defined for a static build, it's really confusing.

>diff --git a/db/sqlite3/src/Makefile.in b/db/sqlite3/src/Makefile.in
>index 08e6f0f..4db8518 100644
>--- a/db/sqlite3/src/Makefile.in
>+++ b/db/sqlite3/src/Makefile.in
>@@ -48,7 +48,11 @@ include $(DEPTH)/config/autoconf.mk
> MODULE           = sqlite3
> LIBRARY_NAME     = mozsqlite3
> SHORT_LIBNAME    = mozsqlt3
>+ifndef BUILD_STATIC_LIBS
> FORCE_SHARED_LIB = 1
>+else
>+DIST_INSTALL = 1
>+endif

Why are we dist installing this lib? Surely we don't want people to link to our sqlite static lib.

>diff --git a/modules/libpr0n/decoders/icon/Makefile.in b/modules/libpr0n/decoders/icon/Makefile.in
>index 2cb4a48..22b82ea 100644
>--- a/modules/libpr0n/decoders/icon/Makefile.in
>+++ b/modules/libpr0n/decoders/icon/Makefile.in
>@@ -56,6 +56,14 @@ LIBXUL_LIBRARY = 1
> EXPORT_LIBRARY = 1
> endif
> 
>+ifdef BUILD_STATIC_LIBS
>+USE_STATIC_LIBS = 1
>+FORCE_STATIC_LIB = 1
>+FORCE_SHARED_LIB =
>+EXPORT_LIBRARY = 1
>+MOZILLA_INTERNAL_API = 1
>+endif

It seems like you ought to just ifdef the setting of FORCE_SHARED_LIB up above instead of resetting it to empty here.

>diff --git a/modules/plugin/Makefile.in b/modules/plugin/Makefile.in
>index aef5287..3050e1c 100644
>--- a/modules/plugin/Makefile.in
>+++ b/modules/plugin/Makefile.in
>@@ -49,6 +49,7 @@ DIRS		= base/public
> ifdef MOZ_PLUGINS
> DIRS		+= base/src 
> 
>+ifndef BUILD_STATIC_LIBS
> ifeq ($(MOZ_WIDGET_TOOLKIT),gtk2)
> TOOL_DIRS += sdk/samples/unixprinting
> endif

I assume the unixprinting plugin doesn't build in your static config? Is that fixable? (It'd be fine to push off to a followup bug, if so.)

>diff --git a/postxpcom/postxpcom-tiers.mk b/postxpcom/postxpcom-tiers.mk
>new file mode 100644
>index 0000000..10ba9ae
>--- /dev/null
>+++ b/postxpcom/postxpcom-tiers.mk

First, creating new top-level directories in the repo without asking is generally frowned upon. Second, it feels like this could all live in toolkit-tiers.mk or something.

>+TIERS += postxpcom 
>+
>+# tier "postxpcom" - everything that depends on the xpcom shared lib
>+
>+ifdef BUILD_STATIC_LIBS
>+tier_postxpcom_dirs = xpcom/stub 
>+
>+ifneq ($(MOZ_BUILD_APP), xulrunner)
>+tier_postxpcom_dirs += $(MOZ_BUILD_APP)/components
>+endif

I don't think you should be doing anything based on MOZ_BUILD_APP here. All the apps have their own build.mk, that should be responsible for adding these directories (browser/build.mk, xulrunner/build.mk).

>diff --git a/toolkit/components/alerts/src/mac/Makefile.in b/toolkit/components/alerts/src/mac/Makefile.in
>index fb524dc..8a3b76c 100644
>--- a/toolkit/components/alerts/src/mac/Makefile.in
>+++ b/toolkit/components/alerts/src/mac/Makefile.in
>@@ -44,8 +44,9 @@ include $(DEPTH)/config/autoconf.mk
> MODULE = alerts
> LIBRARY_NAME = alerts_s
> IS_COMPONENT = 1
>+ifndef BUILD_STATIC_LIBS
> FORCE_SHARED_LIB = 1
>-
>+endif

I'm pretty sure this really does want to stay as a shared lib. It's a shared lib because it links to growl, and we don't require growl, so we want it to be able to fail to load if growl is missing.

>diff --git a/toolkit/toolkit-tiers.mk b/toolkit/toolkit-tiers.mk
>index 012f915..34b59ed 100644
>--- a/toolkit/toolkit-tiers.mk
>+++ b/toolkit/toolkit-tiers.mk
>@@ -228,11 +228,9 @@ tier_toolkit_dirs += embedding/browser/gtk
> endif
> endif
> 
>-ifndef BUILD_STATIC_LIBS
> tier_toolkit_dirs += toolkit/library
>-endif
> 
>-ifdef MOZ_ENABLE_LIBXUL
>+ifndef BUILD_STATIC_LIBS
> tier_toolkit_dirs += xpcom/stub
> endif
> 
>@@ -240,6 +238,7 @@ ifdef NS_TRACE_MALLOC
> tier_toolkit_dirs += tools/trace-malloc
> endif
> 
>+ifndef BUILD_STATIC_LIBS
> ifdef MOZ_ENABLE_GNOME_COMPONENT
> tier_toolkit_dirs    += toolkit/system/gnome
> endif
>@@ -250,6 +249,7 @@ ifdef MOZ_ENABLE_DBUS
> tier_toolkit_dirs    += toolkit/system/dbus
> endif
> endif
>+endif

Basically right here is where I would stick all that tier_postxpcom stuff, I think.

> 
>diff --git a/xpcom/build/Makefile.in b/xpcom/build/Makefile.in
>index be26a29..f6b0416 100644
>--- a/xpcom/build/Makefile.in
>+++ b/xpcom/build/Makefile.in
>@@ -51,9 +51,8 @@ LIBRARY_NAME	= xpcom_core
> SHORT_LIBNAME	= xpcomcor
> LIBXUL_LIBRARY = 1
> 
>-# This is only a static library in libxul builds
>-ifdef MOZ_ENABLE_LIBXUL
>-EXPORT_LIBRARY = 1
>+ifdef BUILD_STATIC_LIBS
>+DIST_INSTALL = 1
> endif

Is this going to break the libxul configuration? We shouldn't do that, if so.

>diff --git a/xpcom/stub/Makefile.in b/xpcom/stub/Makefile.in
>index cea7973..065e420 100644
>--- a/xpcom/stub/Makefile.in
>+++ b/xpcom/stub/Makefile.in
>@@ -74,12 +74,24 @@ FORCE_SHARED_LIB = 1
> 
> EXTRA_DSO_LDOPTS = $(LIBS_DIR)
> 
>+# Don't want -Wl,-z,defs
>+
>+ifdef BUILD_STATIC_LIBS
>+ifeq ($(OS_TARGET),Linux)
>+DSO_LDOPTS = -shared
>+endif
>+endif
>+
> DEPENDENT_LIBS_LIST += \
> 	$(LIB_PREFIX)nspr4$(DLL_SUFFIX) \
> 	$(LIB_PREFIX)plc4$(DLL_SUFFIX) \
> 	$(LIB_PREFIX)plds4$(DLL_SUFFIX) \
> 	$(NULL)
> 
>+ifneq (_, $(BUILD_STATIC_LIBS))
>+USE_STATIC_LIBS = 1
>+endif
>+

This looks confused. Surely you just want ifdef BUILD_STATIC_LIBS here.

>diff --git a/xulrunner/app/Makefile.in b/xulrunner/app/Makefile.in
>index d166ed6..83254dc 100644
>--- a/xulrunner/app/Makefile.in
>+++ b/xulrunner/app/Makefile.in

I don't know that we've ever supported a "static xulrunner" configuration, so I'm not sure what to think about this. It might defeat the purpose of xulrunner. Brad Lassey has a patch for Fennec that makes it build more like Firefox (with a real binary, not just a xulrunner stub), so if this was to support mobile maybe we can just skip it and he can copy what you've done to Firefox.


>diff --git a/xulrunner/confvars.sh b/xulrunner/confvars.sh
>index 90bcb5b..18d41bc 100755
>--- a/xulrunner/confvars.sh
>+++ b/xulrunner/confvars.sh
>@@ -41,9 +41,13 @@ MOZ_APP_DISPLAYNAME=XULRunner
> MOZ_UPDATER=1
> MOZ_XULRUNNER=1
> MOZ_ENABLE_LIBXUL=1
>-MOZ_STATIC_BUILD_UNSUPPORTED=1
>+MOZ_STATIC_BUILD_UNSUPPORTED=
>+BUILD_STATIC_LIBS=1
> MOZ_APP_VERSION=$MOZILLA_VERSION
>-MOZ_JAVAXPCOM=1
>+MOZ_JAVAXPCOM=
>+MOZ_IPC=
>+ENABLE_TESTS=

Same notes here as with browser/confvars.sh.

I also think bsmedberg should give this a once-over once you've fixed all the things I've commented on. I'm not as familiar with all the intricacies of XPCOM linking, as well as the actual libxul build config.
Comment 137 Ted Mielczarek [:ted.mielczarek] 2010-02-11 07:34:25 PST
After discussion, I accept that the MOZ_ENABLE_LIBXUL stuff is ok. I'm not wild about it, but I guess it's the path of least resistance.
Comment 138 Joel Reymont (:joelr) 2010-02-11 07:43:20 PST
Comment on attachment 420078 [details] [diff] [review]
static build

Ted, 

This is a non-libxul static build. Could you please review and comment on the changes that apply to the building of tests?

No need to repeat what you already said on the other patch, I'll take care of those items.

My concern is that test-related changes are super-hairy here and I would like to make sure I redo these properly for the libxul-static patch.
Comment 139 Joel Reymont (:joelr) 2010-02-11 10:52:16 PST
Ted,

The standard tiers are base, nspr, js, xpcom, zlib, necko, external, gecko, toolkit, app, testharness. You are proposing to merge my postxpcom tier, the one at the very end of -everything- into toolkit but this won't work.

In my libxul static setup firefox (app tier) dumps the symbols that are needed by the xpcom shared library which is relied upon by browsercomps. Then there are all the tests that want xpcom as well. 

The toolkit tier is before the app tier but the code that you want in toolkit needs to be after app.

I toyed with adding the bits from my postxpcom tier to default:: in m-c/Makefile.in and so on but this gets tedious fast and stops working quickly. The most straightforward, easiest and least error-prone way I found is to use the postxpcom tier.
Comment 140 Ted Mielczarek [:ted.mielczarek] 2010-02-11 11:18:22 PST
Then let's do one of two things:
a) Put the entire postxpcom tier into toolkit-tiers.mk, but require the app's build.mk to stick postxpcom into TIERS or
b) Put postxpcom.mk next to toolkit-tiers.mk (maybe rename it to static-postxpcom-tiers.mk or something), and require that the app's build.mk to include it.

Do those sound workable?
Comment 141 Joel Reymont (:joelr) 2010-02-11 13:08:58 PST
I picked the 2nd door. Will I get a goat? Will let you know!
Comment 142 Joel Reymont (:joelr) 2010-02-17 05:47:38 PST
Created attachment 427313 [details] [diff] [review]
static libxul build with ted's review changes

Use this custom .mozconfig with the try server:

ac_add_options --disable-tests
ac_add_options --disable-ipc

Building binary tests is a can of worms (Ted, how is the review?) and IPC builds mozilla-runtime which needs to be moved into firefox (bug 545889).
Comment 143 Ted Mielczarek [:ted.mielczarek] 2010-02-17 11:52:05 PST
Comment on attachment 420078 [details] [diff] [review]
static build

The endgame here is to make tests link against the static EXE for the symbols they need, rather than statically linking all of libxul, is that right? It's a little difficult to review this just because of the other bits of your patch from your previous approach don't match what you're currently doing.

>diff --git a/config/test-config.mk b/config/test-config.mk
>new file mode 100644
>index 0000000..9b8445f
>--- /dev/null
>+++ b/config/test-config.mk
>@@ -0,0 +1,114 @@
>+ifndef BUILD_STATIC_LIBS
>+LIBS += $(XPCOM_LIBS)
>+endif
>+
>+LIBS  += $(NSPR_LIBS)
>+
>+ifdef BUILD_STATIC_LIBS
>+MOZILLA_INTERNAL_API = 1
>+endif
>+
>+ifdef BUILD_STATIC_LIBS
>+include $(topsrcdir)/config/config.mk
>+include $(topsrcdir)/config/static-config.mk
>+
>+STATIC_COMPONENT_LIST_UNUSED =  \

What's this list for? (If it's really unused, you should probably just remove it.)

>+
>+STATIC_COMPONENT_LIST = \

Surely we don't want a hardcoded list here, do we? Currently rules.mk puts module names into a file, and static-config.mk pulls them back out to build this list:
http://mxr.mozilla.org/mozilla-central/source/config/static-config.mk#55

Is there a reason you can't use that machinery?

>diff --git a/content/base/test/Makefile.in b/content/base/test/Makefile.in
>index b6c8195..0c24f6b 100644
>--- a/content/base/test/Makefile.in
>+++ b/content/base/test/Makefile.in
>@@ -60,6 +60,7 @@ XPCSHELL_TESTS = \
>                unit \
>                $(NULL)
> 
>+include $(topsrcdir)/config/test-config.mk
> include $(topsrcdir)/config/rules.mk

It would probably be nicer if rules.mk included this file ifdef CPP_UNIT_TESTS, so individual makefiles don't have to remember to include it. (It seems fine to manually include it in cases where makefiles are using SIMPLE_PROGRAMS directly.)

>diff --git a/js/src/xpconnect/tests/Makefile.in b/js/src/xpconnect/tests/Makefile.in
>index e8079fe..3e8f54c 100644
>--- a/js/src/xpconnect/tests/Makefile.in
>+++ b/js/src/xpconnect/tests/Makefile.in
>@@ -57,13 +57,10 @@ XPCSHELL_TESTS = unit
> 
> CPPSRCS		= TestXPC.cpp
> 
>-LIBS		= \
>-		$(DIST)/lib/$(LIB_PREFIX)xpcomglue_s.$(LIB_SUFFIX) \
>-		$(LIBS_DIR) \
>-		$(MOZ_JS_LIBS) \
>-		$(MOZ_COMPONENT_LIBS) \
>-		$(NULL)
>-
>+include $(topsrcdir)/config/test-config.mk
>+ifdef BUILD_STATIC_LIBS
>+LIBS += -ljs_static
>+endif

This is only one test, so if this stays as-is it's not a deal breaker, but in the end this should wind up linking to the static binary, right?

>diff --git a/toolkit/crashreporter/Makefile.in b/toolkit/crashreporter/Makefile.in
>index 7f5f789..00ead4d 100644
>--- a/toolkit/crashreporter/Makefile.in
>+++ b/toolkit/crashreporter/Makefile.in
>@@ -110,7 +110,9 @@ CPPSRCS = \
> FORCE_STATIC_LIB = 1
> 
> ifdef ENABLE_TESTS
>+ifndef BUILD_STATIC_LIBS
> TOOL_DIRS = test
> endif
>+endif

Presumably you couldn't get the linkage to work here (since testcrasher wants to link against libxul)? Do you have a plan to fix this? (It just wants that so it can expose some method from libxul that's not scriptable to a test harness.)

>diff --git a/xpcom/Makefile.in b/xpcom/Makefile.in
>index 5e3e09d..8e01d48 100644
>--- a/xpcom/Makefile.in
>+++ b/xpcom/Makefile.in
>@@ -75,11 +75,16 @@ endif
> endif
> 
> ifdef ENABLE_TESTS
>+ifdef BUILD_STATIC_LIBS
>+# xpcom/tests needs libxpctest
>+TOOL_DIRS += $(DEPTH)/js/src/xpconnect/tests/components
>+endif

This is...awkward. Is there a better way you can get this ordering right?

I'm not r+ing this, since I think that would confuse things, but I hope that feedback is useful.
Comment 144 Joel Reymont (:joelr) 2010-02-17 12:01:37 PST
Ted, thanks for the review! To answer your end-game question, it can't work that way and binaries -do- need to link in all of libxul. That's my understanding anyway, others may correct me. It works for plugins, etc. to link against the symbols in the main exe because they are loaded into that exe.
Comment 145 Ted Mielczarek [:ted.mielczarek] 2010-02-17 14:04:49 PST
Gotcha, thanks. bsmedberg: given that, do we really want to try to support these binary tests in a static build? The impact of that many libxul links is unlikely to make any developer or tinderbox happy.
Comment 146 Joel Reymont (:joelr) 2010-02-17 14:08:26 PST
Dietrich can weigh in here but the idea is to do this in two steps:

1) Make sure all binary tests build and run, and 

2) Add a switch to disable binary tests when not needed
Comment 147 Joel Reymont (:joelr) 2010-02-17 14:09:24 PST
Ted, any comments on the patch that includes your review changes?
Comment 148 Dietrich Ayala (:dietrich) 2010-02-19 10:54:31 PST
I defer to ted/bsmedberg on the binary tests. Maybe we could have releng set up daily test runs for the binary tests, or something like that.
Comment 149 Mike Hommey [:glandium] 2010-02-19 12:51:51 PST
Stupid question: on what kind of environment are the linux tests done ? Is that running ffx on a raw X session, or running in a "full desktop" like gnome ?
Comment 150 Joel Reymont (:joelr) 2010-02-19 12:54:10 PST
Stock Ubuntu with the full Gnome desktop.
Comment 151 Mike Hommey [:glandium] 2010-02-19 12:54:37 PST
Also, for linux, you should try these: http://lwn.net/Articles/192624/
Comment 152 Mike Hommey [:glandium] 2010-02-20 05:41:56 PST
Something else that could be beneficial is the use of -Bsymbolic, though I'm unsure what kind of side effects it can have on the vtables used in libxul.
Comment 153 Ted Mielczarek [:ted.mielczarek] 2010-03-04 08:00:40 PST
Comment on attachment 427313 [details] [diff] [review]
static libxul build with ted's review changes

Ok, sorry, it took me a bit to get back to this.

>diff --git a/browser/app/Makefile.in b/browser/app/Makefile.in
>index 8e04667..b408887 100644
>--- a/browser/app/Makefile.in
>+++ b/browser/app/Makefile.in
>@@ -103,6 +103,9 @@ else
>+ifdef BUILD_STATIC_LIBS
>+
>+SHARED_LIBRARY_LIBS += \
>+  $(DEPTH)/toolkit/xre/$(LIB_PREFIX)xulapp_s.$(LIB_SUFFIX) \
>+  $(NULL)
>+
>+SHARED_LIBRARY_LIBS += \
>+  $(foreach component,$(COMPONENT_LIBS),$(DEPTH)/staticlib/components/$(LIB_PREFIX)$(component).$(LIB_SUFFIX)) \
>+  $(foreach lib,$(STATIC_LIBS),$(DEPTH)/staticlib/$(LIB_PREFIX)$(lib).$(LIB_SUFFIX)) \
>+  $(NULL)
>+endif

You never answered my question in the previous review about why you're using SHARED_LIBRARY_LIBS here.

>+
> # This switches $(INSTALL) to copy mode, like $(SYSINSTALL), so things that
> # shouldn't get 755 perms need $(IFLAGS1) for either way of calling nsinstall.
> NSDISTMODE = copy
>@@ -171,6 +192,58 @@ endif
> ifdef BUILD_STATIC_LIBS
> include $(topsrcdir)/config/static-config.mk
> 
>+STATIC_EXTRA_LIBS += $(DEPTH)/dist/lib/$(LIB_PREFIX)mozz.$(LIB_SUFFIX)
>+# Don't want nsStaticComponents.cpp
>+STATIC_CPPSRCS =

I still think you should just edit static-config.mk and not set this ifdef BUILD_STATIC_LIBS.


>diff --git a/browser/confvars.sh b/browser/confvars.sh
>index 56f2986..1e5edf2 100755
>--- a/browser/confvars.sh
>+++ b/browser/confvars.sh
>@@ -41,7 +41,8 @@ MOZ_UPDATER=1
> MOZ_PHOENIX=1
> 
> MOZ_ENABLE_LIBXUL=1
>-MOZ_STATIC_BUILD_UNSUPPORTED=1
>+BUILD_STATIC_LIBS=1
>+

Please leave the BUILD_STATIC_LIBS line out of your final patch. We'll have to keep libxul as the default build until we sort out the IPC linkage issues. (It's fine to remove the MOZ_STATIC_BUILD_UNSUPPORTED.)


>diff --git a/config/config.mk b/config/config.mk
>index 1499cc4..2c0c1db 100644
>--- a/config/config.mk
>+++ b/config/config.mk
>@@ -349,6 +349,7 @@ endif
> # Disable PIC if necessary
> #
> 
>+ifndef BUILD_STATIC_LIBS
> ifndef _ENABLE_PIC
> DSO_CFLAGS=
> ifeq ($(OS_ARCH)_$(HAVE_GCC3_ABI),Darwin_1)
>@@ -357,6 +358,7 @@ else
> DSO_PIC_CFLAGS=
> endif
> endif
>+endif

I don't think you explained this change.

>diff --git a/config/rules.mk b/config/rules.mk
>index c6b6d74..5baf56b 100644
>--- a/config/rules.mk
>+++ b/config/rules.mk
>@@ -196,6 +196,10 @@ check-one:
> 
> endif # XPCSHELL_TESTS
> 
>+ifdef BUILD_STATIC_LIBS
>+CPP_UNIT_TESTS=
>+endif
>+
> ifdef CPP_UNIT_TESTS


Could you just instead wrap an ifndef BUILD_STATIC_LIBS around that ifdef CPP_UNIT_TESTS block? That way we'd just skip it in static builds.

> 
> # Compile the tests to $(DIST)/bin.  Make lots of niceties available by default
>@@ -204,7 +208,11 @@ ifdef CPP_UNIT_TESTS
> CPPSRCS += $(CPP_UNIT_TESTS)
> SIMPLE_PROGRAMS += $(CPP_UNIT_TESTS:.cpp=$(BIN_SUFFIX))
> INCLUDES += -I$(DIST)/include/testing
>-LIBS += $(XPCOM_GLUE_LDOPTS) $(NSPR_LIBS)
>+ifndef BUILD_STATIC_LIBS
>+LIBS += $(XPCOM_GLUE_LDOPTS)
>+endif
>+LIBS += $(NSPR_LIBS)

Since you're not compiling these in your static build, surely you don't need these changes?

>diff --git a/configure.in b/configure.in
>index 18cfcce..2680e91 100644
>--- a/configure.in
>+++ b/configure.in
>@@ -2206,8 +2206,8 @@ case "$target" in
>         IMPORT_LIB_SUFFIX=lib
>         MKSHLIB='$(LD) -NOLOGO -DLL -OUT:$@ -PDB:$(LINK_PDBFILE) $(DSO_LDOPTS)'
>         MKCSHLIB='$(LD) -NOLOGO -DLL -OUT:$@ -PDB:$(LINK_PDBFILE) $(DSO_LDOPTS)'
>-        MKSHLIB_FORCE_ALL=
>-        MKSHLIB_UNFORCE_ALL=
>+    	MKSHLIB_FORCE_ALL='-OPT:NOREF'
>+    	MKSHLIB_UNFORCE_ALL='-OPT:REF'

I think your indentation is a little wonky here.

>diff --git a/toolkit/components/alerts/Makefile.in b/toolkit/components/alerts/Makefile.in
>index e7d651b..8f07e7e 100644
>--- a/toolkit/components/alerts/Makefile.in
>+++ b/toolkit/components/alerts/Makefile.in
>@@ -50,12 +50,14 @@ DIRS += src
> endif
> 
> ifneq (,$(filter cocoa, $(MOZ_WIDGET_TOOLKIT)))
>+ifndef BUILD_STATIC_LIBS
> # src/mac/growl needs to be first for linking to work!
> DIRS += \
>   src/mac/growl \
>   src/mac \
>   $(NULL)
> endif
>+endif

This could probably stand a comment saying that it's in tier postxpcom for static builds.

>diff --git a/toolkit/static-postxpcom-tiers.mk b/toolkit/static-postxpcom-tiers.mk
>new file mode 100644
>index 0000000..d5cc658
>--- /dev/null
>+++ b/toolkit/static-postxpcom-tiers.mk
>@@ -0,0 +1,83 @@
>+# ***** BEGIN LICENSE BLOCK *****
>+# Version: MPL 1.1/GPL 2.0/LGPL 2.1
>+#
>+# The contents of this file are subject to the Mozilla Public License Version
>+# 1.1 (the "License"); you may not use this file except in compliance with
>+# the License. You may obtain a copy of the License at
>+# http://www.mozilla.org/MPL/
>+#
>+# Software distributed under the License is distributed on an "AS IS" basis,
>+# WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
>+# for the specific language governing rights and limitations under the
>+# License.
>+#
>+# The Original Code is the Mozilla build system.
>+#
>+# The Initial Developer of the Original Code is
>+# the Mozilla Foundation <http://www.mozilla.org/>.
>+# Portions created by the Initial Developer are Copyright (C) 2006

Fix the copyright date, please.

>+# the Initial Developer. All Rights Reserved.
>+#
>+# Contributor(s):
>+#   Benjamin Smedberg <benjamin@smedbergs.us> (Initial Code)

I think you get to take the blame here. :)

>+ifeq ($(MOZ_WIDGET_TOOLKIT),gtk2)
>+tier_postxpcom_dirs += 	modules/plugin/sdk/samples/unixprinting
>+endif

There's an extraneous tab here.

>+ifneq (,$(filter cocoa, $(MOZ_WIDGET_TOOLKIT)))

This can just be ifeq(cocoa, $(MOZ_WIDGET_TOOLKIT))

>diff --git a/toolkit/toolkit-tiers.mk b/toolkit/toolkit-tiers.mk
>index 012f915..34b59ed 100644
>--- a/toolkit/toolkit-tiers.mk
>+++ b/toolkit/toolkit-tiers.mk
>@@ -228,11 +228,9 @@ tier_toolkit_dirs += embedding/browser/gtk
> endif
> endif
> 
>-ifndef BUILD_STATIC_LIBS
> tier_toolkit_dirs += toolkit/library
>-endif
> 
>-ifdef MOZ_ENABLE_LIBXUL
>+ifndef BUILD_STATIC_LIBS
> tier_toolkit_dirs += xpcom/stub
> endif

Will this break a --disable-libxul --disable-static build?

>diff --git a/xpcom/build/Makefile.in b/xpcom/build/Makefile.in
>index e4438ba..00df425 100644
>--- a/xpcom/build/Makefile.in
>+++ b/xpcom/build/Makefile.in
>@@ -51,9 +51,8 @@ LIBRARY_NAME	= xpcom_core
> SHORT_LIBNAME	= xpcomcor
> LIBXUL_LIBRARY = 1
> 
>-# This is only a static library in libxul builds
>-ifdef MOZ_ENABLE_LIBXUL
>-EXPORT_LIBRARY = 1
>+ifdef BUILD_STATIC_LIBS
>+DIST_INSTALL = 1
> endif

Doesn't this change how things work in an --enable-libxul --disable-static build?

>diff --git a/xpcom/stub/Makefile.in b/xpcom/stub/Makefile.in
>index cea7973..d81aa12 100644
>--- a/xpcom/stub/Makefile.in
>+++ b/xpcom/stub/Makefile.in
>@@ -116,6 +132,16 @@ endif
> 
> endif
> 
>+ifdef BUILD_STATIC_LIBS
>+ifeq ($(OS_TARGET),Darwin)
>+EXTRA_DSO_LDOPTS += -undefined dynamic_lookup
>+endif
>+ifneq (, $(filter WINNT WINCE, $(OS_ARCH)))
>+EXTRA_DSO_LIBS += \
>+	$(DEPTH)/$(MOZ_BUILD_APP)/app/$(LIB_PREFIX)$(MOZ_APP_NAME)

Two space indent here, not a tab.

>diff --git a/xulrunner/Makefile.in b/xulrunner/Makefile.in
>index 46406ba..e040f5c 100644
>--- a/xulrunner/Makefile.in
>+++ b/xulrunner/Makefile.in

I don't think we should deal with xulrunner at all in this patch. Fennec is moving to a Firefox-like build (not building xulrunner, bug 543060), so they can just port these changes to work there. It doesn't seem likely that anyone is going to want to use a static xulrunner otherwise. Just drop the rest of this patch.

So r- for a few things that need to be addressed there, but this is pretty close. If you can fix those up and resubmit I'd expect to r+ the next version. Given Jim's comment in bug 532765 comment 21, it sounds like we can actually get binary tests working on Windows, at least. We can do that in a followup.

Sorry this review took so long, it takes me a bit to get into the right mindset to review patches of this size...
Comment 154 (dormant account) 2010-03-08 12:53:27 PST
(In reply to comment #104)

> Come to think of it, Maemo is also an ELF-based OS, isn't it?  If so, the
> optimization that 'prelink' does should apply there too.

As far as I can tell prelink doesn't apply to us as it fails on firefox binaries(due to some far-off dependency) on every single Linux distribution I've tried.
Comment 155 Dietrich Ayala (:dietrich) 2010-03-08 20:53:51 PST
(In reply to comment #154)
> As far as I can tell prelink doesn't apply to us as it fails on firefox
> binaries(due to some far-off dependency) on every single Linux distribution
> I've tried.

Is there a way to find out what that dependency is?
Comment 156 Dietrich Ayala (:dietrich) 2010-03-12 16:15:42 PST
Joel's unavailable for a while, so reassigning to Ted who said he could take it from here.
Comment 157 Justin Wood (:Callek) 2010-03-15 21:37:42 PDT
(In reply to comment #156)
> Joel's unavailable for a while, so reassigning to Ted who said he could take it
> from here.

To be clear, as after reading ALL 156 comments here I did not see an indication that  this bug morphed from "Investigate..." to "Make static build the preferred and possible state" (or whatever).

So, to ask the vague and deliberate question, "What is the state of this bug and its patch(es) as it relates to m-c and Firefox future?"
Comment 158 Dietrich Ayala (:dietrich) 2010-03-15 22:07:28 PDT
The patch will land on m-c, as a non-default option. Whether we turn it on for release builds or not depends on how the testing bears out.
Comment 159 Ted Mielczarek [:ted.mielczarek] 2010-03-17 10:52:34 PDT
Created attachment 433097 [details] [diff] [review]
updated patch

This is an updated version of Joel's most recent patch. I had to fix a few things to make it compile in the default (libxul) configuration again, and I cleaned up the things from my review comments. It builds and runs in the static configuration on Linux, but I think the component linkage is broken, because I'm missing icons, which suggests that some Gnome service is missing.

I also made libmozalloc be statically linked into the binary (from bug 441324), cjones said that made sense.
Comment 160 Ted Mielczarek [:ted.mielczarek] 2010-03-19 10:34:48 PDT
I had to fix some libimgicon bits, I'll have a new patch shortly (maybe not today). I split those fixes out into bug 553635, since it turns out we don't actually need that as a separate sharedlib even in libxul builds anymore.
Comment 161 Ted Mielczarek [:ted.mielczarek] 2010-03-22 10:49:43 PDT
Created attachment 433976 [details] [diff] [review]
updated patch

Ok, this builds and runs in both the --enable-static and the default libxul config. (I had to fix some build bustage in the libxul config.) My expectation is that we'll land this, then sort out the remaining issues before switching our nightly builds to this configuration.

I've pushed it to try for a sanity check, but I think this is ready to go. bsmedberg: can you give this a once-over?
Comment 162 Ted Mielczarek [:ted.mielczarek] 2010-03-26 14:53:23 PDT
Created attachment 435281 [details] [diff] [review]
updated patch

This fixes a build issue on Windows with --enable-static. The browser components weren't linking right because the Thebes APIs weren't exported in static builds, and we use them for aero peek preview stuff.
Comment 163 (dormant account) 2010-04-20 14:24:17 PDT
Would there be a problem with changing the static firefox approach to instead be a firefox-bin executable + a single giant libxul library? It seems like that gets us many of the benefits of a single executable, but gives us some room to mess with the dynamic linker(ie set madvise flags, use large pages, etc).
Comment 164 :Ehsan Akhgari 2010-04-20 14:31:17 PDT
Yes, we would also need that on Windows if we're going to preload our large binary using more efficient mechanisms.
Comment 165 Ted Mielczarek [:ted.mielczarek] 2010-04-26 11:43:03 PDT
If that's what you really want, then we can throw away 90% of this patch.
Comment 166 (dormant account) 2010-04-26 11:45:38 PDT
(In reply to comment #165)
> If that's what you really want, then we can throw away 90% of this patch.

Ok. To clarify, I'd like to fold in all of the libraries in dist/bin/ into one.
Comment 167 Ted Mielczarek [:ted.mielczarek] 2010-04-26 13:02:32 PDT
Yeah, this patch does a little bit of that. The biggest remaining offender is NSPR/NSS, but there's a patch in bug 534471 for that.

Seriously, if you've decided you don't want this, let's WONTFIX this bug and file a new bug for "fold all remaining shared libs into libxul". This patch has a lot of complexity, and if we don't need it let's not spend any more time on it.
Comment 168 (dormant account) 2010-04-26 13:03:37 PDT
(In reply to comment #167)
> Yeah, this patch does a little bit of that. The biggest remaining offender is
> NSPR/NSS, but there's a patch in bug 534471 for that.

Cool.

> 
> Seriously, if you've decided you don't want this, let's WONTFIX this bug and
> file a new bug for "fold all remaining shared libs into libxul". This patch has
> a lot of complexity, and if we don't need it let's not spend any more time on
> it.
Comment 169 Ted Mielczarek [:ted.mielczarek] 2010-04-26 13:08:36 PDT
Created attachment 441580 [details] [diff] [review]
Rebased patch (for posterity)

This is rebased on top of changeset 2968d19b0165, just for posterity.
Comment 170 Mike Hommey [:glandium] 2010-04-26 23:53:53 PDT
(In reply to comment #167)
> Yeah, this patch does a little bit of that. The biggest remaining offender is
> NSPR/NSS, but there's a patch in bug 534471 for that.

The problem with that is that it would make it harder for weave to work.
http://blog.mozilla.com/dolske/2010/04/25/weave-crypto-javascript/
Comment 171 Ted Mielczarek [:ted.mielczarek] 2010-04-27 05:53:51 PDT
I don't think it's that big of a deal. We can just export the necessary symbols from libxul.
Comment 172 Dietrich Ayala (:dietrich) 2010-04-27 09:07:09 PDT
For clarity and posterity, see here for why this was WONTFIXed:

https://bugzilla.mozilla.org/show_bug.cgi?id=561842#c0

Note You need to log in before you can comment on or make changes to this bug.