Last Comment Bug 279839 - optimize js component loading
: optimize js component loading
Status: RESOLVED FIXED
: fixed1.8.1, perf
Product: Core
Classification: Components
Component: XPConnect (show other bugs)
: Trunk
: All All
: -- normal with 1 vote (vote)
: ---
Assigned To: Brian Ryner (not reading)
: Phil Schwartau
Mentors:
Depends on: 312238 313262 313268 313610 313612 313614 321985
Blocks:
  Show dependency treegraph
 
Reported: 2005-01-25 19:18 PST by Brian Ryner (not reading)
Modified: 2009-07-15 09:09 PDT (History)
19 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
use memory-mapped files for compiling js components (5.55 KB, patch)
2005-10-02 23:31 PDT, Brian Ryner (not reading)
shaver: review+
jst: superreview+
Details | Diff | Review
fastload js components (58.74 KB, patch)
2005-10-11 13:37 PDT, Brian Ryner (not reading)
shaver: review+
brendan: superreview+
Details | Diff | Review
fastload js components (diff -w) (55.02 KB, patch)
2005-10-11 13:37 PDT, Brian Ryner (not reading)
no flags Details | Diff | Review
combined branch patch (57.08 KB, patch)
2006-05-15 11:00 PDT, Brian Ryner (not reading)
shaver: approval‑branch‑1.8.1+
Details | Diff | Review

Description Brian Ryner (not reading) 2005-01-25 19:18:15 PST
Recent profiling data suggests we're spending a fair amount of firefox startup
time (on Mac at least) loading and compiling nsExtensionManager.js.  At a
minimum, we can optimize the load by reading the whole file at once into a
buffer; ideally we should be able to fastload js components.
Comment 1 Brian Ryner (not reading) 2005-10-02 23:31:02 PDT
Created attachment 198274 [details] [diff] [review]
use memory-mapped files for compiling js components

This seems to help quite a bit, 20% or so according to Quantify.  I had to
leave the old codepath because BeOS and OS/2 don't support PR_MemMap (those
were the only ones I could find from looking through NSPR -- and runtime
detection seems quite inefficient).  I also didn't bother dealing with files
larger than 4GB -- if we have one of those in the components directory, I think
we're going to have problems.
Comment 2 Brian Ryner (not reading) 2005-10-02 23:37:38 PDT
Oh, also... I'd originally implemented this by just malloc'ing a buffer and 
reading the file into it.  That gives a healthy speedup as well, but using 
mmap seems to be a bit faster.
Comment 3 Mike Shaver (:shaver -- probably not reading bugmail closely) 2005-10-03 11:19:17 PDT
Comment on attachment 198274 [details] [diff] [review]
use memory-mapped files for compiling js components

woo! r=shaver
Comment 4 Johnny Stenback (:jst, jst@mozilla.com) 2005-10-03 14:52:55 PDT
Comment on attachment 198274 [details] [diff] [review]
use memory-mapped files for compiling js components

sr=jst
Comment 5 Brian Ryner (not reading) 2005-10-11 13:37:05 PDT
Created attachment 199193 [details] [diff] [review]
fastload js components

This patch uses the FastLoadService to fastload js components.	Interesting
bits:

- It pushes and pops the stream state on the FastLoadService so that it doesn't
interfere with ongoing XUL fastloads.  I couldn't really come up with a good
model for sharing the fastload file with XUL, and I don't think it would be a
big advantage anyway.  I also looked at changing the FLS into something
non-singleton so that swapping out the stream pointers wouldn't be necessary,
but the FastLoadPtr stuff made that rather hairy, and I don't think it's a big
deal.

- There's lots of cleanup of the JSComponentLoader here that's not 100% related
to fastload.  I quickly got tired of dealing with the goto-based code to handle
errors so I cleaned it up in favor of stack-based objects to clean up state,
and also improved the propagation of errors significantly.

- Because JS components can be loaded before the profile-change notification,
the ProfD/ProfLD directory service key won't give access to the profile
directory when we need it.  I created a new key that's implemented by
toolkit/xre, per bsmedberg's advise, and use that, falling back to the other
key if it fails.  The upshot is that for apps that don't implement "ProfLDS",
but implement "ProfD" or "ProfLD", the fastload file may not be written out
during component registration, but instead will be written on the next startup.
 This should fail gracefully for apps that don't use a profile directory, too.

- Yes, the nsIBinaryInputStream change is related; I ran across this bogus
header dependency while working on this, and implemented this fix that Darin
suggested.
Comment 6 Brian Ryner (not reading) 2005-10-11 13:37:50 PDT
Created attachment 199194 [details] [diff] [review]
fastload js components (diff -w)
Comment 7 Brendan Eich [:brendan] 2005-10-11 18:27:39 PDT
Comment on attachment 199193 [details] [diff] [review]
fastload js components

Cool, skimmed and it looks good.  May read more closely in a bit.  What kind of
speedup does this give on top of the mmap one?

/be
Comment 8 Brian Ryner (not reading) 2005-10-11 19:35:56 PDT
Seems to be about 4% on Windows Firefox.  As I mentioned to you on irc, most of
the time spent in GlobalForLocation with this patch is directly attributable to
JS_XDRScript, and a huge chunk of that comes from memory allocations in
JS_XDRString.
Comment 9 Brendan Eich [:brendan] 2005-10-11 21:36:13 PDT
(In reply to comment #8)
> Seems to be about 4% on Windows Firefox.  As I mentioned to you on irc, most of
> the time spent in GlobalForLocation with this patch is directly attributable to
> JS_XDRScript, and a huge chunk of that comes from memory allocations in
> JS_XDRString.

That was by email (at least, the message I read was), and I suggested a few
things to look into:
- Fixing bug 104170, bug 107907, and bug 195010.
- Profiling to see whether the JS_malloc layering was costly (prolly not).
- The rt->gcLock cost serializing JSString allocation.
Any profiling detail under JS_XDRString?

/be
Comment 10 Brian Ryner (not reading) 2005-10-12 13:42:44 PDT
(In reply to comment #9)
> Any profiling detail under JS_XDRString?

Here's the distribution of time, from Quantify:

JS_malloc       55.87%
  malloc          98.35%

JS_NewUCString  34.91%
  js_NewString    98.46%
    js_NewGCThing   96.87%
      PR_Unlock       49.06%
      PR_Lock         36.39%
      gc_new_arena     1.97%
      js_GetGCThingFlags 1.19%
      js_PushLocalRoot 0.03%
  
JS_XDRUint32     1.61%
mem_raw          1.01%
Comment 11 Brendan Eich [:brendan] 2005-10-12 15:37:17 PDT
So bryner and I talked on IRC and there are two ideas:

- bug 312238 (bryner filed just now, thanks!) is for optimizing away the
rt->gcLock cost, to make JS_New*String cheap.

- the malloc (JS_malloc is trivial layering) cost can be amortized by allocating
two pools of string storage at once per JS component: one for the top-level
script strings that are used only by that script, the second for the strings
used by functions declared, stated, or expressed in that script.  It may not pay
to have both, if top-level scripts use few or no strings exclusively.

/be
Comment 12 Mike Shaver (:shaver -- probably not reading bugmail closely) 2005-10-19 13:48:18 PDT
Comment on attachment 199193 [details] [diff] [review]
fastload js components

r=shaver, thanks to bryner for patience, cleanup, and IRC clues.
Comment 13 Brendan Eich [:brendan] 2005-10-19 22:56:32 PDT
Comment on attachment 199193 [details] [diff] [review]
fastload js components

Still looks good, but how about a followup bug (don't give it to me, but do cc:
me on it -- it'll be good intern fodder) to share a bunch of code, at least:

ReadScriptFromStream
WriteScriptToStream
mozJSComponentLoader::StartFastLoad

with the XUL content fastload code (and uplift its "fastload session"
management with the righteous timer thing).

/be
Comment 14 Frederic Bezies 2005-10-21 07:39:18 PDT
Sorry to spam, but is this bug related to both bug 313268 (which kill Chatzilla)
and bug 313262 (which kills restart of Thunderbird) ?
Comment 15 Sergei Dolgov 2005-10-22 10:32:32 PDT
BeOS build bustage:
ComponentLoader.cpp: In method `nsresult
mozJSComponentLoader::StartFastLoad(nsIFastLoadService *)':
/mozbuild/home/mozbone/2005-10-22-05-trunk/mozilla/js/src/xpconnect/loader/mozJSComponentLoader.cpp:1234:
assuming & on overloaded member function 
Comment 16 Sergei Dolgov 2005-10-22 10:49:57 PDT
If this matters, compiler version on BeOS is gcc 2.95.3
Comment 17 Sergei Dolgov 2005-10-22 10:55:28 PDT
Is it meant to be:
rv = mFastLoadTimer->InitWithFuncCallback(&(mozJSComponentLoader::CloseFastLoad),
???
In this form it compiles, though, I'm unsure if this is proper.
Comment 18 Sergei Dolgov 2005-10-22 14:17:53 PDT
version 1.111 
of mozJSComponentLoader.cpp with
rv = mFastLoadTimer->InitWithFuncCallback(CloseFastLoad,

compiles successfully at BeOS.

So "fix" for btek bustage broke BeOS building.
Comment 19 Brian Ryner (not reading) 2005-10-22 15:56:20 PDT
(In reply to comment #17)
> Is it meant to be:
> rv = mFastLoadTimer->InitWithFuncCallback(&(mozJSComponentLoader::CloseFastLoad),
> ???
> In this form it compiles, though, I'm unsure if this is proper.

I checked in this fix.  Thanks.
Comment 20 Matt 2005-10-23 21:18:38 PDT
I think this screws up Adblock+ randomly.  Is this a problem on the patch's part or something wrong with the Adblock+ code itself?

Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9a1) Gecko/20051023 Firefox/1.6a1 ID:2005102313
Comment 21 Boris Zbarsky [:bz] 2005-10-23 22:07:10 PDT
Since this landed, the XUL filepicker has often not come up the first time it should after startup.  Also, I've been having issues with random components (eg the JS context stack iterator) not being created properly until I clobber...
Comment 22 Igor Bukanov 2005-12-09 12:08:48 PST
(In reply to comment #11)
> So bryner and I talked on IRC and there are two ideas:
> 
> - bug 312238 (bryner filed just now, thanks!) is for optimizing away the
> rt->gcLock cost, to make JS_New*String cheap.
> 
> - the malloc (JS_malloc is trivial layering) cost can be amortized by allocating
> two pools of string storage at once per JS component: one for the top-level
> script strings that are used only by that script, the second for the strings
> used by functions declared, stated, or expressed in that script.  It may not pay
> to have both, if top-level scripts use few or no strings exclusively.
> 

But what about storing all the string characters in a special pool while loading   the script, then creating one single JSString that contains chars from all the strings, then creating all necessary JSString instances as dependancies of the main string and then patching script structures to use the dependant strings. This would reduce malloc costs to just few allocations and all the dependancy strings can be allocated with single GC lock.

Comments?
Comment 23 Brian Ryner (not reading) 2006-02-27 22:21:06 PST
Marking as fixed, we should track additional speedups separately.
Comment 24 Brian Ryner (not reading) 2006-05-15 11:00:14 PDT
Created attachment 222063 [details] [diff] [review]
combined branch patch

This includes the memory mapped I/O patch, the fastload patch, and all follow-up fixes.
Comment 25 Mike Shaver (:shaver -- probably not reading bugmail closely) 2006-05-16 00:42:00 PDT
Comment on attachment 222063 [details] [diff] [review]
combined branch patch

Heck yeah.  181=shaver, thanks for driving this.
Comment 26 Brian Ryner (not reading) 2006-05-16 11:30:06 PDT
checked in on branch, thanks shaver.
Comment 27 Myk Melez [:myk] [@mykmelez] 2006-05-22 17:44:44 PDT
As expected, Ts numbers for branch tinderbox show a significant win for this patch on the branch:

atlantia
  before: 1669, 1658, 1665
   after: 1803, 1602, 1608

sparky
  before: 3386, 3385
   after: 3025, 3022, 3015

pacifica
  before:  718,  734,  716
   after:  640,  641,  640
Comment 28 Steve Snyder 2006-06-04 09:08:52 PDT
Would this patch have any benefit for Seamonkey?  It doesn't use nsExtensionManager.js, but does read several other JS files at startup.


Comment 29 Dietrich Ayala (:dietrich) 2009-07-15 09:09:21 PDT
(In reply to comment #22)
> (In reply to comment #11)
> > - the malloc (JS_malloc is trivial layering) cost can be amortized by allocating
> > two pools of string storage at once per JS component: one for the top-level
> > script strings that are used only by that script, the second for the strings
> > used by functions declared, stated, or expressed in that script.  It may not pay
> > to have both, if top-level scripts use few or no strings exclusively.
> > 
> 
> But what about storing all the string characters in a special pool while
> loading   the script, then creating one single JSString that contains chars
> from all the strings, then creating all necessary JSString instances as
> dependancies of the main string and then patching script structures to use the
> dependant strings. This would reduce malloc costs to just few allocations and
> all the dependancy strings can be allocated with single GC lock.
> 
> Comments?

Igor, Brendan: Was a followup bug ever filed for these ideas? Are they worth looking into?

Note You need to log in before you can comment on or make changes to this bug.