optimize js component loading

RESOLVED FIXED

Status

()

Core
XPConnect
RESOLVED FIXED
12 years ago
8 years ago

People

(Reporter: Brian Ryner (not reading), Assigned: Brian Ryner (not reading))

Tracking

({fixed1.8.1, perf})

Trunk
fixed1.8.1, perf
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments)

(Assignee)

Description

12 years ago
Recent profiling data suggests we're spending a fair amount of firefox startup
time (on Mac at least) loading and compiling nsExtensionManager.js.  At a
minimum, we can optimize the load by reading the whole file at once into a
buffer; ideally we should be able to fastload js components.
(Assignee)

Comment 1

12 years ago
Created attachment 198274 [details] [diff] [review]
use memory-mapped files for compiling js components

This seems to help quite a bit, 20% or so according to Quantify.  I had to
leave the old codepath because BeOS and OS/2 don't support PR_MemMap (those
were the only ones I could find from looking through NSPR -- and runtime
detection seems quite inefficient).  I also didn't bother dealing with files
larger than 4GB -- if we have one of those in the components directory, I think
we're going to have problems.
Assignee: shaver → bryner
Status: NEW → ASSIGNED
Attachment #198274 - Flags: superreview?(jst)
Attachment #198274 - Flags: review?(shaver)
(Assignee)

Comment 2

12 years ago
Oh, also... I'd originally implemented this by just malloc'ing a buffer and 
reading the file into it.  That gives a healthy speedup as well, but using 
mmap seems to be a bit faster.
Comment on attachment 198274 [details] [diff] [review]
use memory-mapped files for compiling js components

woo! r=shaver
Attachment #198274 - Flags: review?(shaver) → review+
Comment on attachment 198274 [details] [diff] [review]
use memory-mapped files for compiling js components

sr=jst
Attachment #198274 - Flags: superreview?(jst) → superreview+
(Assignee)

Comment 5

12 years ago
Created attachment 199193 [details] [diff] [review]
fastload js components

This patch uses the FastLoadService to fastload js components.	Interesting
bits:

- It pushes and pops the stream state on the FastLoadService so that it doesn't
interfere with ongoing XUL fastloads.  I couldn't really come up with a good
model for sharing the fastload file with XUL, and I don't think it would be a
big advantage anyway.  I also looked at changing the FLS into something
non-singleton so that swapping out the stream pointers wouldn't be necessary,
but the FastLoadPtr stuff made that rather hairy, and I don't think it's a big
deal.

- There's lots of cleanup of the JSComponentLoader here that's not 100% related
to fastload.  I quickly got tired of dealing with the goto-based code to handle
errors so I cleaned it up in favor of stack-based objects to clean up state,
and also improved the propagation of errors significantly.

- Because JS components can be loaded before the profile-change notification,
the ProfD/ProfLD directory service key won't give access to the profile
directory when we need it.  I created a new key that's implemented by
toolkit/xre, per bsmedberg's advise, and use that, falling back to the other
key if it fails.  The upshot is that for apps that don't implement "ProfLDS",
but implement "ProfD" or "ProfLD", the fastload file may not be written out
during component registration, but instead will be written on the next startup.
 This should fail gracefully for apps that don't use a profile directory, too.

- Yes, the nsIBinaryInputStream change is related; I ran across this bogus
header dependency while working on this, and implemented this fix that Darin
suggested.
Attachment #199193 - Flags: superreview?(brendan)
Attachment #199193 - Flags: review?(shaver)
(Assignee)

Comment 6

12 years ago
Created attachment 199194 [details] [diff] [review]
fastload js components (diff -w)
Comment on attachment 199193 [details] [diff] [review]
fastload js components

Cool, skimmed and it looks good.  May read more closely in a bit.  What kind of
speedup does this give on top of the mmap one?

/be
(Assignee)

Comment 8

12 years ago
Seems to be about 4% on Windows Firefox.  As I mentioned to you on irc, most of
the time spent in GlobalForLocation with this patch is directly attributable to
JS_XDRScript, and a huge chunk of that comes from memory allocations in
JS_XDRString.
(In reply to comment #8)
> Seems to be about 4% on Windows Firefox.  As I mentioned to you on irc, most of
> the time spent in GlobalForLocation with this patch is directly attributable to
> JS_XDRScript, and a huge chunk of that comes from memory allocations in
> JS_XDRString.

That was by email (at least, the message I read was), and I suggested a few
things to look into:
- Fixing bug 104170, bug 107907, and bug 195010.
- Profiling to see whether the JS_malloc layering was costly (prolly not).
- The rt->gcLock cost serializing JSString allocation.
Any profiling detail under JS_XDRString?

/be
(Assignee)

Comment 10

12 years ago
(In reply to comment #9)
> Any profiling detail under JS_XDRString?

Here's the distribution of time, from Quantify:

JS_malloc       55.87%
  malloc          98.35%

JS_NewUCString  34.91%
  js_NewString    98.46%
    js_NewGCThing   96.87%
      PR_Unlock       49.06%
      PR_Lock         36.39%
      gc_new_arena     1.97%
      js_GetGCThingFlags 1.19%
      js_PushLocalRoot 0.03%
  
JS_XDRUint32     1.61%
mem_raw          1.01%
(Assignee)

Updated

12 years ago
Depends on: 312238
So bryner and I talked on IRC and there are two ideas:

- bug 312238 (bryner filed just now, thanks!) is for optimizing away the
rt->gcLock cost, to make JS_New*String cheap.

- the malloc (JS_malloc is trivial layering) cost can be amortized by allocating
two pools of string storage at once per JS component: one for the top-level
script strings that are used only by that script, the second for the strings
used by functions declared, stated, or expressed in that script.  It may not pay
to have both, if top-level scripts use few or no strings exclusively.

/be
Comment on attachment 199193 [details] [diff] [review]
fastload js components

r=shaver, thanks to bryner for patience, cleanup, and IRC clues.
Attachment #199193 - Flags: review?(shaver) → review+
Comment on attachment 199193 [details] [diff] [review]
fastload js components

Still looks good, but how about a followup bug (don't give it to me, but do cc:
me on it -- it'll be good intern fodder) to share a bunch of code, at least:

ReadScriptFromStream
WriteScriptToStream
mozJSComponentLoader::StartFastLoad

with the XUL content fastload code (and uplift its "fastload session"
management with the righteous timer thing).

/be
Attachment #199193 - Flags: superreview?(brendan) → superreview+

Comment 14

12 years ago
Sorry to spam, but is this bug related to both bug 313268 (which kill Chatzilla)
and bug 313262 (which kills restart of Thunderbird) ?

Updated

12 years ago
Depends on: 313268

Comment 15

12 years ago
BeOS build bustage:
ComponentLoader.cpp: In method `nsresult
mozJSComponentLoader::StartFastLoad(nsIFastLoadService *)':
/mozbuild/home/mozbone/2005-10-22-05-trunk/mozilla/js/src/xpconnect/loader/mozJSComponentLoader.cpp:1234:
assuming & on overloaded member function 

Comment 16

12 years ago
If this matters, compiler version on BeOS is gcc 2.95.3

Comment 17

12 years ago
Is it meant to be:
rv = mFastLoadTimer->InitWithFuncCallback(&(mozJSComponentLoader::CloseFastLoad),
???
In this form it compiles, though, I'm unsure if this is proper.

Comment 18

12 years ago
version 1.111 
of mozJSComponentLoader.cpp with
rv = mFastLoadTimer->InitWithFuncCallback(CloseFastLoad,

compiles successfully at BeOS.

So "fix" for btek bustage broke BeOS building.
(Assignee)

Comment 19

12 years ago
(In reply to comment #17)
> Is it meant to be:
> rv = mFastLoadTimer->InitWithFuncCallback(&(mozJSComponentLoader::CloseFastLoad),
> ???
> In this form it compiles, though, I'm unsure if this is proper.

I checked in this fix.  Thanks.

Comment 20

12 years ago
I think this screws up Adblock+ randomly.  Is this a problem on the patch's part or something wrong with the Adblock+ code itself?

Mozilla/5.0 (X11; U; Linux i686; en-GB; rv:1.9a1) Gecko/20051023 Firefox/1.6a1 ID:2005102313
Since this landed, the XUL filepicker has often not come up the first time it should after startup.  Also, I've been having issues with random components (eg the JS context stack iterator) not being created properly until I clobber...
Depends on: 313262
Depends on: 313610
Depends on: 313612
Depends on: 313614

Comment 22

12 years ago
(In reply to comment #11)
> So bryner and I talked on IRC and there are two ideas:
> 
> - bug 312238 (bryner filed just now, thanks!) is for optimizing away the
> rt->gcLock cost, to make JS_New*String cheap.
> 
> - the malloc (JS_malloc is trivial layering) cost can be amortized by allocating
> two pools of string storage at once per JS component: one for the top-level
> script strings that are used only by that script, the second for the strings
> used by functions declared, stated, or expressed in that script.  It may not pay
> to have both, if top-level scripts use few or no strings exclusively.
> 

But what about storing all the string characters in a special pool while loading   the script, then creating one single JSString that contains chars from all the strings, then creating all necessary JSString instances as dependancies of the main string and then patching script structures to use the dependant strings. This would reduce malloc costs to just few allocations and all the dependancy strings can be allocated with single GC lock.

Comments?

Updated

11 years ago
Depends on: 321985
(Assignee)

Comment 23

11 years ago
Marking as fixed, we should track additional speedups separately.
Status: ASSIGNED → RESOLVED
Last Resolved: 11 years ago
Resolution: --- → FIXED
(Assignee)

Comment 24

11 years ago
Created attachment 222063 [details] [diff] [review]
combined branch patch

This includes the memory mapped I/O patch, the fastload patch, and all follow-up fixes.
Attachment #222063 - Flags: approval-branch-1.8.1?(shaver)
Comment on attachment 222063 [details] [diff] [review]
combined branch patch

Heck yeah.  181=shaver, thanks for driving this.
Attachment #222063 - Flags: approval-branch-1.8.1?(shaver) → approval-branch-1.8.1+
(Assignee)

Comment 26

11 years ago
checked in on branch, thanks shaver.
Keywords: fixed1.8.1
As expected, Ts numbers for branch tinderbox show a significant win for this patch on the branch:

atlantia
  before: 1669, 1658, 1665
   after: 1803, 1602, 1608

sparky
  before: 3386, 3385
   after: 3025, 3022, 3015

pacifica
  before:  718,  734,  716
   after:  640,  641,  640

Comment 28

11 years ago
Would this patch have any benefit for Seamonkey?  It doesn't use nsExtensionManager.js, but does read several other JS files at startup.


(In reply to comment #22)
> (In reply to comment #11)
> > - the malloc (JS_malloc is trivial layering) cost can be amortized by allocating
> > two pools of string storage at once per JS component: one for the top-level
> > script strings that are used only by that script, the second for the strings
> > used by functions declared, stated, or expressed in that script.  It may not pay
> > to have both, if top-level scripts use few or no strings exclusively.
> > 
> 
> But what about storing all the string characters in a special pool while
> loading   the script, then creating one single JSString that contains chars
> from all the strings, then creating all necessary JSString instances as
> dependancies of the main string and then patching script structures to use the
> dependant strings. This would reduce malloc costs to just few allocations and
> all the dependancy strings can be allocated with single GC lock.
> 
> Comments?

Igor, Brendan: Was a followup bug ever filed for these ideas? Are they worth looking into?
You need to log in before you can comment on or make changes to this bug.