Bugzilla

Comment 1

•

20 years ago

This is not a JS engine bug, SpiderMonkey knows nothing about an HTTP-conforming
cache, Necko's or some other embedding's.

Trying DOM, but perhaps DOM level 0 is better.

/be

Component: JavaScript Engine → DOM

timeless

Updated

•

20 years ago

Assignee: general → general

QA Contact: general → ian

Comment 3

•

16 years ago

brendan, ideally we'd be able to cache the compiled script... except that we need to be able to change the principals if the script is loaded from a different page.  I suppose we could force the cache to be per-domain...

This came up today with gmail: about 20% of the gmail pageload time is under js_CompileScript.

Comment 4

•

16 years ago

How much of this is under eval? See bug 454184.

/be

Comment 5

•

16 years ago

Also, I hope the code being compiled is only ever loaded from one origin. Explicit load balancing hostname-1 through hostname-n.foo.com is so 1994. Erik, is gmail loading from a fixed origin?

/be

Comment 6

•

16 years ago

None.  It's directly under JS_EvaluateUCScriptForPrincipals.  I'd say it's an inline script, but it might have an @src and just have gotten prefetched...

Comment 7

•

16 years ago

A couple of notes.

JS API exists to precompile and (re-)execute scripts, used by XUL and XBL. The DOM could use this, but it would lose some JSOPTION_COMPILE_N_GO optimizations, and it would need to index the cache by a key that includes the origin or other ingredients of the principals compiled into the script.

We could add new API to "re-link" against different global objects and principals to relieve the cache from having to worry about this, at some cost in re-execute performance (probably small).

Making the cache data-associative rather than name-associative would be good for all those Ajax library copies (uncrunched, or crunched the same way) that are not yet on the Google or Yahoo! CDN. The idea is to hash the script as it loads, save the hash and any other hints in the compiled-script cache, and if it reloads from anywhere with the same hash and other hings (need SHA-1 or better, not MD5), use the cached bytecode.

Sites do have to reload the same script when they cross a domain boundary, I'm told (metaweb.com pals). So this helps intentional replication across origins as well as all the organic .js file copy propagation in the field. With the latter kind of copying, we don't know (at least I know of no studies) how often the copy is mutated or re-crunched.

All of this is do-able and would make a good *not*-my-first-bug project for some game hacker.

/be

Comment 8

•

16 years ago

> The DOM could use this, but it would lose some JSOPTION_COMPILE_N_GO
> optimizations

Details?  Would doing this only for scripts over a certain size make sense?  Or are the optimizations not constant-time wins?

Hashing by data would be a win in terms of not having to hook into the necko cache and such.  Just get the data from necko, however it gets it, then look in or private cache.  I assume said private cache could be fastloaded if we wanted to persist it across browser invocations...  That said, would fastloading a script (from disk, with attendant disk access) be any win over compiling an in-memory string if you already have the latter?  It's not obvious.  Could use data!

We do have SHA-1 via NSS, of course.  The only question there is performance, again.

Erik Arvidsson

Reporter

Comment 9

•

16 years ago

(In reply to comment #5)
> Also, I hope the code being compiled is only ever loaded from one origin.
> Explicit load balancing hostname-1 through hostname-n.foo.com is so 1994. Erik,
> is gmail loading from a fixed origin?
> 
> /be

Gmail loads all of its JS from the same domain. Gmail is a bit of a special case though. We load all the js through an html file in an iframe to be able to do the loading progress bar. If we could get a 20% speedup by using a single js file again we would of course look into other ways of providing user feedback during page load.

Phil Ringnalda (:philor)

Updated

•

15 years ago

Assignee: general → nobody

QA Contact: ian → general

Chris Leary [:cdleary] (not checking bugmail)

Updated

•

15 years ago

Alias: js-ccache

Summary: Cache parsed javascripts → Cache compiled javascripts

Comment 11

•

15 years ago

sayrer reference the following article: http://www.yuiblog.com/blog/2007/01/04/performance-research-part-2/

Mike Shaver (:shaver emeritus)

Comment 12

•

14 years ago

Attached patch Adds the ability to attach a binary blob of data as metadata to a cache entry. (obsolete) — Details — Splinter Review

As it is now, the cache metadata facility cannot be used to store serialized bytecode as it requires the both the key and value to be null terminated strings. This patch changes the value so that it is a (length, blob) pair instead of a null terminated string. To avoid API breakage, the existing get and set functions are kept as expecting a null terminated value, and the new functionality is exposed through the Get...Len and Set...Len functions.

This is a pre-requisite for being able to attach XDR serialized bytecode to cache entries via the existing metadata facility.

Comment 13

•

14 years ago

API breakage seems permissible here, since it's mechanically easy to work through and is less likely to result in data loss.

Mike Shaver (:shaver emeritus)

Comment 14

•

14 years ago

Note that we could also store base64-encoded xdr if we want to keep it null-free.  We do this for other binary blobs we need to stick in cache metadata (e.g. security info).  But that has obvious performance costs; I'm just as happy to have a proper blob api.

Comment 15

•

14 years ago

Make it a proper API! Be bold! Live the adventure! (Get me to sr it.)

Updated

•

14 years ago

Attachment #454605 - Flags: review?(michal.novotny)

Comment 16

•

14 years ago

Attached patch metadata patch v2 (obsolete) — Details — Splinter Review

The metadata visitor was broken as-is because it was expecting all values to be null terminated strings. I've added a Len suffixed version of the visitor function, as I suspect that a user would want to handle the display of strings and binary data differently.

Attachment #454605 - Attachment is obsolete: true

Attachment #455709 - Flags: review?

Attachment #454605 - Flags: review?(michal.novotny)

Updated

•

14 years ago

Attachment #455709 - Flags: review? → review?(michal.novotny)

Comment 17

•

14 years ago

Attached patch xpcom io streams v1 (obsolete) — Details — Splinter Review

Two new xpcom io streams needed to be added that would stream into and out of a buffer.

Serializing a script that has principals via XDR causes the XDR code to expect there to be a nsIObject{Input,Output}Stream as the xdr->userdata. As I want all of the serialized data to be dumped to a buffer in memory, this required writing some custom nsI{Input,Output}Streams.

Attachment #460722 - Flags: feedback?

Updated

•

14 years ago

Attachment #460722 - Flags: feedback? → feedback?(jwalden+bmo)

Kyle Huey (Exited; not receiving bugmail, old account, do not use)

Comment 18

•

14 years ago

Attached patch js bytecode caching v1 (obsolete) — Details — Splinter Review

This is the current status of the caching compiled js. It is no where near landable quality, contains some form of debug logging about every 3 lines, and I haven't remotely bothered to apply style clean-ups to it. That said, if anyone looks and sees massive issues please be loud about it.

For reasons of simplicity, it currently only handles external js files (anything that matches "http.*\.js"). If anyone has a good idea how to uniquely identify a piece of script within a webpage based on any of the information that's available in the parameters, let me know.

It does work though, and I can generate perf numbers for anyone that would be interested.
As a random example, loading jquery off of mozilla.org resulted in:
  No caching of bytecode: 74ms (this is how it is now)
  With caching of bytecode:
    not in cache:
        compile: 45ms
        execute: 40ms
        saving: 6ms
      Total: 91ms
    in cache:
        loading: 15ms
        running: 40ms
      Total: 55ms

The one-time cost of cashing is noticeable on something like gmail, so I wouldn't recommended it to always be enabled. On the other hand, the effect of having much of gmail cached is equally noticeable, especially in terms of how long it takes gmail to get past its initial loading bar screen.

Updated

•

14 years ago

Assignee: nobody → amiller

Comment 19

•

14 years ago

> Two new xpcom io streams needed to be added 

We already have an input stream that reads from a byte buffer: nsIStringInputStream (admittedly not all that well-named).  Was there a reason not to use it?

For the output stream, you could create a new class (though it would need to be a lot more careful than the current one about various things, actually implement the stream API, etc), or you could use a pipe, pass the output end to the serializer and then ReadSegments the result into your string, doing only one allocation based on the (now known) length.  Either way would work, I think.

Comment 20

•

14 years ago

Another question: 45 + 40 > 74, right?  Is that due to lack of COMPILE_N_GO optimizations, or something else?

Mike Shaver (:shaver emeritus)

Comment 21

•

14 years ago

(In reply to comment #19)
> We already have an input stream that reads from a byte buffer:
> nsIStringInputStream (admittedly not all that well-named).  Was there a reason
> not to use it?

Not really... I was unaware of its existence and I didn't look into it in my quick glance over xpcom/io/ due to its name.


(In reply to comment #20)
> Another question: 45 + 40 > 74, right?  Is that due to lack of COMPILE_N_GO
> optimizations, or something else?

Yes. The main difference in the code is that COMPILE_N_GO is missing. A few people have mentioned the possibility of a 'linking' phase that binds a script to a global object so that we can get the COMPILE_N_GO optimizations back, but I'm not aware of that code existing yet.

Jeff Walden [:Waldo]

Comment 22

•

14 years ago

Comment on attachment 460722 [details] [diff] [review]
xpcom io streams v1

I think bz is probably right about reusing nsStringInputStream here.  The older class certainly does more, but it's about the right fit for what's being done here (its not-quite-Panglossian naming aside).

A few brief comments on what you did write, for edification:

>diff -r a4a0c84f62d3 xpcom/io/nsIMemoryInputStream.idl

>+ * Portions created by the Initial Developer are Copyright (C) 2001

There's a license boilerplate page on mozilla.org somewhere, easily findable, makes it harder to avoid mistakes like the above copyright year.


>+[scriptable, uuid(64356aa1-bfb1-4241-bfe6-c5d1520df1c0)]
>+interface nsIMemoryInputStream : nsIInputStream
>+{
>+    [noscript] void setBuffer(in string value, in PRUint32 len);
>+};

Interfaces containing only unscriptable methods generally shouldn't be scriptable themselves.  :-)


>diff -r a4a0c84f62d3 xpcom/io/nsMemoryStream.cpp

>+nsMemoryOutputStream::nsMemoryOutputStream()
>+{
>+    mBuffer = (char*)malloc( STARTING_BUFFER_SIZE );

When writing these sorts of things for XPCOM, you need to be aware of allocator/deallocator mismatch issues -- try to allocate something with Mozilla's malloc implementation [jemalloc] and free it with msvcrt80.dll!free and things will go wrong.  NS_alloc and NS_free (I think there's NS_realloc, but I'm not sure) are the proper XPCOM idioms for allocating/freeing raw memory.


>+    if (aCount + mBufferLen > mMaxSize)
>+    {
>+        //FIXME: If (aCount+mBufferLen) has the highest bit set, then mMaxSize will overflow when
>+        //       trying to find the next power-of-two that's larger. As this would require >2GB
>+        //       of data to be sent into the output stream, I'm not particularly worried, but
>+        //       should I be accounting for it possibly happening anyway?

I consider paranoia a virtue when working on code whose input may be (even partially) determined by malicious third parties.  Unlikely, near-impossible, sure, but it's still an assumption waiting to be violated.


>+    nsresult rv = aWriter(this, aClosure, mBuffer, 0, toConsume, _retval);
>+    //FIXME: if the writer failed, should we still count any bytes it might have consumed,
>+    //       or should we skip over these next two lines?
>+    mBuffer += *_retval;
>+    mBufferLeft -= *_retval;
>+    return rv;

Out parameters are garbage when the method being called fails, so you'd skip in cases like this.

Attachment #460722 - Flags: feedback?(jwalden+bmo)

Comment 23

•

14 years ago

(In reply to comment #22)
> >+[scriptable, uuid(64356aa1-bfb1-4241-bfe6-c5d1520df1c0)]
> >+interface nsIMemoryInputStream : nsIInputStream
> >+{
> >+    [noscript] void setBuffer(in string value, in PRUint32 len);
> >+};
> 
> Interfaces containing only unscriptable methods generally shouldn't be
> scriptable themselves.  :-)

Disagree: we may still want script to be able to pass them around, even if they don't have any more methods in the most-derived interface than in the parent.  For streams it's pretty common for script to create-and-hand-off, without calling methods on them directly.

Jeff Walden [:Waldo]

Comment 24

•

14 years ago

Perhaps.  I still think it stands as a general rule, with streams being a reasonable exception.

Comment 25

•

14 years ago

Attached patch metadata patch v3 — Details — Splinter Review

Fixed a bug where an out-param wasn't being assigned to.

Attachment #455709 - Attachment is obsolete: true

Attachment #463373 - Flags: review?

Attachment #455709 - Flags: review?(michal.novotny)

Updated

•

14 years ago

Attachment #463373 - Flags: review? → review?(michal.novotny)

Comment 26

•

14 years ago

Attached patch js bytecode caching v2 — Details — Splinter Review

Many updates: (If you're a tl;dr'er, just read the first sentence of each bullet.)

* All review comments and known issues have been fixed. The entire xpcom io streams patch has died. RIP. I've tried to do all of the necessary cleanups and error checking. As usual, if you see any issues in the code, let me know.

* The code should now detect if the cached entry has been updated and update the corresponding compiled version so that the two stay in sync. However, there's a slight issue. Whenever we update the metadata, the Last Modified time on the cache entry gets updated. What I'd like to do is attach the time that the time that the cache entry was last updated to the cache entry as metadata. So if I record the modification time, and set it as a metadata on the cache entry, this will cause the Last Modified time on the cache entry to be changed. This produces a chicken-and-the-egg situation. My current solution is to record the last modified time as the cache entry's last modified time plus a small constant so that it will always be slightly larger (and thus newer) than the real last modified time. This feels VERY hack-ish, but I don't know of a better way to do it. Suggestions welcome.

* COMPILE_N_GO is on for the initial run. For giggles, I enabled it while compiling the script. XDR'ing still works just fine, and the XDR'd script can be loaded and run again (although presumably without the COMPILE_N_GO optimizations applied). I compared the time of parsing and running with COMPILE_N_GO enabled and disabled, and couldn't find any performance difference (at least on the order of milliseconds).

* XDR'd scripts can be massive. For your average script, the XDR'd equivalent is on the order of 1x to 2x larger. However, for your large, industrial-grade, minified js files, it hurts badly. A 55k minified js file bloats up to 240k when XDR'd. A 125k js file pulled in off of facebook's cdn bloated up to 750k when XDR'd. It's still unknown what effects this will have on the current network cache code, as none of the metadata code was written with the expectation that it would be heavily used or contain large amounts of data. (The only current uses of the metadata are holding if the http was a GET or POST request and holding onto the http headers. This sums to less than 500 bytes.)

* I can no longer reproduce getting ~74ms with all of my code disabled. I've gone back through and re-done all of my perf measurements. Apparently I shouldn't be comparing perf data I measured from the beginning of my internship with perf data I'm getting with my patches applied to a current m-c. A run from current m-c tip results in EvaluateString() taking 83ms to compile and run jQuery (this is _with_ COMPILE_N_GO enabled).
That said, this means that the overhead of caching bytecode is fairly minimal. The main overhead is saving the bytecode, which is done _after_ we've run the script. Hopefully this should keep the amount of user-noticeable page load time differences to a minimum. The current code seems to be able to execute multiple loaded scripts in parallel (I've seen printf's from different runs of the function overlapping), so I'm staying hopeful that the extra time shouldn't be that bad. If anyone knows a way to measure page load time in a way that's more concrete than "it seems faster/slower", please speak up.

Attachment #460722 - Attachment is obsolete: true

Attachment #460731 - Attachment is obsolete: true