Closed Bug 883154 Opened 11 years ago Closed 11 years ago

Add runtime wide cache for compiled lazy scripts

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla24

People

(Reporter: bhackett1024, Assigned: bhackett1024)

References

Details

Attachments

(1 file, 1 obsolete file)

Folks accessing the same site repeatedly via multiple tabs or navigation will often load the same scripts over and over.  It would be good to avoid repeatedly reparsing these.  With bug 678037 we don't parse these scripts until their functions first execute, so a good way to avoid redundant parsing would be to check when compiling lazy scripts.  This is preferable to checking for redundant scripts when the source is first encountered, as a LazyScript still uses less memory than a JSScript which shares its bytecode with other scripts.

Outline:

- Keep a table in the runtime of all scripts which were compiled from lazy scripts.
- When compiling a lazy script with no inner functions, do a table lookup for a script with the same source location (filename, lineno, column, source begin/end offsets), with a memcmp for an exact match if found.
- On an exact match, clone the found script into the new compartment/zone.

This is especially important for SS.  After 678037 we now spend much of the parse time on SS (and other benchmarks, though SS runs so short it shows up the most here) while timers are running, so that the parse time is visible in the benchmark result.  SS runs the same scripts over and over, and will benefit from cacheing lazy compilations.
Attached patch patch (ce43d28276e4) (obsolete) — Splinter Review
This adds a cache to the runtime per above.  The lazy script cache is fixed size, with newer entries evicting older entries, similar to the property cache.  Unlike the property cache, entries can go in one of several buckets and choose which to evict via LRU.

This design was necessary to get the cache behaving well on the sites I tested, the SS benchmark and techcrunch (hitting the like/etc. overlays will load lots of similar inline frames).

On a techcrunch session, for leaf lazy functions I get 1336 cache hits and 1733 misses, with 1527 different scripts being compiled.  So the cache is able to pick up the duplicates pretty well.

On SS, I get 1187 hits and 1165 misses, even though there are only 206 different scripts being compiled.  So the cache is much less effective than on techcrunch.  The problem here is that we usually GC between loads of the frames in the SS harness, and these GCs often clean out most entries in the cache (which are weak references).  This could be fixed it seems by making these entries strong references during these GCs.  Doing that is distasteful and I'm hoping that this patch by itself will be enough.  In bug 678037 comment 69 I measured lazy parsing time as being 1.7% of SS's run time in the shell, so this patch should cut that to less than 1%.
Attachment #762903 - Flags: review?(luke)
Doing that will also avoid consuming extra memory in future refactoring of Gaia applications.  To reclaim memory, they are looking into using iframes to isolate each panel even if they have to re-load the same scripts.
(In reply to Nicolas B. Pierron [:nbp] from comment #2)
> Doing that will also avoid consuming extra memory in future refactoring of
> Gaia applications.  To reclaim memory, they are looking into using iframes
> to isolate each panel even if they have to re-load the same scripts.

This patch won't help with memory usage, it just shortcuts things so that we can directly clone scripts instead of reparsing when reloading the same scripts.  (Which should still help this iframe case.)  The runtime wide bytecode sharing we do and lazy parsing should help with memory usage a good deal in this iframe case.
Could you use a plain HashMap instead instead of adding a custom hash table?  We can measure, but it seems like neither the memory usage of the map nor the sweep time would be prohibitive if the table just kept every entry, without clobbering, and that would avoid the ad hoc heuristics to make SS got a hit every time (modulo GC throwing away scripts).

Another thought, based on what you are saying at the end of comment 1 is that we could have the script cache could have strong refs, but only for a fixed amount of time (or, to be slightly more deterministic, a fixed number of GCs).  Perhaps that would even give us a net win on SS?
(In reply to Luke Wagner [:luke] from comment #4)
> Could you use a plain HashMap instead instead of adding a custom hash table?
> We can measure, but it seems like neither the memory usage of the map nor
> the sweep time would be prohibitive if the table just kept every entry,
> without clobbering, and that would avoid the ad hoc heuristics to make SS
> got a hit every time (modulo GC throwing away scripts).

Well, Bill should have a better feel for this, but since the table is runtime wide we need to sweep the entire thing non-incrementally on the main thread after every GC, even if the GC only works in one or a few zones.  For a user with many tabs open the sweep time could be significant.  This patch is low risk and will preserve the memory gains of lazy parsing while giving a measurable improvement to real sites.  Anyhow:

- Not much is needed to make sure SS lookups hit, since its working set is small.  The reason why we don't get more hits is because of GCs cleaning the tables, which won't improve by using a HashMap.

- It would be nice if we had a library for fixed capacity caches, which are already used elsewhere in the engine (the soon to be removed property cache, the new object cache).  Adding that is not something that belongs in this bug though.

- I don't feel that LRU and double hashing are particularly ad hoc techniques, and are generally useful for fixed capacity caches.

> Another thought, based on what you are saying at the end of comment 1 is
> that we could have the script cache could have strong refs, but only for a
> fixed amount of time (or, to be slightly more deterministic, a fixed number
> of GCs).  Perhaps that would even give us a net win on SS?

I don't think we'll be able to see an improvement on SS vs. the pre-678037 state, but if we had a way of dealing with the scripts being removed from the cache when the SS harness runs then I think we could cut the regression to close to zero.
Per IRC discussion this spins off the fixed size hash table into a separate class, and removes scripts from the lazy script cache during finalization rather than in a separate phase.
Assignee: general → bhackett1024
Attachment #762903 - Attachment is obsolete: true
Attachment #762903 - Flags: review?(luke)
Attachment #764119 - Flags: review?(wmccloskey)
Comment on attachment 764119 [details] [diff] [review]
add FixedSizeHashSet utility class (ce43d28276e4)

Review of attachment 764119 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ds/FixedSizeHash.h
@@ +18,5 @@
> + * managed via LRU. For tables with a relatively small size, using different
> + * hashes increases utilization and makes it less likely that entries will keep
> + * evicting each other due to wanting to use the same bucket.
> + *
> + * T indicates the type of tree elements, HashPolicy must have the following

tree?

@@ +51,5 @@
> +    {
> +        JS_STATIC_ASSERT(Capacity > 0);
> +
> +        if (mozilla::IsPod<T>::value)
> +            mozilla::PodArrayZero(entries);

I believe that you can write the ctor this way:
  FixedSizeHashSet()
    : entries(),
      lastOperations(),
      numOperations(0)

This will use value initialization for the arrays. For non-POD types, the default constructor will be called. For POD-types, the array will get zero-initialized. Then you can avoid the PodArrayZero calls.

@@ +58,5 @@
> +        JS_ASSERT(HashPolicy::isCleared(entries[0]));
> +    }
> +
> +    bool lookup(const Lookup &lookup, T *pentry)
> +    {

The braces should be on the same line as the method name.

::: js/src/jit-test/tests/basic/lazyparse.js
@@ +34,5 @@
>      (function(x) { return !(x) })(0/0)
>    }
>  }
> +
> +testCatch(15);

Is this new test code relevant to the patch?

::: js/src/jsfun.cpp
@@ +1064,5 @@
>          fun->flags |= INTERPRETED;
>  
> +        JSScript *script;
> +
> +        if ((script = lazy->maybeScript())) {

Can you just initialize script in the definition?

@@ +1090,5 @@
> +        }
> +
> +        if (script) {
> +            // Trigger a barrier when reading entries from the cache.
> +            JSScript::readBarrier(script);

I'm pretty sure this is insufficient. It covers the case where we resurrect a dead script during marking, but not during sweeping. We could have a situation like this:
  1. marking finishes, |script| wasn't marked, begin sweeping scripts incrementally
  2. find |script| in the cache and clone it, which takes a reference to some of the objects stored inside it, which may also be dead
  3. finalize |script| and remove it from the table

I think we should just add an IsIncrementalGCInProgress test to the !lazy->numInnerFunctions() test. That's a bit conservative, since we're only concerned about the sweep phase. But I'd rather not expose mark versus sweep outside the GC. Also, if you do this, the read barrier should be unnecessary.

@@ -1093,5 @@
>          const jschar *lazyStart = chars + lazy->begin();
>          size_t lazyLength = lazy->end() - lazy->begin();
>  
> -        if (!frontend::CompileLazyFunction(cx, fun, lazy, lazyStart, lazyLength)) {
> -            fun->initLazyScript(lazy);

What was this line for?

::: js/src/jsgc.cpp
@@ +86,5 @@
>  using namespace js;
>  using namespace js::gc;
>  
>  using mozilla::ArrayEnd;
> +using mozilla::ArrayLength;

What's this for?
Attachment #764119 - Flags: review?(wmccloskey) → review+
https://hg.mozilla.org/mozilla-central/rev/a0dfe6abef73
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
Did this land in 25 or 24?
Seems indeed that the target milestone is wrong!!!! I see revision a0dfe6abef73 on aurora!
This landed on mozilla-central whilst it was still mozilla24, however mcMerge references the bugzilla target milestone field ordering, which was updated pre-emptively before the merge had occurred.
Target Milestone: mozilla25 → mozilla24
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: