Closed Bug 454184 Opened 16 years ago Closed 15 years ago

Implement eval caching

Categories

(Core :: JavaScript Engine, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla1.9.2a1

People

(Reporter: jeresig, Assigned: brendan)

References

Details

(Keywords: perf, testcase, Whiteboard: fixed1.9.1 fixed-in-tracemonkey)

Attachments

(3 files, 12 obsolete files)

227 bytes, text/plain
Details
627 bytes, text/plain
Details
25.96 KB, patch
brendan
: review+
Details | Diff | Splinter Review
Both Chrome and WebKit are now starting to cache evals (WebKit is doing it for eval's that are less than 256 characters long).

More details:
 - http://code.google.com/p/v8/source/detail?r=173
 - https://bugs.webkit.org/show_bug.cgi?id=20718

The WebKit bug is especially interesting:

"The date-format-tofte test could benefit from an eval code cache, whereby code
for the same string eval'd in the same context is retained so that repeated
evals can skip parsing and compilation. Bleeding-edge V8 includes a similar
idea.

It turns out that real sites benefit from this as well. Google spreadsheets,
GMail, Slate and Digg all get hits on this cache, between 2 and 100 depending
on the site. I also tested that in the miss case there is no measurable
slowdown.

The attached patch is a 1.052x speedup on SunSpider relative to trunk."
Blocks: 117611
Oh - and they only cache up to 64 entries at a time. This will definitely speed up SunSpider - and theoretically some web sites - but it's still probably a matter of "writing to the test."
Currently the compiled eval code inherently depends on the scope chain so wit caching it would not be possible to reuse the eval compiled, say, in one frame, with the eval invoked with the same string in another frame.

It would be interesting to see if making short eval code sharable would offset the drawbacks like less opportunities to optimize the eval code for the geven context.
I wonder if those scope-chain dependencies can be captured in the shape of the scope chain head.
(In reply to comment #3)
> I wonder if those scope-chain dependencies can be captured in the shape of the
> scope chain head.

Sure they can, but my hypothesis is that reusing eval scripts for different scope shapes but the same string would make things faster.
I'd be a _little_ surprised if we saw same-string evals in many different contexts (and it sounds from the webkit bug description like they don't try to reuse across contexts either), but I guess it's worth measuring.
Severity: normal → enhancement
Keywords: perf
Attached patch couldn't sleep (obsolete) — Splinter Review
We had the storage for the eval cache already, via scriptsToGC and u.nextToGC added to preserve property cache validity last August. This saves 5491 out of 5500 eval compilations (5491 needless recompilations) for date-format-tofte.js.

/be
Assignee: general → brendan
Status: NEW → ASSIGNED
Attachment #358721 - Flags: review?(gal)
Make hay while the sun doth shine, to quote from G&S ("Yeoman", I think).

/be
Flags: wanted1.9.1?
Priority: -- → P2
Target Milestone: --- → mozilla1.9.1b3
Attached patch awake now (obsolete) — Splinter Review
Attachment #358721 - Attachment is obsolete: true
Attachment #358722 - Flags: review?(gal)
Attachment #358721 - Flags: review?(gal)
Comment on attachment 358722 [details] [diff] [review]
awake now

>+#ifdef DEBUG_brendaneich
>+# define JS_EVAL_CACHE_METERING 1
>+#endif

Sure you want to commit this?

>+#ifdef JS_EVAL_CACHE_METERING
>+# define JS_DECLARE_EVAL_CACHE_METERING struct { uint32 probe, hit; } evalCacheMeter;
>+#else
>+# define JS_DECLARE_EVAL_CACHE_METERING /* nothing */
>+#endif

In general I wouldn't mind all (not just this one) the metering on in DEBUG mode, but thats for another patch.

>+/* Number of scriptToGC to search for the eval cache . */
>+#ifndef JS_EVAL_CACHE_LIMIT
>+# define JS_EVAL_CACHE_LIMIT    16
>+#endif

Is the 16 gut feeling or is it the sweet spot based on metering SS?

>     tcflags = TCF_COMPILE_N_GO;
>     if (caller)
>         tcflags |= TCF_PUT_STATIC_DEPTH(caller->script->staticDepth + 1);
>-    script = js_CompileScript(cx, scopeobj, caller, principals, tcflags,
>-                              JSSTRING_CHARS(str), JSSTRING_LENGTH(str),
>-                              NULL, file, line);
>+
>+    script = NULL;
>+#if JS_EVAL_CACHE_LIMIT > 0
>+    if (caller->fun) {
>+        uintN count = 0;
>+        JSScript **scriptp = &JS_SCRIPTS_TO_GC(cx);
>+
>+        EVAL_CACHE_METER(probe);
>+        while ((script = *scriptp) != NULL) {
>+            JSFunction *fun;
>+            JS_GET_SCRIPT_FUNCTION(script, 0, fun);
>+            if (fun == caller->fun) {
>+                JSString *src = ATOM_TO_STRING(script->atomMap.vector[0]);
>+                if (src == str || !js_CompareStrings(src, str)) {

Isn't js_EqualStrings the right choice here?

Otherwise looks awesome. Please don't commit until I have cleaned up the tree (relanding patches atm).
Attachment #358722 - Flags: review?(gal) → review+
Comment on attachment 358722 [details] [diff] [review]
awake now

Good points, I'll use #ifdef DEBUG to enable metering, and js_EqualStrings.

But this is buggy on a couple of points, now that I'm caffeinated. New patch shortly.

The 16 is an upper bound on linear search, and more than enough for date-format-tofte.js -- only 9 misses reported out of 5500 evals, so 10 is enough for that. But with a self-organizing list (move the hit to the head of the scriptsToGC list) I think 16 is better.

/be
Attachment #358722 - Attachment is obsolete: true
Attachment #358722 - Flags: review+
Er, hits already go to the head of the list. So that's good.

/be
Attached file test showing two bugs (obsolete) —
First bug: JS_GET_SCRIPT_FUNCTION can't be called blindly. Fixed by adding JSSF_SAVED_CALLER_FUN to flag those scripts that save caller->fun as the 0th element in the script's object table, testing it before JS_GET_SCRIPT_FUNCTION.

Second bug: as Igor noted, the memoized scope chain means the script can't be reused for a different scope chain. Shape matching is not enough since we need the call object (or any future flat closure) remembered the first call to g's environment, not the second call's. Working on a way to index by scope as well as source now.

/be
Attached file fixed test, shows only bug 1 (obsolete) —
And I've patched bug 1. Trying to break along lines of hypothesized bug 2, but it's not breaking (yet).

/be
Attachment #358746 - Attachment is obsolete: true
Attachment #358751 - Attachment is patch: false
Need to mutate a binding in the immediate environment of the eval, not the grand-caller's activation.

/be
Attachment #358751 - Attachment is obsolete: true
Attached patch proposed fix (obsolete) — Splinter Review
Looking for mrbkap's approval, gal too but he already liked the general idea ;-).

/be
Attachment #358774 - Flags: review?(mrbkap)
Summary: Implement eval Caching → Implement eval caching
Comment on attachment 358774 [details] [diff] [review]
proposed fix

>+/* Number of scriptToGC to search for the eval cache . */

Fixed this to read

/* Number of potentially reusable scriptsToGC to search for the eval cache. */

Results on date-format-tofte.js:

(gdb) p rt.evalCacheMeter 
$1 = {
  probe = 5500, 
  hit = 5491, 
  step = 35970, 
  noscope = 5491
}

For some reason the patch seems to have regressed this test:

*-* Testcase js1_5/extensions/regress-376052.js failed:
Bug Number 376052
STATUS: javascript.options.anonfunfix to allow function (){} expressions
Failure messages were:
 FAILED! [reported from test()] javascript.options.anonfunfix to allow function (){} expressions: 3 : Expected value 'SyntaxError: syntax error', Actual value 'No Error' 

-*- exit code 0, exit signal 0.
-*- Writing output to results-2009-01-25-163551-smdebug.html.
-#- Wrote failures to 'results-2009-01-25-163551-smdebug-failures.txt'.
-#- Wrote results to 'results-2009-01-25-163551-smdebug.html'.
-#- 1 test(s) failed

Looking into it as time allows today, will wrap up by tonight.

/be
No time and less trust in my machine than I need to benchmark -- Andreas, David, feel free to do so and post numbers.

/be
Andreas and the Davids, I guess ;-).

/be
Attached patch fixed option/version checking (obsolete) — Splinter Review
Certain options affect compiler decision-making. Need these to reflect into high bits in the script and context version field. Had it right for XML but not for the ANONFUNFIX option. That was caught by the testsuite, bless its heart.

/be
Attachment #358774 - Attachment is obsolete: true
Attachment #358793 - Flags: review?(mrbkap)
Attachment #358774 - Flags: review?(mrbkap)
A gmail/bugzilla/slashdot session, not very lengthy, yielded:

  evalCacheMeter = {
    probe = 3671776391, 
    hit = 3671776276, 
    step = 3671777782, 
    noscope = 3671776276
  }
}
(gdb) p 3671776276.0/3671776391
$3 = 0.99999996868000995
(gdb) p 3671777782 - 3671776391
$4 = 1391
(gdb) p 3671777782.0/3671776391
$5 = 1.0000003788357057
(gdb) p 3671776391 - 3671776276
$6 = 115

Seven nines of hit rate, almost no searching past the self-reorganizing MRU list head, only 115 out of 3 billion+ had a particular scopeobj dependency. Cool! But maybe I should change the meter struct to use uint64. Thoughts?

/be
(In reply to comment #19)
> Created an attachment (id=358793) [details]
> fixed option/version checking

Why it is not necessary to check for upvar references when comparing the cached scripts? They also introduce scope dependencies AFAICS.
(In reply to comment #21)
> (In reply to comment #19)
> > Created an attachment (id=358793) [details] [details]
> > fixed option/version checking
> 
> Why it is not necessary to check for upvar references when comparing the cached
> scripts? They also introduce scope dependencies AFAICS.

Checking source string and scope chain head are enough, since upvars are simply identifier expressions within the source, resolved to immutable lexical binding definitions against the scope chain.

/be
/be
(In reply to comment #22)
> Checking source string and scope chain head are enough, since upvars are simply
> identifier expressions within the source, resolved to immutable lexical binding
> definitions against the scope chain.

The cache hit is also predicated on caller->fun == fun (the saved caller fun in the cached script).

Comment 20 and more testing shows an absurd population of eval call scenarios, skewed toward simple expressions with no literal function or regexp scope chain dependencies, only identifiers resolved via JSOP_NAME or (in the case of eval from a function with lexical bindings) JSOP_{GET,CALL}UPVAR.

eval overuse seems inevitable, although we could add strict warnings about eval calls that should just use computed property name access instead. For the rest, we need to make eval fast in order to stay competitive.

/be
(In reply to comment #22)
> Checking source string and scope chain head are enough

Right, upvar is only used when eval's enclosing scope is the function variable object. Since no variables can be deleted from it, reusing the evaluated script the second time is safe even if comes with upvars.
(In reply to comment #23)
> Comment 20 and more testing shows an absurd population of eval call scenarios,
> skewed toward simple expressions with no literal function or regexp scope chain
> dependencies, only identifiers resolved via JSOP_NAME or (in the case of eval
> from a function with lexical bindings) JSOP_{GET,CALL}UPVAR.

Hm, maybe it would make sence to add a special support for eval scripts that just contain name or property access?
How embarrassing -- I forgot that js_SetContextThread has to zero anything that should not be free-pattern-filled in DEBUG builds. The shell of course has the runtime as the JS_CACHE_LOCUS(cx), so it didn't suffer this bug (JS_NewRuntime zeroes rt using memset).

Anyway, here is the browser session with brief gmail navigation, some tab preview ctrl-tab'ing, /., and a bugzilla page:

eval cache meter (0x13b6d000):
probe     102
hit       27
step      754
noscope   27
hit ratio 26.4706%
avg steps 7.39216

I fixed the step-limiting logic to count every step, not just those that matched saved-caller-fun-flag, version, and principal; and bumped JS_EVAL_CACHE_LIMIT to 64:

eval cache meter (0x13b6a000):
probe     99
hit       30
step      1120
noscope   30
hit ratio 30.303%
avg steps 11.3131

The same session, plus one web-hosted SunSpider run:

eval cache meter (0x13b6d000):
probe     27616
hit       27488
step      181359
noscope   27488
hit ratio 99.5365%
avg steps 6.56717

Hate to add hashing to the MRU cache, it will complicate the code. Not sure 64 is the right number -- what's the distribution of eval'ed source? What's the distribution of how long such code takes to compile in SpiderMonkey? OTOH I'm not too worried about spending too much time in the cache probe loop on miss, even with a miss rate of 70%.

Comments welcome from all and sundry,

/be
Attachment #358793 - Attachment is obsolete: true
Attachment #358886 - Flags: review?(mrbkap)
Attachment #358793 - Flags: review?(mrbkap)
BTW, bugzilla interdiff works, in spite of whinage implying it might be lying. So all the adjacent patches in this bug can be interdiff'ed.

/be
(In reply to comment #20)
> A gmail/bugzilla/slashdot session, not very lengthy, yielded:
> 
>   evalCacheMeter = {
>     probe = 3671776391, 
>     hit = 3671776276, 
>     step = 3671777782, 
>     noscope = 3671776276
>   }
> }

If "not very lengthy" means "30 mins", then that's...a _LOT_ of eval.  2 million a second?
(In reply to comment #28)
> (In reply to comment #20)
> > A gmail/bugzilla/slashdot session, not very lengthy, yielded:
> > 
> >   evalCacheMeter = {
> >     probe = 3671776391, 
> >     hit = 3671776276, 
> >     step = 3671777782, 
> >     noscope = 3671776276
> >   }
> > }
> 
> If "not very lengthy" means "30 mins", then that's...a _LOT_ of eval.  2
> million a second?

Not thinking clearly (had to put the cat down this weekend :-(), obviously. But look at those numbers in hex:

DADADC87
DADADC14
DADAE1F6
DADADC14

Dead Arena, AKA JS_FREE_PATTERN. All fixed in latest patch. Only issue left is whether the cost of linear search with MRU move to head of list hurts. Trying to measure that now.

/be
Without the patch, I get:

worst case cache hit time 3181
before 9068544, after 32768, break 02000000
cache miss time 3119

With the patch:

worst case cache hit time 1334
before 9068544, after 32768, break 02000000
cache miss time 4690

Of course, no one evals a string containing a number in [0,64], but this test is trying to bound the cost of the linear search. It looks bad enough (41% worse) that I'm going to play with smaller cache size and maybe hashing.

/be
FWIW, webkit seems to use a 64-entry cache (per-what? I think per-script, but I haven't chased the refs in their source to figure out for sure) and their custom hashmap, with a maximum cached size of 256 characters.

http://trac.webkit.org/browser/trunk/JavaScriptCore/bytecode/EvalCodeCache.h etc.
Attached file updated evalbench.js
Without the new patch that's coming next:

worst case cache hit time 3253
before 9068544, after 32768, break 02000000
cache miss time 3153

With the patch, which uses an EVAL_CACHE_CHAIN_LIMIT of 4:

worst case cache hit time 438
before 9068544, after 32768, break 02000000
cache miss time 3811

Varying EVAL_CACHE_CHAIN_LIMIT between 2 and 8 can vary the cache miss time, but using 2 does not reduce the time to within noise-epsilon of the unpatched time. With 2 the slowdown is more like 10%, with 4 it's 19%. Again this is a torture test, real program sources will put the slowdown much more in the noise category.

Someone should eyeball the benchmark and patch to make sure I'm not doing something dumb. The /tmp/evalcache.stats file from a debug build looks sane:

probe     2000000
hit       999936
step      3499600
noscope   999936
hit ratio 49.9968%
avg steps 1.7498

/be
Attachment #358898 - Attachment is obsolete: true
Attached patch updated patch (obsolete) — Splinter Review
Attachment #358886 - Attachment is obsolete: true
Attachment #358930 - Flags: review?(mrbkap)
Attachment #358886 - Flags: review?(mrbkap)
Attached patch avoid recomputing EvalCacheHash (obsolete) — Splinter Review
Hoping this can go in easily.

/be
Attachment #358930 - Attachment is obsolete: true
Attachment #358946 - Flags: review?(mrbkap)
Attachment #358930 - Flags: review?(mrbkap)
Comment on attachment 358946 [details] [diff] [review]
avoid recomputing EvalCacheHash

>+#define JSVERSION_ANONFUNFIX            0x2000  /* jsapi.h, JSOPION_ANONFUNFIX */

Nit: Typo in the comment: "JSOPION".

Looks good.
Attachment #358946 - Flags: review?(mrbkap) → review+
With typo fix per mrbkap's r+.

/be
Attachment #358946 - Attachment is obsolete: true
Attachment #359001 - Flags: review+
Keywords: checkin-needed
30ms perf win on my machine
Doesn't build in DEBUG:

g++-4.2 -o jsobj.o -c -I./dist/include/system_wrappers_js -include ../config/gcc_hidden.h -DOSTYPE=\"Darwin9.6.0\" -DOSARCH=Darwin -DEXPORT_JS_API  -DJS_USE_SAFE_ARENA  -I.. -I.  -I./dist/include   -I./dist/include/js  -I/sdk/include -I..    -fPIC  -I/usr/X11/include -fno-rtti -fno-exceptions -Wall -Wpointer-arith -Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy -Wno-non-virtual-dtor -Wcast-align -Wno-invalid-offsetof -Wno-long-long -fno-strict-aliasing -fpascal-strings -fno-common -fshort-wchar -pthread -pipe  -DDEBUG -D_DEBUG -DDEBUG_gal -DTRACING -g  -I/usr/X11/include -DMOZILLA_CLIENT -include ./mozilla-config.h -Wp,-MD,.deps/jsobj.pp ../jsobj.cpp
../jsobj.cpp: In function ‘JSBool obj_eval(JSContext*, JSObject*, uintN, jsval*, jsval*)’:
../jsobj.cpp:1472: error: jump to label ‘out’
../jsobj.cpp:1358: error:   from here
../jsobj.cpp:1376: error:   crosses initialization of ‘JSScript** bucket’
../jsobj.cpp:1472: error: jump to label ‘out’
../jsobj.cpp:1289: error:   from here
../jsobj.cpp:1376: error:   crosses initialization of ‘JSScript** bucket’
Attachment #359001 - Attachment is obsolete: true
Attachment #359002 - Flags: review?(brendan)
The warning is junk since bucket is unused after out: -- oh well. I need to upgrade my Mac gcc for this?!

/be
Attachment #359002 - Attachment is obsolete: true
Attachment #359004 - Flags: review+
Attachment #359002 - Flags: review?(brendan)
Landed in tm, if it sticks (I think it will) then on to m-c and 1.9.1 nomination of the patch (I think this bug is wanted-1.9.1+ after talking to sayrer, even if it's not marked that way yet).

http://hg.mozilla.org/tracemonkey/rev/ed312882b453

/be
Whiteboard: fixed-in-tracemonkey
Out and back in again:

http://hg.mozilla.org/tracemonkey/rev/5174fbfd612d

I passed Tp without crashing. Have to reproduce the Mac Tp crash before blaming it on this patch (it could be; just no direct evidence yet; sayrer's working on Tp crash stack backtrace reporting, which would rule).

/be
(In reply to comment #42)
> I passed Tp without crashing.

That was my debug build against mrbkap's private-label Tp server.

/be
It didn't crash a second run I started on tm either.
Doesn't crash anything -- orange was spurious, from something pulled from m-c?

Fixed in m-c:

http://hg.mozilla.org/mozilla-central/rev/6dbc129ccd19

/be
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Attachment #359004 - Attachment description: comment the gcc lameness workaround → comment the gcc lameness workaround [Checkin: Comment 42 & 46]
Keywords: checkin-needed
Whiteboard: fixed-in-tracemonkey → [c-n: baking for 1.9.1] fixed-in-tracemonkey
Target Milestone: mozilla1.9.1b3 → mozilla1.9.2a1
http://hg.mozilla.org/releases/mozilla-1.9.1/rev/46e031334fb5
Flags: wanted1.9.1? → wanted1.9.1+
Whiteboard: [c-n: baking for 1.9.1] fixed-in-tracemonkey → fixed1.9.1 fixed-in-tracemonkey
brendan: is a passing condition for your evalbench.js that with the patch, worst case cache hit time is at least 1/2 of the cache miss time?
Keywords: testcase
We could try that. I currently see almost 4 times the hit time in the miss time. But the direct way to test is to capture /tmp/evalcache.stats and confirm these numbers:

probe     2000000
hit       999936
step      3499600
noscope   999936
hit ratio 49.9968%
avg steps 1.7498

This is not reflected as a shell object, probably it should be (using common infrastructure with other such things, chiefly jitstats).

/be
Depends on: 531037
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: