Last Comment Bug 634654 - Add RegExpPrivate cache
: Add RegExpPrivate cache
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal with 2 votes (vote)
: mozilla10
Assigned To: Chris Leary [:cdleary] (not checking bugmail)
:
:
Mentors:
: 673189 (view as bug list)
Depends on: 696524 587288 673188 700915 701332 701387 701396 701399 702352 702426 728705
Blocks: domperf 597446 673274
  Show dependency treegraph
 
Reported: 2011-02-16 10:29 PST by Chris Leary [:cdleary] (not checking bugmail)
Modified: 2013-07-31 07:46 PDT (History)
13 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP: RegExp cache. (10.48 KB, patch)
2011-03-03 20:47 PST, Chris Leary [:cdleary] (not checking bugmail)
no flags Details | Diff | Splinter Review
WIP: Cache |RegExpPrivate|s. (10.17 KB, patch)
2011-10-08 16:20 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details | Diff | Splinter Review
Top offenders. (10.86 KB, text/plain)
2011-10-09 09:15 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details
WIP: cache with purging. (23.73 KB, patch)
2011-10-09 16:29 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details | Diff | Splinter Review
WIP: cache with purging. (38.18 KB, patch)
2011-10-10 18:23 PDT, Chris Leary [:cdleary] (not checking bugmail)
no flags Details | Diff | Splinter Review
Cache |RegExpPrivate|s, purge cache on global GC. (40.46 KB, patch)
2011-10-11 16:32 PDT, Chris Leary [:cdleary] (not checking bugmail)
jwalden+bmo: review+
Details | Diff | Splinter Review

Description Chris Leary [:cdleary] (not checking bugmail) 2011-02-16 10:29:35 PST
Splitting this off from the toSource cache lazy-construction patch in bug 614155.
Comment 1 Chris Leary [:cdleary] (not checking bugmail) 2011-03-03 16:49:16 PST
Instrumenting RegExp::create and visiting a bunch of high-profile sites for some informal profiling I got:

atomized regexp sources:                 473
non-atomized regexp sources:            7863

So 94% of the sources were not atomized. I'll go with the atomize-and-cache approach and time how many cycles we spend atomizing versus our regexp cache hit rate on a similar workload.
Comment 2 Chris Leary [:cdleary] (not checking bugmail) 2011-03-03 17:55:33 PST
So with the tabs: GMail, GReader, Tech Crunch, Twitter, and CAD comic I get the following stats:

RegExpCache stats:
  hits:                  2244
  flagMisses:              34
  constructs:              24
  lookups:               3979
  removals:                23
  puts:                  1759

Which is a hearty 56% hit rate. When I switch the protocol to only put cache entries that are atomized (and only do lookups for atomized strings) I get:

RegExpCache stats:
  hits:                    72
  flagMisses:               0
  constructs:              19
  lookups:                112
  removals:                 3
  puts:                    59

So atomizing the strings makes this cache useful. I'll rdtsc the atomization overhead.
Comment 3 Chris Leary [:cdleary] (not checking bugmail) 2011-03-03 18:27:18 PST
An interesting observation: when we create regexps from literals in the source text the parser creates a regexp from the jschar token buffer. The string generated from this jschar run is not atomized! I'll try to re-run the measurements with those literal-derived strings atomized and see how things change.
Comment 4 Chris Leary [:cdleary] (not checking bugmail) 2011-03-03 20:47:18 PST
Created attachment 516796 [details] [diff] [review]
WIP: RegExp cache.

RegExpCache stats:
  hits:                  1804
  flagMisses:              34
  constructs:              36
  lookups:               3572
  removals:                 4
  puts:                  1804
  nonAtoms:              2443

We get a nice hit rate among the atoms, but there is a large number of non-atoms we may be missing.

Here is a map from patterns to hits-that-we-miss-out-on:

{'#((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '&': 19,
 '(^(?:.|\\r|\\n)*?)#((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '(^(?:.|\\r|\\n)*?):((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?:\\(([\'"]?)((?:\\([^\\)]+\\)|[^\\(\\)]*)+)\\3\\))?(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '(^(?:.|\\r|\\n)*?):(nth|eq|gt|lt|first|last|even|odd)(?:\\((\\d*)\\))?(?=[^\\-]|$)(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '(^(?:.|\\r|\\n)*?)\\.((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '(^(?:.|\\r|\\n)*?)\\[name=[\'"]*((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)[\'"]*\\](?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '(^(?:.|\\r|\\n)*?)^((?:[\\w\\u00c0-\\uFFFF\\*\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '(^|.)facebook.(com)([^.]*)$': 19,
 '(^|\\.)(\\.|$)': 421,
 '(^|\\s)addthis_counter(\\s|$)': 1,
 '(^|\\s)hidden_elem(?:\\s|$)': 1,
 ':((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?:\\(([\'"]?)((?:\\([^\\)]+\\)|[^\\(\\)]*)+)\\2\\))?(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 ':(nth|eq|gt|lt|first|last|even|odd)(?:\\((\\d*)\\))?(?=[^-]|$)(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 ':(nth|eq|gt|lt|first|last|even|odd)(?:\\((\\d*)\\))?(?=[^\\-]|$)(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 ':(only|nth|last|first)-child(?:\\((even|odd|[\\dn+-]*)\\))?(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '\\%\\{tweets\\}': 20,
 '\\%\\{url\\}': 41,
 '\\+': 39,
 '\\.((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '\\[name=[\'"]*((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)[\'"]*\\](?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '\\bgbg0l\\b': 20,
 '\\bgbgt\\b': 23,
 '\\bgbgtd\\b': 20,
 '\\bgbkc\\b': 2,
 '\\bgbmt\\b': 20,
 '\\bgbzt\\b': 29,
 '\\s*[,;]$': 1,
 '^((?:[\\w\\u00c0-\\uFFFF\\*\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1,
 '^(?:([^:\\/?#.]+):)?(?:\\/\\/(?:([^\\/?#]*)@)?([\\w\\d\\-\\u0100-\\uffff.%]*)(?::([0-9]+))?)?([^?#]+)?(?:\\?([^#]*))?(?:#(.*))?$': 2,
 '^([\\[\\]])$': 119,
 '^,': 20,
 '^Yahoo Application State Plugin$': 1,
 '^[\\s]+|[\\s]+$': 2}

What's interesting about this set is that we can get 796 of the 834 potential non-atom hits if we only atomize strings of length 16-or-less. That may be a decent compromise to avoid atomization overheads.
Comment 5 Chris Leary [:cdleary] (not checking bugmail) 2011-10-08 16:20:54 PDT
Created attachment 565763 [details] [diff] [review]
WIP: Cache |RegExpPrivate|s.

Puts a RegExpPrivate cache on the ThreadData for JSAtom sources only. Hurray immutability!

I want to do a little bit of browsing-in-the-name-of-science before I put it up for review. Particularly, I want to see how many non-atom sources we "could have hit on if we cached those" and how much memory overhead that produces.
Comment 6 Chris Leary [:cdleary] (not checking bugmail) 2011-10-08 16:30:39 PDT
Oh, and I still have to add a purge-to-size mechanism. (That's how we emulate HashTable shrink-to-size these days.) The number of entries will drop to zero on GC later due to bug 673189.
Comment 7 Chris Leary [:cdleary] (not checking bugmail) 2011-10-09 09:15:16 PDT
Created attachment 565810 [details]
Top offenders.

My benchmark was to run an already-atomized-only cache and instrument RegExpPrivate creation. I was a lot more scientific this time, and it had very interesting results.

Opening the following 6 sites from session restore: Twitter, GReader, GMail, TechCrunch, Facebook, and Delicious, the hit rate was miserable.

22112 events
62 hits, 22050 misses (0.28 % hit ratio)

As the text file shows, we get nice contributions from GMail, Twitter's pheonix-core thing and even our own nsSessionStore. The nsSessionStore is due to not atomizing regexp sources out of XDR, which (NOTE REVIEWER! :-) I will fix in this bug.

All of these top offenders were uncacheable due to being non-atoms. If we atomize all of these patterns that are less than 16 chars. If we atomize strings we see with <= 16 chars we should hit a nice speed/usage tradeoff at a 98.3% hit rate for this workload.

Upcoming science: time the aggregate atomization overhead, dump code sizes as well so we can know how much duplication we're talking about eliminating.
Comment 8 Chris Leary [:cdleary] (not checking bugmail) 2011-10-09 09:25:32 PDT
Also, (for verification) inspecting the GMail js file by hand I see 31 instances where RegExp is called directly (despite minification), at least in a handful of cases with tiny literal strings.
Comment 9 Emanuel Hoogeveen [:ehoogeveen] 2011-10-09 09:51:45 PDT
This is very interesting data. Is the bug title correct, assuming that TM refers to TraceMonkey, that this bug is for TM only? IIUC TM is disabled when TI is enabled, and is set to go away entirely when IM is ready.
Comment 10 AWAY Tom Schuster [:evilpie] 2011-10-09 10:16:45 PDT
(In reply to Emanuel Hoogeveen from comment #9)
> This is very interesting data. Is the bug title correct, assuming that TM
> refers to TraceMonkey, that this bug is for TM only? IIUC TM is disabled
> when TI is enabled, and is set to go away entirely when IM is ready.

Looking at the code this patch is not TraceMonkey related. Putting TM in front of bugs was some weird thing being done back in these days.
Comment 11 Chris Leary [:cdleary] (not checking bugmail) 2011-10-09 12:38:45 PDT
I made the chars we read out of the token stream and the strings we XDR decode into atoms and we look good hit rate wise:

21787 events
20526 hits, 1261 misses (94.21% hit ratio)

kind        | count
------------+------
cold        | 648  
hit         | 20526
eviction    | 22   
uncacheable | 591

Since we were be creating fresh JSStrings for those token-stream-read creations the atomizing should have added very little overhead, but I'll make an opt build to time it just to be sure.

Without any purging we keep a ~2.5K hash table around to cache all the values. Half the RegExpPrivates that exist are there because they're non-atom sources, so I still need more memory usage and execution time data -- maybe atomizing more things would be worth it to reduce the duplication.
Comment 12 Chris Leary [:cdleary] (not checking bugmail) 2011-10-09 14:08:29 PDT
Duplicate code savings: sum: 183,668B; avg per RegExpPrivate: 895.94; stddev: 3,087.15

At least in my benchmark the code savings is negligible. We can always atomize more stuff later if we find that's a win in the real world, and just caching the naturally atomized stuff is better than what we have now. Will have a patch with purging soon.
Comment 13 Chris Leary [:cdleary] (not checking bugmail) 2011-10-09 16:29:52 PDT
Created attachment 565841 [details] [diff] [review]
WIP: cache with purging.

This is close to done, but I have to fix a few things that are bothering me:

1) I'm pretty sure this assertion method doesn't work with background sweeping. I *could* deallocate the cache pre-mark and assume refcounting correctness, but the debug refcounting assertions are nice. I either have to figure out a way to do it in JS_THREADSAFE that doesn't make your eyes bleed, or only perform the assertions when !JS_THREADSAFE (which seems undesirable).
2) The string regexp methods are violating the RegExpPrivate boundary quite heavily in order to avoid creating a RegExpObject. This leads to more annoying problems in the refcount assertions where I have to say, "check that the string regexp methods are appropriate messing around with RegExpPrivates behind our back," which is gross.
3) Switch the RegExpPrivates from taking JSContext parameters to JSRuntime parameters now that they live in the ThreadData.
4) There's too many scary verbs on the RegExp data structures... purge, finalize, reset, decref. If I can't consolidate any of those verbs they at least need a big comment at the top of RegExpObject.h to say how they all work together with the ThreadData cache.
Comment 14 Chris Leary [:cdleary] (not checking bugmail) 2011-10-09 16:31:13 PDT
(Oh, and it passes shell tests in that form.)
Comment 15 Chris Leary [:cdleary] (not checking bugmail) 2011-10-10 11:43:48 PDT
(In reply to Chris Leary [:cdleary] from comment #13)
> 1) I'm pretty sure this assertion method doesn't work with background
> sweeping.

billm pointed out that it does because we don't background finalize anything with clasp->finalize.
Comment 16 Chris Leary [:cdleary] (not checking bugmail) 2011-10-10 18:23:09 PDT
Created attachment 566104 [details] [diff] [review]
WIP: cache with purging.

Looks like this is passing jsreftests. Interesting note to self: you can't legitimately free up memory from a thread-data level cache when you're doing compartmental GCs.

Pushed to try.
Comment 17 Chris Leary [:cdleary] (not checking bugmail) 2011-10-11 10:01:22 PDT
Looks good on try. Will post for review this afternoon.
Comment 18 Chris Leary [:cdleary] (not checking bugmail) 2011-10-11 16:32:24 PDT
Created attachment 566392 [details] [diff] [review]
Cache |RegExpPrivate|s, purge cache on global GC.

Depends on laziness patch from bug 673188.
Comment 19 Chris Leary [:cdleary] (not checking bugmail) 2011-10-11 16:33:45 PDT
(Sorry luke, didn't qref the removal of that one fprintf chunk!)
Comment 20 Chris Leary [:cdleary] (not checking bugmail) 2011-10-12 11:47:47 PDT
Comment on attachment 566392 [details] [diff] [review]
Cache |RegExpPrivate|s, purge cache on global GC.

Waldo said he would be willing to take this review (and his review queue is less loaded).
Comment 21 Jeff Walden [:Waldo] (remove +bmo to email) 2011-10-26 13:54:40 PDT
Comment on attachment 566392 [details] [diff] [review]
Cache |RegExpPrivate|s, purge cache on global GC.

Review of attachment 566392 [details] [diff] [review]:
-----------------------------------------------------------------

The RegExpObjectBuilder idiom feels weird to me.  Maybe it's the name that feels strange.  I don't know.  My concerns about it are not coherent enough to suggest something better.

::: js/src/jscntxt.cpp
@@ +163,5 @@
> +    RegExpPrivateCache *newCache = rt->new_<RegExpPrivateCache>(rt);
> +
> +    if (!newCache || !newCache->init()) {
> +        if (newCache)
> +            rt->delete_<RegExpPrivateCache>(newCache);

If delete_ is null-safe I'd remove the null-check.

@@ +367,5 @@
> +#ifdef JS_THREADSAFE
> +    for (JSThread::Map::Enum e(cx->runtime->threads);
> +         !e.empty();
> +         e.popFront()) {
> +        JSThread *thread = e.front().value;

{ on new line.

::: js/src/jscntxt.h
@@ +1255,5 @@
>       */
>      js::GCHelperThread *gcBackgroundFree;
>  #endif
>  
> +    js::ThreadData *threadData() { return JS_THREAD_DATA(this); }

File a bug to remove JS_THREAD_DATA in favor of this?  I've wanted to do this for a long time, never was touching close enough code to actually do it.

::: js/src/jsstr.cpp
@@ +1266,5 @@
>       */
>      int32 match() const { return match_; }
>  };
>  
>  class RegExpPair

An overview comment for why this class exists would be nice.  It's not entirely obvious to me, at a skim of its uses.

@@ +1280,5 @@
>    public:
> +    explicit RegExpPair(JSContext *cx) : cx(cx), rep_(cx), markedOutstanding(false) {}
> +
> +    ~RegExpPair() {
> +        if (!!markedOutstanding)

Wouldn't operator bool be worth adding to DebugOnly as well?

::: js/src/vm/RegExpObject.cpp
@@ +104,5 @@
> +    if (!reobj_->init(cx, rep->getSource(), rep->getFlags())) {
> +        rep->decref(cx);
> +        return NULL;
> +    }
> +    reobj_->setPrivate(rep.get());

If this is a RegExpObject, shouldn't we have a method to do this, that takes in an AIR<>, rather than do it all manually here?

The free-floating decrefs here (and increfs in callers) are not happy-making.  I assume the semantics here are that the caller passes ownership of his reference to this function, which is then responsible for decrementing it at the appropriate time (either now, in case of error, or later when the reobj dies)?  I suspect based on the name this is what WebKit's PassRefPtr (?) templates are for; maybe we should have something similar.

@@ +151,5 @@
> +        return build(other->getSource(), newFlags);
> +    }
> +
> +    RegExpPrivate *toShare = other->getPrivate();
> +    if (toShare) {

if (RegExpPrivate *...)

::: js/src/vm/RegExpObject.h
@@ +81,5 @@
>       * so this function is really meant for object creation during code
>       * execution, as opposed to during something like XDR.
>       */
> +    static inline RegExpObject *create(JSContext *cx, RegExpStatics *res, const jschar *chars,
> +                                       size_t length, RegExpFlag flags, TokenStream *ts);

If you're renaming arguments, ts here should be tokenStream.

That said, why are you removing the line break between return type and methodname?  I find the other alignment more readable given how many different arguments this function takes.
Comment 22 Chris Leary [:cdleary] (not checking bugmail) 2011-10-26 15:06:56 PDT
Thanks for the review Waldo! Just going to finish investigating bug 696524 before I land this so we don't muddle up the results.
Comment 23 Chris Leary [:cdleary] (not checking bugmail) 2011-11-07 15:59:25 PST
https://hg.mozilla.org/integration/mozilla-inbound/rev/366d80e91816

The small V8 regression was due to some regexp compilation being duplicated in v8-regexp. The cache largely fixes that issue, but it still gets cleared out and recompiles regexps about 6 times during v8-regexp. This at least gets our regexp score back where it was, but there's followup work we can do by only purging on a GC_SHRINK (thereby not clearing out compiled regexps during v8-regexp). That'll be left to the v8-regexp investigation bug.
Comment 24 Ed Morley [:emorley] 2011-11-08 01:31:58 PST
https://hg.mozilla.org/mozilla-central/rev/366d80e91816
Comment 25 Chris Leary [:cdleary] (not checking bugmail) 2011-11-29 15:36:43 PST
*** Bug 673189 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.