Closed Bug 634654 Opened 14 years ago Closed 13 years ago

Add RegExpPrivate cache

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla10

People

(Reporter: cdleary, Assigned: cdleary)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 4 obsolete files)

Splitting this off from the toSource cache lazy-construction patch in bug 614155.
Blocks: domperf, 597446
Depends on: 587288
Instrumenting RegExp::create and visiting a bunch of high-profile sites for some informal profiling I got: atomized regexp sources: 473 non-atomized regexp sources: 7863 So 94% of the sources were not atomized. I'll go with the atomize-and-cache approach and time how many cycles we spend atomizing versus our regexp cache hit rate on a similar workload.
So with the tabs: GMail, GReader, Tech Crunch, Twitter, and CAD comic I get the following stats: RegExpCache stats: hits: 2244 flagMisses: 34 constructs: 24 lookups: 3979 removals: 23 puts: 1759 Which is a hearty 56% hit rate. When I switch the protocol to only put cache entries that are atomized (and only do lookups for atomized strings) I get: RegExpCache stats: hits: 72 flagMisses: 0 constructs: 19 lookups: 112 removals: 3 puts: 59 So atomizing the strings makes this cache useful. I'll rdtsc the atomization overhead.
An interesting observation: when we create regexps from literals in the source text the parser creates a regexp from the jschar token buffer. The string generated from this jschar run is not atomized! I'll try to re-run the measurements with those literal-derived strings atomized and see how things change.
Attached patch WIP: RegExp cache. (obsolete) — Splinter Review
RegExpCache stats: hits: 1804 flagMisses: 34 constructs: 36 lookups: 3572 removals: 4 puts: 1804 nonAtoms: 2443 We get a nice hit rate among the atoms, but there is a large number of non-atoms we may be missing. Here is a map from patterns to hits-that-we-miss-out-on: {'#((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1, '&': 19, '(^(?:.|\\r|\\n)*?)#((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1, '(^(?:.|\\r|\\n)*?):((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?:\\(([\'"]?)((?:\\([^\\)]+\\)|[^\\(\\)]*)+)\\3\\))?(?![^\\[]*\\])(?![^\\(]*\\))': 1, '(^(?:.|\\r|\\n)*?):(nth|eq|gt|lt|first|last|even|odd)(?:\\((\\d*)\\))?(?=[^\\-]|$)(?![^\\[]*\\])(?![^\\(]*\\))': 1, '(^(?:.|\\r|\\n)*?)\\.((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1, '(^(?:.|\\r|\\n)*?)\\[name=[\'"]*((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)[\'"]*\\](?![^\\[]*\\])(?![^\\(]*\\))': 1, '(^(?:.|\\r|\\n)*?)^((?:[\\w\\u00c0-\\uFFFF\\*\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1, '(^|.)facebook.(com)([^.]*)$': 19, '(^|\\.)(\\.|$)': 421, '(^|\\s)addthis_counter(\\s|$)': 1, '(^|\\s)hidden_elem(?:\\s|$)': 1, ':((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?:\\(([\'"]?)((?:\\([^\\)]+\\)|[^\\(\\)]*)+)\\2\\))?(?![^\\[]*\\])(?![^\\(]*\\))': 1, ':(nth|eq|gt|lt|first|last|even|odd)(?:\\((\\d*)\\))?(?=[^-]|$)(?![^\\[]*\\])(?![^\\(]*\\))': 1, ':(nth|eq|gt|lt|first|last|even|odd)(?:\\((\\d*)\\))?(?=[^\\-]|$)(?![^\\[]*\\])(?![^\\(]*\\))': 1, ':(only|nth|last|first)-child(?:\\((even|odd|[\\dn+-]*)\\))?(?![^\\[]*\\])(?![^\\(]*\\))': 1, '\\%\\{tweets\\}': 20, '\\%\\{url\\}': 41, '\\+': 39, '\\.((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1, '\\[name=[\'"]*((?:[\\w\\u00c0-\\uFFFF\\-]|\\\\.)+)[\'"]*\\](?![^\\[]*\\])(?![^\\(]*\\))': 1, '\\bgbg0l\\b': 20, '\\bgbgt\\b': 23, '\\bgbgtd\\b': 20, '\\bgbkc\\b': 2, '\\bgbmt\\b': 20, '\\bgbzt\\b': 29, '\\s*[,;]$': 1, '^((?:[\\w\\u00c0-\\uFFFF\\*\\-]|\\\\.)+)(?![^\\[]*\\])(?![^\\(]*\\))': 1, '^(?:([^:\\/?#.]+):)?(?:\\/\\/(?:([^\\/?#]*)@)?([\\w\\d\\-\\u0100-\\uffff.%]*)(?::([0-9]+))?)?([^?#]+)?(?:\\?([^#]*))?(?:#(.*))?$': 2, '^([\\[\\]])$': 119, '^,': 20, '^Yahoo Application State Plugin$': 1, '^[\\s]+|[\\s]+$': 2} What's interesting about this set is that we can get 796 of the 834 potential non-atom hits if we only atomize strings of length 16-or-less. That may be a decent compromise to avoid atomization overheads.
Attached patch WIP: Cache |RegExpPrivate|s. (obsolete) — Splinter Review
Puts a RegExpPrivate cache on the ThreadData for JSAtom sources only. Hurray immutability! I want to do a little bit of browsing-in-the-name-of-science before I put it up for review. Particularly, I want to see how many non-atom sources we "could have hit on if we cached those" and how much memory overhead that produces.
Attachment #516796 - Attachment is obsolete: true
Oh, and I still have to add a purge-to-size mechanism. (That's how we emulate HashTable shrink-to-size these days.) The number of entries will drop to zero on GC later due to bug 673189.
Attached file Top offenders.
My benchmark was to run an already-atomized-only cache and instrument RegExpPrivate creation. I was a lot more scientific this time, and it had very interesting results. Opening the following 6 sites from session restore: Twitter, GReader, GMail, TechCrunch, Facebook, and Delicious, the hit rate was miserable. 22112 events 62 hits, 22050 misses (0.28 % hit ratio) As the text file shows, we get nice contributions from GMail, Twitter's pheonix-core thing and even our own nsSessionStore. The nsSessionStore is due to not atomizing regexp sources out of XDR, which (NOTE REVIEWER! :-) I will fix in this bug. All of these top offenders were uncacheable due to being non-atoms. If we atomize all of these patterns that are less than 16 chars. If we atomize strings we see with <= 16 chars we should hit a nice speed/usage tradeoff at a 98.3% hit rate for this workload. Upcoming science: time the aggregate atomization overhead, dump code sizes as well so we can know how much duplication we're talking about eliminating.
Also, (for verification) inspecting the GMail js file by hand I see 31 instances where RegExp is called directly (despite minification), at least in a handful of cases with tiny literal strings.
This is very interesting data. Is the bug title correct, assuming that TM refers to TraceMonkey, that this bug is for TM only? IIUC TM is disabled when TI is enabled, and is set to go away entirely when IM is ready.
(In reply to Emanuel Hoogeveen from comment #9) > This is very interesting data. Is the bug title correct, assuming that TM > refers to TraceMonkey, that this bug is for TM only? IIUC TM is disabled > when TI is enabled, and is set to go away entirely when IM is ready. Looking at the code this patch is not TraceMonkey related. Putting TM in front of bugs was some weird thing being done back in these days.
Summary: TM: add regexp cache → Add RegExpPrivate cache
I made the chars we read out of the token stream and the strings we XDR decode into atoms and we look good hit rate wise: 21787 events 20526 hits, 1261 misses (94.21% hit ratio) kind | count ------------+------ cold | 648 hit | 20526 eviction | 22 uncacheable | 591 Since we were be creating fresh JSStrings for those token-stream-read creations the atomizing should have added very little overhead, but I'll make an opt build to time it just to be sure. Without any purging we keep a ~2.5K hash table around to cache all the values. Half the RegExpPrivates that exist are there because they're non-atom sources, so I still need more memory usage and execution time data -- maybe atomizing more things would be worth it to reduce the duplication.
Duplicate code savings: sum: 183,668B; avg per RegExpPrivate: 895.94; stddev: 3,087.15 At least in my benchmark the code savings is negligible. We can always atomize more stuff later if we find that's a win in the real world, and just caching the naturally atomized stuff is better than what we have now. Will have a patch with purging soon.
Attached patch WIP: cache with purging. (obsolete) — Splinter Review
This is close to done, but I have to fix a few things that are bothering me: 1) I'm pretty sure this assertion method doesn't work with background sweeping. I *could* deallocate the cache pre-mark and assume refcounting correctness, but the debug refcounting assertions are nice. I either have to figure out a way to do it in JS_THREADSAFE that doesn't make your eyes bleed, or only perform the assertions when !JS_THREADSAFE (which seems undesirable). 2) The string regexp methods are violating the RegExpPrivate boundary quite heavily in order to avoid creating a RegExpObject. This leads to more annoying problems in the refcount assertions where I have to say, "check that the string regexp methods are appropriate messing around with RegExpPrivates behind our back," which is gross. 3) Switch the RegExpPrivates from taking JSContext parameters to JSRuntime parameters now that they live in the ThreadData. 4) There's too many scary verbs on the RegExp data structures... purge, finalize, reset, decref. If I can't consolidate any of those verbs they at least need a big comment at the top of RegExpObject.h to say how they all work together with the ThreadData cache.
Attachment #565763 - Attachment is obsolete: true
(Oh, and it passes shell tests in that form.)
(In reply to Chris Leary [:cdleary] from comment #13) > 1) I'm pretty sure this assertion method doesn't work with background > sweeping. billm pointed out that it does because we don't background finalize anything with clasp->finalize.
Attached patch WIP: cache with purging. (obsolete) — Splinter Review
Looks like this is passing jsreftests. Interesting note to self: you can't legitimately free up memory from a thread-data level cache when you're doing compartmental GCs. Pushed to try.
Attachment #565841 - Attachment is obsolete: true
Looks good on try. Will post for review this afternoon.
Depends on laziness patch from bug 673188.
Attachment #566104 - Attachment is obsolete: true
Attachment #566392 - Flags: review?(luke)
(Sorry luke, didn't qref the removal of that one fprintf chunk!)
Depends on: 673188
Comment on attachment 566392 [details] [diff] [review] Cache |RegExpPrivate|s, purge cache on global GC. Waldo said he would be willing to take this review (and his review queue is less loaded).
Attachment #566392 - Flags: review?(luke) → review?(jwalden+bmo)
Comment on attachment 566392 [details] [diff] [review] Cache |RegExpPrivate|s, purge cache on global GC. Review of attachment 566392 [details] [diff] [review]: ----------------------------------------------------------------- The RegExpObjectBuilder idiom feels weird to me. Maybe it's the name that feels strange. I don't know. My concerns about it are not coherent enough to suggest something better. ::: js/src/jscntxt.cpp @@ +163,5 @@ > + RegExpPrivateCache *newCache = rt->new_<RegExpPrivateCache>(rt); > + > + if (!newCache || !newCache->init()) { > + if (newCache) > + rt->delete_<RegExpPrivateCache>(newCache); If delete_ is null-safe I'd remove the null-check. @@ +367,5 @@ > +#ifdef JS_THREADSAFE > + for (JSThread::Map::Enum e(cx->runtime->threads); > + !e.empty(); > + e.popFront()) { > + JSThread *thread = e.front().value; { on new line. ::: js/src/jscntxt.h @@ +1255,5 @@ > */ > js::GCHelperThread *gcBackgroundFree; > #endif > > + js::ThreadData *threadData() { return JS_THREAD_DATA(this); } File a bug to remove JS_THREAD_DATA in favor of this? I've wanted to do this for a long time, never was touching close enough code to actually do it. ::: js/src/jsstr.cpp @@ +1266,5 @@ > */ > int32 match() const { return match_; } > }; > > class RegExpPair An overview comment for why this class exists would be nice. It's not entirely obvious to me, at a skim of its uses. @@ +1280,5 @@ > public: > + explicit RegExpPair(JSContext *cx) : cx(cx), rep_(cx), markedOutstanding(false) {} > + > + ~RegExpPair() { > + if (!!markedOutstanding) Wouldn't operator bool be worth adding to DebugOnly as well? ::: js/src/vm/RegExpObject.cpp @@ +104,5 @@ > + if (!reobj_->init(cx, rep->getSource(), rep->getFlags())) { > + rep->decref(cx); > + return NULL; > + } > + reobj_->setPrivate(rep.get()); If this is a RegExpObject, shouldn't we have a method to do this, that takes in an AIR<>, rather than do it all manually here? The free-floating decrefs here (and increfs in callers) are not happy-making. I assume the semantics here are that the caller passes ownership of his reference to this function, which is then responsible for decrementing it at the appropriate time (either now, in case of error, or later when the reobj dies)? I suspect based on the name this is what WebKit's PassRefPtr (?) templates are for; maybe we should have something similar. @@ +151,5 @@ > + return build(other->getSource(), newFlags); > + } > + > + RegExpPrivate *toShare = other->getPrivate(); > + if (toShare) { if (RegExpPrivate *...) ::: js/src/vm/RegExpObject.h @@ +81,5 @@ > * so this function is really meant for object creation during code > * execution, as opposed to during something like XDR. > */ > + static inline RegExpObject *create(JSContext *cx, RegExpStatics *res, const jschar *chars, > + size_t length, RegExpFlag flags, TokenStream *ts); If you're renaming arguments, ts here should be tokenStream. That said, why are you removing the line break between return type and methodname? I find the other alignment more readable given how many different arguments this function takes.
Attachment #566392 - Flags: review?(jwalden+bmo) → review+
Thanks for the review Waldo! Just going to finish investigating bug 696524 before I land this so we don't muddle up the results.
Depends on: 696524
https://hg.mozilla.org/integration/mozilla-inbound/rev/366d80e91816 The small V8 regression was due to some regexp compilation being duplicated in v8-regexp. The cache largely fixes that issue, but it still gets cleared out and recompiles regexps about 6 times during v8-regexp. This at least gets our regexp score back where it was, but there's followup work we can do by only purging on a GC_SHRINK (thereby not clearing out compiled regexps during v8-regexp). That'll be left to the v8-regexp investigation bug.
Target Milestone: --- → mozilla10
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Depends on: 700915
Depends on: 701387
Depends on: 702352
Depends on: 701399
Depends on: 701396
Depends on: 701332
Depends on: 728705
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: