It (1) checks the whole range for case equivalent characters and (2) adds a singleton range for each of these. We often get here with a range containing > 50000 chars. We should skip ranges of "uninteresting" chars and we should try to add a single range instead of many singleton ones. The V8 code this was forked from is a lot smarter. Maybe we can import that.
(In reply to Jan de Mooij [:jandem] from comment #0) > (2) adds a singleton range for each of these. Actually it does try to combine with existing ranges, so this part seems okay.
I added some logging and it seems there are a few large ranges that have no case equivalent chars: [11566, 42559] (30994 chars) [43968, 65312] (21345 chars) It might make sense to skip these.
Btw I think we're also calling this multiple times for the same ranges so we may be doing something stupid higher up the stack. I'll investigate more this week.
OK so for whatever reason previously in bug 1374014 I wasn't seeing a backtrace of where these are coming from! And now I do: setTimeout_timer _matchStringToFieldName getInfo getFormInfo collectFormFields identifyAutofillFields doIdentifyAutofillFields/< setTimeout_timer Here are the regexes in question: https://searchfox.org/mozilla-central/source/browser/extensions/formautofill/content/heuristicsRegexp.js
(Also it seems like we run this code for each field name we find, which looks to be the name of each form control we find on the page: https://searchfox.org/mozilla-central/rev/7cc377ce3f0a569c5c9b362d589705cd4fecc0ac/browser/extensions/formautofill/FormAutofillHeuristics.jsm#101. This may be extremely expensive... If we can't do something about this in SpiderMonkey we should probably file a front-end bug about it.)
OK so the problem here is that some of these chrome regexps use the "u" and "i" flags, and the ranges we add for '.' is where we spend a lot of (unnecessary) time. The patch optimizes this as follows: (1) In MakeCaseIndependent we can skip character classes that contain everything except whitespace and surrogates, as AddCaseEquivalents is a no-op in this case. This is similar to the cc->is_standard check there. (2) In AddCaseEquivalents we can skip surrogate character ranges. The V8 code upstream has a similar check. This improves the micro-benchmark below from >= 10 ms to 0 ms. var re = /f...................................../ui; var t = new Date; print("foo".match(re)); print(new Date - t);
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Attachment #8881328 - Flags: review?(arai.unmht)
Comment on attachment 8881328 [details] [diff] [review] Patch Review of attachment 8881328 [details] [diff] [review]: ----------------------------------------------------------------- Thanks, the change looks good, except that js/src/vm/make_unicode.py needs corresponding modification. ::: js/src/irregexp/RegExpCharacters.cpp @@ +140,5 @@ > + 0x2028, 0x2029 + 1, // LINE SEPARATOR..PARAGRAPH SEPARATOR > + 0xD800, 0xDFFF + 1, // <Lead Surrogate Min>..<Trail Surrogate Max> > + 0xFFFF + 1 > +}; > +const int js::irregexp::kLineTerminatorAndSurrogateRangeCount = 9; this file is automatically generated by make_unicode.py, and this needs to be handled there, next to the following line: https://dxr.mozilla.org/mozilla-central/rev/53477d584130945864c4491632f88da437353356/js/src/vm/make_unicode.py#1427 > character_range('LineTerminator', line_terminator)
Attachment #8881328 - Flags: review?(arai.unmht) → feedback+
Good catch, arai. I undid my RegExpCharacters.cpp changes and ran make_unicode.py (which generated exactly the same file) :)
Comment on attachment 8881464 [details] [diff] [review] Patch v2 Review of attachment 8881464 [details] [diff] [review]: ----------------------------------------------------------------- Great!
Attachment #8881464 - Flags: review?(arai.unmht) → review+
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/d084672cc9fc Speed up TextNode::MakeCaseIndependent for the unicode '.' character class ranges. r=arai
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
status-firefox56: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
You need to log in before you can comment on or make changes to this bug.