I did a short survey of the use of regexps on the web and I found that as many as 2/3 of regexp objects created are never executed. Clearly, compiling them is wasted effort. More importantly, with the regexp->native compiler (see bug 461050), currently it must compile the regexp as both native and bytecode, because the native code can be nuked at any time (whenever the nanojit fragment cache fills up). This has several disadvantages: (a) if the cache is not nuked, compiling the bytecode version is wasted effort, (b) the current eager design does not allow recompilation of the native version when it would be advantageous, and (c) compiling to native eagerly increases the chance that the nanojit cache will be filled and the compilation effort wasted. A better strategy: - on creation, don't compile anything - to run a regexp, first see if there is a bytecode version. if it exists, then the regexp can't be compiled to native, so use that - otherwise, look for a native version and use if it exists. - otherwise, try to compile a native version and use that - if native compilation fails, compile a bytecode version and use that.
This comes with a spec revision, too. Delaying parsing seems like a bad idea because then error messages for invalid regexps are also delayed. So, instead I am just doing the native compilation lazily. Because native compilation costs about 10x as much as parsing and bytecode compilation, this gets most of the benefit. I do have to reparse, because ASTs are annoying to keep around (if kept, they couldn't be freed using an arena mark). This patch is regression-suite-equivalent with tip tracemonkey. I also tested with a stress test that I know causes OOM conditions.
Attachment #346995 - Flags: review?(gal)
Holding off on pushing because of tree bustage and Win32 hasn't finished building yet to prove it's OK now. And also not wanting to break the tree myself just before leaving tonight. Anyone should feel free to push this if they want.
Pushed as changeset 21548:dbb2a6559cf5.
Backed out. sayrer: gal, looks like it crashes on clarin.com test data http://hg.mozilla.org/tracemonkey/rev/d82a3643ef43
Based on Dave's measurements, we need this along with the regexp JIT -- we don't want to take the time and space hits of the JIT up front and for regexps that are never executed. /be
Here's a revised patch. The design is similar but it has been updated for all the other changes we've made since then. And those include a bunch of design improvements (esp. using the regexp source as key in fragment cache) so it should be all around better. On SunSpider it doesn't have any particularly noticeable effect on total time. It probably does speed things up by an ms or two but it's hard to detect improvements that small. I've tested it with the regression suite and with MochiTest (including a patch that runs the bytecode engine on every regexp processed natively and compares the results). I consider it basically done so I flagged for review but I recommend not pushing it until I get back so I can fix any problems promptly.
Comment on attachment 349529 [details] [diff] [review] Revised patch The lazy compilation doesn't produce a speedup, but using the string as key does (> 35ms).
Attachment #349529 - Flags: review?(gal) → review+
Since lazy compilation doesn't produce a speedup, I propose that this not block 1.9.1. We'll be doing lots more work on regexp compilation in 1.9.2, I have no doubt.
Flags: blocking1.9.1+ → blocking1.9.1-
Lazy compilation could save significant memory, though. Measure before minusing. /be
Flags: blocking1.9.1- → blocking1.9.1?
For now it won't save any allocated memory, because nanojit uses a preallocated fixed-sized set of pages for regexps. Lazy compilation might decrease working set size or cause nanojit flushes to happen less often, but that's harder to measure and less obviously important.
minusing per comment 10
Flags: blocking1.9.1? → blocking1.9.1-
These bugs are all part of a search I made for js bugs that are getting lost in transit: http://tinyurl.com/jsDeadEndBugs They all have a review+'ed, non-obsoleted patch and are not marked fixed-in-tracemonkey or checkin-needed but have not seen any activity in 300 days. Some of these got lost simply because the assignee/patch provider never requested a checkin, or just because they were forgotten about.
Now that we're using YARR, this all seems moot. Please reopen if I'm wrong.
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.