Closed Bug 461050 Opened 11 years ago Closed 11 years ago

Implement regular expressions by compiling to native code


(Core :: JavaScript Engine, enhancement)

Not set





(Reporter: dmandelin, Assigned: dmandelin)


(Depends on 1 open bug)



(3 files, 4 obsolete files)

The idea is to compile SpiderMonkey regular expression ASTs to nanojit LIR, and then let nanojit compile to native code. This will enable compilation of regular expressions on all nanojit platforms.
Depends on: 460892
Depends on: 461065
Depends on: 461067
Here's a sample of regular expressions matched by loading a few popular sites. Notably, I found no uses of \B (non-(word-boundary)) or backreferences. Also, negative lookahead assertions are only used twice. I also notice that a fair number of the patterns are flat strings, which could be optimized quite a bit without even doing any fancy compilation stuff. But for long strings, a native code matcher will still be faster (assuming Boyer-Moore isn't even faster than that).
Depends on: 461834
Attached patch Patch for fixed RE (obsolete) — Splinter Review
This patch implements the fixed regex from SunSpider /agggtaaa|tttaccct/i. Just to make things weird, that regex gets run when the input regex is /a/. The main point is that doing this with nanojit seems to actually work, and perf is about the same as what WREC gets. (Which I think is more evidence for the idea that WREC's perf on regexp-dna test cases is memory-bound.)
Another thing about the test patch is that it horribly abuses the 'skipped' field in the REGlobalData struct by using it to store the 'anchor start' during the matching process. This seems more efficient to me, because this way only one value (the anchor start) needs to be incremented in the anchoring loop, while in the basic design, both 'skipped' (difference between anchor start and search start) and 'cp2' (anchor start) need to be updated on each iteration. Does anyone know if there is some reason the 'skipped' field is required to be the way it is, or if it is OK to replace it with 'anchor start'?
Depends on: 462412
No longer depends on: 460892
No longer depends on: 461065
Depends on: 454301
Attached patch v2, runs regexp-dna.js (obsolete) — Splinter Review
Basic RE compiler that handles empty, flat, char class, and alternation. Has known bugs related to case sensitivity of flat chars, possibly others. Still need deallocation.

Runs regexp-dna.js within 10ms of WebKit.
Attachment #345209 - Attachment is obsolete: true
Depends on: 462208
Attached patch fixed a couple bugs (obsolete) — Splinter Review
Attachment #345586 - Attachment is obsolete: true
Reviewable patch for tracemonkey. This gives identical results on the regression test suite but runs regexp-dna much faster. Note that there is a code freeze tomorrow.

Sadly, it doesn't run a whole lot more than what's in regexp-dna. E.g., I don't have boundary assertions done yet, or even the . operator. Those are easy to put in, what's harder is overcoming the same limit WREC has, handling parens. It appears that the fundamental problem with the current design is the lack of a way to indicate where to go next when the current part of the regex has finished matching. I tried various simple schemes but none of them worked. I think the easiest thing that actually does work is to pass to each code generation function (e.g., compileAlt) a "continuation" argument as in the spec. The argument is actually a little object that describes how to compile what comes next. These objects can potentially be nested, and with capturing there end up being various kinds of them, so complexity grows. There may be a simpler scheme but none of the dumber ways of doing it seemed to work.
Attachment #345640 - Attachment is obsolete: true
Attachment #346172 - Flags: review?(danderson)
Attachment #346172 - Flags: review?(brendan)
I forgot to say in comment 6 that I haven't yet built a browser with this, that's my next step. But with the code freeze I wanted to request review tonight.
Comment on attachment 346172 [details] [diff] [review]
Patch, "passes" regression test suite & speeds up regexp-dna.js

Looks good on the nanojit side, some small fixes/comments:

>+ * FIXME  Should consolidate with tracing structures or eliminate the need
>+ *        for a guard record at the loop edge. */

I've got a patch in my queue that factors out common recording tasks like setting up and using filters/guards, should help here when it's done.

>+        pos = lir->ins2(LIR_add, pos, lir->insImm(2));

Use LIR_piadd when adding to a pointer, for 64-bit portability.  

>+        /* FIXME  When available in nanojit, use LIR that can generate
>+         *        indexed load instructions instead of this workaround. */
>+        LIns *bitmap = lir->insLoad(LIR_ld, lir->insImm((int32_t) charSet), offsetof(RECharSet, u.bits));

I'll refresh the patch in bug 444682 very soon.  For now could you #if defined NANOJIT_64BIT and #error (or insImmq)?

>+        pos = lir->ins2(LIR_add, pos, lir->insImm(2));

LIR_piadd again.

r=me with 64-bit portability nits
Attachment #346172 - Flags: review?(danderson) → review+
Oops, no need for #ifdefs - just use insImmPtr instead of insImm
Depends on: 444682
Attachment #346172 - Flags: review?(gal)
Attachment #346172 - Flags: review?(gal) → review+
Attachment #346172 - Flags: review?(brendan)
Attached patch OOM handling for regexps (obsolete) — Splinter Review
This was part of the original patch, I have no idea how I lost it from my mq so that it didn't get checked in. The main test of this is the regression suite tests regress-367561-0(1|3).js.
Attachment #346391 - Flags: review?(gal)
Attachment #346391 - Attachment is obsolete: true
Attachment #346393 - Flags: review?(gal)
Attachment #346391 - Flags: review?(gal)
Attachment #346393 - Flags: review?(gal) → review+
Comment on attachment 346172 [details] [diff] [review]
Patch, "passes" regression test suite & speeds up regexp-dna.js

Neat patch, nice work. Nits only here, r=me on checkin to fix style.

>+    /* Fragmento for the regular expression compiler. This is logically
>+     * a distinct compiler but needs to be managed in exactly the same
>+     * way as the real tracing Fragmento. */

Pervasive third style for major comment should change to use original BSD-Unix style:

     * Fragmento for the regular expression compiler. This is logically
     * a distinct compiler but needs to be managed in exactly the same
     * way as the real tracing Fragmento.

or else (when in gal's Rome ;-) Harbison&Steele style:

    /* Fragmento for the regular expression compiler. This is logically
       a distinct compiler but needs to be managed in exactly the same
       way as the real tracing Fragmento. */

Two major comment styles is more than enough :-P.

>+struct LoopGuardRecord {
>+    void *jmp;
>+    GuardRecord* next;
>+    SideExit* exit;
>+    GuardRecord* guards;
>+    Fragment *from;
>+    Fragment *target;

Usually we align members.

Also the declarator mode (* above, but & later) sometimes cuddles the type, sometimes the declarator name. Please use C++ style and cuddle the type (which means one declarator per declaration, of course).

>+/* dpm -- need to delete all the new'd stuff here. */

FIXME: bug nnnnnn comment is best, with bug nnnnnn on file.

>+class RegExpNativeCompiler {
>+ private:
>+    JSRegExp      *re;   /* Careful: not fully initialized */
>+    CompilerState *cs;   /* RegExp to compile */
>+    Fragment      *fragment;
>+    LirWriter     *lir;
>+    LIns          *state;
>+    LIns          *gdata;
>+    LIns          *cpend;

Definitely want C++ style (declarator mode cuddles type).

>+    JSBool isCaseInsensitive() const {
>+        return cs->flags & JSREG_FOLD;
>+    }

Not a bad style for short methods, but the { should go on its own line in general, and definitely for long methods. These could even be one-liners...

>+    void targetCurrentPoint(LIns *ins) {
>+        ins->target(lir->ins0(LIR_label));
>+    }

... while this one...

>+    void targetCurrentPoint(LInsList &fails) {
>+        LIns *fail = lir->ins0(LIR_label);
>+        for (size_t i = 0; i < fails.size(); ++i) {
>+            fails[i]->target(fail);
>+        }
>+        fails.clear();
>+    }

... and definitely this one...

>+    JSBool compileFlatSingleChar(RENode *node, LIns *pos, LInsList &fails) {
>+        /* Fast case-insensitive test for ASCII letters: convert text
>+         * char to lower case by bit-or-ing in 32 and compare. */
>+        JSBool useFastCI = JS_FALSE;
>+        jschar ch = node->u.flat.chr; /* char to test for */
>+        jschar ch2 = ch;              /* 2nd char to test for if ci */
[many lines snipped]
>+        pos = lir->ins2(LIR_add, pos, lir->insImm(2));
>+        return compileNode(node->next, pos, fails);
>+    }

Should put { on its own line, indented to same column as closing }.

Note * cuddling declarator name in formal parameters above. But enough nits -- great work, again!

Comment on attachment 346393 [details] [diff] [review]
Forgot to re-add the error check for assembly

>     JSBool compileNode(RENode *node, LIns *pos, LInsList &fails) {
>+        if (fragment->lirbuf->outOmem()) return JS_FALSE;

Forgot to nit this: prevailing style, which has debug-breakpoint-ability substance, wants return on its own line, outside of magic local macros we will ignore here (which must stay very local, or fall under Taras's gaze).

Depends on: 463258
Depends on: 463260
Depends on: 463545
I just tried unrolling the anchoring loop 2x and then 4x to see if that would improve perf on regexp-dna. But it didn't. The reason we expected it to is that the outer loop has to store and reload the position each time through because of the design of nanojit. I'm not sure why it didn't help, but I imagine it's either because the store and reload wasn't on the critical path, or because there was too much register pressure and equivalent spills were generated.

But maybe it's not that important: when we include the word-by-word compare, time taken on my machine for the main regexp part of regexp-dna is 24ms, compared with 32ms for WebKit. We're probably running near maximum efficiency on that test. WebKit is still a hair faster on that test, but it appears their advantage is all in the calls to replace. One of those uses a regexp (which they compile, but we don't yet) and the other a flat string (which they special-case, but we don't yet). So I think I'll work on those a bit.
This was checked in on 2008/11/04. Should this bug be resolved?
Up to dmandelin. I see unfixed dependencies but they could block a new metabug, as this one by its summary does indeed seem fixed.

I consider it solved as originally intended. The remaining deps all seem to represent small enhancements.
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.