406271 - JavaScript Regular Expression Extremely Slow

Reporter

Description

•

17 years ago

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.10) Gecko/20071115 Firefox/2.0.0.10
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.10) Gecko/20071115 Firefox/2.0.0.10

I have Javascript code used to highlight text.  When the JavaScript executes over a large document it is extremely slow.
URL: http://hro.abc-clio.com/regextest/sample.asp
In my test the sample took about 80 seconds to execute.
There is an alert at the beginning and end of the script to show the user when highlighting begins and ends.  
When I tested the same code in Internet Explorer it completed in about 14 seconds.
There is now "slow script" notice printed so the user does not have the option to stop execution of this script.

Reproducible: Always

Steps to Reproduce:
1.Load Web Page
2.Watch for "start" and "done" alert to observer time taken.
3.Load the same page in Internet Explorer and watch how quick it is
Actual Results:  
Execution of the script is extremely slow, about 8-10x slower than IE.

Expected Results:  
Complete in a similar time to internet explorer.
I would expect the typical, this script is unresponsive error, where I can select to continue exection, stop, or debug options, to appear sometime during the execution of this slow script.

Mats Palmgren (inactive)

Comment 1

•

17 years ago

Firefox trunk 2007113004 on Linux completes in 6 seconds for me.

Andy, please test a trunk build and report your results:
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-trunk/

FYI, Firefox 2.x only takes security and crash fixes (and regressions from
those fixes) at this point.

Jesse Ruderman

Updated

•

17 years ago

Assignee: nobody → general

Component: General → JavaScript Engine

Keywords: perf

Product: Firefox → Core

QA Contact: general → general

Version: unspecified → 1.8 Branch

Andy Reagan

Reporter

Comment 2

•

17 years ago

I downloaded the latest trunk as suggested and the sample code completed in 41 seconds.

The user-agent from the about for the trunk is: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b2pre) Gecko/2007120305 Minefield/3.0b2pre

I tested on a co-workers machine on the generally released version of FireFox and he had the same results.

Bob Clary [:bc] (inactive)

Comment 3

•

17 years ago

I get a 404 not found on the url. Is there another location for the test script?

Andy Reagan

Reporter

Comment 4

•

17 years ago

Sorry about that.  I had a server fail and the site switched to a backup server that didn't have the example.  The server is back up so the sample is available.  I also put the sample on the backup server in case I get another unexpected fail-over.

Mats Palmgren (inactive)

Comment 5

•

17 years ago

My test results on Windows XPSP2:
Firefox trunk 2007120505  6 seconds
Opera 9.21                2 seconds
IE7                       2 seconds

RFLazaro

Comment 6

•

16 years ago

Does this still happen with the recent nightlys?

Also, link is not accessible.

Michael Haufe

Comment 7

•

16 years ago

I can confirm that Regular Expressions are slower than expected.

FireFox: 3.1b2 (20081201)
OS: Windows Vista SP 0
1 GB RAM, 1.66GHZ Dual Core

While developing a tokenizer I came across the following results with 500 lines of JavaScript:

Chrome 1.0.154.43:    22ms
IE7:                 359ms
FireFox 3.1b2:     1,083ms
Safari 3.2.1:      4,573ms
Opera 9.63:        6,260ms

URL to test with: http://thenewobjective.com/temp/FxBugReport/JSParse.html

Just paste about 500 or so lines of JS into the first textarea and hit "Tokenize".

Hope this helps

Michael Haufe

Comment 8

•

16 years ago

Sorry, looks like I was wrong on those numbers and the cause. Here's a reduced test with results:

--------------------------------------------
onload = function(){
   var str = "qwdqw !@%^&*hrty4567756223!dewbghgN ER  .<.TYu5  ";
   var re = /\w+/
   var m;
   var d1 = new Date()
   for(var i = 0;i < 1000000;i++){
      m = re.exec(str)
   }
   alert(new Date() - d1 + "ms")
}
--------------------------------------------

Chrome 1.0.154.43:    1,173ms
Safari 3.2.1:         1,450ms
Opera 9.63:           2,683ms
FireFox 3.1b2:        5,326ms
IE7:                  7,987ms

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 9

•

16 years ago

confirming.

On MBP, I got:

Minefield today nightly: 4200ms
recent webkit nightly: 459ms

Status: UNCONFIRMED → NEW

Ever confirmed: true

Blake Kaplan (:mrbkap) (inactive)

Comment 10

•

16 years ago

This testcase points out two problems: we don't trace RegExp.prototype.exec, and we don't native-compile any of the naked classes.

Brendan Eich [:brendan]

Comment 11

•

16 years ago

Dave, quick fix to the regexp compilation issue possible?

Traceable native for exec should be a no-brainer, zero-risk to boot -- separate bug blocking this one seems best.

Competitive issue, not a blocker by some criteria but close to it in my book.

/be

Flags: wanted1.9.1?

OS: Windows XP → All

QA Contact: general → dmandelin

Hardware: x86 → All

David Mandelin [:dmandelin]

Comment 12

•

16 years ago

If I read everything right the regex is /\w+/, and in that case a cheesy quick fix is possible. I had hoped to switch to a new compiler structure before doing quantifiers, but it could be done either way.

Robert Sayre

Updated

•

16 years ago

Flags: wanted1.9.1? → wanted1.9.1+

Michael Haufe

Comment 13

•

15 years ago

Are all of these the same fundamental issue?

bug 406271
bug 480759
bug 459783

Boris Zbarsky [:bzbarsky]

Updated

•

15 years ago

Flags: blocking1.9.2?

Michael Haufe

Comment 14

•

15 years ago

Updated numbers from my last post:

Chrome 2	 409ms
Safari 4b	 429ms
Opera 10a	1228ms
Fx 3.5b4	1306ms
IE8		2794ms

Robert Sayre

Updated

•

15 years ago

Blocks: 503107

Luke Wagner [:luke]

Assignee

Comment 15

•

15 years ago

Attached patch WIP v.1 (obsolete) — Details — Splinter Review

Not quite ready for review, but this patch passes the js tests and my unit tests.  First, I added compilation for \w \W \s \S \d \D and '.'.  Next, to /\w+/ at all, it is necessary to determine that no backtracking is needed when greedily consuming the quantifier body.  So, /a*/ /a*b/ /\w*#/ /.*\n/ and /\d*\s/ don't need backtracking, while /a*a/ /\w*a/ /\w*\d/ and /.*s/ do need backtracking and thus can't be compiled.

A big part of this patch is enumerating the characters that can be consumed along a path, representing this set efficiently, and testing set intersection.  Once this was done, I was able to generalize quantifiers and alternatives to support nesting (as long as they use non-capturing parenthesis).

Current limitations:
 - '.' includes almost everything, so the "no-backtracking" limitation will abort compilation on practically every regexp where .* isn't last.  (e.g., /.*star.*/ in bug 503107 comment 1)
 - user-defined character classes don't have an obvious efficient way to implement enumeration/intersection, so regexps like /(?:[a-c]|[d-f])x/ will conservatively assume [a-c] and [d-f] intersect and fail compilation.
 - only *, +, and ? quantifiers are supported because general {n,k} requires a stack of loop counters.  (It would be easy to support {n,}, with a fixed upper bound on n, through "inlining")
 - only greedy quantifiers are supported
 - no capturing parenthesis

Comments welcome

Assignee: general → lw

Status: NEW → ASSIGNED

Luke Wagner [:luke]

Assignee

Comment 16

•

15 years ago

Attached file Performance test — Details

Performance comparison:

Without JITing (trunk): 2624ms
With JITing (patch):    111ms

Jeff Walden [:Waldo]

Updated

•

15 years ago

QA Contact: dmandelin → general

Luke Wagner [:luke]

Assignee

Comment 17

•

15 years ago

Attached patch WIP v.2 (obsolete) — Details — Splinter Review

Bug fixes and performance tweaks.

SunSpider performance improved:
dna:         1.126x as fast    39.4ms +/- 1.3%    35.0ms +/- 0.0%
tagcloud:    1.013x as fast    95.3ms +/- 0.8%    94.1ms +/- 0.7%
unpack-code: 1.050x as fast    115.2ms +/- 1.2%   109.7ms +/- 0.9%

I'm not sure how much of this is regexps being compiled now that weren't before and how much is the tweaks to what gets passed in and gets kept live.

Attachment #387485 - Attachment is obsolete: true

Luke Wagner [:luke]

Assignee

Comment 18

•

15 years ago

Attached patch v.4 (obsolete) — Details — Splinter Review

Attachment #387577 - Attachment is obsolete: true

Attachment #387988 - Flags: review?(dmandelin)

David Mandelin [:dmandelin]

Comment 19

•

15 years ago

Comment on attachment 387988 [details] [diff] [review]
v.4

I thought it was kind of unfortunate to require all the character set testing code, since it would be temporary until we get backtracking, but then I realized it could continue to be useful in other algorithms or optimizations, so that seems fine.

Also, very cool C++ usage in general. Mostly I just have a bunch of nits about names and comments.

>+/*
>+ * Return whether the given subexpression has a valid path consuming no
>+ * character (epsilon edge).  'next' indicates whether the algorithm should
>+ * consider the next expression in concatenation order.  The conservative answer
>+ * is 'true'.
>+ */
>+static bool epsPath(RENode *node, bool next = true)

There are some ambiguities in this comment (what does 'consider' mean? and 'conservative answer to what?' (context suggests 'next', which is of course wrong)). I recommend a revision like

/*
 * Return true if the given subexpression may match the empty string. The conservative
 * answer is |true|. If |rest| is true, then the subexpression is considered to be the
 * rest of the regexp, starting from node. Otherwise, the subexpression is considered to
 * be |node| in isolation.
 */
static bool mayMatchEmpty(RENode *node, bool rest = true)

But I don't know if |rest| is a particularly great name. 

Note that I also recommend a rename to |mayMatchEmpty|, because mention of epsilon edges may be confusing or distracting in this context (see also below).

>-    JSBool targetCurrentPoint(LIns* ins) 
>-    {
>-        if (fragment->lirbuf->outOMem()) 
>-            return JS_FALSE;
>-        ins->setTarget(lir->ins0(LIR_label)); 
>-        return JS_TRUE;
>-    }
>+    void targetCurrentPoint(LIns *ins)
>+    {
>+        ins->setTarget(lir->ins0(LIR_label));
>+    }

I noticed you removed the OOM checks from these items. I remember having to add them to fix some bug. Are you sure that the LIR buffer cannot be OOM before these are called? If it is, I think you can end up trying to patch random stuff and going very bad.

>+    /* Factor out common code to index js_alnum. */
>+    LIns *compileIndexTable(LIns *chr, const bool *tbl)

When I saw this I thought I would be reading a function that builds an index table. I recommend renaming to something like |compileTableRead|.

>+    {
>+        if (sizeof(bool) != 1) {
>+            LIns *sizeLog2 = lir->insImm(StaticLog2<sizeof(bool)>::result);

I like the template thing very much.

>+    LIns *compileAlt(RENode *node, LIns *pos, bool atEnd, LInsList &fails)
>+    {
>+        RENode *leftRe = (RENode *)node->kid, *rightRe = (RENode *)node->u.kid2;
>+
>+        /*
>+         * If the RE continues after the alternative, we need to ensure that no
>+         * backtracking is required.  Recursive calls to compileNode will fail
>+         * on capturing parens, so we guard here on non-deterministic branches
>+         * ("left branch succeded, continue to next or try other branch?") in
>+         * the underlying NFA.
>+         */

*Please* avoid the terms "NFA" and "nondeterministic" in the context of JS regular expression matching. In my experience, they open up endless confusing discussions because those terms are strongly overloaded in this context. (In particular, people tend to start thinking about automata theory constructs that we are totally not using here.) Also, they are mostly incorrect. Perl- and ECMA-style regular expression matching is not nondeterministic and is not implementable by NFAs. It is a deterministic backtracking search, which is a PDA in automata theory terms. So please rewrite all those comments.

>+        if (!atEnd) {
>+            /* Catch non-deterministic branch between left and right */

The part that really needs explanation is why we test this condition. I think you want something like: 

              // We don't support backtracking yet, so we are correct only if
              // for any given text string there is only one way to match this alt.

By and large, though, the code itself in this section is pretty clear.

>+        /*
>+         * This might be the only LIR_live in Mozilla, so I will explain its
>+         * sinister semantics.  LIR_lives must appear immediately following a
>+         * backwards jump and describe what is live immediately at the *target*
>+         * of the back-edge.  Thus, these instructions answer the question "what
>+         * is live at the top of the loop?", which makes sense, because the
>+         * backwards scan has not yet seen the top of the loop and needs this
>+         * information to continue working backwards up the inside of the loop.
>+         *
>+         * Here, 'cpend' and 'state' get defined before the loop, and used
>+         * inside, so they are live at 'loopTop'.  While 'iterBegin' is used
>+         * after the loop, making it live in on loop exit, it gets defined after
>+         * 'loopTop', which "kills" its liveness.
>+         */
>+        lir->ins1(LIR_live, state);
>+        lir->ins1(LIR_live, cpend);
>+

Thanks for figuring this out. I assume variables only need to be listed this way if they are only live because of a read in the loop body, and that if they are read after the loop, these usually aren't needed. Or is that not true?

Luke Wagner [:luke]

Assignee

Comment 20

•

15 years ago

Attached patch v.5 (obsolete) — Details — Splinter Review

(In reply to comment #19)

Thanks!

> I noticed you removed the OOM checks from these items. I remember having to add
> them to fix some bug. Are you sure that the LIR buffer cannot be OOM before
> these are called? If it is, I think you can end up trying to patch random stuff
> and going very bad.

I talked to Andreas a bit about the OOM situation in nanojit before writing this and, if I understand correctly, the idea is that nanojit OOMs early so that there is enough memory "in reserve" to satisfy another bunch of instructions (e.g., enough to compile a normal JSOP).  Thus, you only have to check OOM if you have a loop or recursion or other particularly long codegen.  This is why, if you search jstracer.cpp, the only outOMem calls are in loops and pinch points like monitorRecording.  In jsregexp, the only loops/recursion pass through compileNode, so that's where I put the main OOM check, with special cases in compileFlat and compileClass.

> Thanks for figuring this out. I assume variables only need to be listed this
> way if they are only live because of a read in the loop body, and that if they
> are read after the loop, these usually aren't needed. Or is that not true?

That's what I initially thought too, but the resulting assembly begged to differ quite forcefully ;).  The basic issue is that, since the regalloc works backwards, when it reaches a backwards edge, it is *completely* reliant on the LIR_live insns to tell it what is now live.  So even if you have a program:

1  x = ...
2  top:
3    ... no use/def of x
4    jt after
5    ... no use/def of x
6  goto top
7  after:
8  ... = x

if you don't make x LIR_live at 6, the regalloc will assume x is not live until is sees 4 (and pulls in the live state from 7).  However, between 6 and 4, the regalloc will not know x is live and insn 5 is free to clobber x's storage.

Attachment #387988 - Attachment is obsolete: true

Attachment #388232 - Flags: review?(dmandelin)

Attachment #387988 - Flags: review?(dmandelin)

Brendan Eich [:brendan]

Comment 21

•

15 years ago

Comment on attachment 388232 [details] [diff] [review]
v.5

>+/*
>+ * This table allows efficient testing for \w during a trace.  ECMA-262
>+ * 15.10.2.6 defines \w to be [0-9A-Z_a-z].  The index must be < 128.

Is JS_ISWORD redundant now? I see it still used in the patch, but it uses the ctype.h lookup table.

>+        if (JS7_ISDEC(*p)) {
>+            if (classes & Digit)
>+                return false;
>+        }
>+        else if (JS_ISWORD(*p)) {

Nit here and throughout: cuddle else with preceding } (dmr, K&R original style).

>+            if (classes & OtherAlnum)
>+                return false;
>+        }
>+        else if (RE_IS_LINE_TERM(*p)) {
>+            if (classes & LineTerms)
>+                return false;
>+        }
>+        else if (JS_ISSPACE(*p)) {
>+            if (classes & OtherSpace)
>+                return false;
>+        }
>+        else {
>+            if (classes & Other)
>+                return false;
>+        }
>+    }
>+    return true;
>+}
>+
>+/*
>+ * Predicate version of the STL's set_intersection.  Assumes both ranges are
>+ * sorted and thus runs in linear time.
>+ *
>+ * FIXME: This is a reusable algorithm, perhaps it should be put somewhere.
>+ */
>+template <class InputIterator1, class InputIterator2>
>+bool set_disjoint(InputIterator1 p1, InputIterator1 end1,
>+                  InputIterator2 p2, InputIterator2 end2)

Super-nit, hate to pick it but we've kept consistent even in recent C++ code: type on preceding line, function declarator starts in new line at column 1.

>+{
>+    if (p1 == end1 || p2 == end2)
>+        return true;
>+    while (true) {
>+        if (*p1 < *p2) {
>+            ++p1;
>+            if (p1 == end1)
>+                return true;
>+        } else if (*p2 < *p1) {
>+            ++p2;
>+            if (p2 == end2)
>+                return true;
>+        }
>+        else {
>+            return false;
>+        }
>+    }
>+}

Seems better to avoid the appearance of falling off the end of the function, by putting the *p1 == *p2 test in the loop condition instead of uselessly-faux-iloopy true, and return false; before closing brace of function.

>+    switch (node->op) {
>+        case REOP_EMPTY:  return true;
>+        case REOP_FLAT:   return false;
>+        case REOP_CLASS:  return false;
>+        case REOP_ALNUM:  return false;
>+        case REOP_ALT:    return (mayMatchEmpty((RENode *)node->kid) ||
>+                                  mayMatchEmpty((RENode *)node->u.kid2)) &&
>+                                 (!next || mayMatchEmpty(node->next));
>+        case REOP_QUANT:  return (node->u.range.min == 0 ||
>+                                  mayMatchEmpty((RENode *)node->kid)) &&
>+                                 (!next || mayMatchEmpty(node->next));
>+        default:          return true;

Prevailing style half-indents labels (case, default and goto target) like the next major hunk:

>+    switch (node->op) {
>+      /* Record as bitflags. */
>+      case REOP_DOT:       set.addClass(CharSet::Dot);     return true;

Nice to see more idiomatic high-class C++ style code over time!

/be

Luke Wagner [:luke]

Assignee

Comment 22

•

15 years ago

(In reply to comment #21)

Thanks; don't worry, I'll get this style eventually :)

> (From update of attachment 388232 [details] [diff] [review])
> >+/*
> >+ * This table allows efficient testing for \w during a trace.  ECMA-262
> >+ * 15.10.2.6 defines \w to be [0-9A-Z_a-z].  The index must be < 128.
> 
> Is JS_ISWORD redundant now? I see it still used in the patch, but it uses the
> ctype.h lookup table.

Perhaps I should place "extern js_alnum" in jsstr.h, define js_alnum in jsstr.cpp, and change #define JS_ISWORD to use js_alnum instead of of std::isalnum?

David Mandelin [:dmandelin]

Comment 23

•

15 years ago

Comment on attachment 388232 [details] [diff] [review]
v.5

On the OOM checking, I know that initially, I did not check OOM in the branch patch functions (and returned void, just like you did), but found it necessary to avoid some crashes. Nanojit OOMs have changed since then, so maybe it's OK. I should look up those bugs and see if they have any good test cases.

Attachment #388232 - Flags: review?(dmandelin) → review+

Luke Wagner [:luke]

Assignee

Comment 24

•

15 years ago

Attached patch v.6 (obsolete) — Details — Splinter Review

Other than the stylistic changes, this patch moves js_alnum from jsregexp.cpp to js_Alnum in jsstr.h/jsstr.cpp and redefines JS_ISWORD to use js_Alnum.  In a  micro-benchmark (gcc -O3) of this new table lookup vs. std::isalnum, the table lookup is 10x faster because (1) its inlined and (2) isalnum doesn't know we only care about [0,127] and so it does locale-ish things.

Attachment #388232 - Attachment is obsolete: true

Luke Wagner [:luke]

Assignee

Comment 25

•

15 years ago

Attached patch rebased (obsolete) — Details — Splinter Review

Attachment #388294 - Attachment is obsolete: true

Luke Wagner [:luke]

Assignee

Comment 26

•

15 years ago

Attached patch lower-cased js_alnum — Details — Splinter Review

Attachment #389022 - Attachment is obsolete: true

David Mandelin [:dmandelin]

Comment 27

•

15 years ago

Attachment 389036 [details] [diff] pushed to TM as b837948c1daf.

Bob Clary [:bc] (inactive)

Updated

•

15 years ago

Depends on: 505083

Robert Sayre

Comment 28

•

15 years ago

http://hg.mozilla.org/mozilla-central/rev/b837948c1daf

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Resolution: --- → FIXED

Jesse Ruderman

Updated

•

15 years ago

Depends on: 505400

Ed Lee :Mardak

Updated

•

15 years ago

Blocks: 506201

Robert Sayre

Updated

•

15 years ago

Flags: blocking1.9.2? → wanted1.9.2+

WIP v.1 15 years ago Luke Wagner [:luke] 53.47 KB, patch		Details \| Diff \| Splinter Review
Performance test 15 years ago Luke Wagner [:luke] 360 bytes, application/javascript		Details
WIP v.2 15 years ago Luke Wagner [:luke] 55.13 KB, patch		Details \| Diff \| Splinter Review
v.4 15 years ago Luke Wagner [:luke] 54.99 KB, patch		Details \| Diff \| Splinter Review
v.5 15 years ago Luke Wagner [:luke] 55.31 KB, patch	dmandelin : review+	Details \| Diff \| Splinter Review
v.6 15 years ago Luke Wagner [:luke] 58.04 KB, patch		Details \| Diff \| Splinter Review
rebased 15 years ago Luke Wagner [:luke] 57.10 KB, patch		Details \| Diff \| Splinter Review
lower-cased js_alnum 15 years ago Luke Wagner [:luke] 57.10 KB, patch		Details \| Diff \| Splinter Review