Closed Bug 938020 Opened 11 years ago Closed 8 years ago

Regex runtime depends on input size when it shouldn't

Categories

(Core :: JavaScript Engine, defect)

7 Branch
x86
All
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: peter.isza, Unassigned)

References

Details

Attachments

(1 file)

Attached file regex_test.html
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:25.0) Gecko/20100101 Firefox/25.0 (Beta/Release)
Build ID: 20131025151332

Steps to reproduce:

This bug makes JISON very slow, which makes compiling CoffeeScript on Firefox very slow. Slower than IE.

Jsfiddle: http://jsfiddle.net/eP2TR/10/.

I have run the following regular expressions on strings that contain only letter 'a':
  1. /^b+/
  2. /^(b)+/
  3. /^(?:b+)/
  4. /^(?:(?:b)+)/

Obviously, there should be no matches. The regex engine should quit after reading the first character, so the running time shouldn't depend on what's after the first letter in the string.

Safari 6 also contains the same bug (I guess it is using the same sh*tty regex compiler), while Chrome, IE10 and Opera don't cont.


Actual results:

In cases #2 and #4 the running times depended (linearly) on the size of the input.


Expected results:

The regex engine should have quit after seeing the first character of the input.
Priority: -- → P3
Regression window for cases #2 and #4(m-c)
Good:
http://hg.mozilla.org/mozilla-central/rev/b69d30cc0b24
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0a1) Gecko/20110606 Firefox/7.0a1 ID:20110606110214
Bad:
http://hg.mozilla.org/mozilla-central/rev/3589f8cefd83
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0a1) Gecko/20110606 Firefox/7.0a1 ID:20110606132754
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b69d30cc0b24&tochange=3589f8cefd83

Regressed by:
	cc36a234d0d6	David Mandelin — Bug 625600: Update Yarr import to WebKit rev 86639, r=cdleary,dvander
Blocks: 625600
Status: UNCONFIRMED → NEW
Component: JavaScript Engine: JIT → JavaScript Engine
Ever confirmed: true
OS: Mac OS X → All
Version: 25 Branch → 7 Branch
Hmm.  Cases 2 and 4 run in the Yarr interpreter, I would think, due to https://bugs.webkit.org/show_bug.cgi?id=122891

That said, it's odd that the Yarr interpreter would take linear time on those.

Peter, just to make sure, are the regexps in comment 0 representative of the ones JISON is using, in that fixing that case will actually help JISON?
Priority: P3 → --
Flags: needinfo?(luke)
Hi Boris,

Yes, we parse JavaScript code with jison (http://almafa.org), and that's how we discovered the bug. We didn't try it with CoffeeScript though, but we're pretty sure the bug slows that down too.

By the way I thought the regexp engine used a finite automaton to parse the string.
I'm not familiar with the details of how Yarr works.  I know some regexps get compiled into optimized jitcode that does the match and some run in effectively a regexp interpreter.  How that last is implemented, I don't know.
I disavow any knowledge of this; I'm not sure who maintains regexps now...
Flags: needinfo?(luke)
Is Yarr a separate project? Should I file a bug there?
I have reported the bug here as well: https://bugs.webkit.org/show_bug.cgi?id=124319#c0.
Scrolling in CodeMirror is also extremely slow due to this bug when the window is big and syntax highlighting is turned on.
We don't use yarr anymore, we have irregexp instead. The jsfiddle in the first comment shows ~constant time for all the cases, so I suspect that as the result of the exec() call isn't used, it's dce'd by Ion and we are effectively measuring an empty loop.

If I change the initial code to do more runs and remember what the last result of exec() is (and print it to make sure it's not dce'd), I can see that we still have ~constant time. So closing as worksforme, feel free to open a new bug if you think the contrary.

http://jsfiddle.net/eP2TR/11/
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: