Closed
Bug 938020
Opened 11 years ago
Closed 8 years ago
Regex runtime depends on input size when it shouldn't
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: peter.isza, Unassigned)
References
Details
Attachments
(1 file)
605 bytes,
text/html
|
Details |
User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:25.0) Gecko/20100101 Firefox/25.0 (Beta/Release) Build ID: 20131025151332 Steps to reproduce: This bug makes JISON very slow, which makes compiling CoffeeScript on Firefox very slow. Slower than IE. Jsfiddle: http://jsfiddle.net/eP2TR/10/. I have run the following regular expressions on strings that contain only letter 'a': 1. /^b+/ 2. /^(b)+/ 3. /^(?:b+)/ 4. /^(?:(?:b)+)/ Obviously, there should be no matches. The regex engine should quit after reading the first character, so the running time shouldn't depend on what's after the first letter in the string. Safari 6 also contains the same bug (I guess it is using the same sh*tty regex compiler), while Chrome, IE10 and Opera don't cont. Actual results: In cases #2 and #4 the running times depended (linearly) on the size of the input. Expected results: The regex engine should have quit after seeing the first character of the input.
Reporter | ||
Updated•11 years ago
|
Priority: -- → P3
Comment 1•11 years ago
|
||
Regression window for cases #2 and #4(m-c) Good: http://hg.mozilla.org/mozilla-central/rev/b69d30cc0b24 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0a1) Gecko/20110606 Firefox/7.0a1 ID:20110606110214 Bad: http://hg.mozilla.org/mozilla-central/rev/3589f8cefd83 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0a1) Gecko/20110606 Firefox/7.0a1 ID:20110606132754 Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b69d30cc0b24&tochange=3589f8cefd83 Regressed by: cc36a234d0d6 David Mandelin — Bug 625600: Update Yarr import to WebKit rev 86639, r=cdleary,dvander
Blocks: 625600
Status: UNCONFIRMED → NEW
Component: JavaScript Engine: JIT → JavaScript Engine
Ever confirmed: true
OS: Mac OS X → All
Version: 25 Branch → 7 Branch
Comment 2•11 years ago
|
||
Hmm. Cases 2 and 4 run in the Yarr interpreter, I would think, due to https://bugs.webkit.org/show_bug.cgi?id=122891 That said, it's odd that the Yarr interpreter would take linear time on those. Peter, just to make sure, are the regexps in comment 0 representative of the ones JISON is using, in that fixing that case will actually help JISON?
Priority: P3 → --
Updated•11 years ago
|
Flags: needinfo?(luke)
Reporter | ||
Comment 3•11 years ago
|
||
Hi Boris, Yes, we parse JavaScript code with jison (http://almafa.org), and that's how we discovered the bug. We didn't try it with CoffeeScript though, but we're pretty sure the bug slows that down too. By the way I thought the regexp engine used a finite automaton to parse the string.
Comment 4•11 years ago
|
||
I'm not familiar with the details of how Yarr works. I know some regexps get compiled into optimized jitcode that does the match and some run in effectively a regexp interpreter. How that last is implemented, I don't know.
Comment 5•11 years ago
|
||
I disavow any knowledge of this; I'm not sure who maintains regexps now...
Flags: needinfo?(luke)
Reporter | ||
Comment 6•11 years ago
|
||
Is Yarr a separate project? Should I file a bug there?
Comment 7•11 years ago
|
||
Yeah. The right place to file Yarr bugs is probably https://bugs.webkit.org/enter_bug.cgi?product=WebKit&component=JavaScriptCore
Reporter | ||
Comment 8•11 years ago
|
||
I have reported the bug here as well: https://bugs.webkit.org/show_bug.cgi?id=124319#c0.
Reporter | ||
Comment 9•11 years ago
|
||
Scrolling in CodeMirror is also extremely slow due to this bug when the window is big and syntax highlighting is turned on.
Comment 10•8 years ago
|
||
We don't use yarr anymore, we have irregexp instead. The jsfiddle in the first comment shows ~constant time for all the cases, so I suspect that as the result of the exec() call isn't used, it's dce'd by Ion and we are effectively measuring an empty loop. If I change the initial code to do more runs and remember what the last result of exec() is (and print it to make sure it's not dce'd), I can see that we still have ~constant time. So closing as worksforme, feel free to open a new bug if you think the contrary. http://jsfiddle.net/eP2TR/11/
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•