Closed
Bug 1408452
Opened 7 years ago
Closed 7 years ago
/\x/ and /\u/ produce incorrect regular expressions
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla58
People
(Reporter: zirakertan, Assigned: zirakertan)
Details
Attachments
(2 files, 3 obsolete files)
645 bytes,
patch
|
arai
:
review+
|
Details | Diff | Splinter Review |
954 bytes,
patch
|
arai
:
review+
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0 Build ID: 20171009141542 Steps to reproduce: /\x/.test('x'); // false /\x/.test('xx'); // true Does not trigger when there's any character at all after the escape: /\x!/.test('x!'); // true (if you don't care about the debugging, Ctrl-F "solution") After a long debugging session, it came down to RegExpParser.cpp: https://dxr.mozilla.org/mozilla-central/source/js/src/irregexp/RegExpParser.cpp#1755-1765. The problem stems from the "backtracing" done in ParseHexEscape, and it's a subtle problem at that (at least for my tastes). In the beginning of the `case 'x'` label in ParseDisjunction, our state looks something like: current() = 0x5c ('\') position() = 0x...3368 ("\\x") next_pos_ = 0x...3369 ("x") has_more = true After the Advance(2) call in line 1756, we've supposedly stepped over both characters, giving us: current() = random memory position() = 0x...3369 ('x') next_pos_ = end_ = 0x...3360 has_more = false The important bit is that even though we've "stepped over" both characters, `position()` still returns `"x"` and not `end_`. Now inside of `ParseHexEscape`, line 371 saves our current position and when `Reset` is called on line 378, `has_more` is flipped back to `true`. We leave the function with the following state: current() = 0x78 ('x') position() = 0x...3369 ("x") next_pos_ = end_ = 0x...3360 has_more_ = true Outside of the function in `ParseDisjunction`:1763 adds 'x' to the regexp, giving us /x/. That's good. However, the `while` loop in `ParseDisjunction` continues since `has_more` is `true`. Now we're pointing at 'x' again which is consumed as a literal character, giving us the effective regexp /xx/. Yikes. This also happens for \u: /\u/.test('uu') === true There're several solutions to this sort of problem. It's a classic combination of "state management is hard" and "parsing is hard", giving us a bit of a headache. One simple and obvious solution is to add `has_more_` checks to `ParseX`, for instance: 1 file changed, 4 insertions(+) js/src/irregexp/RegExpParser.cpp | 4 ++++ modified js/src/irregexp/RegExpParser.cpp @@ -368,6 +368,10 @@ template <typename CharT> bool RegExpParser<CharT>::ParseHexEscape(int length, widechar* value) { + if (!has_more_) { + return false; + } + const CharT* start = position(); uint32_t val = 0; bool done = false; That works but it doesn't solve the underlying problem: Advance does not correctly set next_pos_ when has_more_ is set to false. Those are two pieces of state which should be linked but they are divorced in this specific case. A more thorough solution could be: modified js/src/irregexp/RegExpParser.cpp @@ -328,6 +328,7 @@ RegExpParser<CharT>::Advance() next_pos_++; } else { current_ = kEndMarker; + next_pos_ = end_ + 1; has_more_ = false; } } That also seems to be the approach Chromium took: https://cs.chromium.org/chromium/src/v8/src/regexp/regexp-parser.cc?l=96&rcl=094a7c93dcdcd921de3883ba4674b7e1a0feffbe
Test case on test262: https://github.com/tc39/test262/pull/1274
Updated•7 years ago
|
Assignee: nobody → zirakertan
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment 3•7 years ago
|
||
Comment on attachment 8918374 [details] [diff] [review] More appropriate solution thanks :) this looks good. can you export this as a patch file and attach it here, so that it contains commit message and author information?
Attachment #8918374 -
Flags: feedback+
Attached. Hopefully it's the correct format, as I am but a git plebian I followed the instructions in https://developer.mozilla.org/en-US/docs/Mercurial/Using_Mercurial#I%27m_all_used_to_Git_but_how_can_I_provide_Mercurial-ready_patches
Attachment #8918367 -
Attachment is obsolete: true
Attachment #8918374 -
Attachment is obsolete: true
Updated•7 years ago
|
status-firefox56:
--- → affected
status-firefox57:
--- → affected
status-firefox58:
--- → affected
status-firefox-esr52:
--- → affected
Comment 5•7 years ago
|
||
Comment on attachment 8918484 [details] [diff] [review] 0001-Fix-RegExpParser-position-tracking.patch Review of attachment 8918484 [details] [diff] [review]: ----------------------------------------------------------------- Thanks again :D I've triggered try run (automated testing) https://treeherder.mozilla.org/#/jobs?repo=try&revision=a41c28ea1b79b2d4a0fa9e0134fa30a22966807e while waiting for the try, can you fix the 1st line of the commit message to "Bug 1408452 - Fix RegExpParser position tracking" ? also, once your test262 patch gets merged there, can you create another patch to add the test under js/src/tests/test262/local ? with appending `reportCompare(0, 0);` at the end of file, like the following file: https://dxr.mozilla.org/mozilla-central/rev/196dadb2fe500e75c6fbddcac78106648676cf10/js/src/tests/test262/built-ins/RegExp/unicode_restricted_brackets.js#33 we're updating js/src/tests/test262 directory periodically, and js/src/tests/test262/local is the directory for adding tests temporary.
Attachment #8918484 -
Flags: review+
Comment 6•7 years ago
|
||
the test gets merged. zirak, can you prepare a patch for testcase, and also update the commit message of the 1st patch?
Flags: needinfo?(zirakertan)
Attachment #8918484 -
Attachment is obsolete: true
Flags: needinfo?(zirakertan)
Comment 8•7 years ago
|
||
Comment on attachment 8919928 [details] [diff] [review] 0001-Fix-RegExpParser-position-tracking.patch Review of attachment 8919928 [details] [diff] [review]: ----------------------------------------------------------------- Thank you :D
Attachment #8919928 -
Flags: review+
Comment 10•7 years ago
|
||
Comment on attachment 8919934 [details] [diff] [review] 0002-Bug-1408452-Add-test262-case.patch Review of attachment 8919934 [details] [diff] [review]: ----------------------------------------------------------------- Yes, thanks again!
Attachment #8919934 -
Flags: review+
Assignee | ||
Comment 12•7 years ago
|
||
Huzzah! What happens now?
Comment 13•7 years ago
|
||
these patches will be checked in shortly :)
Comment 14•7 years ago
|
||
Pushed by ryanvm@gmail.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/e3faec071445 Fix RegExpParser position tracking. r=arai https://hg.mozilla.org/integration/mozilla-inbound/rev/76fa19a2eef0 Add test262 test case. r=arai
Keywords: checkin-needed
Comment 15•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/e3faec071445 https://hg.mozilla.org/mozilla-central/rev/76fa19a2eef0
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Comment 16•7 years ago
|
||
Not a new issue, doesn't sound worth uplifting.
You need to log in
before you can comment on or make changes to this bug.
Description
•