Closed Bug 392378 Opened 12 years ago Closed 5 years ago

Javascript Regular Expression Non-participating Capture Groups are inaccurate in edge cases.


(Core :: JavaScript Engine, defect)

Not set





(Reporter: kris.kowal, Unassigned)




(Whiteboard: DUPEME)


(2 files, 1 obsolete file)

4.58 KB, patch
: review+
Details | Diff | Splinter Review
10.36 KB, application/x-javascript
User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en) AppleWebKit/522.11 (KHTML, like Gecko) Version/3.0.2 Safari/522.12
Build Identifier: Firefox

Javascript Regular Expression Non-participating Capture Groups are inaccurate in edge cases.  Steven Levithan wrote up a cross-browser test matrix comparing browser behaviors for edge case regular expressions to his interpretation of the ECMA specified behavior.  For details see his blog post:

Steve Levithan notes, in particular, that these cases fail for Firefox:

Input: "y".split(/(x)?\1y/)
Oracle:  ["", undefined, ""]
Actual: ["", "", ""]

Input: "y".split(/(x)?y/)
Oracle:  ["", undefined, ""]
Actual: ["", "", ""]

Input: "y".replace(/(x)?\1y/, function($0, $1){ return String($1); })
Oracle: "undefined"
Actual: ""

Input: "y".replace(/(x)?y/, function($0, $1){ return String($1); })
Oracle: "undefined"
Actual: ""

Input: "y".replace(/(x)?y/, function($0, $1){ return $1; })
Oracle: "undefined"
Actual: ""

Reproducible: Always
Totally sounds like a known bug crowder and mrbkap are tracking, for sure ;-).

Ever confirmed: true
Whiteboard: DUPEME
Assignee: general → crowder
Duplicate of this bug: 389717
Attached patch patchSplinter Review
Actually no RegExp bug, but bugs in String.prototype.replace and String.prototype.split.

Patch passes the test suite.
Assignee: crowder → x00000000
Attachment #313880 - Flags: review?(brendan)
MochiTests on this would be nice add'l info.  I'm going to steal the review from Brendan on this, though, in the hopes that he won't mind.  :)
Comment on attachment 313880 [details] [diff] [review]

Looks good to me, thanks.  Will land conditional upon a1.9+ and mochitest results.  If you haven't got time to run the mochitests, please let me know.
Attachment #313880 - Flags: review?(brendan)
Attachment #313880 - Flags: review+
Attachment #313880 - Flags: approval1.9?
I don't know how to run the mochitests yet.
Assuming you have a Mozilla tree and Firefox built, just run:

python $objdir/_tests/testing/mochitest/ --autorun

...where $objdir is the value for MOZ_OBJDIR, if you set it, and otherwise is /probably/ just your source directory (but I'm not entirely certain, to be honest).
Thanks, Jeff, that's easy and also doesn't take too long (20 minutes). Mochitests passed.

There are (at least) 2 more instances in the code where an empty string for non-matched parens is returned: The $n replacements in String.prototype.replace and the global RegExp.$n properties.

In String.prototype.replace that behavior is accordingly to the spec, and the global RegExp properties are non-standard. Konqueror and Opera have also an empty string there (don't know about IE; Konqueror doesn't have the properties until the first use of a RegExp).

BTW, RegExp.$n goes only up to RegExp.$9. Opera has also only the first 9 paren matches as global properties, but Konqueror even has RegExp["$1.8e19"] (limit is slightly less than 2^64).
Wanted for the next time -- the day before RC1 freeze isn't the time to be taking these non-blocker non-regression wins, delicious though they look.
Flags: wanted-next+
Comment on attachment 313880 [details] [diff] [review]

Per discussion with shaver, deferring until post-1.9 (like, 1.9.1)
Attachment #313880 - Flags: approval1.9?
Comment on attachment 313880 [details] [diff] [review]

a-: non-regression non-blocker to repair edge case.  dot next will enjoy this much more.
I should also add that it would be great to get that edge-case test set into a suite somewhere.
bclary:  Would be great to get this known-failure into js/tests, if you can lend a hand.
Flags: in-testsuite?
crowder, Sure. I have a major update to go in tonight. I'll do this one after that lands.
This test fails also for bug 427733 and bug 427823, which I found while writing it. It doesn't test the split case.

tests/ecma_2/String/replace-001.js seems to be obsolete; String.prototype.replace wasn't in ECMAScript 2 final.

Code like this isn't exactly an edge case:

              function (x, p) { return p != null ? +p + 1 : "???" });

But you can easily work around it by writing

              function (x, p) { return p ? +p + 1 : "???" });

instead (that's what I do anyways), and most cases where you would need to compare explicitly against null like

  str.replace(/\s+($)?/g, function (x, p) { return p == null ? " " : "" });

don't work per spec (they do in Perl).
Attached file 3 more tests
Attachment #314767 - Attachment is obsolete: true
x00000000:  Thanks for the tests here!
/cvsroot/mozilla/js/tests/ecma_3/String/regress-392378.js,v  <--  regress-392378.js
initial revision: 1.1

Checking in ecma_3/String/;
/cvsroot/mozilla/js/tests/ecma_3/String/,v  <--
initial revision: 1.1

/cvsroot/mozilla/js/tests/public-failures.txt,v  <--  public-failures.txt
new revision: 1.65; previous revision: 1.64

Flags: in-testsuite?
Flags: in-testsuite+
Flags: in-litmus-
Duplicate of this bug: 435990
back into the pool.
Assignee: x00000000 → general
Marking as duplicate of bug 369778. That one doesn't have a patch, but it is very slightly newer and has current-era discussion going on.
Closed: 5 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 369778
You need to log in before you can comment on or make changes to this bug.