Closed
Bug 505083
Opened 15 years ago
Closed 13 years ago
ecma_3_1/RegExp/regress-305064.js - Is ZERO WIDTH SPACE
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
INVALID
People
(Reporter: bc, Unassigned)
References
Details
(Keywords: regression, testcase)
ecma_3_1/RegExp/regress-305064.js with jit only Is ZERO WIDTH SPACE (category Cf) a space reason: Expected value 'true', Actual value 'false' regression changeset: 30362:b837948c1daf user: Luke Wagner <lw@mozilla.com> date: Thu Jul 16 17:17:35 2009 -0700 summary: Bug 406271: add quantifier support for regexp->native compiler, r=dmandelin
Flags: in-testsuite+
Comment 1•15 years ago
|
||
Perhaps I am missing something in the spec, but 15.10.2.12 says that \s matches WhiteSpace (7.2) and LineTerminator (7.3). This list includes the Unicode category Zs (space separator). \u200B (zero width space) is not among any of these. Looking at http://www.fileformat.info/info/unicode/char/200b/index.htm, the Java Character.isSpaceChar() and Character.isWhitespace() properties are both true, which is probably why the interpreter returns true. So either (1) I'm misunderstanding the spec, (2) the spec has an omission, or (3) the test and interpreter are wrong. What do you think?
Comment 2•15 years ago
|
||
Unicode 5.1 section 6.2 supports Luke's understanding of the spec: -- One exceptional “space” character is U+200B zero width space. This character, although called a “space” in its name, does not actually have any width or visible glyph in display. It functions primarily to indicate word boundaries in writing systems that do not actually use orthographic spaces to separate words in text. It is given the General Category [gc=Cf] and is treated as a format control character, rather than as a space character, in implementations. Further discussion of U+200B zero width space, as well as other zero-width characters with special properties, can be found in Section 16.2, Layout Controls. -- But Python 3 does what the regression test expects, treating U+200B as a whitespace. Any Web compatibility or Unicode experts here?
Comment 3•13 years ago
|
||
Not a TM bug. Still occurs with a current JS shell. FWIW, d8 shows the same behavior.
Summary: TM: ecma_3_1/RegExp/regress-305064.js - Is ZERO WIDTH SPACE → ecma_3_1/RegExp/regress-305064.js - Is ZERO WIDTH SPACE
Comment 4•13 years ago
|
||
Everyone who's commented here seems to agree the test is buggy, and I will independently continue that trend. Moreover, at least one other engine agrees with SpiderMonkey and with our interpretation of the spec. I think that's enough to call this bug (and test) invalid.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•