Closed
Bug 78156
Opened 24 years ago
Closed 23 years ago
m flag of regular expression does not work with $
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
VERIFIED
FIXED
People
(Reporter: georg.maass, Assigned: rogerl)
Details
(Keywords: testcase, Whiteboard: [Don't try testcase in NN4.7; m flags not recognized there])
Attachments
(2 files, 1 obsolete file)
2.56 KB,
text/html
|
Details | |
1.67 KB,
patch
|
brendan
:
review+
brendan
:
superreview+
|
Details | Diff | Splinter Review |
Mozilla/5.0 (Windows; U; WinNT4.0; en-US; 0.8.1) Gecko/20010319
If the regular expression contains a $ to find an item at the line end, then the
m flag is ignored. With ^ instead of $ to find the item at line beginnings it
works well
// Problem
// -------
string='hghh\njasdhgflö<s bp98>szf 3434
8dxfnv8yvksjgk\r\n8ewrabduzase87bxdnbsj4711\r\nksdjkjertk789h';
a=new RegExp('\\d$','m');
document.write('<h2>Regulärer Ausdruck</h2><pre>'+a.toString()+'<\/pre>');
document.write('<h2>Ergebnis
regExp.test(Teststring)<\/h2><pre>'+a.test(string)+'<\/pre>');
document.write('<h2>Ergebnis
Teststring.search(regExp)<\/h2><pre>'+string.search(a)+'<\/pre>');
// No problem
// -----------
a=new RegExp('^\\d','m');
document.write('<h2>Regulärer Ausdruck</h2><pre>'+a.toString()+'<\/pre>');
document.write('<h2>Ergebnis
regExp.test(Teststring)<\/h2><pre>'+a.test(string)+'<\/pre>');
document.write('<h2>Ergebnis
Teststring.search(regExp)<\/h2><pre>'+string.search(a)+'<\/pre>');
Comment 1•24 years ago
|
||
Updated•24 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Windows NT → All
Updated•24 years ago
|
Whiteboard: [Note: don't try testcase in NN4.7; it doesn't have m flag]
Comment 2•24 years ago
|
||
Testcase added to JS test suite:
/js/tests/ecma_3/RegExp/regress-78156.js
Comment 3•24 years ago
|
||
NOTE: the testcase is failing in Rhino as well as in Spidermonkey -
Whiteboard: [Note: don't try testcase in NN4.7; it doesn't have m flag] → [Don't try testcase in NN4.7; m flags not recognized there]
Comment 4•24 years ago
|
||
To run this testcase directly in the JS or Rhino shell, do this:
[/d/JS_TRUNK/mozilla/js/src/WINNT4.0_DBG.OBJ] ./js
js> load('../../tests/ecma_3/shell.js')
js> load('../../tests/ecma_3/RegExp/shell.js')
js> load('../../tests/ecma_3/RegExp/regress-78156.js')
Comment 5•24 years ago
|
||
In Perl:
$string = "aaa\n789\r\nccc";
$pattern = /^\d/m; --> Perl match == '7'
$pattern = /\d$/m; --> Perl match == null
IF WE TAKE THE '\r' OUT OF THE STRING:
$string = "aaa\n789\nccc";
$pattern = /^\d/m; --> Perl match == '7'
$pattern = /\d$/m; --> Perl match == '9'
IF WE HAVE ONLY '\r' IN THE STRING AND NOT '\n'
$string = "aaa\r789\rccc";
$pattern = /^\d/m; --> Perl match == null
$pattern = /\d$/m; --> Perl match == null
Comment 6•24 years ago
|
||
Same thing in the JS shell: if you remove all '\r' from the teststring,
and use ONLY '\n' to mark a newline, the testcase passes...
Thus, the issue is "How should JS treat '\r' or '\r\n' in multiline RegExps?"
Comment 7•24 years ago
|
||
From ECMA Edition 3:
7.3 Line Terminators
Like white space characters, line terminator characters are used to
improve source text readability and to separate tokens (indivisible
lexical units) from each other. However, unlike white space characters,
line terminators have some influence over the behaviour of the syntactic
grammar. In general, line terminators may occur between any two tokens,
but there are a few places where they are forbidden by the syntactic
grammar. A line terminator cannot occur within any token, not even a
string. Line terminators also affect the process of automatic semicolon
insertion (section 7.8.5).
The following characters are considered to be line terminators:
Code Point Value Name Formal Name
\u000A Line Feed <LF>
\u000D Carriage Return <CR>
\u2028 Line separator <LS>
\u2029 Paragraph separator <PS>
Syntax
LineTerminator ::
<LF>
<CR>
<LS>
<PS>
15.10 RegExp (Regular Expression Objects)
15.10.2.6 Assertion
The production Assertion :: ^ evaluates by returning an internal AssertionTester
closure that takes a State argument x and performs the following:
1. Let e be x's endIndex.
2. If e is zero, return true.
3. If Multiline is false , return false .
4. If the character Input[e1] is one of the line terminator characters
<LF>, <CR>, <LS>, or <PS>, return true.
5. Return false .
The production Assertion :: $ evaluates by returning an internal AssertionTester
closure that takes a State argument x and performs the following:
1. Let e be x's endIndex.
2. If e is equal to InputLength, return true.
3. If multiline is false , return false .
4. If the character Input[e] is one of the line terminator characters
<LF>, <CR>, <LS>, or <PS>, return true.
5. Return false.
Comment 9•23 years ago
|
||
Rhino fix verified on WinNT. Testcase passes in both rhino, rhinoi shells -
Comment 10•23 years ago
|
||
Hmm, so this still needs to be fixed in the browser?
("abcdef\rghijf").split(/$/m) returns an array with just one element for me.
This breaks Venkman on Mac because of
http://lxr.mozilla.org/mozilla/source/extensions/venkman/resources/content/venkman-outliners.js#436.
Rob, on Mac your sourceText array contains only one line for the scripts because
the regexp /$/m doesn't work for Mac line endings :/.
http://lxr.mozilla.org/mozilla/source/js/src/jsregexp.c#1940 seems suspect to
me. Same for line 1953 in that file.
Comment 11•23 years ago
|
||
Comment 12•23 years ago
|
||
Well, here is a proposed patch. It matches $ as \n, \r, or \r\n, regardless of
platform. The remaining problem is that these match characters are not removed
fromt the result array.
Assignee | ||
Comment 13•23 years ago
|
||
Assignee | ||
Comment 14•23 years ago
|
||
Comment on attachment 48472 [details] [diff] [review]
patch round one; jsregexp.c changes
Have to obsolete the rginda patch as there's no ECMA support for multiline line separators.
Attachment #48472 -
Attachment is obsolete: true
Comment 15•23 years ago
|
||
cc'ing reviewers -
Comment 16•23 years ago
|
||
r=khanson
Comment 17•23 years ago
|
||
Comment on attachment 53661 [details] [diff] [review]
Added support for all Unicode line terminators
sr=brendan@mozilla.org and noting khanson's r= here via the patch manager.
rogerl, you should mail drivers for a= to get this into 0.9.6.
/be
Attachment #53661 -
Flags: superreview+
Attachment #53661 -
Flags: review+
Assignee | ||
Comment 18•23 years ago
|
||
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 19•23 years ago
|
||
Verified FIXED.
The standalone JS testcase now passes in both the debug and optimized
JS shells on WinNT, Linux, and Mac9.1. The HTML testcase above also
passes in Mozilla trunk builds 20011206xx on all three platforms.
If you run the HTML testcase, note the discrepancy in behavior between
String.prototype.search() and String.prototype.match(). Per the ECMA
spec, the former ignores global properties of the regexp it is given.
(ECMA-262 Edition 3 Final, Section 15.5.4.12)
That is why you see multiple matches from Teststring.match(),
but only one from Teststring.search() The multiline flag we have set
in the test regexp is honored by the one method, but ignored by the other.
This is according to the ECMA spec.
Status: RESOLVED → VERIFIED
Updated•20 years ago
|
Flags: testcase+
You need to log in
before you can comment on or make changes to this bug.
Description
•