Closed Bug 362582 Opened 18 years ago Closed 5 years ago

Incorrect toString for regular expression with null character or line break

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
minor

Tracking

()

RESOLVED WORKSFORME
mozilla1.9

People

(Reporter: jruderman, Unassigned)

References

Details

(Keywords: testcase)

Attachments

(1 file)

js> eval("/a\x00b/g")                         
/a

js> eval("function() { return /a\x00b/g }")    
function () {
    return /a;
}
Assignee: general → crowder
Status: NEW → ASSIGNED
Crowder says this is essentially same bug:

js> new RegExp("\n") 
/
/
Same bug, in the sense that we should be treating non-printable characters specially in toString/toSource, yeah.
Summary: Incorrect toString for regular expression with null character → Incorrect toString for regular expression with null character or line break
Attached patch first stabSplinter Review
I'm more throwing this up as a first stab at an idea.  I think the idea has merit, but probably needs some help.

The "\n" case is new correct with my patch:
js> new RegExp("a\n")
/a\n/

The "\x00" case is better, but not right (maybe worse!?  Round-trip would compile, but will not have equivalent semantics):
js> new RegExp("a\x00b")
/a\0b/
Attachment #300153 - Flags: review?(brendan)
Comment on attachment 300153 [details] [diff] [review]
first stab

Throwing review from busy Brendan to mrbkap
Attachment #300153 - Flags: review?(brendan) → review?(mrbkap)
Comment on attachment 300153 [details] [diff] [review]
first stab

Seems reasonable.
Attachment #300153 - Flags: review?(mrbkap) → review+
Comment on attachment 300153 [details] [diff] [review]
first stab

Pretty easy, relatively harmless fix for what I think is a rarely-used type of reflection.
Attachment #300153 - Flags: approval1.9?
Comment on attachment 300153 [details] [diff] [review]
first stab

OK.  I'm approving this, but... really.. we should be working on blockers!!

I'm serious.
Attachment #300153 - Flags: approval1.9? → approval1.9+
I landed this for rc1.

Checking in js/src/jsregexp.c;
/cvsroot/mozilla/js/src/jsregexp.c,v  <--  jsregexp.c
new revision: 3.197; previous revision: 3.196
done
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
OS: Mac OS X → All
Hardware: Macintosh → All
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9
I believe this regressed the following tests

ecma_2/RegExp/properties-001.js
/[\\D]{1,5}[\\ -][\\d]/gi.toString() expected: /[\D]{1,5}[\ -][\d]/gi actual: /[\\D]{1,5}[\\ -][\\d]/gi reason: wrong value

ecma_2/RegExp/properties-001.js
/[\\S]?$/i.toString() expected: /[\S]?$/i actual: /[\\S]?$/i reason: wrong value

ecma_2/RegExp/properties-001.js
/[\\d]{5}/g.toString() expected: /[\d]{5}/g actual: /[\\d]{5}/g reason: wrong value

ecma_2/RegExp/properties-001.js
/\\097/gi.toString() expected: /\097/gi actual: /\\097/gi reason: wrong value

ecma_2/RegExp/properties-001.js
/\\u0051/im.toString() expected: /\u0051/im actual: /\\u0051/im reason: wrong value

ecma_2/RegExp/properties-001.js
/\\x45/gm.toString() expected: /\x45/gm actual: /\\x45/gm reason: wrong value

ecma_2/RegExp/properties-001.js
/^([a-z]*)[^\\w\\s\\f\NL\CR]+/m.toString() expected: /^([a-z]*)[^\w\s\fNLCR]+/m actual: /^([a-z]*)[^\\w\\s\\f\NL\CR]+/m reason: wrong value

ecma_2/RegExp/properties-002.js
/\\*{0,80}/m.toString() expected: /\*{0,80}/m actual: /\\*{0,80}/m reason: wrong value

ecma_2/RegExp/properties-002.js
/\\B/.toString() expected: /\B/ actual: /\\B/ reason: wrong value

ecma_2/RegExp/properties-002.js
/\\cA?/g.toString() expected: /\cA?/g actual: /\\cA?/g reason: wrong value

ecma_2/RegExp/properties-002.js
/\\w*/i.toString() expected: /\w*/i actual: /\\w*/i reason: wrong value

js1_2/regexp/RegExp_object.js
PHONE_pattern + ' is the string' expected: /\(?(\d{3})\)?-?(\d{3})-(\d{4})/ is the string actual: /\\(?(\\d{3})\\)?-?(\\d{3})-(\\d{4})/ is the string reason: wrong value

js1_2/regexp/RegExp_object.js
String(PHONE_pattern.toString()) expected: /\(?(\d{3})\)?-?(\d{3})-(\d{4})/ actual: /\\(?(\\d{3})\\)?-?(\\d{3})-(\\d{4})/ reason: wrong value

js1_5/Regress/regress-248444.js
toString/toSource of RegExp should escape slashes Section 1 of test -  expected: /[^\/]+$/ actual: /[^\\/]+$/ reason: Expected value '/[^\/]+$/', Actual value '/[^\\/]+$/'

js1_5/Regress/regress-248444.js
toString/toSource of RegExp should escape slashes Section 2 of test -  expected: /[^\/]+$/ actual: /[^\\/]+$/ reason: Expected value '/[^\/]+$/', Actual value '/[^\\/]+$/'

js1_5/Regress/regress-248444.js
toString/toSource of RegExp should escape slashes Section 3 of test -  expected: /[^\/]+$/ actual: /[^\\/]+$/ reason: Expected value '/[^\/]+$/', Actual value '/[^\\/]+$/'

js1_5/Regress/regress-248444.js
toString/toSource of RegExp should escape slashes Section 4 of test -  expected: /[^\/]+$/ actual: /[^\\/]+$/ reason: Expected value '/[^\/]+$/', Actual value '/[^\\/]+$/'

js1_5/Regress/regress-248444.js
toString/toSource of RegExp should escape slashes Section 5 of test -  expected: /[^\/]+$/ actual: /[^\\/]+$/ reason: Expected value '/[^\/]+$/', Actual value '/[^\\/]+$/'

js1_5/extensions/regress-96284-001.js
 expected: Expected exit 0 actual: Actual exit 3, signal 0 reason: ./js1_5/extensions/regress-96284-001.js:149: SyntaxError: unterminated parenthetical: ./js1_5/extensions/regress-96284-001.js:149: /ad;(lf)kj(2309\\/\\/)\\/\\// ./js1_5/extensions/regress-96284-001.js:149: ^

js1_5/extensions/regress-96284-002.js
 expected: Expected exit 0 actual: Actual exit 3, signal 0 reason: ./js1_5/extensions/regress-96284-002.js:149: SyntaxError: unterminated parenthetical: ./js1_5/extensions/regress-96284-002.js:149: /ad;(lf)kj(2309\\/\\/)\\/\\// ./js1_5/extensions/regress-96284-002.js:149: ^
Let's back it out, then.  I never put checkin-needed on it because I was mildly suspicious of the patch (though I don't think I encountered test regressions, hmm).  What is the policy on this?  Do patches with r+/a+ get landed regardless?
reverted:
jsregexp.c:3.198
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 300153 [details] [diff] [review]
first stab

Clearing flags to prevent confusion.  Sorry for the mayhem here.
Attachment #300153 - Flags: review+
Attachment #300153 - Flags: approval1.9+
Mass unowning JS bugs...  I'm not likely to be able to move these forward.
Assignee: crowder → general
The \n is still a problem, but we match Chrome. The first issue with \x00 is some problem with the shell, works okay in the browser. (null byte is dropped)
Assignee: general → nobody
No longer seems to happen on m-c rev d02d14a3dd6e, tested with --enable-debug, some of which return different results from the old comments above:

js> eval("/a\x00b/g")
/a
js> eval("function() { return /a\x00b/g }")
typein:1:8 SyntaxError: function statement requires a name:
typein:1:8 function() { return /ab/g }
typein:1:8 ........^
Stack:
  @typein:7:1
js> new RegExp("\n")
/\n/
js> new RegExp("a\n")
/a\n/
js> new RegExp("a\x00b")
/a
js>

Ted, do you think this sounds correct? Shall we resolve this WFM?
Flags: needinfo?(tcampbell)
|eval("function() { return /a\x00b/g }")| should now be |eval("(function() { return /a\x00b/g })")|. (This test case is old!)

Another testcase that gives problematic results in shell: |eval("(function() { return '\x00'; })")|.

Remaining bugs:
1) The shell still stops outputting regex at null when dumping. This is due to shell calling ToSource and then printing the string directly without sanitizing. This is a awkward because we want newlines to render.
2) In browser, the devtools pretty-printer shows unicode null box thing which doesn't seem ideal.

Overall, these are minor ergonomic fixes in the shell at best, and |toString()| computes the expected string in script so this feels silly to keep tracking.
Status: REOPENED → RESOLVED
Closed: 16 years ago5 years ago
Flags: needinfo?(tcampbell)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: