Closed
Bug 98306
Opened 23 years ago
Closed 23 years ago
JS regexp code crashes in ParseAtom for script using Regexp() [@ParseAtom]
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
VERIFIED
FIXED
People
(Reporter: jrgmorrison, Assigned: rogerl)
References
()
Details
(Keywords: crash, js1.5)
Crash Data
Attachments
(3 files)
2.27 KB,
patch
|
Details | Diff | Splinter Review | |
894 bytes,
patch
|
rogerl
:
review+
rogerl
:
superreview+
|
Details | Diff | Splinter Review |
3.39 KB,
text/plain
|
Details |
Overview Description:
JS parser crashes in ParseAtom for script using Regexp()
Steps to Reproduce:
1) go to
http://msdn.microsoft.com/library/default.asp?url=/workshop/author/dhtml/refere
nce/dhtmlrefs.asp
2) click on the item labelled 'DHTML Properties' in the left hand frame
3) go back a page, then forward a page.
Test case with just a simple html document and the offending JS file
is as http://jrgm/bugs/ParseAtomCrash/foo.html (and foo.js).
Actual Results: You should crash. If not reload a couple of times.
Expected Results: Bliss.
Reproducibility: 100% after some number of tries.
Build Date & Platform Bug Found: build pulled 8/2 on win2k
and confirmed in 8/29 win2k comm. verification build.
Additional Information:
Stack trace:
ParseAtom(CompilerState * 0x00bfeb29) line 919 + 8 bytes
ParseQuantAtom(CompilerState * 0x00bfeb17) line 657 + 13 bytes
ParseItem(CompilerState * 0x032d5ce0) line 634 + 6 bytes
ParseRegExp(CompilerState * 0x030c88c0) line 497
js_NewRegExpObject(JSContext * 0x030c88c0, JSTokenStream * 0x031dab08, unsigned
short * 0x0316a760, unsigned int 8, unsigned int 0) line 2846 + 15 bytes
js_GetToken(JSContext * 0x00c041cc, JSTokenStream * 0x030c88c0) line 1152 + 27
bytes
js_MatchToken(JSContext * 0x030c88c0, JSTokenStream * 0x031dab08, int 8) line
1278 + 13 bytes
ArgumentList(JSContext * 0x00000000, JSTokenStream * 0x00000000, JSTreeContext
* 0x00001312, JSParseNode * 0x00b7d010) line 2534
0306e010()
030c88c0()
00b7d200()
00cc7c88()
00ce0b48()
... and these weird symbols continue on for 100's of frames
Reporter | ||
Comment 1•23 years ago
|
||
Reporter | ||
Comment 2•23 years ago
|
||
Seems funny that this would crash in that routine (which hasn't changed much
for some time). But, also seems like this is a pretty high profile place to
crash for web developers (reading MS online docs) so we should probably fix
this sooner as opposed to later (yes, it's ironic; fix the mozilla browser
so people can read docs on IE, but ...).
Keywords: nsbranch
Comment 3•23 years ago
|
||
Testcase added to JS test suite:
mozilla/js/tests/ecma_3/RegExp/regress-98306.js
If you want to run this directly in the JS shell, independently of
the test driver, do e.g.:
[D:/JS_TRUNK/mozilla/js/src/WINNT4.0_DBG.OBJ] ./js
js> load('D:/JS_TRUNK/mozilla/js/tests/ecma_3/shell.js')
js> load('D:/JS_TRUNK/mozilla/js/tests/ecma_3/RegExp/shell.js')
js> load('D:/JS_TRUNK/mozilla/js/tests/ecma_3/RegExp/regress-98306.js')
As jrgm notes, you don't crash 100% of the time, but pretty nearly ...
Comment 4•23 years ago
|
||
The testcase can be reduced to a single line:
"Hello".match(/[/]/);
Remember, if you key this directly in the JS shell, you may need to
execute it a few times before the crash occurs. Happens on Linux, too.
OS : Win2K --> All.
As I understand it this is incorrect syntax, because "/" is a special
character and therefore must be escaped in the middle of the RegExp.
Yet this is what the Microsoft site fails to do:
this.userAgent.match(/Mozilla[/].*(95[/]NT|95|NT|98|3.1).*Opera.*(\d+)\.(\d+)/)
On the other hand, we shouldn't crash on this!
Comment 5•23 years ago
|
||
By the way, when you execute the one-line testcase and DON'T crash,
this is what you see in the JS shell:
js> "Hello".match(/[/]/);
1: missing ) after argument list:
1: "Hello".match(/[/]/);
1: .................^
Comment 6•23 years ago
|
||
Note: in Rhino (both the interpreted and compiled shells), I "never"
crash on either the one-line testcase or the larger testcase, no matter
how many times I execute them. I "always" get the error message instead:
js> "Hello".match(/[/]/);
js: "<stdin>", line 17: uncaught JavaScript exception:
SyntaxError: missing ) after argument list (<stdin>; line 17)
js: "Hello".match(/[/]/);
js: .................^
Comment 7•23 years ago
|
||
I just realized, there is an ambibuity in the regular expression /[/]/.
I had been regarding it as:
"/" is an element of a character set
But "]" is a special character, so the parser might view the whole thing as:
"]" is being escaped
Comment 8•23 years ago
|
||
'ambibuity' : my best one yet
Comment 9•23 years ago
|
||
Note: this alone causes SpiderMonkey to crash!
js> var re = /[/
Whereas Rhino says:
js> var re = /[/
js: uncaught JavaScript exception: SyntaxError: Unterminated character class [
Comment 10•23 years ago
|
||
More tests in SpiderMonkey:
Here's the original regexp; takes a few tries before crashing:
js> var re = /[/]/
8: missing ; before statement:
8: var re = /[/]/
8: ............^
js> var re = /[/]/
9: missing ; before statement:
9: var re = /[/]/
9: ............^
js> var re = /[/]/
CRASH !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Now add a letter in the character set:
js> var re = /[/a]/
1: invalid flag after regular expression:
1: var re = /[/a]/
1: ............^
(STABLE; DOESN'T SEEM TO CRASH)
Now try two backslashes in the character set. Crashes after awhile:
js> var re = /[//]/
16: syntax error:
16: var re = /[//]/
16: .............^
js> var re = /[//]/
17: syntax error:
17: var re = /[//]/
17: .............^
js> var re = /[//]/
CRASH !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
Comment 11•23 years ago
|
||
Compare: Rhino gives the same three error messages as in the entry above,
but remains stable. I can't ever get it to crash on any of these -
Reporter | ||
Comment 12•23 years ago
|
||
So, spidermonkey is not trapping the bogus syntax (although to some degree
this is a parseable syntax if you lookahead to see whether the second '/'
is or isn't the last '/' in the expression. [But I'm not volunteering to
do that, as that probably is more of a world of hurt to do than I can even
imagine :-]).
And when it botches the parse, that sets the pointer a-walking randomly
through other people's heap allocations (Purify reports 680 ABR - Array Bounds
Reads for the simple case of `var re = /ab[/]cd/i;'). Here is the first ABR
with details:
[E] ABR: Array bounds read in ParseAtom {1 occurrence}
Reading 2 bytes from 0x0989a020 (2 bytes at 0x0989a020 illegal)
Address 0x0989a020 is 1 byte past the end of a 8 byte block at 0x0989a018
Address 0x0989a020 points to a malloc'd block in heap 0x02d90000
Thread ID: 0x50c
Error location
ParseAtom [jsregexp.c:910]
if (!ren)
return NULL;
=> while ((c = *++cp) != ']') {
if (cp == state->cpend) {
js_ReportCompileErrorNumber(state->context,
state->tokenStream,
NULL,
??? [ip=0x0989a01c]
??? [ip=0x099ba028]
RelExpr [jsparse.c:2300]
tc->flags &= ~TCF_IN_FOR_INIT;
#endif /* JS_HAS_IN_OPERATOR */
=> pn = ShiftExpr(cx, ts, tc);
while (pn &&
(js_MatchToken(cx, ts, TOK_RELOP)
#if JS_HAS_IN_OPERATOR
EqExpr [jsparse.c:2275]
BitAndExpr [jsparse.c:2263]
BitXorExpr [jsparse.c:2250]
BitOrExpr [jsparse.c:2237]
AndExpr [jsparse.c:2226]
OrExpr [jsparse.c:2215]
Allocation location
malloc [msvcrt.DLL]
JS_malloc [jsapi.c:1409]
if (nbytes == 0)
nbytes = 1;
cx->runtime->gcMallocBytes += nbytes;
=> p = malloc(nbytes);
if (!p)
JS_ReportOutOfMemory(cx);
return p;
js_NewRegExpObject [jsregexp.c:2843]
JSObject *obj;
JSRegExp *re;
=> str = js_NewStringCopyN(cx, chars, length, 0);
if (!str)
return NULL;
re = js_NewRegExp(cx, ts, str, flags, JS_FALSE);
js_GetToken [jsscan.c:1152]
RelExpr [jsparse.c:2300]
EqExpr [jsparse.c:2275]
BitAndExpr [jsparse.c:2263]
BitXorExpr [jsparse.c:2250]
BitOrExpr [jsparse.c:2237]
AndExpr [jsparse.c:2226]
(Note: I don't know why the two stack frames preceding ParseAtom are
" ??? [ip=0x0989a01c]" and " ??? [ip=0x099ba028]". That's what Purify
coughed up.].
Comment 13•23 years ago
|
||
Forget my idea at 2001-09-05 13:09 above - I had my slashes mixed up.
Of course it's the BACKslash that escapes...
Comment 14•23 years ago
|
||
*** Bug 99377 has been marked as a duplicate of this bug. ***
Comment 15•23 years ago
|
||
See the duplicate bug 99377 for a look at jsregexp.c line: 910
Reporter | ||
Comment 16•23 years ago
|
||
I just want to reiterate that this must be fixed for emojo, imho. Way too
common a location for developers (i.e., msdn).
Assignee | ||
Comment 17•23 years ago
|
||
Assignee | ||
Comment 18•23 years ago
|
||
This patch fixes the crash. But I notice that we don't handle [] & [^] per ECMA,
I guess that should be a separate bug.
Status: NEW → ASSIGNED
Comment 19•23 years ago
|
||
BTW, Perl accepts the regexp /[/]/
Here are a couple examples:
regexp /[/]/
string "Hello"
NO MATCH
regexp /[/]/
string "Hello/there"
MATCH!!! /
Comment 20•23 years ago
|
||
*** Bug 99667 has been marked as a duplicate of this bug. ***
Comment 21•23 years ago
|
||
cc'ing potential reviewers -
Comment 23•23 years ago
|
||
r=jst
Comment 24•23 years ago
|
||
Bug 100199 has been filed for the [], [^] issue -
Updated•23 years ago
|
Summary: JS parser crashes in ParseAtom for script using Regexp() → JS regexp code crashes in ParseAtom for script using Regexp()
Assignee | ||
Comment 25•23 years ago
|
||
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 26•23 years ago
|
||
On the basis of the JS shell testcase,
mozilla/js/tests/ecma_3/RegExp/regress-98306.js,
ready to mark VERIFIED. Testcase now passes on WinNT, Linux, and Mac.
But I will wait until tomorrow's builds come out and test the msdn URL
before wrapping this one up -
Reporter | ||
Comment 27•23 years ago
|
||
I'd like to see this on the nsbranch.
(Bob the developer goes to MSDN to check some docs.
Netscape crashes. "Nice browser! Netscape sucks.").
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 28•23 years ago
|
||
I just realized something else: what the patch has done is prevent JS
from crashing on the regular expression /[/]/. But it still produces a
syntax error:
js> "Hello/there".match(/[/]/);
3: unterminated character class [:
3: "Hello/there".match(/[/]/);
3: ....................^
Contrast this with the Perl example above:
regexp /[/]/
string "Hello/there"
MATCH!!! /
So - it's good that we don't crash on this, but shouldn't we accept
this regexp as a valid one? Or as John pointed out above, would that
be asking too much of this fix? Compare what happens in the JS shell
when you use the RegExp constructor instead of a literal RegExp:
js> var re = new RegExp("[/]");
js> "Hello/there".match(re);
/
Reporter | ||
Comment 29•23 years ago
|
||
Which perl is working? This fails for me, with "unmatched [] in regexp".
print "match: $1, $2\n"
if "one/two" =~ /(.*)[/](.*)/;
But, at any rate, let's take the crash fix, and leave the rest of this to
some other bug. (It borders on ambiguous, and the alternate constructor
syntax allows this re to be written. What the author was thinking when
they wrote the test in that page was, well, I don't know).
Can I get an nsbranch+ for the crash fix, at a high profile collection of
pages? (I doubt we can 'evangelize' this one away :-]).
Comment 30•23 years ago
|
||
> Which perl is working?
my $pattern = "(.*)[/](.*)";
print "match: $1, $2\n"
if "one/two" =~ /$pattern/;
OUTPUT
match: one, two
This approach is similar to the RegExp constructor syntax in JS,
in that the pattern string is parsed separately from the // operator.
Comment 31•23 years ago
|
||
> So - it's good that we don't crash on this, but shouldn't we accept
> this regexp as a valid one?
I believe I can answer this from a reading of the ECMA-262 Final spec.
It says that "/" IS permitted as a "pattern character" (Chapter 15.10.1)
but is specifically NOT permitted in RegExp literals (Chapter 7.8.5).
Hence "/" may be provided to the RegExp function or constructor as part
of a pattern, but may not appear (unescaped) as part of a RegExp literal.
Thus, we seem to be acting exactly according to spec.
I will attach the documentation below. A similar example is the following:
RegExp("") is permitted, but the literal // is not (it's the comment sign).
Comment 32•23 years ago
|
||
Comment 33•23 years ago
|
||
SUMMARY
"/" allowed in RegExp patterns "/" allowed unescaped in RegExp literals
Perl 5 YES NO
ECMA-262 YES NO
JS1.5 YES NO
IE4.7 YES YES
To see the IE4.7 behavior, try this:
javascript: var re = /[/]/;
var str = "Hello/there";
var res = str.match(re);
alert(res);
OUTPUT:
/
Comment 34•23 years ago
|
||
NOTE: SpiderMonkey, Rhino are also handling things like this correctly:
js> var re = /[/a]/
1: invalid flag after regular expression
1: var re = /[/a]/
1: ............^
js> var re = /[/g]/
8: unterminated character class [:
8: var re = /[/g]/
Comment 35•23 years ago
|
||
Verification in browser: using trunk builds 20010919xx on WinNT, Linux, Mac9.1
No problem loading the MSDN site, reloading it, going back and forth, etc.
Similarly, no problem with jrgm's foo.html testcase. I never crashed.
Assignee | ||
Updated•23 years ago
|
Attachment #49232 -
Flags: superreview+
Attachment #49232 -
Flags: review+
Assignee | ||
Comment 36•23 years ago
|
||
Pulled and built branch engine, ran tests fine.
Low risk fix - the loop is simply re-structured to detect the end condition
earlier, before the increment, no other changes.
Assignee | ||
Comment 37•23 years ago
|
||
My boss 'thesteve' says this is a nsbranch+ nominee.
Assignee | ||
Updated•23 years ago
|
Status: REOPENED → RESOLVED
Closed: 23 years ago → 23 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 38•23 years ago
|
||
Fix checked in to branch.
Comment 39•23 years ago
|
||
Bug 101070 has been filed on the evangelism issue -
Comment 40•23 years ago
|
||
Verified fixed on 0.9.4 branch. Using Mozilla branch binaries 20010923xx
on WinNT, Mac 9.1, and Linux. I verified the msdn URL given in this bug,
plus the two msdn URLS given in the duped bugs:
bug 99377
bug 99667
Also verified against jrgm's "foo.html" testcase.
Also verified with trunk builds 20010923xx.
Marking VERIFIED FIXED -
Status: RESOLVED → VERIFIED
Comment 41•23 years ago
|
||
Adding the signature to the summary for talkback tracking. Good job guys!
Summary: JS regexp code crashes in ParseAtom for script using Regexp() → JS regexp code crashes in ParseAtom for script using Regexp() [@ParseAtom]
Reporter | ||
Comment 42•23 years ago
|
||
*** Bug 103559 has been marked as a duplicate of this bug. ***
Updated•20 years ago
|
Flags: testcase+
Updated•14 years ago
|
Crash Signature: [@ParseAtom]
You need to log in
before you can comment on or make changes to this bug.
Description
•