Closed Bug 39438 Opened 25 years ago Closed 25 years ago

Malformed RegExp literals cause entire JavaScript file to not be parsed

Categories

(Core :: JavaScript Engine, defect, P3)

x86
Windows NT
defect

Tracking

()

RESOLVED FIXED

People

(Reporter: aw, Assigned: rogerl)

Details

(Keywords: js1.5, Whiteboard: [nsbeta3-])

Attachments

(3 files)

From Bugzilla Helper: User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT) BuildID: 2000051609 A malformed RegExp literal does not trigger any error message, and causes the entire JavaScript file to not be parsed. Reproducible: Always Steps to Reproduce: 1. Run the attached file using xpcshell: xpcshell bad-re.js 2. Edit the file and comment out the reBad literal, and uncomment the reCorrect literal. 3. Run the edited file in xpcshell again. Actual Results: Step 1. The bad RegExp caused the file to not be parsed - none of the dump() statements executed etc. Step 3. The file runs fine when the RegExp is corrected. Expected Results: The bad RegExp should have failed to parse and generated some kind of error. This also occurs when a JavaScript file with a bad RegExp is referenced from a XUL source file.
setting status to NEW
Status: UNCONFIRMED → NEW
Ever confirmed: true
Sorry for the delay - taking over QA for the JavaScript Engine group - Confirming bug. You can use the reporter's test cases in the ordinary JS Shell by replacing the function "dump" with the function "print".
Status: NEW → ASSIGNED
The problem here is that after the RE has been lexed it gets sent to the RegExp parser. This detects and reports the error through the usual ReportErrorNumber() mechanism - i.e. sets an execption in the context. The RegExp parser also returns NULL which triggers an abend through the language parser, ultimately reaching the CompileScript level as a JS_FALSE value. At this point the API just returns and the error report never sees the light of day. How about if we add js_ReportUncaughtException() after the NewRegExp call in the scanner? Or does that seem too limited a fix?
Shouldn't any report-uncaught-exception magic happen at the highest level, in the API entry point code? It does seem odd to me that JS_CompileUCScriptForPrincipals (the common subroutine of all the compile-script API entry points) sets a null error reporter around the CompileTokenStream call, then does nothing to recover a report from any uncaught exception. McCabe, what's the deal? /be
Keywords: nsbeta3
Keywords: js1.5
Anyone have a clue? I still think some code layer near the API entry point should recover an uncaught exception and report it. In fact, I would say that the code I cited previously, which intentionally sets a null error reporter, should then worry about uncaught exceptions. But I crave mccabe's insight. /be
I think this is important, but well formed code is not affected. Marking [nsbeta3-].
Whiteboard: [nsbeta3-]
The problem might be that the regexp parse failure is reported via js_ReportUncaughtException instead of js_ReportCompileErrorNumber (as with unterminated regexp, a few lines above.) js_ReportCompileErrorNumber checks cx->interplevel to decide whether to directly call the JS error reporter if this is a toplevel compile, or to package the error as an exception. In the case of a toplevel compile, we wanto to call the embedding-supplied error reporter directly. Regexp compilation should either do something equivalent, or communicate the error to the scanner in such a way that the scanner can call js_ReportCompileErrorNumber.
I get it: all the JS_Compile* entry points report compile-time errors promptly only if the underlying code goes through js_ReportCompileErrorNumber at top level (cx->interpLevel == 0), else they turn those errors into exceptions. Right? My patch attempts to preserve that behavior with minimal change. McCabe's suggestion of propagating information back from the regexp parser to jsscan.c, so that js_GetToken can call js_ReportCompileErrorNumber, sounds better, because it could give an error indicator with source context, pointing into the middle of a regexp literal at the offending sub-token. But that sounds like work! /be
Brendan - Might make sense to land your intermediate-fix. I'm guessing that passing ts through to the regexp compiler where possible (and adding it to a bunch of parameter lists) won't be as hard as all that, but it might be too pervasive for these times. This'd also require teaching js_ReportCompileErrorNumber about a null ts.
Umm, oops. I just realized that I left in my experimental fix for this when I was checking in the --> as line comment change. Would you be so kind as to blow it off (line 1146) when you patch?
Attaching diffs that modify js_ReportCompileError to tolerate a null tokenstream, and add a tokenstream (possibly null) to the RegExp compiler state. regexp compile error calls are switched to js_ReportCompileError. I've tested in js, and the js_ReportCompileError distinction between compiletime and execution time seems to work. Diffs are against JS_150_RC2 branch.
Righteous, mccabe! ra=brendan@mozilla.org, get it in. /be
Fix checked in.
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: