JS RegExp failing on certain disjunction + character class patterns

VERIFIED FIXED

Status

()

Core
JavaScript Engine
VERIFIED FIXED
15 years ago
15 years ago

People

(Reporter: Oliver Klee, Unassigned)

Tracking

({js1.5, regression})

Trunk
x86
Linux
js1.5, regression
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [no such issue in Rhino ], URL)

Attachments

(3 attachments, 2 obsolete attachments)

(Reporter)

Description

15 years ago
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6a) Gecko/20030729
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6a) Gecko/20031031

JSLint at http://www.crockford.com/javascript/jslint.html so far has worked
fine, but now it always reports "Unclosed string" for non-empty string literals.
Probably this is a regression in the regexp code.

Reproducible: Always

Steps to Reproduce:
1. Load URL.
2. Enter the following code:
   var test = "hallo";
3. Press the "jslint" button.

Actual Results:  
JSlint displays the following:

(1) : error at character 13: Unclosed string: hallo";

Expected Results:  
JSlint displays the following:

ok


Global:  test

This works fine in Opera 7.21, Konqueror 3.1 IE 6.0.2 and Mozilla 1.5.
(Reporter)

Updated

15 years ago
Keywords: regression
Summary: JSLint fails to recognize string → JSLint fails to recognize string
(Reporter)

Updated

15 years ago
Summary: JSLint fails to recognize string → JSLint fails to recognize string delimiters

Comment 1

15 years ago
I bet that this is fallout from bug 85721 (which also fixed bug 123437). See 
also bug 190685.

Comment 2

15 years ago
This is most likely a regexp problem.

Comment 3

15 years ago
The problem is caused by this:

  /^((\\[^\x00-\x1f]|[^\x00-\x1f"\\])*)"/.exec( "hallo\";" );

Smallest testcase I found is this:

  /a|[^b]b/.exec( "xb" );

Note that

  /a|[^b]{1}b/.exec( "xb" );

works as expected.

Comment 4

15 years ago
The problem exists if all of the following are true:
- a disjunction has 2 or more alternatives
- a character class is the first term of an alternative
- it has no quantifier
- the alternative is one of the 2 last alternatives

So, this fails:
  /a|[x]b/.exec( "xb" );
  /[x]b|a/.exec( "xb" );
  /([x]b|a)/.exec( "xb" );
  /([x]b|a)|a/.exec( "xb" );

But this works:
  /^[x]b|a/.exec( "xb" );
  /([x]b)|a/.exec( "xb" );
  /()[x]b|a/.exec( "xb" );
  /x[b]|a/.exec( "xb" );
  /[x]{1}b|a/.exec( "xb" );
  /[x]b|a|a/.exec( "xb" );

Comment 5

15 years ago
This works too:
  /[x]b|[a]/.exec( "xb" );
  /[x]b|a+/.exec( "xb" );
  /[x]b|a{1}/.exec( "xb" );
  /[x]b|(a)/.exec( "xb" );
  /[x]b|()a/.exec( "xb" );
  /[x]b|^a/.exec( "xb" );

Comment 6

15 years ago
Zack: thanks, as always. You've saved us a lot of time!
Cc'ing Brendan; will add regression testcase later today -
Keywords: js1.5
Summary: JSLint fails to recognize string delimiters → JS RegExp failing on certain disjunction + character class patterns
I wish rogerl were around to help.  Phil, do you know of any way to reach him?

/be

Comment 8

15 years ago
Yes, I will try to get in touch with Roger today -

Comment 9

15 years ago
ALTPREREQ2 seems to be broken. It seems to be generated only if the character
class has no quantifier, and seems to fail. But I don't understand the code enough
to say why.

BTW, the simplest testcase is
  /[x]|x/.exec( "x" );

Comment 10

15 years ago
Created attachment 134771 [details] [diff] [review]
patch (effects mostly unknown)

By comparing with REOP_ALTPREREQ I guess the code for REOP_ALTPREREQ2 was meant

like in the patch. But since REOP_ALTPREREQ2 is almost never generated, nobody
saw the error in testcases.
(Reporter)

Comment 11

15 years ago
Requesting 1.6b blocker.
Flags: blocking1.6b?

Comment 12

15 years ago
Created attachment 134785 [details] [diff] [review]
patch working with inverted character classes too
Attachment #134771 - Attachment is obsolete: true

Comment 13

15 years ago
Created attachment 134786 [details] [diff] [review]
extended patch, trying to create more REOP_ALTPREREQ2

Simple approach to account for REOP_QUANTs. May or may not work. Untested.

Comment 14

15 years ago
Created attachment 134799 [details] [diff] [review]
working extended patch

This patch passes the test suite.

Besides fixing the bug it makes the prerequisites useful for character classes,

because they will be created for classes with quantifiers too (unless they have

a minimum of zero), and most real-world character classes have quantifiers.

The benchmark in attachment 99877 [details] of bug 85721 executes about 3% faster, but
there is a penalty at compile time, making attachment 116384 [details] about 3% slower.

Updated

15 years ago
Attachment #134786 - Attachment is obsolete: true

Comment 15

15 years ago
Testcase added to JS testsuite:

      mozilla/js/tests/ecma_3/RegExp/regress-224676.js

Currently passing in the Rhino shell; failing in SpiderMonkey on 
Sections 1-6, 20, and 22.

Comment 16

15 years ago
Created attachment 134817 [details] [diff] [review]
alternative patch, removing prerequisites altogether

Prerequisites in the current form aren't that useful. They apply only to the
last two alternatives and only if both begin with a flat or character class
(and have no quantifier without my previous patch). Maybe it's better to remove
the code bloat.

A IMHO better implementation would give /each/ alternative a prerequisite
whenever possible to give the opportunity for a short-circuit.

Brendan: Are you still planning to rewrite the regexp code again? If so, then
it's not worth spending much thoughts on such relatively small improvements.
Otherwise I may try to make a better patch.

Has anybody good regexp benchmarks? It would be interesting to compare this
patch with the previous patch. attachment 99877 [details] is about 7% slower with this
patch, so prerequisites aren't completely useless.

Updated

15 years ago
Depends on: 225289
Fixed by patch for bug 225289, which was quite similar to this bug's working
extended patch, although developed independently!

/be
Status: NEW → RESOLVED
Last Resolved: 15 years ago
Resolution: --- → FIXED

Comment 18

15 years ago
Verified FIXED. 

The above testcase now passes in the SpiderMonkey shell.
Note, it already did pass in the Rhino shell -
Status: RESOLVED → VERIFIED
Whiteboard: [no such issue in Rhino ]

Updated

15 years ago
Flags: blocking1.6b?
You need to log in before you can comment on or make changes to this bug.