Closed Bug 225289 Opened 21 years ago Closed 21 years ago

Regexp pattern /a|[^a]/ not matching correctly

Categories

(Core :: JavaScript Engine, defect)

x86
All
defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: pschwartau, Assigned: brendan)

References

Details

(Keywords: js1.5, regression, Whiteboard: [no such issue in Rhino ])

Attachments

(2 files)

Reported by Rob Mueller:

--- 
It seems the recent changes checked in on Oct/22 (bug 85721)
have introduced a bug into the regexp engine in Mozilla.
 
This is affecting our website (http://www.fastmail.fm). A description of
the problem is here: http://forums.mozillazine.org/viewtopic.php?t=31354

An example that reproduces the problem:

<head><script>
function test() {
  var s = "test";
  var re = /((?:a|[^a])*)/g;
  re.lastIndex = 0;
  for (var i = 0; i < 2; i++) {
    var matches = re.exec(s);
    alert("matches=" + matches.join(":") + 
          ", index=" + matches.index + 
          ", lastIndex=" + re.lastIndex);
  }
}
</script></head>


Clearly (?:a|[^a]) is really the same as anything, so ((?:a|[^a])*) 
should match any string, however it fails to match in the latest 
nightly builds. I tested this and it works correctly in older versions 
of Mozilla, IE and Opera.
---
Testcase added to JS testsuite:

      mozilla/js/tests/ecma_3/RegExp/regress-225289.js

Let me know if there are any errors -
Keywords: js1.5, regression
zack-weg@gmx.de wrote via email:

> This is probably a problem with optimizing [a] and [^a] to point to the same
> character class and failing to remember different states for both. Have no time
> to look into it in more detail until friday. Please file a bug and cc me.

More tomorrow.

/be
Assignee: general → brendan
Priority: -- → P1
Target Milestone: --- → mozilla1.6beta
Blocks: 223650
It turns out another bug was filed on this, bug 223650:
"Using webmail ... causes Firebird to not respond ...."

I have made the current bug a dependency for bug 223650,
and left that bug open. That way, the contributors there
can check that http://www.myfastmail.com works properly
once we fix this bug.

Upgrading the severity here to critical, for parity -
Severity: normal → critical
Blocks: 225068
Attached patch proposed fixSplinter Review
ALTPREREQ2 was pretty busted.  I hope someone's around to review.

/be
Attachment #135541 - Flags: superreview?(shaver)
Attachment #135541 - Flags: review?(zack-weg)
Phil, I see this from mozilla/js/tests/ecma_3/RegExp/regress-225289.js when
testing the patch (attachment 135542 [details] [diff] [review]):

FAILED!: [reported from test()] Section 10 of test -
FAILED!: [reported from test()] regexp = /((?:a|[^a])*)/g
FAILED!: [reported from test()] string = 'a'
FAILED!: [reported from test()] ERROR !!! regexp failed to give expected match
array:
FAILED!: [reported from test()] Expect: ["a", "a"]
FAILED!: [reported from test()] Actual: ["a", ""]

Per ECMA 15.5.4.10 second bullet, matching 'a' against /((?:a|[^a])*)/g should
result in an array with two elements, the first from the first element of the
result of RegExp.prototype.exec at lastIndex 0, the second from the first el't
of the result of RegExp.prototype.exec at lastIndex 1.  The second match result
is "", because there is no "a" at index 1.

There are similar errors in the test for similar global regexps.

/be
Marking blocking1.6b+.  Phil, are there other regressions from the regexp
rewrite that should block 1.6beta?

/be
Status: NEW → ASSIGNED
Flags: blocking1.6b+
The other regressions of the regexp rewrite are:

bug 224676  JS RegExp failing on certain disjunction + character class patterns
bug 225343  Regexp pattern /[A]/i not matching 'a'

It looks like your patch for the current bug fixes bug 224676, too.
(At least it makes the testcase for that bug pass).


Thank you for pointing out the errors in the testcase for this bug.
I have corrected them, but the patch still fails on Section 11:

FAILED!: [reported from test()] Section 11 of test -
FAILED!: [reported from test()] regexp = /((?:a|[^a])*)/g
FAILED!: [reported from test()] string = ''
FAILED!: [reported from test()] ERROR !!! match arrays have different lengths:
FAILED!: [reported from test()] Expect: ["", , ]
FAILED!: [reported from test()] Actual: [""]


I think this section is correct. There is one capturing parens in this
regular expression, and per ECMA it must hold something: see bug 123437,  
"regexp backreferences /(a)? etc./ must hold |undefined| if not used".

I believe it must hold |undefined| (and not the empty string)
per bug 209919 comment 14 and the discussion there.
Severity: critical → normal
Priority: P1 → --
Target Milestone: mozilla1.6beta → ---
Phil: a global regexp such as /((?:a|[^a])*)/g, when passed to
String.prototype.match, does not result in an array containing extra elements
from capturing parens.  See ECMA-262 Edition 3 15.5.4.10, second bullet.

I'm checking in.  I should have noticed bug 224676 before, but it looks like
zack came up with a similar patch, then a patch to remove PREREQ* opcodes
altogether, then questioned that because of a 7% slowdown.  I'll close 224676
now.  I can't wait for reviews much longer, because I'm off to Comdex and 1.6b
freezes on Tuesday night.

/be
Blocks: 224676
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Testcase repaired. I keep forgetting the distinctions between
re.exec(str) and str.match(re) when the global flag is set!

Marking Verified FIXED. The testcase now passes in both the 
debug and optimized JS shell. Note, it also passes in Rhino.
Status: RESOLVED → VERIFIED
Whiteboard: [no such issue in Rhino ]
Comment on attachment 135541 [details] [diff] [review]
proposed fix

clearing sr request - setting r=r=zack-web@gmx.de based on bug checkin comment
Attachment #135541 - Flags: superreview?(shaver)
Attachment #135541 - Flags: review?(zack-weg)
Attachment #135541 - Flags: review+
Flags: testcase+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: