Regular expression "(A)?(A.*)" picks 'A' twice

VERIFIED FIXED in mozilla0.9.8

Status

()

Core
JavaScript Engine
VERIFIED FIXED
17 years ago
13 years ago

People

(Reporter: Koike Kazuhiko, Assigned: rogerl (gone))

Tracking

Trunk
mozilla0.9.8
x86
All
Points:
---
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [fixed in Rhino 2001-08-27])

Attachments

(3 attachments, 3 obsolete attachments)

(Reporter)

Description

17 years ago
Regular expression "(A)?(A.*)" picks 'A' twice.

var str = "ABCDEF";
var regexp = /^(A)?(A.*)$/;
regexp.exec(str);

Actual Result: RegExp.$1 = 'A'; RegExp.$2 = 'ABCDEF';
Expected Result: RegExp.$1 = ''; RegExp.$2 = 'ABCDEF';
(Reporter)

Comment 1

17 years ago
Created attachment 39625 [details]
Testcase

Comment 2

17 years ago
Created attachment 39675 [details]
Testcase corrected to run on Bugzilla

Comment 3

17 years ago
Using Mozilla 2001062113 WinNT.

The corrected testcase has the test function called from button onClick()
rather than from form onSubmit(). This is necessary to run the testcase
on the Bugzilla server. 

It is very interesting to run the testcase with the test strings 
"AAA", then "AA",  then "A": 

regexp = /^(A)?(A.*)$/;      
str = "AAA"

NN4.7 $1=A, $2=AA
Moz:  $1=A, $2=AA


str = "AA"

NN4.7 $1=A, $2=A
Moz:  $1=A, $2=A


str = "A"

NN4.7 $1= , $2=A
Moz:  $1=A, $2=A


It is only when there is a single "A" that things go wrong in Moz - 

Comment 4

17 years ago
Here is how Perl handles this:

pattern  /(A)?(A.*)/

string   AAA
$1=A
$2=AA

string   AA
$1=A
$2=A

string   A
$1=
$2=A

Comment 5

17 years ago
Certainly Mozilla (and JS1.5) are not handling the string "A" correctly.


But I'm puzzled the way NN4.7, (and IE4.7), and Perl handle it. 
Since the regexp is /^(A)?(A.*)$/, and we do a greedy match from 
left to right, why do they assign the match to the SECOND subexpression, 
and not the FIRST? 

i.e. for the string "A", shouldn't we expect $1=A, $2=

???????????????????????????????????????????????????????

Comment 6

17 years ago
Possible answer: NN4.7, IE4.7 and Perl are not being "greedy".

"No match" is a successful outcome for the ? operator, since it means 
"match 0 or 1 time". So if you're not being greedy, you can match 0 times
and move on to the next subexpression to the right...

Comment 7

17 years ago
Note NN4.7, IE4.7 and Perl ARE being "greedy" when the string is "AAA" or "AA". 


pattern  /(A)?(A.*)/

string   AAA
$1=A
$2=AA

string   AA
$1=A
$2=A


They "consume" as much as they can (1 character) in the first 
subexpression, and give only the remainder to the second one.

But when the string is "A", they do not consume as much as they
can in the first subexpression. The first subexpression accepts
less: a 0-character match, to let the second subexpression match. 

"Altruistic" behavior to allow other parts of the pattern to match ...

Comment 8

17 years ago
Testcase added to JS test suite:

              js/tests/ecma_3/RegExp/regress-87231.js

Note: depends on latest version of:

              js/tests/ecma_3/RegExp/shell.js

Comment 9

17 years ago
Note: this testcase is also failing in Rhino...

Updated

17 years ago
Whiteboard: [need fix in Rhino, too]
(Assignee)

Comment 10

17 years ago
Rhino fix checked in.
Status: NEW → ASSIGNED

Comment 11

17 years ago
Rhino fix verified on WinNT. Testcase passes in both rhino, rhinoi shells -

Updated

17 years ago
Whiteboard: [need fix in Rhino, too] → [fixed in Rhino 2001-08-27]
(Assignee)

Comment 12

16 years ago
Created attachment 53792 [details] [diff] [review]
Reset paren contents after failed greedy child attempt

Comment 13

16 years ago
cc'ing reviewers -
(Assignee)

Comment 14

16 years ago
Created attachment 60740 [details] [diff] [review]
Here's the patch applied to 3.49

Seems like there were some problems applying the earlier patch
(Assignee)

Comment 15

16 years ago
Created attachment 60742 [details] [diff] [review]
Here's the patch applied to 3.49

Once again, with context.
Attachment #60740 - Attachment is obsolete: true
(Assignee)

Comment 16

16 years ago
Created attachment 62276 [details] [diff] [review]
Patched patch

I messed up shifting to later revision, trying again,
Attachment #53792 - Attachment is obsolete: true
Attachment #60742 - Attachment is obsolete: true

Comment 17

16 years ago
Comment on attachment 62276 [details] [diff] [review]
Patched patch

r=khanson
Attachment #62276 - Flags: review+
Comment on attachment 62276 [details] [diff] [review]
Patched patch

sr=brendan@mozilla.org
Attachment #62276 - Flags: superreview+
Another patch with r= and sr=, but not checked in, and it's now < 4 hours till
0.9.8 freezes.  What's the hold-up?

/be
(Assignee)

Comment 20

16 years ago
Fix checked in.
Status: ASSIGNED → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla0.9.8

Comment 21

16 years ago
Marking Verified FIXED. The above testcase now passes in the
debug and optimized JS shell on Linux, WinNT, and Mac9.1.
Status: RESOLVED → VERIFIED

Updated

13 years ago
Flags: testcase+
You need to log in before you can comment on or make changes to this bug.