Closed Bug 443590 Opened 16 years ago Closed 9 years ago

String replace with regular expression with lazy quantifier and capturing parentheses gives unexpected result

Categories

(Core :: JavaScript Engine, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: meelluc, Unassigned)

References

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; nl; rv:1.9) Gecko/2008052906 Firefox/3.0
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; nl; rv:1.9) Gecko/2008052906 Firefox/3.0

If I replace a string using a regular expression with a lazy quantifier and capturing parentheses, the result is not what I expect.

Below the HTML/Javascript to reproduce the error. The outcome of the code below should be:
XXX {:bbb:} {:ccc:}
but is:
XXXX {:ccc:}


<HTML>
<HEAD>
<SCRIPT type="text/javascript">
var text = "{:aaa:} {:bbb:} {:ccc:}";
var re = new RegExp("{:aaa(.*?)?:}","i");
var replaced = text.replace(re,"XXXX");
alert (replaced);

</SCRIPT>
</HEAD>
<BODY>
Regular expressions test in Firefox 3
</BODY>
</HTML>

Reproducible: Always

Steps to Reproduce:
1. make a html file with the following HTML:
<HTML>
<HEAD>
<SCRIPT type="text/javascript">
var text = "{:aaa:} {:bbb:} {:ccc:}";
var re = new RegExp("{:aaa(.*?)?:}","i");
var replaced = text.replace(re,"XXXX");
alert (replaced);

</SCRIPT>
</HEAD>
<BODY>
Regular expressions test in Firefox 3
</BODY>
</HTML>

2. open this file in FireFox 3 or FireFox 2
3. the result should be: XXX {:bbb:} {:ccc:}
Actual Results:  
XXX {:ccc:}

Expected Results:  
XXX {:bbb:} {:ccc:}

The regular expression can easily be rewritten in another form, but a conditional lazy quantifier should work as excepted. So it is a bug, but a minor bug.
Assignee: nobody → general
Component: General → JavaScript Engine
Product: Firefox → Core
QA Contact: general → general
Looks like bug 359651 to me
Depends on: 359651
Assignee: general → nobody
IE11, Chrome 45 and Nightly give the same result: "XXXX {:ccc:}"
As bug 359651 got resolved as invalid, I'll do the same here.
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
Resolution: --- → INVALID
I just checked the specification and returning "XXX {:ccc:}" is correct per ES2015. 

Evaluating the term `(.*?)?` calls RepeatMatcher [1] with:
  RepeatMatcher(m=Matcher(.*?), c=Continuation(Matcher(:)), min=0, max=1, greedy=true, x.endIndex=3)

Following RepeatMatcher step 9, the matcher for the term `.*?` is called. That leads to another invocation of RepeatMatcher with:
  RepeatMatcher(m=Matcher(.), c=Continuation(Repeat:Matcher(.*?)), min=0, max=Infinity, greedy=false, x.endIndex=3)

This non-greedy invocation enters step 8.a which calls the continuation of the first RepeatMatcher. The continuation returns the "failure" token (step 2.1), because the x and y states still have the same endIndex. That leads to entering step 8.c of the second RepeatMatcher and consuming the ":" character at index 5 of the input string with the matcher for the term `.`. At this point the `.` matcher consumes all other characters until it reaches the ":}" characters of "{:bbb:}". The ":}" characters are matched by `:}` and the pattern evaluation stops. 


[1] http://tc39.github.io/ecma262/#sec-runtime-semantics-repeatmatcher-abstract-operation
You need to log in before you can comment on or make changes to this bug.