Closed Bug 313591 Opened 19 years ago Closed 19 years ago

Instantiated regexp object not reset after reconstruction in different scope-chain of same functioncall

Categories

(Core :: JavaScript Engine, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 237111

People

(Reporter: crisp, Unassigned)

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b3) Gecko/20050712 Firefox/1.0+
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8b3) Gecko/20050712 Firefox/1.0+

In the following example the second time function do_something is called (and the regexp object is thus instantiated in a different scope-chain of this function) the lastIndex is not reset and the result will not be the first match from the passed string:

var string = 'abc';
alert(do_something(string)); // a
alert(do_something(string)); // b
function do_something(string)
{
	var re = /\w/g;
	alert(re.lastIndex); // 0 in first call, 1 on second call
	return (re.exec(string))[0];
}

The following situations however work fine:

within same scope, only re-instantiating the regexp object:
var string = 'abc';
var re = /\w/g;
alert(re.lastIndex); // 0
alert((re.exec(string))[0]); // 'a'
re = /\w/g;
alert(re.lastIndex); // 0
alert((re.exec(string))[0]); // 'a'

When using a different function:
var string = 'abc';
alert(do_something(string)); // a
alert(do_something2(string)); // a
function do_something(string)
{
	var re = /\w/g;
	alert(re.lastIndex); // 0
	return (re.exec(string))[0];
}
function do_something2(string)
{
	var re = /\w/g;
	alert(re.lastIndex); // 0
	return (re.exec(string))[0];
}

Reproducible: Always
This is duplicate of bug 237111.
Ok, I understand that during scanning a regular expression literal is compiled, and that during execution all evaluations to that same literal are actual references to the same compiled object.
What I couldn't quite find in the specification is if the lastIndex should be reset to 0 upon such evaluation. Imho it should.

Workaround anyway is easy: either reset the lastIndex yourself to 0 after evaluation, or don't use a regexp literal but use the constructor.

*** This bug has been marked as a duplicate of 237111 ***
Status: UNCONFIRMED → RESOLVED
Closed: 19 years ago
Resolution: --- → DUPLICATE
(In reply to comment #2)
> What I couldn't quite find in the specification is if the lastIndex should be
> reset to 0 upon such evaluation.

That's because there are no such words in the spec.

> Workaround anyway is easy: either reset the lastIndex yourself to 0 after
> evaluation, or don't use a regexp literal but use the constructor.

We've talked about this in ECMA TG1.  One proposal to make the latter easier to write is to allow:

  var re = new /(\d+)\s*/g;

where the new acts as usual, calling the [[Construct]] internal method of the regexp literal singleton, which clones that regexp each time the new expression is evaluated.

We are not going to make an incompatible change, or violate the spec, but by such extensions, we hope to make the problems people have (this is a big one) easier to work around.  Comments welcome.

/be
(In reply to comment #4)
> We've talked about this in ECMA TG1.  One proposal to make the latter
> easier to write is to allow:
>   var re = new /(\d+)\s*/g;
> where the new acts as usual, calling the [[Construct]] internal method of
> the regexp literal singleton, which clones that regexp each time the new
> expression is evaluated.

According to the spec, |new RegExp(regexpObject)| clones
given regexp object, except for its |lastIndex| property.
http://bclary.com/2004/11/07/ecma-262.html#a-15.10.4.1

The following code should alert "false/true/true/ture/true/false".
Fx-trunk and IE6 produce correct results. Can anyone test on Opera?

var R1 = /(\d+)\s*/g;
R1.lastIndex = 1;
var R2 = new RegExp(R1);
alert([
  R1 === R2,
  R1.source == R2.source,
  R1.global == R2.global,
  R1.ignoreCase == R2.ignoreCase,
  R1.multiline == R2.multiline,
  R1.lastIndex == R2.lastIndex
].join("/"));
(In reply to comment #5)
> (In reply to comment #4)
> > We've talked about this in ECMA TG1.  One proposal to make the latter
> > easier to write is to allow:
> >   var re = new /(\d+)\s*/g;
> > where the new acts as usual, calling the [[Construct]] internal method of
> > the regexp literal singleton, which clones that regexp each time the new
> > expression is evaluated.
> 
> According to the spec, |new RegExp(regexpObject)| clones
> given regexp object, except for its |lastIndex| property.

Yes, that's good, but it's still verbose, ugly or awkward or impossible given quoting requirements, and it doesn't let the compiler create the regexp.

The idea with 'var re = new /(\d+)\s*/g' is that each individual regexp object (however it was created, whether from a literal or a new RegExp expression) is also a constructor that clones itself.

/be
(In reply to comment #5)
opera 8.5 sez: false/true/true/true/true/false
You need to log in before you can comment on or make changes to this bug.