Closed Bug 367388 Opened 14 years ago Closed 13 years ago

Javascript test() method fails on regexp

Categories

(Firefox :: General, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 98409

People

(Reporter: mailing-list, Unassigned)

References

Details

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.8.1.1) Gecko/20061204 Firefox/2.0.0.1

the method test() applied on a regular expression seems to fail. 
I used a regexp that check if a string is composed only with word with length of at least 3 characters



Reproducible: Always

Steps to Reproduce:
To reproduce the error simply try the code below

[code]
<script type="text/javascript">
function foobar(){
    var ricerca = document.getElementById('str').value; 
    var re=/^(\s?[\S]{3,}\s?)*$/g
    
    if (!(re.test(ricerca))) { 
        alert ("it fails"); 
    } 
    else {
        alert ("it works"); 
    }
    document.getElementById('str').focus(); 
}
</script>
<input type="text" id="str" value="test string">
<input type="button" value="test" onclick="foobar()">
[/code]

Actual Results:  
at first test, the method test() works correctly
at 2nd it fails
at 3rd it works
at 4th it fails
.... and so on

Expected Results:  
it must not fails
it must only alert "it works"

if I change the line 
 -> re.test(ricerca)

with this
 -> ricerca.match(re)

the example works as expected
I can reproduce the bug under Ubuntu Linux and Windows XP.

The bug only works for 'g' expressions.

Here is a simple example:

t = /abc/g; // g is important
t.test("abc"); // true
t.test("abcabc"); // true
t.test("abcabc"); // false

Maybe it is like this: The regex object remembers in g mode the position of the last result and starts it search there.
I used Firefox 2.0.0.3
Did some testing and came up w/ the following results:
It seems like after doing a global match with the 'g' flag, rhino does not properly reset the LastIndex Property of the corresponding RegExp Object.
Was poking around a little with basically the following skript:

<script type="text/javascript">
function foobar(){
    var s1 = "foo's da ****!"; 
//    var s1 = "foo"; 
    var s2 = "bar is the ****"; 
//    var s2 = "bar"; 
	var re1, re2;
	var res1 = new Array(100);
	var res2 = new Array(100);
	document.writeln("Matches '/foo/' and '/bar/' against the test strings 100x<br>"); 

	for (var i = 0; i < 100; i++) {
		re1 =  /foo/g;
		re2 =  /bar/g;
	    res1[i] = /foo/g;
    	res2[i] = /bar/g;
   // 	res1[0].lastIndex=-1;
 //   	res2[0].lastIndex=-1;
 //		re1.lastIndex=-1;
	//	re2.lastIndex=-1;
		
	    if (i%2==0){
			if (!(res1[i].test(s1))) { 
//			if (!(re1.test(s1))) { 
    		    document.writeln(i+":foo matches!<br>"); 
    		} 
	    	else {
    		    document.writeln(i+":foo does'nt match!<br>"); 
    		}
    	}else{	    
	    
		    if (!(res2[i].test(s2))) { 
//		    if (!(re2.test(s2))) { 
    		    document.writeln(i+":bar matches!<br>"); 
    		} 
	    	else {
    		    document.writeln(i+":bar does'nt match!<br>"); 
    		}
    	}
    	document.writeln(i+":r1 Lidx="+res1[i].lastIndex+"<br>"); 
    	document.writeln(i+":r2 Lidx="+res2[i].lastIndex+"<br>"); 

	}
}
</script>

<input type="button" value="test" onclick="foobar()">


If you use the same RE object twice with the global flag the LastIndex property is set to the last match and not reset.
But following intuition it should because the global flag only affects the current match, not any future ones.
This causes the next match with the same pattern to fail and reset the property so that the match after succeeds.
Resetting the lastIndex property by hand to -1 or matching an empty string also achieves this.
See here: http://lists.evolt.org/archive/Week-of-Mon-20050829/175309.html
Not using the global flag also does not help, it just causes every match to fail.

It also seems that RE objects with the same pattern refer to the same singleton object so re-creating every time does not help, moreover the lastIndex Property can be reset on any object in the array.
This generally applies to all mozilla-browers but not to Opera or IE.
Both do not honor that script at all and never match.
I believe I have just come across the same bug with the following sample code:

var re = new RegExp('([^asd])', 'g');
alert(re.test('t')); // returns true
alert(re.test('t')); // returns false

Granted, in the above example the 'g' is redundant but I find this unexpected behaviour highly undesirable. Surely the test method of the same RegExp object should always return the same result when provided with the same string?

Using Firefox 2.0.0.11
This is all working as intended, if I understand the bug reports correctly.

When a RegExp has the "g" flag, it doesn't reset the lastIndex value to zero when it starts a new match, so it starts from that position. It resets
lastIndex to zero when it fails to match.
(ECMAScript 3.ed. section 15.10.6.2, steps 3-6 and 10-11).

RegExp literals create RegExp objects at parse time, so each literal corresponds to only one RegExp object, no matter how many times it's evaluated.
(ECMAScript 3.ed. section 7.8.5)
The singleton RegExp object per literal is considered a bug in ES3 now, based on experience evident as testimony in bugzilla (this bug and many like it), and on the exceptional evaluation model (other "literals" in JS are evaluated each time to make a new object).

Also, mutable objects should not be shared singletons when expressed literally, since the programmer can't tell where all the mutations occur in any program where references to the literal regexp escape. Programmers can make this hazard for themselves if they like even with a fixed regexp literal evaluation design, using global variables or shared heap references -- but the language should not do it by default.

So, ECMA-262 Edition 3.1, which is available in draft form now (see http://wiki.ecmascript.org/doku.php?id=es3.1:es3.1_proposal_working_draft), fixes the design flaw in ES3 by making each evaluation of a regexp literal create a new RegExp instance.

Technically, this bug report is a dup of bug 98409. We could morph it into a bug to track ES3.1, but I think it's better to mark it a dup. When we are close to having ES3.1 done (by next spring, probably sooner in the case at hand) we should file a new bug asking to track its change to regexp literal evaluation.

Cc'ing sayrer -- Rob, do you know of a tracking bug for the 3.1 change (I didn't see one at a glance)? Please file one if needed.

/be
Status: UNCONFIRMED → RESOLVED
Closed: 13 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: regexpliteralflaw
Blocks: es5
(In reply to comment #6)
> Cc'ing sayrer -- Rob, do you know of a tracking bug for the 3.1 change (I
> didn't see one at a glance)? Please file one if needed.

ES3.1 tracking: bug 445494.
No longer blocks: es5
Blocks: es5
You need to log in before you can comment on or make changes to this bug.