Closed Bug 237111 Opened 21 years ago Closed 19 years ago

g parameter in RegExp causes alternating true/false result on same string

Categories

(Core :: JavaScript Engine, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 98409

People

(Reporter: joachim.kathmann, Unassigned)

References

Details

Attachments

(1 file)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; de-AT; rv:1.6) Gecko/20040113 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; de-AT; rv:1.6) Gecko/20040113 If a regular expression is finished with the g parameter for global search then the first time you execute the test() or exec() method on a given string, the result is correct(e.g. true). Using the same expression on the same string a second time, it returns the opposite result (e.g. false). A third time the result is the same as the first time and so on. Reproducible: Always Steps to Reproduce: 1. Save the following source code as HTML file: <html> <head> <title>BugTest</title> </head> <script language="JavaScript"> function BugTest(sString){ var expr=/[a-z]+/g; return expr.test(sString); } </script> <body> <form id="test" name="test"> <input type="button" id="btn" name="btn" value="Click Me" onClick="alert(BugTest('teststring'));"> </form> </body> </html> 2. Open the file in Mozilla and click the button 3. First result is true 4. Click again 5. Second result is false 6. Click again 7. Third result is true 8. remove g from expr and save page 9. Reload page in browser 10. Every click returns true Actual Results: The result of the RegExp Method test() alternates with every click on the button. Expected Results: The result should be the same all the time. This description also applies for the gi parameter combination. On Internet Explorer 5.5 the result is always correct whether you use g or not.
Attached file Testcase
Attach reporters testcase
Hm, this is bad. Alternating awnsers from a regexp. Confirming with Mozilla 1.7a under WinXP. Removing the g parameter indeed solves the problem. Bug 165353 and bug 209919 sound related but both are supposed to be fixed the bug bug 85721 so I don't think they are duplicates.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Actually, JS regexps are not supposed to always return the same thing. They store state, so the answer will depend on what you have done with the regexp up to now. In particular, the algorithm in the ECMA-262 spec (http://www.ecma-international.org/publications/files/ecma-st/ECMA-262.pdf) gives us the following: First invocation: 1. Let S = "teststring" 2. Let length = 10 3. Let lastIndex = 0 4. Let i = 0 5. Does nothing since "g" option is used. 6. Does nothing since 0 <= i <= length 7. Call to [[Match]] succeeds. Go to step 10. 10. Let e = 10 11. Set lastIndex = 10 12. Return match Second invocation: 1. Let S = "teststring" 2. Let length = 10 3. Let lastIndex = 10 4. Let i = 10 5. Does nothing since /g option is used. 6. Does nothing since 0 <= i <= length 7. Call to [[Match]] fails, since there is nothing to match at position i. Go to step 8. 8. Let i = 11. 9. Go to step 6. 6'. i > length, so set lastIndex to 0 and return null. Note that we returned null and the next time through we will start matching at the beginning of the string again. Hence the alternating behavior in the testcase. The point is that the /g option allows you to test the regexp against the string multiple times, each time starting with the preceding match until no more matches are left. When that happens, null is returned to indicate no more matches and the regexp is reset to the beginning of the string again. So it looks like the problem is that IE has a bug in its implementation of /g (and this bug is rather well-known if you look at the other /g-related bugs in bugzilla).
Boris, thanks for this nice piece of education. I indeed do see the alternating pattern occurring: First run: expr.global = true expr.lastIndex = 0 match returns true Second run: expr.global = true expr.lastIndex = 10 match returns false etc. What first of all surprises me is that the state is carried on to the next instance of a regexpr. Apparently it is stored in the machine and not in the instance. However, not having thoroughly studied the specs yet (it's kinda big) I'll accept this blindly. I'm however somewhat confused about a statement on page 101 of the specs: 15.5.4.10 String.prototype.match (regexp) If regexp is not an object whose [[Class]] property is "RegExp", it is replaced with the result of the expression new RegExp(regexp). Let string denote the result of converting the this value to a string. Then do one of the following: • If regexp.global is false: Return the result obtained by invoking RegExp.prototype.exec (see 15.10.6.2) on regexp with string as parameter. • If regexp.global is true: Set the regexp.lastIndex property to 0 and invoke RegExp.prototype.exec repeatedly until there is no match. If there is a match with an empty string (in other words, if the value of regexp.lastIndex is left unchanged), increment regexp.lastIndex by 1. Let n be the number of matches. The value returned is an array with the length property set to n and properties 0 through n–1 corresponding to the first elements of the results of all matching invocations of RegExp.prototype.exec. The spec states that "If regexp.global is true: Set the regexp.lastIndex property to 0 and invoke RegExp.prototype.exec repeatedly until there is no match." In the testcase expr.global always returns true which would mean, if I understand it correctly, that the lastIndex property should always be set to 0. In your second run you say the lastIndex = 10. Is there an explanation for this? (I just hope I'm not looking at the wrong function here. At least I think I'm not.) Like I said, I've only taken a very short look at it so far so don't shoot me if I got it all wrong ;)
Hm, I guess I was looking at the wrong function. I should be looking at RegExp.prototype.test as defined on page 145 and RegExp.prototype.exec as defined on page 144, right? And the one I mentioned earlier doesn't apply here, or does it?
Funny, if RegExp.test in the Netscape Javascript documentation is implemented by RegExp.prototype.test and RegExp.exec from the Netscape docs is implemented by RegExp.prototype.exec then why does the Netscape Javascript documentation say the following about RegExp.test: test Executes the search for a match between a regular expression and a specified string. Returns true or false. Method of: RegExp Implemented in: JavaScript 1.2, NES 3.0 Syntax: regexp.test([str]) Parameters regexp: The name of the regular expression. It can be a variable name or a literal. str: The string against which to match the regular expression. If omitted, the value of RegExp.input is used. Description When you want to know whether a pattern is found in a string use the test method (similar to the String.search method); for more information (but slower execution) use the exec method (similar to the String.match method). How can exec be slower than test if test is implemented as RegExp.prototype.exec(string) != null ?
(In reply to comment #4) > What first of all surprises me is that the state is carried on to the next > instance of a regexpr. That's why I cced brendan and left the bug open. I'm not sure whether having a var declared like that should create a new regexp instance every time through the function or not (it apparently does not in Mozilla). I _think_ regexp literals are evaluated and converted to objects at compile time, though... (In reply to comment #5) > Hm, I guess I was looking at the wrong function. I should be looking at > RegExp.prototype.test as defined on page 145 and RegExp.prototype.exec as > defined on page 144, right? Yes. My apologies for not including the spec section number in my comment; I meant to and forgot. As for the NS javascript docs, I don't really know enough about the JS engine to know what's up there. Brendan?
Thanks for your reply. > How can exec be slower than test if test is implemented as > RegExp.prototype.exec(string) != null ? This makes me wonder: You can always bail out once you have the first match. The awnser will be correct the first time but does that leave the machine in the same state? Perhaps also a nice one for Brendan to awnser.
There are a lot of questions here best answered by developer docs, JS books, and the ECMA spec. They don't constitute a bug. See ECMA-262 Edition 3 7.8.5, first paragraph, which stipulates that a regexp literal creates a RegExp object once per source literal, when the program is scanned, and evaluating that scanned reference results in a reference to the same single object. test is faster than exec because exec constructs and returns a match array on match; test does not. The Netscape JS docs have bugs, but again, those should not be reported here in bugzilla.mozilla.org. Take IE bugs to Microsoft, of course. /be
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → INVALID
*** Bug 238723 has been marked as a duplicate of this bug. ***
*** Bug 245376 has been marked as a duplicate of this bug. ***
*** Bug 303554 has been marked as a duplicate of this bug. ***
*** Bug 313591 has been marked as a duplicate of this bug. ***
*** Bug 331504 has been marked as a duplicate of this bug. ***
I know this is closed, but I don't think the answer that it's not a bug is correct: "See ECMA-262 Edition 3 7.8.5, first paragraph, which stipulates that a regexp literal creates a RegExp object once per source literal, when the program is scanned, and evaluating that scanned reference results in a reference to the same single object." "If regexp.global is true: Set the regexp.lastIndex property to 0" This implies that the functionality of not creating a new object each time is correct, but that the index should be reset on the existing object every time you "declare" it. I originally noticed this problem with regex.match. Rene Pronk quoted from the regex.match documentation, but I highly doubt this specific functionality would be different from method to method.
ECMA TG1 views the singleton per literal design decision as a mistake, and believes that fixing it incompatibly, to create a new RegExp object on each evaluation of the literal, will only make code work as intended, not break anything. This is on our list for Edition 4 incompatible bug-fix changes. Bugzilla is not the place to track ECMA stuff. Isn't this bug a dup of a much older bug? /be
Whiteboard: DUPEME
yep.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
*** This bug has been marked as a duplicate of 98409 ***
Status: REOPENED → RESOLVED
Closed: 21 years ago19 years ago
Resolution: --- → DUPLICATE
Whiteboard: DUPEME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: