Closed
Bug 237111
Opened 20 years ago
Closed 18 years ago
g parameter in RegExp causes alternating true/false result on same string
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
People
(Reporter: joachim.kathmann, Unassigned)
References
Details
Attachments
(1 file)
361 bytes,
text/html
|
Details |
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; de-AT; rv:1.6) Gecko/20040113 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; de-AT; rv:1.6) Gecko/20040113 If a regular expression is finished with the g parameter for global search then the first time you execute the test() or exec() method on a given string, the result is correct(e.g. true). Using the same expression on the same string a second time, it returns the opposite result (e.g. false). A third time the result is the same as the first time and so on. Reproducible: Always Steps to Reproduce: 1. Save the following source code as HTML file: <html> <head> <title>BugTest</title> </head> <script language="JavaScript"> function BugTest(sString){ var expr=/[a-z]+/g; return expr.test(sString); } </script> <body> <form id="test" name="test"> <input type="button" id="btn" name="btn" value="Click Me" onClick="alert(BugTest('teststring'));"> </form> </body> </html> 2. Open the file in Mozilla and click the button 3. First result is true 4. Click again 5. Second result is false 6. Click again 7. Third result is true 8. remove g from expr and save page 9. Reload page in browser 10. Every click returns true Actual Results: The result of the RegExp Method test() alternates with every click on the button. Expected Results: The result should be the same all the time. This description also applies for the gi parameter combination. On Internet Explorer 5.5 the result is always correct whether you use g or not.
Comment 1•20 years ago
|
||
Attach reporters testcase
Comment 2•20 years ago
|
||
Hm, this is bad. Alternating awnsers from a regexp. Confirming with Mozilla 1.7a under WinXP. Removing the g parameter indeed solves the problem. Bug 165353 and bug 209919 sound related but both are supposed to be fixed the bug bug 85721 so I don't think they are duplicates.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 3•20 years ago
|
||
Actually, JS regexps are not supposed to always return the same thing. They store state, so the answer will depend on what you have done with the regexp up to now. In particular, the algorithm in the ECMA-262 spec (http://www.ecma-international.org/publications/files/ecma-st/ECMA-262.pdf) gives us the following: First invocation: 1. Let S = "teststring" 2. Let length = 10 3. Let lastIndex = 0 4. Let i = 0 5. Does nothing since "g" option is used. 6. Does nothing since 0 <= i <= length 7. Call to [[Match]] succeeds. Go to step 10. 10. Let e = 10 11. Set lastIndex = 10 12. Return match Second invocation: 1. Let S = "teststring" 2. Let length = 10 3. Let lastIndex = 10 4. Let i = 10 5. Does nothing since /g option is used. 6. Does nothing since 0 <= i <= length 7. Call to [[Match]] fails, since there is nothing to match at position i. Go to step 8. 8. Let i = 11. 9. Go to step 6. 6'. i > length, so set lastIndex to 0 and return null. Note that we returned null and the next time through we will start matching at the beginning of the string again. Hence the alternating behavior in the testcase. The point is that the /g option allows you to test the regexp against the string multiple times, each time starting with the preceding match until no more matches are left. When that happens, null is returned to indicate no more matches and the regexp is reset to the beginning of the string again. So it looks like the problem is that IE has a bug in its implementation of /g (and this bug is rather well-known if you look at the other /g-related bugs in bugzilla).
Comment 4•20 years ago
|
||
Boris, thanks for this nice piece of education. I indeed do see the alternating pattern occurring: First run: expr.global = true expr.lastIndex = 0 match returns true Second run: expr.global = true expr.lastIndex = 10 match returns false etc. What first of all surprises me is that the state is carried on to the next instance of a regexpr. Apparently it is stored in the machine and not in the instance. However, not having thoroughly studied the specs yet (it's kinda big) I'll accept this blindly. I'm however somewhat confused about a statement on page 101 of the specs: 15.5.4.10 String.prototype.match (regexp) If regexp is not an object whose [[Class]] property is "RegExp", it is replaced with the result of the expression new RegExp(regexp). Let string denote the result of converting the this value to a string. Then do one of the following: • If regexp.global is false: Return the result obtained by invoking RegExp.prototype.exec (see 15.10.6.2) on regexp with string as parameter. • If regexp.global is true: Set the regexp.lastIndex property to 0 and invoke RegExp.prototype.exec repeatedly until there is no match. If there is a match with an empty string (in other words, if the value of regexp.lastIndex is left unchanged), increment regexp.lastIndex by 1. Let n be the number of matches. The value returned is an array with the length property set to n and properties 0 through n–1 corresponding to the first elements of the results of all matching invocations of RegExp.prototype.exec. The spec states that "If regexp.global is true: Set the regexp.lastIndex property to 0 and invoke RegExp.prototype.exec repeatedly until there is no match." In the testcase expr.global always returns true which would mean, if I understand it correctly, that the lastIndex property should always be set to 0. In your second run you say the lastIndex = 10. Is there an explanation for this? (I just hope I'm not looking at the wrong function here. At least I think I'm not.) Like I said, I've only taken a very short look at it so far so don't shoot me if I got it all wrong ;)
Comment 5•20 years ago
|
||
Hm, I guess I was looking at the wrong function. I should be looking at RegExp.prototype.test as defined on page 145 and RegExp.prototype.exec as defined on page 144, right? And the one I mentioned earlier doesn't apply here, or does it?
Comment 6•20 years ago
|
||
Funny, if RegExp.test in the Netscape Javascript documentation is implemented by RegExp.prototype.test and RegExp.exec from the Netscape docs is implemented by RegExp.prototype.exec then why does the Netscape Javascript documentation say the following about RegExp.test: test Executes the search for a match between a regular expression and a specified string. Returns true or false. Method of: RegExp Implemented in: JavaScript 1.2, NES 3.0 Syntax: regexp.test([str]) Parameters regexp: The name of the regular expression. It can be a variable name or a literal. str: The string against which to match the regular expression. If omitted, the value of RegExp.input is used. Description When you want to know whether a pattern is found in a string use the test method (similar to the String.search method); for more information (but slower execution) use the exec method (similar to the String.match method). How can exec be slower than test if test is implemented as RegExp.prototype.exec(string) != null ?
Comment 7•20 years ago
|
||
(In reply to comment #4) > What first of all surprises me is that the state is carried on to the next > instance of a regexpr. That's why I cced brendan and left the bug open. I'm not sure whether having a var declared like that should create a new regexp instance every time through the function or not (it apparently does not in Mozilla). I _think_ regexp literals are evaluated and converted to objects at compile time, though... (In reply to comment #5) > Hm, I guess I was looking at the wrong function. I should be looking at > RegExp.prototype.test as defined on page 145 and RegExp.prototype.exec as > defined on page 144, right? Yes. My apologies for not including the spec section number in my comment; I meant to and forgot. As for the NS javascript docs, I don't really know enough about the JS engine to know what's up there. Brendan?
Comment 8•20 years ago
|
||
Thanks for your reply.
> How can exec be slower than test if test is implemented as
> RegExp.prototype.exec(string) != null ?
This makes me wonder: You can always bail out once you have the first match. The
awnser will be correct the first time but does that leave the machine in the
same state? Perhaps also a nice one for Brendan to awnser.
Comment 9•20 years ago
|
||
There are a lot of questions here best answered by developer docs, JS books, and the ECMA spec. They don't constitute a bug. See ECMA-262 Edition 3 7.8.5, first paragraph, which stipulates that a regexp literal creates a RegExp object once per source literal, when the program is scanned, and evaluating that scanned reference results in a reference to the same single object. test is faster than exec because exec constructs and returns a match array on match; test does not. The Netscape JS docs have bugs, but again, those should not be reported here in bugzilla.mozilla.org. Take IE bugs to Microsoft, of course. /be
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → INVALID
Comment 10•20 years ago
|
||
*** Bug 238723 has been marked as a duplicate of this bug. ***
Comment 11•20 years ago
|
||
*** Bug 245376 has been marked as a duplicate of this bug. ***
Comment 12•19 years ago
|
||
*** Bug 303554 has been marked as a duplicate of this bug. ***
Comment 13•19 years ago
|
||
*** Bug 313591 has been marked as a duplicate of this bug. ***
Comment 14•18 years ago
|
||
*** Bug 331504 has been marked as a duplicate of this bug. ***
Comment 15•18 years ago
|
||
I know this is closed, but I don't think the answer that it's not a bug is correct: "See ECMA-262 Edition 3 7.8.5, first paragraph, which stipulates that a regexp literal creates a RegExp object once per source literal, when the program is scanned, and evaluating that scanned reference results in a reference to the same single object." "If regexp.global is true: Set the regexp.lastIndex property to 0" This implies that the functionality of not creating a new object each time is correct, but that the index should be reset on the existing object every time you "declare" it. I originally noticed this problem with regex.match. Rene Pronk quoted from the regex.match documentation, but I highly doubt this specific functionality would be different from method to method.
Comment 16•18 years ago
|
||
ECMA TG1 views the singleton per literal design decision as a mistake, and believes that fixing it incompatibly, to create a new RegExp object on each evaluation of the literal, will only make code work as intended, not break anything. This is on our list for Edition 4 incompatible bug-fix changes. Bugzilla is not the place to track ECMA stuff. Isn't this bug a dup of a much older bug? /be
Whiteboard: DUPEME
Comment 18•18 years ago
|
||
*** This bug has been marked as a duplicate of 98409 ***
Status: REOPENED → RESOLVED
Closed: 20 years ago → 18 years ago
Resolution: --- → DUPLICATE
You need to log in
before you can comment on or make changes to this bug.
Description
•