Closed Bug 692441 Opened 13 years ago Closed 13 years ago

unexpected results with regular expression under Firefox 7

Categories

(Core :: JavaScript Engine, defect)

7 Branch
x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: u427052, Unassigned)

References

Details

User Agent: Mozilla/5.0 (Windows NT 6.1; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
Build ID: 20110928134238

Steps to reproduce:

Running this code in Firefox 7 :

var myregexp = /((UID (\w+)|RFC822.SIZE (\w+)|FLAGS \((.*?)\))[\s)]+){3}/g;
var str = "* 1 FETCH (UID 9 RFC822.SIZE 724 FLAGS (\Seen))";

alert(myregexp.exec(str));


Actual results:

Result under Firefox >= 7 : 

UID 9 RFC822.SIZE 724 FLAGS (\Seen))
,FLAGS (\Seen))
,FLAGS (\Seen),,,\Seen


Expected results:

Result under Firefox <=6.02 :

UID 9 RFC822.SIZE 724 FLAGS (\Seen))
,FLAGS (\Seen))
,FLAGS (\Seen),9,724,\Seen
I see the "Actual results" also in Chrome, Safari, and Opera....
So you think it's normal that there is a difference in results between Firefox 6 and Firefox 7?
Firefox 7 updated the Yarr RegExp engine (the same one used in Webkit, aka Chrome and Safari), so it's not entirely unsurprising that behavior changed between 6 and 7. The fact that Opera with a different RegExp implementation also shows the same Actual Results suggests that the Firefox 7 results are more consistent with what other browsers do, though.

I'm no expert in RegExp, so I can't say whether the behavior in 6 or 7 is more correct, however. I'm curious, though, whether current Aurora or Nightly builds show the same result or not. A patch in bug 683838 landed recently that fixed a known regression from the Yarr upgrade. Can you try a current Aurora or Nightly build to see if things work as expected again? If they do, the good news is that the patch landed for Firefox 8 as well. If not, I will have to defer back to the RegExp gods :-)
http://www.mozilla.org/en-US/firefox/channel/
Test with Aurora (9.0a2): same result
Test with Nightly (10.0a1) : same result

Shame ...

I hope RegExp gods will be indulgent with us :)
Chrome likewise doesn't use Yarr, so comment 1 says that there are 3 independent regular expression implementations (Yarr, and whatever Carakan and V8 use) that give the same behavior.

But yes, what this bug needs is for someone who understands the ES regexp spec to look at it...

Given the duplicate, I wonder where this regular expression is coming from!
"Given the duplicate, I wonder where this regular expression is coming from!"

this regular expression come from the same guy from the same add-on...
according to the tests I've done, it would seem that this is the pipe that is the problem
Thanks for filing, but the old results were incorrect. On each iteration of the outer parens (with the {3} quantifier on them) the inner parens get cleared out at the start. If the match is successful, the results stick.

So, in this case, on the last iteration the inner parens get cleared and then only the FLAGS result is matched, so you don't get the UID or SIZE results from prior iterations because they were cleared out at the beginning of the iteration.

Spec citation is ECMAv5 15.10.2 for RepeatMatcher step 4: "For every integer k that satisfies parenIndex < k and k ≤ parenIndex+parenCount, set cap[k] to undefined."

As a simpler example, take:

/(?:first (\d) |second (\d) |third (\d) ){3}/.exec("first 1 second 2 third 3 ")
["first 1 second 2 third 3 ", (void 0), (void 0), "3"]
Status: UNCONFIRMED → RESOLVED
Closed: 13 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.