Closed Bug 1053944 Opened 10 years ago Closed 10 years ago

RegExp.$N changed behavior since FF 33

Categories

(Core :: JavaScript: Standard Library, defect)

34 Branch
x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla34
Tracking Status
firefox33 --- fixed
firefox34 --- fixed

People

(Reporter: samth, Assigned: till)

References

()

Details

(Keywords: dev-doc-complete, Whiteboard: [DocArea=JS])

Attachments

(1 file)

To reproduce:

- Visit http://mirror.racket-lang.org/releases/6.1/doc/ 
- Typed "f" in the search box and hit enter
- In FF 33 (and Chrome), you'll go to http://mirror.racket-lang.org/releases/6.1/doc/search/index.html?q=f but in FF 34 you'll go to http://mirror.racket-lang.org/releases/6.1/doc/search/index.html?q=fundefined
- Note message in console about "RegExp.$2 is undefined"

I haven't yet figured out what behavior exactly changed, but I'm working on it.
This was bug 369778.

Not sure whether this is something we should, or shouldn't, fix, honestly.  Having different behavior for the case fixed there, and for $N, seems undesirable to me.
First, since this breaks a large number of pages that have worked for years, I really hope that this gets fixed.

Second, Chrome appears to behave the same as old FF:

> re = /(?:(ABC)|(123)){2}/;
/(?:(ABC)|(123)){2}/
> var t = re.exec("ABC123");
> var u = re.exec("123ABC");
> t
["ABC123", undefined, "123"]
> u
["123ABC", "ABC", undefined]
So we have this:

  var page_query_string =
    (location.href.search(/\?([^#]+)(?:#|$)/) >= 0) && RegExp.$1;

It's not clear to me why this is using location.href this way, versus the far simpler

  var page_query_string = location.search.substring(1);
(In reply to Sam Tobin-Hochstadt [:samth] from comment #2)
> Second, Chrome appears to behave the same as old FF:
> 
> > re = /(?:(ABC)|(123)){2}/;
> /(?:(ABC)|(123)){2}/
> > var t = re.exec("ABC123");
> > var u = re.exec("123ABC");
> > t
> ["ABC123", undefined, "123"]
> > u
> ["123ABC", "ABC", undefined]

No, that's the new way.  I believe it's what every JS engine except SpiderMonkey has done basically forever.  And that's the way that's standard, that ECMAScript has specified since time immemorial.

The only difference appears to be in the behavior of the non-standard $N properties of RegExp.
Ah, ok, I misunderstood the first comment in bug 369778.

I'm happy to believe that this is old and bad code, and I'm planning to rewrite it at some point (I didn't write it in the first place). 

However, there's a bunch of archived content (old documentation) which uses this code and I'll probably have to tell people to use Chrome for it. Which is annoying, of course, but not the end of the world.
It's possible we'd change RegExp.$N back to the old behavior, to be sure.  But I'd like more evidence than a single site of this being a pervasive problem.  And I suspect this line is probably the same in all the archived versions, such that a modification to it would be easy to make if necessary.
Here's the change to the code I ended up with: https://github.com/plt/racket/commit/51254f06267f

Not too bad, but not trivial.
Whether RegExp.$\d is problematic entirely depends on whether \d refers to a potentially-unmatchable component of the most recently used regular expression.  Without citing any evidence, I suspect it's not the common case for capturing groups in a regular expression to be potentially zero-quantified, or nested within another group that is potentially zero-quantified.  So a mere search for RegExp.$\d unfortunately doesn't say much one way or another.
As a (perhaps final) hope for this to change, I'll mention that in addition to a bunch of extant pages, this code is also present in shipping software and documentation pages that are on users local computers, which we can't change to avoid this incompatibility. So for all of those pages, using another browser is likely to be the only solution.
I'm pretty sure we'll have to change this: trying to book an Air France flight, I found that their form validation code breaks because of this change. As an example of what they're doing, check lines 101 to 106 of http://www.airfrance.com/DE/common/js/pceAfValidation.js.

I cannot really imagine that this isn't pretty common, and making people unable to book flights is a bit awkward.
Given that we've encountered quite a bit of breakage, some in high-profile websites, I think we should change this back. The downside is a loss in conceptual purity, but then it's for a feature that is far away from bein conceptually pure in its entirety.

The attached patch changes just the static getters, nothing else. Passes tests (with the single test requiring changes changed). Try-servering here:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=97875b185e4e
Attachment #8477361 - Flags: review?(jwalden+bmo)
Assignee: nobody → till
Status: NEW → ASSIGNED
Comment on attachment 8477361 [details] [diff] [review]
Change RegExp. getters to return '' instead of undefined for non-matched groups

Review of attachment 8477361 [details] [diff] [review]:
-----------------------------------------------------------------

Sigh.  I hate the web, I hate our non-standard extensions, I hate life.  :-(
Attachment #8477361 - Flags: review?(jwalden+bmo) → review+
https://hg.mozilla.org/mozilla-central/rev/df3b50b23153
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Keywords: dev-doc-needed
Whiteboard: [DocArea=JS]
Comment on attachment 8477361 [details] [diff] [review]
Change RegExp. getters to return '' instead of undefined for non-matched groups

Approval Request Comment
[Feature/regressing bug #]: bug 369778
[User impact if declined]: various site breakage. E.g., booking flights on airfrance.com is impossible.
[Describe test coverage new/current, TBPL]: tested by jstests
[Risks and why]: very low, reverts to previous behavior
[String/UUID change made/needed]: none
Attachment #8477361 - Flags: approval-mozilla-aurora?
Attachment #8477361 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Flags: in-testsuite+
You need to log in before you can comment on or make changes to this bug.