Closed Bug 773687 Opened 11 years ago Closed 8 years ago

RegExp flag /y (sticky) causes ^ anchor to apply to start of match attempt rather than entire target string


(Core :: JavaScript Engine, defect)

Not set



Tracking Status
firefox44 --- fixed


(Reporter: steves_list, Assigned: arai)


(Blocks 1 open bug)


(Keywords: dev-doc-complete, Whiteboard: [js:p2][DocArea=JS])


(3 files)

Repro steps for Firefox 13.0.1:

var r = /^r/gy;
r.lastIndex = 2;

Result: ['r']

Expected: null

If you remove the /y flag, you do get the correct result (null).

Because flag /y is (as-yet) nonstandard, I suppose I'm arguing for what the correct result should be, based on intuitive semantics and what would actually be useful.

Note that XRegExp <> uses flag /y behind the scenes for various perf optimizations, and shims it via its XRegExp.exec function via an optional 'sticky' argument (which uses native /y in browsers that support it, i.e., Firefox 3+). The described, incorrect behavior is therefore causing bugs in XRegExp.
Maybe I'm missing something, but isn't the exactly the semantics /y was designed for?  For example, if you were using regular expressions to parse header lines in an HTTP request, you could use a /y regex to extract each header line, without having to mess with creating substrings of the entire HTTP request.  If /y doesn't do what's demonstrated here, what's it supposed to do, and what behavior does it actually change?
For the use case you described, you should probably use the /m flag combined with ^ and /y, in which you would get the behavior you expect.

Flag /y should add an *implicit* non-/m ^ anchor that is applied to the beginning of the match attempt. However, an *explicit* (user-specified) ^ in the RegExp pattern should match only at the beginning of the string, or at the beginning of a line if used with /m.
Consider something like 'baaba'.match(/a|^b/gy). This should return ['b','a','a']. However, currently Firefox returns ['b','a','a','b','a'].
Really should have a report on file instead of this, or upstream of this.

Ever confirmed: true
I've added a new bug at
Whiteboard: [js:p2]
But there are no way to test if a string "startsWith" a RegExp at position "position".
The current behaviour of FireFox with "y" flag allows this.
This pages tells:
"This allows the match-only-at-start capabilities of the character "^" to effectively be used at any location in a string by changing the value of the lastIndex property."
Also this page contains an example, where "g" flag should be used instead of "y".
I cannot think, when "y" can be usefull. But why not to extend "startsWith" to work with RegExpes at least?
(In reply to 4esn0k from comment #6)
> I cannot think, when "y" can be usefull.

I'm not sure I understand. The part you quite, and the way you quote it, sounds like it does what you want. Why can it then never be useful?

> But why not to extend "startsWith" to work with RegExpes at least?

The exact behavior of the "sticky" flag is now specified in ES6[1]. If you think that something is seriously wrong with the way it works, you should file a bug against the specification, as mentioned by brendan in comment 4.

The same applies if you want to have other functions extended, String.prototype.startsWith[2] in this case. Note that the specification of that function already contains a note about potentially supporting Regexps in the future.

sorry, i missed, that i need to use /y flag without "^" in pattern.
Assignee: general → nobody
The draft ES6 spec now lays down the rule: ... Even when the y flag is used with a pattern, ^ always matches only at the beginning of Input, or (if Multiline is true) at the beginning of a line.

But FF gets this wrong (tested in FF 32 on MacOSX):

var re = /^foo/y
re.lastIndex = 2

This tests true, but should be false.
(In reply to Erik Corry from comment #9)
> But FF gets this wrong (tested in FF 32 on MacOSX):

Thanks for testing. It still does in current Nightly.
Blocks: es6
OS: Windows 7 → All
Hardware: x86_64 → All
Version: 13 Branch → unspecified
Blocks: 887016
Changed following:
  * In irregexp::CompilePattern, do not prepend nodes for ".*?" pattern when sticky flag is set,
    so pattern match starts directly from specified offset
  * remove hack for sticky flag in following:
    * In RegExpShared::compile do not enclose pattern with "^(?:)"
    * In RegExpShared::execute, do not modify the start of input text, and related other parameters, results
  * In RegExpShared::execute, call HasSubstringAt instead of StringFindPattern if sticky flag is set and pattern is simple string

Green on try run:

Currently ^, \b, and \B assertions are tested.  Do you know any other case that sticky flag bahaves differently between current SpiderMonkey and the spec?
Assignee: nobody → arai.unmht
Attachment #8663358 - Flags: review?(till)
Thank you Erik :)
I tested with latest revision:
and looks like the test itself has 2 bugs in lastIndex after match (or, maybe the spec changed after that?).
Except them, patched SpiderMonkey passed it, and I think the test in my patch covers almost cases :)
wow, it was old repository :P

anyway it's not changed:
Comment on attachment 8663358 [details] [diff] [review]
Fix assertion pattern in RegExp with sticky flag.

Review of attachment 8663358 [details] [diff] [review]:

Very nice, thank you. r=me with feedback addressed.

::: js/src/jsstr.h
@@ +218,5 @@
>  extern int
>  StringFindPattern(JSLinearString* text, JSLinearString* pat, size_t start);
> +/* Return true if the string contains a pattern at start. */

s/start/|start|/ to make clear that the value of that parameter is meant, not the start of the string.

::: js/src/tests/ecma_6/RegExp/sticky.js
@@ +2,5 @@
> +var summary = 'sticky flag should not break assertion behavior.';
> +
> +print(BUGNUMBER + ": " + summary);
> +
> +function test(re, text, results) {

I think I would call |results| |expectations|, and the |ret| member |matches|. And rename |m| to |match|. However, all that is utterly unimportant, so feel free to ignore.

@@ +3,5 @@
> +
> +print(BUGNUMBER + ": " + summary);
> +
> +function test(re, text, results) {
> +  // sanity check for test data itself

Nit: full-sentence comments should start with an upper-case letter and end with a "."

@@ +18,5 @@
> +    } else {
> +      assertEq(re.lastIndex, result.lastIndex);
> +      assertEq(m !== null, true);
> +      assertEq(m.length, result.ret.length);
> +      for (var j = 0; j < result.ret.length; j++) {

Nit: no braces needed for single-line body.

::: js/src/vm/RegExpObject.cpp
@@ +666,5 @@
>      if (canStringMatch) {
>          MOZ_ASSERT(pairCount() == 1);
> +        size_t sourceLength = source->length();
> +        if (sticky()) {
> +            if (sourceLength + start < sourceLength || sourceLength + start > length)

I don't think the first part of this condition can ever hold: |start| is unsigned, and this boils down to `start < 0`. Simply removing it should be fine, IINM.

@@ +753,5 @@
>          result = irregexp::InterpretCode(cx, byteCode, chars, start, length, matches);
>      }
> +    if (result == RegExpRunStatus_Success && matches)
> +        matches->checkAgainst(length);

I think `MatchPairs::displace` is dead after this removal. If I'm right, please also remove the definition.
Attachment #8663358 - Flags: review?(till) → review+
(In reply to Tooru Fujisawa [:arai] from comment #11)
> Currently ^, \b, and \B assertions are tested.  Do you know any other case
> that sticky flag bahaves differently between current SpiderMonkey and the
> spec?

Forgot to comment on this: no, I can't think of more cases either.

And thanks, Erik, for the tests!
Comment on attachment 8663358 [details] [diff] [review]
Fix assertion pattern in RegExp with sticky flag.

Review of attachment 8663358 [details] [diff] [review]:

Good point, I think V8 has a bug there (and the test) where it is not writing back lastIndex for non-global sticky regexps.
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
Keywords: dev-doc-needed
Whiteboard: [js:p2] → [js:p2][DocArea=JS]
Flags: needinfo?(supriyantomaftuh)
Flags: needinfo?(supriyantomaftuh)
You need to log in before you can comment on or make changes to this bug.