Implement ES4's /y option on regular expressions ("sticky" -- start at lastIndex) (js1_7/regexp/yflag.js)

VERIFIED FIXED in mozilla1.9alpha3

Status

()

enhancement
VERIFIED FIXED
13 years ago
12 years ago

People

(Reporter: Pike, Assigned: mrbkap)

Tracking

({dev-doc-complete})

Trunk
mozilla1.9alpha3
Points:
---
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

()

Attachments

(1 attachment, 1 obsolete attachment)

Writing a parser in js for my pet file format, I use regular expressions to do the analysis of the input string.

It'd be nice if .test and .exec would actually provide a start (and maybe even an end) offset, because right now, I'd either have to provide some ^[.\n]{n} to skip the first n chars I already parsed, or chop of each parsed string fragment at the start of the string, both of which look happily inefficient.

That'd be similar to python's match and search params, http://docs.python.org/lib/re-objects.html.

Do we need to have a .match equivalent to specify what '^foo' is supposed to match in that case? I'm currently using the chopping off alternative, with a BOL for each regexp, using .match with an offset would be the API I really need for parsing.
http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Objects:RegExp#Properties

See |lastIndex|, and note that it's present when the RegExp includes the /g flag.
The ES4/JS2 regexp work includes a /y flag ("sticky") that starts at lastIndex and either matches there without searching forward, or fails.

Given /y, I don't see a need for anything more. What am I missin?

/be
And yes, the /y flag is exactly what I need, together with lastIndex.

I didn't manage to find http://developer.mozilla.org/es4/proposals/extend_regexps.html before, too. I recall that we don't intend to enhance JS for 1.9, right?

I'm kinda torn if I should see myself writing the l20n parser in C/C++ or keep it in js, so if we could support /y early, that'd make it easier to not port the parser to C/C++ and still evaluate the performance impact of it. At least without having obvious computational complexity bugs in it.
We should do /y and the other ES4 regexp extensions in 1.9.  Anyone up for it?

/be
The Perl regular expression tests use @- (array of captures starting indices) and @+ (array of captures ending indices). Something similar would be useful in that context.
Posted patch Implement /y, v1 (obsolete) — Splinter Review
I looked, and it seemed like /y was pretty easy to implement quickly, so here's a first go at it.

Some of the other extensions (namely /x) are going to require a bunch more work.
Assignee: general → mrbkap
Status: NEW → ASSIGNED
Attachment #256762 - Flags: review?(brendan)
Comment on attachment 256762 [details] [diff] [review]
Implement /y, v1

Looks ok at a glance, I'm kind of jetlagged still. But why do you mix all metaphors (NOSEARCH, anchored) except the new, mnemonic one (stickY)?

Brian should like this too, or in place of my r+ (I'll look again later today).

/be
Attachment #256762 - Flags: review?(crowder)
Comment on attachment 256762 [details] [diff] [review]
Implement /y, v1

This looks good to me.
Attachment #256762 - Flags: review?(crowder) → review+
I should add that I agree with brendan's feedback on the use of "nosearch" and "anchored".  StickY is better.
(In reply to comment #7)
> (From update of attachment 256762 [details] [diff] [review])
> Looks ok at a glance, I'm kind of jetlagged still. But why do you mix all
> metaphors (NOSEARCH, anchored) except the new, mnemonic one (stickY)?

At first, I wasn't thinking and used "anchored", then I saw on the page you linked to that the property that described this feature was called "noSearch". I now understand that the developer.mozilla.org page is out-of-date. New patch in a jiffy.
This patch doesn't mix its metaphors quite so much.
Attachment #256762 - Attachment is obsolete: true
Attachment #256802 - Flags: superreview?(crowder)
Attachment #256802 - Flags: review?(brendan)
Attachment #256762 - Flags: review?(brendan)
Comment on attachment 256802 [details] [diff] [review]
Implement /y, v2

>-MSG_DEF(JSMSG_NO_INPUT,                59, 3, JSEXN_SYNTAXERR, "no input for /{0}/{1}{2}")
>+MSG_DEF(JSMSG_NO_INPUT,                59, 4, JSEXN_SYNTAXERR, "no input for /{0}/{1}{2}{3}")

Note that 4 parameters aren't enough. I had to change the 4 to a 5 and add a {4} to the end of the error string, but I don't think that's worth a new patch. Currently, we don't append the multiline flag to this error message (which is what confused me).
Attachment #256802 - Flags: superreview?(crowder) → superreview+
I checked this into the trunk (with crowder's review). I'll address any comments that brendan has ex-post facto.
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Summary: add start offset to RegExp test and exec methods → Implement ES4's /y option on regular expressions ("sticky" -- start at lastIndex)
Keywords: dev-doc-needed
Target Milestone: --- → mozilla1.9alpha3
Comment on attachment 256802 [details] [diff] [review]
Implement /y, v2

Looks ok to me. Tests in the suite yet?

/be
Attachment #256802 - Flags: review?(brendan) → review+
(In reply to comment #14)

not yet.
/cvsroot/mozilla/js/tests/js1_7/regexp/yflag.js,v  <--  yflag.js
initial revision: 1.1

/cvsroot/mozilla/js/tests/js1_7/regexp/shell.js,v  <--  shell.js
initial revision: 1.1

/cvsroot/mozilla/js/tests/js1_7/regexp/browser.js,v  <--  browser.js
initial revision: 1.1
Flags: in-testsuite+
verified fixed 1.9.0 20070320 win/mac*/linux
Status: RESOLVED → VERIFIED
Summary: Implement ES4's /y option on regular expressions ("sticky" -- start at lastIndex) → Implement ES4's /y option on regular expressions ("sticky" -- start at lastIndex) (js1_7/regexp/yflag.js)
(In reply to comment #18)
http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Global_Objects:RegExp

I added an example of how to use /y and expanded the flag's documentation, because I couldn't understand how it worked just from reading it, even after partially following its implementation/functionality here.  It'd be nice if people could double-check that my example is a reasonably idiomatic use case.  :-)


http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Guide:Regular_Expressions

Maybe I'm blind, but I only see g/i/m on this page at a glance, and a search for 'flags' on the page doesn't mention it either.  I don't really have time to write the necessary prose here, or I'd fix it myself.
You need to log in before you can comment on or make changes to this bug.