Closed Bug 576823 Opened 14 years ago Closed 14 years ago

YARR incorrectly matches later in string with alternate and sticky flag

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jruderman, Assigned: cdleary)

References

Details

(Keywords: testcase, Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 1 obsolete file)

/A|B/y.exec("CB")

YARR: ["B"]
Old: null

I think YARR is correct and the old engine is incorrect.  But I'd like a fix for the old engine so we can do more thorough compare-fuzzing to find bugs in YARR.
Since lastIndex is 0, the old engine is correct. I'll take a look.
Assignee: general → cdleary
Status: NEW → ASSIGNED
Ahh, I missed the part where /y means not only "start at lastIndex" but also "don't match forward".  (As if there were a ^ at the beginning of the regexp and you had sliced the string.)
Summary: Old regexp engine fails to match alternate with sticky flag → YARR incorrectly matches later in string with alternate and sticky flag
Whoops, rookie mistake -- the caret was getting stuck to the LHS of the disjunction. Need to wrap the rest of the expression in (?:) to keep things balanced across disjunctions. This fixes it:

$ ./js -e 'print(/A|B/y.exec("CB"))'
null
Attachment #470057 - Flags: review?(lw)
If we are expecting fakey's chars() to be read, then it seems like building a rope would be strictly slower than doing an apriori concatenation with a Vector<jschar> and js_NewStringFromCharBuffer.  Actually, from a cursory glance over compileHelper, it seems like you might be able to even put the JSString header itself on the stack and manually initialize it.  (This trick is done elsewhere; see js_Atomize.)  The key property is that JSString does not leak into the wild.
(In reply to comment #4)

Alrighty -- I don't think that sticky is really performance critical, but that sounds like an interesting trick to know anyway. :-)
Switched to JSCharBuffer. I may be misunderstanding, but I don't think that the inline trick you're suggesting will work, sine the source gets sucked into the resulting compiled regexp object -- it has to live on the heap. (That way, you can just print out the pattern for a given regexp object without worrying about decompilation.)
Attachment #470057 - Attachment is obsolete: true
Attachment #472013 - Flags: review?(lw)
Attachment #470057 - Flags: review?(lw)
Comment on attachment 472013 [details] [diff] [review]
Fix sticky patterns with disjunctions.

>+    static const jschar prefix[] = {'^', '(', '?', ':'};
>+    static const size_t prefixLen = sizeof(prefix) / sizeof(prefix[0]);
>+    static const jschar postfix[] = {')'};
>+    static const size_t postfixLen = sizeof(postfix) / sizeof(postfix[0]);
>+
>+    JSCharBuffer cb(cx);
>+    if (!cb.reserve(prefixLen + source->length() + postfixLen))
>+        return false;

can just use JS_ARRAY_LENGTH in-line instead.

>+    AutoStringRooter asr(cx, fakeySource);
>     return compileHelper(cx, *fakeySource);

I'm not sure what compilerHelper does with fakeySource, but conservative gc might let you avoid the rooter.
Attachment #472013 - Flags: review?(lw) → review+
http://hg.mozilla.org/mozilla-central/rev/60af58b42567
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: