Closed Bug 1050649 Opened 9 years ago Closed 9 years ago

IonMonkey: Implement RegExpReplace recover instruction

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla35

People

(Reporter: aetf, Assigned: aetf, Mentored)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 5 obsolete files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:34.0) Gecko/20100101 Firefox/34.0 (Beta/Release)
Build ID: 20140723030202




Expected results:

Implement RegExpReplace in js/src/jit/Recover.cpp.
See Bug 1003801 comment 0 for explanation.
Blocks: 1003801
OS: Linux → All
Hardware: x86_64 → All
Version: Trunk → unspecified
I would like to work on this bug.
Assignee: nobody → 7437103
Mentor: nicolas.b.pierron
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attached patch bug-105049.patch (obsolete) — Splinter Review
First try. I've read the code and if I'm right, this one doesn't have a side-effect. So quite a simple one.
Attachment #8472995 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8472995 [details] [diff] [review]
bug-105049.patch

Review of attachment 8472995 [details] [diff] [review]:
-----------------------------------------------------------------

This patch sounds good, but since the last time, I became an expert at making crazy test cases with regexps :P


var uceFault_regexp_g_replace = eval(uneval(uceFault).replace('uceFault', 'uceFault_regexp_g_replace'))
function rregexp_g_replace(i) {
    var re = new RegExp("str\\d+" + (i % 10), "g");
    re.test("str00123456789");
    assertEq(re.lastIndex == 0, false);

    var res = "str00123456789".replace(re, "abc");

    // replace will always zero the lastIndex field, even if it was not zero before.
    assertEq(re.lastIndex == 0, true);

    if (uceFault_regexp_g_replace(i) || uceFault_regexp_g_replace(i))
        assertEq(res, "abc");
    return i;
}


So, I agree, that if the reset of the lastIndex was done out-side the str_replace_regexp_raw, then its only side-effect would not be visible.
This might be something to experiment as a follow-up patch ;)
Attachment #8472995 - Flags: review?(nicolas.b.pierron) → feedback+
(In reply to Nicolas B. Pierron [:nbp] {N/A 22-29/08} from comment #3)
I've added the test case as well as a literal version because I'm using the same logic in canRecoverOnBailout.

> bool canRecoverOnBailout() const {
>     // RegExpReplace will zero the lastIndex field when global flag is set.
>     // So we can only remove this if it's non-global or lastIndex is zero before execution.
>     if (pattern()->isRegExp()) {
>         RegExpObject* re = pattern()->toRegExp()->source();
>         return !re->global() || re->getLastIndex().isInt32(0);
>     }
>     return false;
> }

But then the literal test case failed. In gdb, it seems that re->getLastIndex() always returns a zero value. I'm not sure if this has something to do with lazy update and have no idea where to look at...

> So, I agree, that if the reset of the lastIndex was done out-side the
> str_replace_regexp_raw, then its only side-effect would not be visible.
> This might be something to experiment as a follow-up patch ;)

The reset is actually done in StrReplaceRegexpRemove and StrReplaceRegExp. While it's obvious in StrReplaceRegExp, I can't figure out how things are done in StrReplaceRegexpRemove. Is updateLazily that does the reset?
Attached patch What I have done now (obsolete) — Splinter Review
I have been busy for a lot of things, and have not had time to dive into the code yet. Here is what I have done now.
Attachment #8472995 - Attachment is obsolete: true
Well, I still can't get correct lastIndex in canRecoverOnBailout, so here is only the basic detection that test if the global flag is set.
Further optimization would be to check if lastIndex is zero when global flag is set, in which case the side-effect is non-visible too.

By the way, is it necessary to add test cases for other flags?
Attachment #8477962 - Attachment is obsolete: true
Attachment #8481888 - Flags: review?(nicolas.b.pierron)
(In reply to Aetf from comment #6)
> Further optimization would be to check if lastIndex is zero when global flag
> is set, in which case the side-effect is non-visible too.

Or to separate the part which reset the lastIndex from the part which does the replace.  We could add an extra MIR Instruction only for resetting the lastIndex property before doing the replace.

> By the way, is it necessary to add test cases for other flags?

Knowing that we do not want to change the behaviour, and that we might accidentally do so. I think it makes sense to check against other flags as well.
Comment on attachment 8481888 [details] [diff] [review]
Implement recover instruction for RegExpReplace

Review of attachment 8481888 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MIR.h
@@ +6209,5 @@
> +    bool canRecoverOnBailout() const {
> +        // RegExpReplace will zero the lastIndex field when global flag is set.
> +        // So we can only remove this if it's non-global.
> +        if (pattern()->isRegExp()) {
> +            return !pattern()->toRegExp()->source()->global();

Is there any reasons to not use needUpdateLastIndex() ?
(In reply to Nicolas B. Pierron [:nbp] from comment #7)
> We could add an extra MIR Instruction only for resetting the
> lastIndex property before doing the replace.

Well, this could be in a second patch. I'll look at this later, but I would appreciate any information about how to add a MIR Instruction.

> Knowing that we do not want to change the behaviour, and that we might
> accidentally do so. I think it makes sense to check against other flags as
> well.

Test cases added.

(In reply to Nicolas B. Pierron [:nbp] from comment #8)
> Is there any reasons to not use needUpdateLastIndex() ?

Yes, replace only reset lastIndex with global flag set, but not when sticky is set.
(In reply to Aetf from comment #9)
> (In reply to Nicolas B. Pierron [:nbp] from comment #7)
> > We could add an extra MIR Instruction only for resetting the
> > lastIndex property before doing the replace.
> 
> Well, this could be in a second patch. I'll look at this later, but I would
> appreciate any information about how to add a MIR Instruction.

Adding a new MIR instruction implies adding the MInstruction for it, in which case it would be similar to what we have with MSetInitializedLength, except that instead of manipulating an element vector we would be manipulating the regexp object.

Basically you will have to modify MIR.{h,cpp}, Lowering.{h,cpp}, and the equivalent LIR instruction in LIR-Common.h and the visit function use to generate the code for it in CodeGenerator.{h,cpp}.  And probably add a few accessor function to locate the offset of the lastIndex slot on the RegExp object.

And I guess you might have to make a special case for it in the code which flagging the MRegRexpReplace as RecoveredOnBailout in IonAnalysis.cpp, such as you can replace it by this instruction, while flagging the RegExpReplace as recovered on bailout.

> (In reply to Nicolas B. Pierron [:nbp] from comment #8)
> > Is there any reasons to not use needUpdateLastIndex() ?
> 
> Yes, replace only reset lastIndex with global flag set, but not when sticky
> is set.

ok, I thought we always did so when using MRegExpReplace.
Add test cases with all flags.
Attachment #8481888 - Attachment is obsolete: true
Attachment #8481888 - Flags: review?(nicolas.b.pierron)
Attachment #8484220 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8484220 [details] [diff] [review]
Implement recover instruction for RegExpReplace

Review of attachment 8484220 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good :)

::: js/src/jit/MIR.h
@@ +6448,5 @@
> +        // RegExpReplace will zero the lastIndex field when global flag is set.
> +        // So we can only remove this if it's non-global.
> +        if (pattern()->isRegExp()) {
> +            return !pattern()->toRegExp()->source()->global();
> +        }

style-nit: remove the curly brackets.
Attachment #8484220 - Flags: review?(nicolas.b.pierron) → feedback+
removed the curly brackets.
Attachment #8484220 - Attachment is obsolete: true
Attachment #8485521 - Flags: review?(nicolas.b.pierron)
sorry... uploaded a wrong version
Attachment #8485521 - Attachment is obsolete: true
Attachment #8485521 - Flags: review?(nicolas.b.pierron)
Attachment #8485523 - Flags: review?(nicolas.b.pierron)
I'm working on adding new MIR instruction for reset the lastIndex of regexp object. A few questions:

* not sure how to get offset of lastIndex, which is stored in slots, while most other offsetOfXXX are dealing with a field of ObjectImpl. Use getSlotAddress()?

* I've found MSetPropertyPolymorphicV, does that do the same thing?
(In reply to Aetf from comment #15)
> I'm working on adding new MIR instruction for reset the lastIndex of regexp
> object. A few questions:
> 
> * not sure how to get offset of lastIndex, which is stored in slots, while
> most other offsetOfXXX are dealing with a field of ObjectImpl. Use
> getSlotAddress()?

RegExp objects have some reserved slots[1].  A reserved slot is just a slot that we handle by its index inside the object.  We have different kind of slots, we have fixed slots and dynamic slots.  dynamic slots are used when the number of property on the object become larger than a specific threshold (I think this is 12 now).

The good new, is that you probably do not have to use a special offsetOf, but that you can use a MStoreFixedSlot [2] MIR Instruction.

[1] http://dxr.mozilla.org/mozilla-central/source/js/src/vm/RegExpObject.h#337
[2] http://dxr.mozilla.org/mozilla-central/source/js/src/jit/MIR.h?from=MStoreFixedSlot&case=true#8244
Comment on attachment 8485523 [details] [diff] [review]
Implement recover instruction for RegExpReplace

Review of attachment 8485523 [details] [diff] [review]:
-----------------------------------------------------------------

This sounds good. :)
I will send this patch to the try server.
Attachment #8485523 - Flags: review?(nicolas.b.pierron) → review+
(In reply to Aetf from comment #15)
> I'm working on adding new MIR instruction for reset the lastIndex of regexp
> object.

As the current patch will land as soon as the tree is open, do you mind opening a new bug to continue on this modification?

Thus this other bug (feature) would be easier to track.  This will avoid mixing 2 patches which can land separately in different time-frames ;)
Flags: needinfo?(7437103)
https://hg.mozilla.org/mozilla-central/rev/18c0b7cb6a16
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
(In reply to Nicolas B. Pierron [:nbp] from comment #20)
> As the current patch will land as soon as the tree is open, do you mind
> opening a new bug to continue on this modification?
> 

Sure.
Flags: needinfo?(7437103)
You need to log in before you can comment on or make changes to this bug.