Closed
Bug 1038592
Opened 10 years ago
Closed 10 years ago
IonMonkey: Implement RegExpTest Recover Instruction
Categories
(Core :: JavaScript Engine: JIT, defect)
Core
JavaScript Engine: JIT
Tracking
()
RESOLVED
FIXED
mozilla34
People
(Reporter: nbp, Assigned: aetf, Mentored)
References
(Blocks 1 open bug)
Details
(Whiteboard: [good first bug][lang=c++])
Attachments
(2 files, 5 obsolete files)
2.50 KB,
patch
|
nbp
:
review+
bhackett1024
:
feedback+
|
Details | Diff | Splinter Review |
8.66 KB,
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
Implement RRegExpTest in js/src/jit/Recover.cpp.
See Bug 1003801 comment 0 for explanation.
Comment 1•10 years ago
|
||
I would like to work on this bug.
Reporter | ||
Comment 2•10 years ago
|
||
(In reply to Paali Tandia from comment #1)
> I would like to work on this bug.
You already work on 3 recover instruction, maybe you might want to try other part of the JavaScript engine[1] and leave these recover instruction for other new people who are looking for good first bugs?
Is there any other bug which interest you?
[1] https://bugzilla.mozilla.org/buglist.cgi?cmdtype=dorem&remaction=run&namedcmd=JS%20Mentored%20Bugs&sharer_id=422187
Comment 3•10 years ago
|
||
No problem! Thank you for the link.
I would like to work on this bug, but I need some help now.
To my understand, the following things should be done:
* Add writeRecoverData and canRecoverOnBailout method in MRegExpTest class in MIR.h
* Implement MRegExpTest::writeRecoverData in Recover.cpp
* Add RRegExpTest class in Recover.h
* Implement RRegExpTest class in Recover.cpp
* Add tests
However, I have no idea where to find the function that actually does the work. I suppose it to be something in js/src/irregexp but I'm not sure. Here is what I've already written:
::: js/src/jit/Recover.cpp
@@ -885,6 +885,32 @@
> +bool
> +RRegExpTest::recover(JSContext *cx, SnapshotIterator &iter) const
> +{
> + RootedString string(cx, iter.read().toString());
> + RootedObject regexp(cx, &iter.read().toObject());
> + RootedValue result(cx);
> +
> + // actually do regexptest and check return value
> +
> + iter.storeInstructionResult(result);
> + return true;
> +}
Reporter | ||
Comment 5•10 years ago
|
||
(In reply to Aetf from comment #4)
> I would like to work on this bug, but I need some help now.
Welcome,
So far your summary sounds correct.
> However, I have no idea where to find the function that actually does the
> work. I suppose it to be something in js/src/irregexp but I'm not sure. Here
> is what I've already written:
Have a look at the code that we generate with visitRegExpTest [1]. This should help you figure out what to put in the recover function.
[1] http://dxr.mozilla.org/mozilla-central/search?q=visitRegExpTest&case=true
Assignee: nobody → 7437103
(In reply to Nicolas B. Pierron [:nbp] from comment #5)
> Have a look at the code that we generate with visitRegExpTest [1]. This
> should help you figure out what to put in the recover function.
>
> [1] http://dxr.mozilla.org/mozilla-central/search?q=visitRegExpTest&case=true
Thank you for the link. I know what to do now :-)
OK, here is the patch. Please let me know if there is any thing wrong.
Reporter | ||
Comment 8•10 years ago
|
||
Comment on attachment 8462999 [details] [diff] [review]
regexptest.patch
Review of attachment 8462999 [details] [diff] [review]:
-----------------------------------------------------------------
This patch looks good, except that the regexp.prototype.test function might have a side-effect on the RegExp object in case of sticky regular expressions. The side-effect is that the lastIndex of the RegExp object is mutated and is observable. So the following test case should fail with this patch because the canRecoverOnBailout always returns true.
[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp
::: js/src/jit-test/tests/ion/dce-with-rinstructions.js
@@ +572,5 @@
> + if (uceFault_regexp_test(i) || uceFault_regexp_test(i)) {
> + assertEq(res, true);
> + }
> + return i;
> +}
Add the following test case too, and modify your code to handle it properly.
var uceFault_regexp_sticky_test = eval(uneval(uceFault).replace('uceFault', 'uceFault_regexp_sticky_test'))
function rregexp_sticky_test(i) {
var re = new RegExp("str\\d+" + (i % 10), "y");
var res = re.test("str0123456789");
if (uceFault_regexp_sticky_test(i) || uceFault_regexp_sticky_test(i)) {
assertEq(res, true);
}
assertEq(re.lastIndex == 0, false);
return i;
}
::: js/src/jit/Recover.cpp
@@ +903,5 @@
> + bool result;
> +
> + if (!js::regexp_test_raw(cx, regexp, string, &result)) {
> + return false;
> + }
style-nit: no curly braces for single lines.
Attachment #8462999 -
Flags: feedback+
Sorry... I forgot to set the flag.
Attachment #8463299 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 10•10 years ago
|
||
Well, When I checked the last time, there was no new comment. So I thought it may because the flag wasn't set correctly. Can the duplicated patch be deleted?
Anyway, I will work on the sticky case now. One more question, is the canRecoverOnBailout method that should be modified? I'm still studying about the bailout, and have not gained a clear view about the procedure.
Reporter | ||
Updated•10 years ago
|
Attachment #8463299 -
Attachment is obsolete: true
Attachment #8463299 -
Flags: review?(nicolas.b.pierron)
Reporter | ||
Comment 11•10 years ago
|
||
(In reply to Aetf from comment #10)
> Anyway, I will work on the sticky case now. One more question, is the
> canRecoverOnBailout method that should be modified? I'm still studying about
> the bailout, and have not gained a clear view about the procedure.
Have you read the blog post in Bug 1003801 comment 12, does it help?
Assignee | ||
Comment 12•10 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #11)
> Have you read the blog post in Bug 1003801 comment 12, does it help?
Yes, I've read that and now I think I finally get the point.
If canRecoverOnBailout is true and hasLiveDefUses is false, the instruction will be removed and won't be executed unless in a bailout. However, hasLiveDefUses failed to capture all possible uses. The regexptest may have a side-effect on lastIndex which will cause implicit uses. Thus in case hasLiveDefUse is false but later lastIndex is used, the generated code is incorrect.
Not a solution yet but I'm working on two possible modifications
* modify canRecoverOnBailout to return false when its regexp is a sticky one
* modify the way instruction uses is added making it capture implicit uses on lastIndex.
And some other problems.
(In reply to Nicolas B. Pierron [:nbp] from comment #8)
> var re = new RegExp("str\\d+" + (i % 10), "y");
> var res = re.test("str0123456789");
> if (uceFault_regexp_sticky_test(i) || uceFault_regexp_sticky_test(i)) {
> assertEq(res, true);
> }
> assertEq(re.lastIndex == 0, false);
This test case will fail when i%10 == 0 because /str\d+0/ won't match "str0123456789", and thus re.lastIndex equals 0. Did you mean something like "str00123456789" that will be matched regardless of the value of i?
offthread compile also seems influencing the result.
with --ion-offthread-compile=on
* rregexp_test -> passed
* rregexp_sticky_test(modified) -> passed (unexpected)
with --ion-offthread-compile=off
* rregexp_test -> segment fault (unexpected)
* rregexp_sticky_test(modified) -> failed at i == 20 (expected)
Reporter | ||
Comment 13•10 years ago
|
||
(In reply to Aetf from comment #12)
> Not a solution yet but I'm working on two possible modifications
> * modify canRecoverOnBailout to return false when its regexp is a sticky one
This sounds like the easiest for now, so focus on this one ;)
> * modify the way instruction uses is added making it capture implicit uses
> on lastIndex.
I don't think this is a good solution, as we want to add the ability to remove branches which are infrequently used. In which case, this would be a blocker.
> And some other problems.
>
> (In reply to Nicolas B. Pierron [:nbp] from comment #8)
> > var re = new RegExp("str\\d+" + (i % 10), "y");
> > var res = re.test("str0123456789");
> > if (uceFault_regexp_sticky_test(i) || uceFault_regexp_sticky_test(i)) {
> > assertEq(res, true);
> > }
> > assertEq(re.lastIndex == 0, false);
>
> This test case will fail when i%10 == 0 because /str\d+0/ won't match
> "str0123456789", and thus re.lastIndex equals 0. Did you mean something like
> "str00123456789" that will be matched regardless of the value of i?
oh, or change the \\d+ by \\d*.
> offthread compile also seems influencing the result.
> with --ion-offthread-compile=on
> * rregexp_test -> passed
> * rregexp_sticky_test(modified) -> passed (unexpected)
This is not "unexpected" but non-deterministic, as this depends on how fast IonMonkey compiler thread is able to compile compared to the remaining time for the execution. This is why we are mostly running test with "--ion-offthread-compile=off", to ensure that Ion's result is available before the end of the run.
Assignee | ||
Comment 14•10 years ago
|
||
Still can't find a way to detect the sticky flag precisely in every case. Obtaining the regexp object in MInstruction seems impossible, since the instruction have not be actually executed yet.
So here is a more conventional solution. canRecoverOnBailout only returns true if we can make sure that sticky flag is not set.
> bool canRecoverOnBailout() const {
> bool canRecover = false;
> if (regexp()->op() == Op_RegExp) {
> MRegExp *mre = (MRegExp*) regexp();
> canRecover = !mre->source()->sticky();
> }
> return canRecover;
> }
This should pass the sticky test case as well as this additional test case in which the regexp object is created through a literal notation.
> function rregexp_sticky_literal_test(i) {
> var re = function() { return /str\d*0/y; } ()
> var res = re.test("str00123456789");
> if (uceFault_regexp_sticky_test(i) || uceFault_regexp_sticky_test(i)) {
> assertEq(res, true);
> }
>
> assertEq(re.lastIndex == 0, false);
> return i;
> }
Attachment #8462999 -
Attachment is obsolete: true
Attachment #8466672 -
Flags: review?(nicolas.b.pierron)
Reporter | ||
Comment 15•10 years ago
|
||
Comment on attachment 8466672 [details] [diff] [review]
regexptest-v2.patch
Review of attachment 8466672 [details] [diff] [review]:
-----------------------------------------------------------------
This sounds like a good start for recovering MRegExpTest. :)
Still few a few comments.
::: js/src/jit-test/tests/ion/dce-with-rinstructions.js
@@ +588,5 @@
> +}
> +
> +var uceFault_regexp_sticky_literal_test = eval(uneval(uceFault).replace('uceFault', 'uceFault_regexp_sticky_literal_test'))
> +function rregexp_sticky_literal_test(i) {
> + var re = function() { return /str\d*0/y; } ()
If the goal is to test with a literal, why having a function wrapping it?
::: js/src/jit/MIR.h
@@ +5601,5 @@
> + bool writeRecoverData(CompactBufferWriter &writer) const;
> + bool canRecoverOnBailout() const {
> + bool canRecover = false;
> + if (regexp()->op() == Op_RegExp) {
> + MRegExp *mre = (MRegExp*) regexp();
use isRegExp() and toRegExp()
Attachment #8466672 -
Flags: review?(nicolas.b.pierron) → feedback+
Assignee | ||
Comment 16•10 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #15)
> If the goal is to test with a literal, why having a function wrapping it?
Oh I wrote that to see if regexptest can still get a constant regexp even if the literal has been passed around somehow. And yes, wrapping function isn't necessary. Removed.
> use isRegExp() and toRegExp()
Modified.
Attachment #8466672 -
Attachment is obsolete: true
Attachment #8467110 -
Flags: review?(nicolas.b.pierron)
Reporter | ||
Comment 17•10 years ago
|
||
Comment on attachment 8467110 [details] [diff] [review]
regexptest-v2.patch
Review of attachment 8467110 [details] [diff] [review]:
-----------------------------------------------------------------
Ok, One last iteration and the next one would be good ;)
::: js/src/jit-test/tests/ion/dce-with-rinstructions.js
@@ +590,5 @@
> +var uceFault_regexp_sticky_literal_test = eval(uneval(uceFault).replace('uceFault', 'uceFault_regexp_sticky_literal_test'))
> +function rregexp_sticky_literal_test(i) {
> + var re = /str\d*0/y;
> + var res = re.test("str00123456789");
> + if (uceFault_regexp_sticky_test(i) || uceFault_regexp_sticky_test(i)) {
s/uceFault_regexp_sticky_test/uceFault_regexp_sticky_literal_test/g
::: js/src/jit/MIR.h
@@ +5603,5 @@
> + bool canRecover = false;
> + if (regexp()->isRegExp()) {
> + MRegExp *mre = regexp()->toRegExp();
> + canRecover = !mre->source()->sticky();
> + }
nit:
if (regexp()->isRegExp())
return !regexp()->toRegExp()->source()->sticky();
return false;
There is no need for mre, nor the extra boolean.
::: js/src/jit/Recover.cpp
@@ +906,5 @@
> + return false;
> +
> + RootedValue rootedres(cx);
> + rootedres.setBoolean(result);
> + iter.storeInstructionResult(rootedres);
style-nit: s/rootedres/res/
Rooted is already mentioned as part of the variable definition.
Attachment #8467110 -
Flags: review?(nicolas.b.pierron) → feedback+
Assignee | ||
Comment 18•10 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #17)
> s/uceFault_regexp_sticky_test/uceFault_regexp_sticky_literal_test/g
Oops, I should be more careful ;P
> There is no need for mre, nor the extra boolean.
Modified.
> style-nit: s/rootedres/res/
>
> Rooted is already mentioned as part of the variable definition.
I tried to make the raw result and the rooted value be more distinguishable. Now I actually make the following modification which is more consistent to other instructions' naming convension (result and resultObject).
::: js/src/jit/Recover.cpp
@@ +904,8 @@
> + bool resultBool;
> +
> + if (!js::regexp_test_raw(cx, regexp, string, &resultBool))
> + return false;
> +
> + RootedValue result(cx);
> + result.setBoolean(resultBool);
> + iter.storeInstructionResult(result);
Attachment #8467110 -
Attachment is obsolete: true
Attachment #8467619 -
Flags: review?(nicolas.b.pierron)
Reporter | ||
Comment 19•10 years ago
|
||
Comment on attachment 8467619 [details] [diff] [review]
regexptest-v2.patch
Review of attachment 8467619 [details] [diff] [review]:
-----------------------------------------------------------------
Awesome work, but I am sorry for this last iteration.
I recall that the global flag was also used for setting a non-null lastIndex, so we also need to disable the recover instruction if the regexp has a global flag. The code also suggests the same thing, that the lastIndex might be non-zero if we have either a global or a sticky flag. Hopefully there is only one location which does not zero the lastIndex and the guards are clear.
::: js/src/jit-test/tests/ion/dce-with-rinstructions.js
@@ +594,5 @@
> + if (uceFault_regexp_sticky_literal_test(i) || uceFault_regexp_sticky_literal_test(i)) {
> + assertEq(res, true);
> + }
> +
> + assertEq(re.lastIndex == 0, false);
Add similar test cases for other RegExpObject flags [1], at least to ensure that there is no more cases that we are missing, make sure the test cases are working well with --no-ion, and well with --ion-offthread-compile=off.
[1] http://dxr.mozilla.org/mozilla-central/source/js/src/vm/RegExpObject.h#355
::: js/src/jit/MIR.h
@@ +5702,5 @@
> +
> + bool writeRecoverData(CompactBufferWriter &writer) const;
> + bool canRecoverOnBailout() const {
> + if (regexp()->isRegExp())
> + return !regexp()->toRegExp()->source()->sticky();
Apparently we also need to take the global flag [2] into account. Also add a comment at both locations such that we keep this code in sync, or better, make an accessor to check if we need to update the lastIndex to a non-zero value.
[2] http://dxr.mozilla.org/mozilla-central/source/js/src/builtin/RegExp.cpp#590
Attachment #8467619 -
Flags: review?(nicolas.b.pierron) → feedback+
Assignee | ||
Comment 20•10 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #19)
> Awesome work, but I am sorry for this last iteration.
I'm happy to do more and learn more ;)
> Add similar test cases for other RegExpObject flags
I'm also thinking of changing these test cases' name to their corresponding flag to make things shorter.
s/uceFault_regexp_sticky_test/uceFault_regexp_y_test/g
s/uceFault_regexp_sticky_literal_test/uceFault_regexp_y_literal_test/g
Would this be OK?
> make an accessor to check if we need to update the lastIndex to a non-zero value.
Good idea to keep things sync, but a few question though.
* Where to add, in RegExpObject or RegExpShared or both?
* It seems that inlined steps 6, 7, 9a in RegExp.cpp [1] should also use the accessor. But I'm not sure.
[1] http://dxr.mozilla.org/mozilla-central/source/js/src/builtin/RegExp.cpp#563
BTW, I find a typo in Recover.cpp. The undef should be MATCH_OPCODES_
::: js/src/jit/Recover.cpp
@@ 40,9 @@
> # define MATCH_OPCODES_(op) \
> case Recover_##op: \
> static_assert(sizeof(R##op) <= sizeof(RInstructionStorage), \
> "Storage space is too small to decode R" #op " instruction s."); \
> new (raw->addr()) R##op(reader); \
> break;
>
> RECOVER_OPCODE_LIST(MATCH_OPCODES_)
> # undef DEFINE_OPCODES_
Should I open a new bug for this or just modify it in my patch?
Reporter | ||
Comment 21•10 years ago
|
||
(In reply to Aetf from comment #20)
> > Add similar test cases for other RegExpObject flags
>
> I'm also thinking of changing these test cases' name to their corresponding
> flag to make things shorter.
> s/uceFault_regexp_sticky_test/uceFault_regexp_y_test/g
> s/uceFault_regexp_sticky_literal_test/uceFault_regexp_y_literal_test/g
>
> Would this be OK?
Sure.
> > make an accessor to check if we need to update the lastIndex to a non-zero value.
>
> Good idea to keep things sync, but a few question though.
> * Where to add, in RegExpObject or RegExpShared or both?
I tihnk the RegExpObject looks good as, this is where are the flags.
> * It seems that inlined steps 6, 7, 9a in RegExp.cpp [1] should also use the
> accessor. But I'm not sure.
>
> [1]
> http://dxr.mozilla.org/mozilla-central/source/js/src/builtin/RegExp.cpp#563
I agree, but then the topic is too different than the original patch. So I suggest to do a second patch before this one, just to add this accessor.
> BTW, I find a typo in Recover.cpp. The undef should be MATCH_OPCODES_
>
> ::: js/src/jit/Recover.cpp
> @@ 40,9 @@
> > # define MATCH_OPCODES_(op) \
> > case Recover_##op: \
> > static_assert(sizeof(R##op) <= sizeof(RInstructionStorage), \
> > "Storage space is too small to decode R" #op " instruction s."); \
> > new (raw->addr()) R##op(reader); \
> > break;
> >
> > RECOVER_OPCODE_LIST(MATCH_OPCODES_)
> > # undef DEFINE_OPCODES_
>
> Should I open a new bug for this or just modify it in my patch?
Fell free to open a new bug ;)
Otherwise, when you will have commit access, for such typo (which has no consequences) you can just ask for a review on IRC and push the patch.
Assignee | ||
Comment 22•10 years ago
|
||
Attachment #8469087 -
Flags: review?(nicolas.b.pierron)
Attachment #8469087 -
Attachment description: [part1] Add needUpdateLastIndex accessor in RegExpObject → [part 1] Add needUpdateLastIndex accessor in RegExpObject
Assignee | ||
Comment 23•10 years ago
|
||
Done!
I've created two patches using Mercurial Queues, hope they are in right format.
Attachment #8467619 -
Attachment is obsolete: true
Attachment #8469090 -
Flags: review?(nicolas.b.pierron)
Reporter | ||
Comment 24•10 years ago
|
||
Comment on attachment 8469087 [details] [diff] [review]
[part 1] Add needUpdateLastIndex accessor in RegExpObject
Review of attachment 8469087 [details] [diff] [review]:
-----------------------------------------------------------------
This looks good to me, but I admit I would be more confident if we could add a "const" in front of the flags field of the RegExpShared. But this can be done as part of a different bug ;)
Attachment #8469087 -
Flags: review?(nicolas.b.pierron)
Attachment #8469087 -
Flags: review+
Attachment #8469087 -
Flags: feedback?(bhackett1024)
Reporter | ||
Comment 25•10 years ago
|
||
Comment on attachment 8469090 [details] [diff] [review]
[part 2] Implement RegExpTest Recover Instruction
Review of attachment 8469090 [details] [diff] [review]:
-----------------------------------------------------------------
Awesome work!
So the next step are the following, I will pull your patches and send them to the try-server. The try server is used for verifying patches before landing the changes into the code base which is used by everybody.
Congratulation :)
Would you be interested on continuing on another recover instruction, such as RStringReplace and RRegExpReplace, or you have something else in mind?
::: js/src/jit/MIR.h
@@ +5703,5 @@
> + bool writeRecoverData(CompactBufferWriter &writer) const;
> + bool canRecoverOnBailout() const {
> + /* RegExpTest has a side-effect on the regexp object's lastIndex
> + * when sticky or global flags are set.
> + * Return false unless we are sure it's not the case. */
style-nit: there is a trailing white space, I'll fix it locally before pushing your patches ;)
Attachment #8469090 -
Flags: review?(nicolas.b.pierron) → review+
Reporter | ||
Comment 26•10 years ago
|
||
I've sent the patches to our try server:
> You can view the progress of your build at the following URL:
> https://tbpl.mozilla.org/?tree=Try&rev=3db011825c9f
> Alternatively, view them on Treeherder (experimental):
> https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=3db011825c9f
Assignee | ||
Comment 27•10 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #25)
> So the next step are the following, I will pull your patches and send them
> to the try-server. The try server is used for verifying patches before
> landing the changes into the code base which is used by everybody.
Thank you :)
> Would you be interested on continuing on another recover instruction, such
> as RStringReplace and RRegExpReplace, or you have something else in mind?
I'd like to continue on another recover instruction. Actually I just finished a project on my compiler course last month. So it's interesting to see how those things are used in the real world :)
> style-nit: there is a trailing white space, I'll fix it locally before
> pushing your patches ;)
;)
Comment 28•10 years ago
|
||
Comment on attachment 8469087 [details] [diff] [review]
[part 1] Add needUpdateLastIndex accessor in RegExpObject
Review of attachment 8469087 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/builtin/RegExp.cpp
@@ +559,5 @@
> if (!ToInteger(cx, lastIndex, &d))
> return RegExpRunStatus_Error;
>
> /* Inlined steps 6, 7, 9a with doubles to detect failure case. */
> + if ((reobj->needUpdateLastIndex()) && (d < 0 || d > length)) {
Unnecessary () around the call.
Attachment #8469087 -
Flags: feedback?(bhackett1024) → feedback+
Reporter | ||
Comment 29•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/d538719acebe
https://hg.mozilla.org/integration/mozilla-inbound/rev/d591725fbf82
These are the changeset corresponding to your patches, you can see our continuous integration processing them [1]. Later today a sheriff will merge these changes in mozilla-central, and these modifications would be available in the next nightly of Firefox, and this change would be release to all Firefox users in Firefox 34 ;)
Congratulation!
[1] https://treeherder.mozilla.org/ui/#/jobs?repo=mozilla-inbound
Comment 30•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/d538719acebe
https://hg.mozilla.org/mozilla-central/rev/d591725fbf82
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in
before you can comment on or make changes to this bug.
Description
•