Closed
Bug 977966
Opened 10 years ago
Closed 10 years ago
Optimize String#split('foo').join('bar') pattern (IonMonkey part)
Categories
(Core :: JavaScript Engine: JIT, defect)
Core
JavaScript Engine: JIT
Tracking
()
RESOLVED
FIXED
mozilla34
People
(Reporter: till, Assigned: vash, Mentored)
References
Details
(Whiteboard: [lang=c++][js:p2])
Attachments
(4 files, 20 obsolete files)
3.55 KB,
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
25.46 KB,
patch
|
vash
:
review+
|
Details | Diff | Splinter Review |
9.11 KB,
patch
|
till
:
review+
|
Details | Diff | Splinter Review |
1.97 KB,
patch
|
till
:
review+
|
Details | Diff | Splinter Review |
Splitting a string on one string pattern and joining it using another is a very common practice. I wonder if it might be possible to check if both split and join are the original methods, and the argument to split is a string, and optimize the entire pattern to get rid of the temporary array creation entirely.
Comment 1•10 years ago
|
||
(In reply to Till Schneidereit [:till] (on vacation March 1st to 23rd) from comment #0) > check if both split and join are the original methods Yes, this would be in js/src/jit/MCallOptimize.cpp (we already have MStringSplit, we would need to add MArrayJoin), and then in a later optimization, we can fold the 2 into one MStringReplace instruction. Note that such optimization will only work with, str.split("foo").join("bar") but not with var x = str.split("foo"); … x.join("bar"); because we might have other instructions capturing the state of x with an MResumePoints, and this kind of optimization cannot be done before Bug 878503. So, when doing the replacement of these instructions, we would need to check that there is only one MUse of the result of the split instruction.
Whiteboard: [mentor=nbp][lang=c++]
Reporter | ||
Comment 2•10 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #1) > Note that such optimization will only work with, > > str.split("foo").join("bar") > > but not with > > var x = str.split("foo"); > … > x.join("bar"); > > because we might have other instructions capturing the state of x with an > MResumePoints, and this kind of optimization cannot be done before Bug > 878503. > > So, when doing the replacement of these instructions, we would need to check > that there is only one MUse of the result of the split instruction. Yeah, that's what I thought. Still, the exact pattern of using split and join chained together is very popular, so this should give some nice wins in real-world code.
Updated•10 years ago
|
Whiteboard: [mentor=nbp][lang=c++] → [mentor=nbp][lang=c++][js:p2]
Comment 3•10 years ago
|
||
can you assign me the bug please ?
Comment 4•10 years ago
|
||
hi Aditya: please feel free to work on this bug or ask questions. Assigning a bug to a person is just a formality. I see you have volunteered to fix four other bugs yesterday. I recommend focusing on just one first bug at a time. :)
Hi, I would like to work on this bug but I'm completely new to Mozilla development. I have completed setting up mozilla environment, so can you tell me how to go about it? Thank you.
Comment 6•10 years ago
|
||
(In reply to Debkanya Mazumder from comment #5) > Hi, I would like to work on this bug but I'm completely new to Mozilla > development. I have completed setting up mozilla environment, so can you > tell me how to go about it? Thank you. Hi Debkanya, I see that you ask to be mentored on old bugs, I suggest you focus on this one for now. This bug should be easy as soon as you understand how MStringSplit and MStringReplace are working. I suggest you to work as follow: 1/ Make a first patch to implement MArrayJoin. 2/ Make a second patch to replace MStringSplit & MArrayJoin by MStringReplace.
Assignee: general → rimjhim.mazumder17
Status: NEW → ASSIGNED
Hi Nicolas, Thank you. I intend to work on this bug for now, and I appreciate all the help since I am completely new. I will start working on the bug and get back as soon as possible.
Reporter | ||
Comment 8•10 years ago
|
||
Hey Debkanya, are you still working on this? If so, can you give us an update on where you are, or, if you're stuck, what's giving you problems? We'd gladly support you with feedback and suggestions. You can also join us on IRC (https://wiki.mozilla.org/IRC) in the #jsapi channel, where we can discuss things in realtime.
Flags: needinfo?(rimjhim.mazumder17)
I have a doubt which I'd like to be answered. Is it necessary to implement StringReplace using MArrayJoin? Can it be implemented using the following procedure? Considering all the "foo" in str is to be replaced by "/". string str="barfoobarfooulz"; string str2="foo"; found=str.find(str2); [This returns all the position of "foo" in str] str.replace(found,str2.length(),"/"); [replaces "foo" with "/"] which gives the result bar/bar/ulz. Can this MStringReplace be implemented using the above method?
Flags: needinfo?(rimjhim.mazumder17)
Comment 10•10 years ago
|
||
(In reply to Debkanya Mazumder from comment #9) > I have a doubt which I'd like to be answered. Is it necessary to implement > StringReplace using MArrayJoin? > Can it be implemented using the following procedure? > > Considering all the "foo" in str is to be replaced by "/". > > string str="barfoobarfooulz"; > string str2="foo"; > > found=str.find(str2); [This returns all the position of "foo" in str] > > str.replace(found,str2.length(),"/"); [replaces "foo" with "/"] > > which gives the result bar/bar/ulz. > > Can this MStringReplace be implemented using the above method? Hi Debkanya, you could give it a try if you want, maybe using a self-hosted implementation (i.e. written in JS, see also [1]), but the performance of our RegExp engine is such that I doubt it will be a win and would rather implement the approach suggested by nbp in comment 6. Ideally, we'd try both techniques, compare them when being run on micro-benchmarks (and / or tracked benchmarks, as shown in [2]), and would choose the best among these. In practice, implementing both of them will take much time, while implementing directly the suggested approach will surely be a win. [1] https://developer.mozilla.org/en-US/docs/SpiderMonkey/Internals/self-hosting [2] http://arewefastyet.com/
Reporter | ||
Comment 11•10 years ago
|
||
What we want here is a method that is substantially faster than what we have right now. The method you're proposing implements the split().join() pattern in terms of other language features, but, as far as I can tell, in a way that would be faster than what we have now. Note also that String#replace doesn't work the way your example would need it to work: it takes the string to search for (or a regular expression) as its first argument and the string to replace it with as the second: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace More importantly, we already have an implementation of MStringReplace, so the task here is to identify the pattern "MStringSplit, immediately followed by MArrayJoin", and completely replace that with MStringReplace. Implementing MArrayJoin is a good first step for that. (And useful in itself, I think.)
Reporter | ||
Comment 12•10 years ago
|
||
Ah, I didn't see bbouvier's comment before posting mine. I still don't think that it makes sense to implement anything else but the technique suggested by nbp.
Comment 13•10 years ago
|
||
Okay, thank you!
Comment 14•10 years ago
|
||
Is the use of vector header file from standard library allowed when implementing MArrayJoin in MCallOptimize.cpp?
Comment 15•10 years ago
|
||
Attachment #8413341 -
Flags: review?(nicolas.b.pierron)
Comment 16•10 years ago
|
||
Comment on attachment 8413341 [details] [diff] [review] rev1 - Implemented MArrayJoin in MCallOptimize.CPP. Review of attachment 8413341 [details] [diff] [review]: ----------------------------------------------------------------- IonMonkey is using SSA form [1], where the graph is composed of MBasicBlock which contains a list of MDefinitions. One MDefinition corresponds to one assignment, and also one function. MArrayJoin is supposed to be the abstract representation of the Array.join JavaScript function: var x = arr.join("/"); // In JavaScript // In MCallOptimize.cpp MDefinition *array = …; MDefinition *str = …; […] MDefinition *join = MArrayJoin::New(alloc(), array, str); […] [1] http://en.wikipedia.org/wiki/Static_single_assignment_form ::: js/src/jit/MCallOptimize.cpp @@ +1876,5 @@ > return InliningStatus_Inlined; > } > > +class MArrayJoin > +{ Have a look at other function call optimizations on Arrays, such as MArrayPush [2], which is also applied on an Array and also take one argument. [2] http://dxr.mozilla.org/mozilla-central/source/js/src/jit/MIR.h?from=MArrayPush#6257
Attachment #8413341 -
Flags: review?(nicolas.b.pierron) → review-
Updated•10 years ago
|
Mentor: nicolas.b.pierron
Whiteboard: [mentor=nbp][lang=c++][js:p2] → [lang=c++][js:p2]
Comment 17•10 years ago
|
||
Attachment #8448693 -
Flags: review?(nicolas.b.pierron)
Comment 18•10 years ago
|
||
Comment on attachment 8448693 [details] [diff] [review] rev2 - Implement MArrayJoin Review of attachment 8448693 [details] [diff] [review]: ----------------------------------------------------------------- I cannot give a review because this patch is incomplete. There is no code to produce a MArrayJoin (MCallOptimize.cpp), and no code to lower the MIR to a LIR instruction (Lowering.cpp & LIR-Common.h) and no code to generate the code (CodeGenerator.cpp) which needs to be executed for such instruction. Still this is a good step forward compare to the previous patch :) ::: js/src/jit/LOpcodes.h @@ +133,5 @@ > _(ConcatPar) \ > _(CharCodeAt) \ > _(FromCharCode) \ > _(StringSplit) \ > + _(ArrayJoin) \ style-nit: avoid tabs. ::: js/src/jit/MCallOptimize.cpp @@ -1958,5 @@ > > -class MArrayJoin > -{ > - vector<string> str; > - string res; Do not base your changes on a bad patch. ::: js/src/jit/MIR.h @@ +4488,5 @@ > + : MBinaryInstruction(object, sep), > + typeObject_(templateObject->type()) > + { > + setResultType(MIRType_String); > + setMovable(); style-nit: avoid tabs and trailing whitespace. @@ +4517,5 @@ > + bool possiblyCalls() const { > + return true; > + } > + AliasSet getAliasSet() const { > + return AliasSet::None(); I would be surprised if this instruction does not alias anything. AliasSet::None() would mean that this instruction does not make any side effect (AliasSet::Store(…)) and that it does not depends on any side effects (AliasSet::Load(…)). Having an AliasSet::None() would mean that we could hoist MArrayJoin before a function call. function mutateArray(arr) { arr[0] = "H"; arr[1] = "e"; } function f(c) { var arr = ["B", "a", "l", "l", "o"]; mutateArray(arr); return arr.join(""); // == "Hello" } ::: js/src/jit/MOpcodes.h @@ +74,5 @@ > _(ConcatPar) \ > _(CharCodeAt) \ > _(FromCharCode) \ > _(StringSplit) \ > + _(ArrayJoin) \ style-nit: avoid tabs. ::: js/src/jit/ParallelSafetyAnalysis.cpp @@ +169,5 @@ > SAFE_OP(ConcatPar) > UNSAFE_OP(CharCodeAt) > UNSAFE_OP(FromCharCode) > UNSAFE_OP(StringSplit) > + UNSAFE_OP(ArrayJoin) style-nits: avoid tabs.
Attachment #8448693 -
Flags: review?(nicolas.b.pierron) → feedback+
Comment 19•10 years ago
|
||
How do I implement the changes without the bad patch?
Comment 20•10 years ago
|
||
(In reply to Debkanya Mazumder from comment #19) > How do I implement the changes without the bad patch? If you are using mercurial queues, you need to delete the patch [1]. [1] https://developer.mozilla.org/en-US/docs/Mercurial_Queues
Comment 21•10 years ago
|
||
Thank you!
Assignee | ||
Comment 22•10 years ago
|
||
Attachment #8459333 -
Flags: review?(nicolas.b.pierron)
Comment 23•10 years ago
|
||
Comment on attachment 8459333 [details] [diff] [review] Implement MArrayJoin. Review of attachment 8459333 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/CodeGenerator.cpp @@ +6091,5 @@ > +{ > + pushArg(ToRegister(lir->separator())); > + pushArg(ToRegister(lir->array())); > + > + return callVM(ArrayJoinInfo, lir); Can you make tests for this code and add a helper function, such as the arguments types are matching the argument type of the function. > typedef bool(*ArrayJoinFn)(JSContext *, unsigned, Value *); ToRegister(lir->array()) is not supposed to contain an unsigned. ToRegister(lir->separator()) is not supposed to contain an Value *. ::: js/src/jit/LIR-Common.h @@ +4223,5 @@ > } > }; > > +class LArrayJoin : public LCallInstructionHelper<1, 2, 1> > +{ Add a TypePolicy to ensure that the arguments are an Object and a String. ::: js/src/jit/Lowering.cpp @@ +2763,5 @@ > +LIRGenerator::visitArrayJoin(MArrayJoin *ins) > +{ > + JS_ASSERT(ins->type() == MIRType_String); > + JS_ASSERT(ins->array()->type() == MIRType_Object); > + JS_ASSERT(ins->sep()->type() == MIRType_String); As there is no type policy (see comment above), these assertions do not hold. Array.join.call("not an Array", { notAString: true });
Attachment #8459333 -
Flags: review?(nicolas.b.pierron)
Updated•10 years ago
|
Attachment #8459333 -
Flags: feedback+
Assignee | ||
Comment 24•10 years ago
|
||
Attachment #8459333 -
Attachment is obsolete: true
Attachment #8460510 -
Flags: review?(nicolas.b.pierron)
Comment 25•10 years ago
|
||
Comment on attachment 8460510 [details] [diff] [review] patch.v2.diff Review of attachment 8460510 [details] [diff] [review]: ----------------------------------------------------------------- I am ok, with the Ion part of the patch, but I am not a specialist of the Array part, Waldo might know better. In Any case, I think it would be better to just provide an _impl function used by Ion, which share the same backend, even if we have to duplicate a bit of the logic to get the length. ::: js/src/jit-test/tests/ion/bug977966.js @@ +19,5 @@ > +var uceFault_join = eval(uneval(uceFault).replace('uceFault', 'uceFault_join')) > +function join(i) { > + var x = [i, i].join("->"); > + if (uceFault_join(i) || uceFault_join(i)) { > + assertEq(x, "99->99"); You don't need ucefault for these test cases, unless you are making attempts at handling the equivalent recover instruction. ::: js/src/jsarray.cpp @@ -1044,4 @@ > { > // This method is shared by Array.prototype.join and > // Array.prototype.toLocaleString. The steps in ES5 are nearly the same, so > // the annotations in this function apply to both toLocaleString and join. Please, try to keep this comment valid. @@ -1048,5 @@ > > // Step 1 > - RootedObject obj(cx, ToObject(cx, args.thisv())); > - if (!obj) > - return false; http://www.ecma-international.org/ecma-262/5.1/#sec-15.4.4.5 @@ +1127,5 @@ > + > + > + RootedLinearString sepstr(cx); > + if (!Locale && args.hasDefined(0)) { > + JSString *s = ToString<CanGC>(cx, args[0]); This does not have the same semantic as specified in the ecma standard. The access of GetLengthProperty is observable. Add a test case which break with the current proposal, where the behaviour of ToString(separator) depends on the call of GetLengthProperty being done before. Something similar to: var lengthWasCalled = false; var obj = {"0": "", "1": ""}; Object.defineProperty(obj, "length", { get : function(){ lengthWasCalled = true; return 2; }, enumerable : true, configurable : true }); var res = Array.join.call(obj, { valueOf: function () { if (lengthWasCalled) return "good"; else return "bad"; } }) assertEq(res, "good");
Attachment #8460510 -
Flags: review?(nicolas.b.pierron) → feedback+
Assignee | ||
Comment 26•10 years ago
|
||
in array_join_impl, sep is well defined while array isn't.
Attachment #8466943 -
Flags: feedback?(nicolas.b.pierron)
Assignee | ||
Comment 27•10 years ago
|
||
Attachment #8466950 -
Flags: feedback?(nicolas.b.pierron)
Assignee | ||
Updated•10 years ago
|
Attachment #8466943 -
Attachment is obsolete: true
Attachment #8466943 -
Flags: feedback?(nicolas.b.pierron)
Assignee | ||
Updated•10 years ago
|
Attachment #8460510 -
Attachment is obsolete: true
Comment 28•10 years ago
|
||
Comment on attachment 8466950 [details] [diff] [review] patch.v4.diff Review of attachment 8466950 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/MIR.h @@ +6862,5 @@ > > +class MArrayJoin > + : public MBinaryInstruction, > + public ObjectPolicy<0>, > + public StringPolicy<1> Use a MixPolicy. @@ +6871,5 @@ > + setResultType(MIRType_String); > + } > + public: > + INSTRUCTION_HEADER(ArrayJoin) > + static MArrayJoin *New(TempAllocator &alloc, MDefinition *array, MDefinition *sep) There this no typePolicy function on the MIR Instruction, add one. This will highlight the above error also. ::: js/src/jsarray.cpp @@ +1045,5 @@ > // This method is shared by Array.prototype.join and > // Array.prototype.toLocaleString. The steps in ES5 are nearly the same, so > // the annotations in this function apply to both toLocaleString and join. > > + // Steps 1 to 6, done by called function comment: should be done by the caller (?) @@ +1205,5 @@ > return ArrayJoin<false>(cx, args); > } > > +JSString * > +js::array_join_impl(JSContext *cx, HandleValue array, HandleString sep) Move this function to VMFunction.cpp. @@ +1209,5 @@ > +js::array_join_impl(JSContext *cx, HandleValue array, HandleString sep) > +{ > + // This method is shared by Array.prototype.join and > + // Array.prototype.toLocaleString. The steps in ES5 are nearly the same, so > + // the annotations in this function apply to both toLocaleString and join. This method is not used for Array.prototype.toLocaleString.
Attachment #8466950 -
Flags: feedback?(nicolas.b.pierron) → feedback+
Assignee | ||
Comment 29•10 years ago
|
||
Attachment #8466950 -
Attachment is obsolete: true
Attachment #8467026 -
Flags: review?(nicolas.b.pierron)
Comment 30•10 years ago
|
||
Comment on attachment 8467026 [details] [diff] [review] patch.v5.diff Review of attachment 8467026 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/Lowering.cpp @@ +2766,5 @@ > + JS_ASSERT(ins->array()->type() == MIRType_Object); > + JS_ASSERT(ins->sep()->type() == MIRType_String); > + > + LArrayJoin *lir = new(alloc()) LArrayJoin(useFixed(ins->array(), CallTempReg1), > + useFixed(ins->sep(), CallTempReg2)); CallTempReg should no longer be added, use GetTempRegForIntArg instead. ::: js/src/jit/MIR.h @@ +6887,5 @@ > + bool possiblyCalls() const { > + return true; > + } > + virtual AliasSet getAliasSet() const { > + return AliasSet::None(); This is incorrect. This instruction is reading the content of an array.
Attachment #8467026 -
Flags: review?(nicolas.b.pierron) → feedback+
Assignee | ||
Comment 31•10 years ago
|
||
Attachment #8467026 -
Attachment is obsolete: true
Attachment #8467062 -
Flags: review?(nicolas.b.pierron)
Comment 32•10 years ago
|
||
Comment on attachment 8467062 [details] [diff] [review] patch.v6.diff Review of attachment 8467062 [details] [diff] [review]: ----------------------------------------------------------------- Nice work :) Upload a new version with the fix style issue and let the second part of the bug done ;) ::: js/src/jit/Lowering.cpp @@ +2765,5 @@ > + JS_ASSERT(ins->type() == MIRType_String); > + JS_ASSERT(ins->array()->type() == MIRType_Object); > + JS_ASSERT(ins->sep()->type() == MIRType_String); > + > + LArrayJoin *lir = new(alloc()) LArrayJoin(useRegisterAtStart(ins->array()), useRegisterAtStart(ins->sep())); style-nit: wrap this line in two (100 columns max).
Attachment #8467062 -
Flags: review?(nicolas.b.pierron) → review+
Assignee | ||
Comment 33•10 years ago
|
||
Forgot to change Store to Load in MIR.h. Fixed it with the style fix.
Attachment #8467062 -
Attachment is obsolete: true
Attachment #8467080 -
Flags: review?(nicolas.b.pierron)
Comment 34•10 years ago
|
||
Comment on attachment 8467080 [details] [diff] [review] patch.v7.diff Review of attachment 8467080 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/CodeGenerator.cpp @@ +10,5 @@ > #include "mozilla/Attributes.h" > #include "mozilla/DebugOnly.h" > #include "mozilla/MathAlgorithms.h" > > +#include "jsarray.h" nit: this include is no longer needed as jit::ArrayJoin is in VMFunctions.h
Attachment #8467080 -
Flags: review?(nicolas.b.pierron) → review+
Assignee | ||
Comment 35•10 years ago
|
||
Attachment #8467080 -
Attachment is obsolete: true
Attachment #8467090 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 36•10 years ago
|
||
Comment on attachment 8467090 [details] [diff] [review] patch.v8.diff Review of attachment 8467090 [details] [diff] [review]: ----------------------------------------------------------------- R+ from the previous patch.
Attachment #8467090 -
Flags: review?(nicolas.b.pierron) → review+
Assignee | ||
Comment 37•10 years ago
|
||
Remove the resume point IonBuilder::inlineArrayJoin as discussed with nbp.
Attachment #8467090 -
Attachment is obsolete: true
Attachment #8467731 -
Flags: review+
Assignee | ||
Comment 38•10 years ago
|
||
Second part of the bug.
Attachment #8467746 -
Flags: review?(nicolas.b.pierron)
Comment 39•10 years ago
|
||
Comment on attachment 8467746 [details] [diff] [review] patch.p2.diff Review of attachment 8467746 [details] [diff] [review]: ----------------------------------------------------------------- This looks good, but they are still some corner cases that we want to avoid, as the array is mutable. ::: js/src/jit/MIR.cpp @@ +3242,5 @@ > > +MDefinition * > +MArrayJoin::foldsTo(TempAllocator &alloc) { > + MDefinition *arr = array(); > + if (arr->isStringSplit()) { We only want to do so if MArrayJoin is the only use of the array. function split_join_2(i) { var x = (i + "-" + i).split("-") x.push("" + i); var res = x.join("->"); assertEq(x, i + "->" + i + "->" + i); return i; } ::: js/src/jsarray.cpp @@ +1087,5 @@ > return nullptr; > } else { > CharSeparatorOp<jschar> op(c); > if (!ArrayJoinKernel<Locale>(cx, op, obj, length, sb)) > + return nullptr; This should be part of the previous patch.
Attachment #8467746 -
Flags: review?(nicolas.b.pierron) → feedback+
Assignee | ||
Comment 40•10 years ago
|
||
Move a shunk to this commit.
Attachment #8467731 -
Attachment is obsolete: true
Attachment #8468312 -
Flags: review+
Assignee | ||
Comment 41•10 years ago
|
||
Attachment #8467746 -
Attachment is obsolete: true
Attachment #8468315 -
Flags: review?(nicolas.b.pierron)
Comment 42•10 years ago
|
||
Comment on attachment 8468315 [details] [diff] [review] patch.p2.v2.diff Review of attachment 8468315 [details] [diff] [review]: ----------------------------------------------------------------- Awesome \o/ Upload the new patch with the additional test case and I will send both patches to the try server later today. ::: js/src/jit/MIR.cpp @@ +3248,5 @@ > + MDefinition *pattern = arr->toStringSplit()->separator(); > + MDefinition *replacement = sep(); > + > + return MStringReplace::New(alloc, string, pattern, replacement); > + } nit: add a comment to mention that the MStringSplit can be recovered on bailout as we are removing its last remaining use, which implies the a.split(b).join(c) is replaced by a.replace(b, c), but if the result of the split is captured by resume points then it will be executed on the bailout path. Also add a test case to test that we will recover the result of MStringSplit, such as: function resumeHere() { bailout(); } function split_join_3(i) { var x = (i + "-" + i).split("-"); resumeHere(); var res = x.join("->"); assertEq(res, i + "->" + i); return i; }
Attachment #8468315 -
Flags: review?(nicolas.b.pierron) → review+
Comment 43•10 years ago
|
||
Comment on attachment 8468315 [details] [diff] [review] patch.p2.v2.diff Review of attachment 8468315 [details] [diff] [review]: ----------------------------------------------------------------- Sorry, I missed on of the corner cases in the previous review, where the MStringSplit will not be removed. ::: js/src/jit/MIR.cpp @@ +3242,5 @@ > > +MDefinition * > +MArrayJoin::foldsTo(TempAllocator &alloc) { > + MDefinition *arr = array(); > + if (arr->isStringSplit() && arr->hasOneDefUse()) { Note: we also want to check that the string split is not captured by an observable operand of a resume point, as we do before optimizing with recover instructions. Otherwise we might end up running the String split in addition to the String replace. An example of such case would be: function trip(i) { if (i == 99) assertEq(myjoin.arguments[0][0], "" + i) } function myjoin(i, x) { trip(i); return x.join("->"); } function split_join_4(i) { var x = (i + "-" + i).split("-"); var res = myjoin(x); assertEq(res, i + "->" + i); return i; } arguments are observable, as somebody can ask the argument object which captures the result of the string split. This is limitation of the recover instructions that we would have to remove later.
Attachment #8468315 -
Flags: review+ → feedback+
Assignee | ||
Comment 44•10 years ago
|
||
Attachment #8468378 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Updated•10 years ago
|
Attachment #8468315 -
Attachment is obsolete: true
Comment 45•10 years ago
|
||
Comment on attachment 8468378 [details] [diff] [review] patch.p2.v3.diff Review of attachment 8468378 [details] [diff] [review]: ----------------------------------------------------------------- see comment 43. You might also want to report the performance improvement on a micro-benchmark dedicated to this optimization, see https://developer.mozilla.org/en-US/docs/Mozilla/Projects/SpiderMonkey/Hacking_Tips#Benchmarking_with_sub-milliseconds_%28JS_shell%29
Attachment #8468378 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 46•10 years ago
|
||
Attachment #8468378 -
Attachment is obsolete: true
Attachment #8468460 -
Flags: review?(nicolas.b.pierron)
Comment 47•10 years ago
|
||
Comment on attachment 8468460 [details] [diff] [review] patch.p2.v4.diff Review of attachment 8468460 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/MIR.cpp @@ +3247,5 @@ > + // We're replacing foo.split(bar).join(baz) by > + // foo.replace(bar, baz). MStringSplit could be recovered by > + // a bailout. As we are removing its last use, and its result > + // could be captured by a resume point, this MStringSplit will > + // be executed on the bailout path. This comment does not match the condition which is above. If there is only one use, then the String split is not captured in any resume point, and only used as argument of MArrayJoin, which is fine for what this bug is aiming at. So either update the comment to reflect that MStringSplit is only by the MArrayJoin and nothing else, or use hasLiveDefUse() after flagging the current instruction as recovered on bailout (and reseting the flag after if not folded)
Attachment #8468460 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 48•10 years ago
|
||
Attachment #8468460 -
Attachment is obsolete: true
Attachment #8468517 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 49•10 years ago
|
||
Attachment #8468517 -
Attachment is obsolete: true
Attachment #8468517 -
Flags: review?(nicolas.b.pierron)
Attachment #8468527 -
Flags: review?(nicolas.b.pierron)
Comment 50•10 years ago
|
||
Comment on attachment 8468527 [details] [diff] [review] patch.p2.v6.diff Review of attachment 8468527 [details] [diff] [review]: ----------------------------------------------------------------- I think this is the same as the previous patch. ::: js/src/jit/MIR.cpp @@ +3245,5 @@ > + MDefinition *arr = array(); > + if (!arr->isStringSplit()) > + return this; > + > + bool wasRecoverOnBailout = arr->isRecoveredOnBailout(); here: arr->isRecoveredOnBailout() == false. @@ +3248,5 @@ > + > + bool wasRecoverOnBailout = arr->isRecoveredOnBailout(); > + arr->setRecoveredOnBailout(); > + if (!arr->hasLiveDefUses()) { > + // We're replacing foo.split(bar).join(baz) by un set the recovered on bailout flag after this condition.
Attachment #8468527 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 51•10 years ago
|
||
Attachment #8468527 -
Attachment is obsolete: true
Attachment #8469081 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 52•10 years ago
|
||
Attachment #8469081 -
Attachment is obsolete: true
Attachment #8469081 -
Flags: review?(nicolas.b.pierron)
Attachment #8469131 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Comment 53•10 years ago
|
||
Bench with: ======== var str = "i"; var replace = "i"; for (var i = 100; i < 100; ++i) { str += "-i"; str += "->i"; } function split_join(i) { return (str + i).split("-").join("->"); } var sum = 0; for (var i = 0; i < 10000000; ++i) { sum += split_join(i).length; } ======== Before patch.p2.v9.diff: 8592.118ms After: 5148.276ms 41% faster.
Attachment #8469131 -
Attachment is obsolete: true
Attachment #8469131 -
Flags: review?(nicolas.b.pierron)
Attachment #8469248 -
Flags: review?(nicolas.b.pierron)
Comment 54•10 years ago
|
||
Comment on attachment 8469248 [details] [diff] [review] patch.p2.v9.diff Review of attachment 8469248 [details] [diff] [review]: ----------------------------------------------------------------- Ok, this sounds good :) I'll fix the patch locally and push it to the try server. ::: js/src/jit/MIR.cpp @@ +3248,5 @@ > + return this; > + > + this->setRecoveredOnBailout(); > + if (arr->hasLiveDefUses()) { > + this->setNotRecoveredOnBailout(); style-nit: Remove "this->", as they are not needed to disambiguate anything. @@ +3261,5 @@ > + MDefinition *string = arr->toStringSplit()->string(); > + MDefinition *pattern = arr->toStringSplit()->separator(); > + MDefinition *replacement = sep(); > + > + setNotRecoveredOnBailout(); note that there is no "this->" here.
Attachment #8469248 -
Flags: review?(nicolas.b.pierron) → review+
Updated•10 years ago
|
Attachment #8413341 -
Attachment is obsolete: true
Updated•10 years ago
|
Attachment #8448693 -
Attachment is obsolete: true
Updated•10 years ago
|
Assignee: rimjhim.mazumder17 → davidmoreirafr
Assignee | ||
Comment 56•10 years ago
|
||
Attachment #8468312 -
Attachment is obsolete: true
Attachment #8469391 -
Flags: review+
Comment 58•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/4168157414b5 https://hg.mozilla.org/integration/mozilla-inbound/rev/9a6455a2eb93
Comment 59•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/4168157414b5 https://hg.mozilla.org/mozilla-central/rev/9a6455a2eb93
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
Comment 60•10 years ago
|
||
`x.split(y).join(z)` and `x.replace(y, z)` aren't equivalent, due to replacement patterns - for instance, `"abc".replace("a", "$'")` evaluates to "bcbc". This causes the following: function repl(x, y, z) { return x.split(y).join(z); } print(repl("abc", "a", "$'")); print(repl("abc", "a", "$'")); to print $'bc, bcbc with --ion-eager --ion-offthread-compile=off. Should I open a new bug for this?
Comment 61•10 years ago
|
||
Apart from which the replacement also needs to be global, and not just affect the first occurrence: print(repl("aa", "a", "c")); print(repl("aa", "a", "c")); prints cc, ca. And behavior when the pattern string is empty is also different...
Reporter | ||
Comment 62•10 years ago
|
||
Ugh, thanks for noticing, Simon. nbp, can you back out and work on a fix with vash? Ideally, the new version should contain tests for the differences Simon noted.
Status: RESOLVED → REOPENED
Flags: needinfo?(nicolas.b.pierron)
Resolution: FIXED → ---
Comment 63•10 years ago
|
||
If the string is a flat replacement, such as split().join(), then we reset the dollarIndex to UINT32_MAX. An UINT32_MAX dollarIndex is used to indicate that no dollar signed was found in the pattern, which is the same as not interpreting any of the dollar sign in the pattern.
Attachment #8474780 -
Flags: review?(till)
Reporter | ||
Comment 64•10 years ago
|
||
Comment on attachment 8474780 [details] [diff] [review] IonMonkey: split().join() no longer interpret replacement patterns. Review of attachment 8474780 [details] [diff] [review]: ----------------------------------------------------------------- Looks good. One nit: the description should probably be something like "Make IonMonkey-optimized split().join() not interpret replacement patterns". Also, this doesn't fix the second issue Simon identified, so I assume another patch is coming?
Attachment #8474780 -
Flags: review?(till) → review+
Comment 65•10 years ago
|
||
remote: https://tbpl.mozilla.org/?tree=Try&rev=04b49485a3da remote: https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=04b49485a3da
Comment 66•10 years ago
|
||
(In reply to Till Schneidereit [:till] from comment #64) > Also, this doesn't fix the second issue Simon identified, so I assume > another patch is coming? Oh, right, I did not noticed this was a different one. The problem is that String.replace only replace the first occurence in the string and we need to replace all of them as String.split. I think this is doable but this require some tweaking to make this more with a few modifications. Where this problem becomes complex is that we can make this work by using one of the SpiderMonkey extension to string replace, the only issue being that this extension register the last matches in RegExpStatics. We need to a add a returnsCapture flag to replaceData to either avoid adding the last MatchPairs, or to discard that last one. Sadly the MatchPairs are mostly accessed throught the RegExpStatics, which means that the second approach might be easier to implement at first. Note, this is already an issue with SpiderMonkey extension of String.replace. I think I will just no-op the optimization for the time being until we fix the String.replace to use flat-regexp without mutating the RegExpStatics.
Reporter | ||
Comment 67•10 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] {N/A 22-29/08} from comment #66) > Note, this is already an issue with SpiderMonkey extension of > String.replace. I think I will just no-op the optimization for the time > being until we fix the String.replace to use flat-regexp without mutating > the RegExpStatics. That's unfortunate, but given that the uplift is on Friday and this isn't *that* important an optimization, it makes a lot of sense.
Comment 68•10 years ago
|
||
Attachment #8475232 -
Flags: review?(till)
Reporter | ||
Comment 69•10 years ago
|
||
Comment on attachment 8475232 [details] [diff] [review] Disable split().join() optimization. Review of attachment 8475232 [details] [diff] [review]: ----------------------------------------------------------------- Thanks ::: js/src/jit-test/tests/ion/bug977966.js @@ +79,5 @@ > assertEq(s.split("-").join("$`$&$'"), i + "$`$&$'" + i); > assertEq(s.replace("-", "$`$&$'"), "" + i + i + "-" + i + i); > } > > +// Check that, as opposed to String.replace we are doing a global replacement as String.split do. uber-nits: comma after String.replace, and `s/do/does` ::: js/src/jit/MIR.cpp @@ +3380,5 @@ > > MDefinition * > MArrayJoin::foldsTo(TempAllocator &alloc) { > + // :TODO: Enable this optimization after fixing Bug 977966 test cases. > + if (true) why not just an unconditional `return this`?
Attachment #8475232 -
Flags: review?(till) → review+
Comment 70•10 years ago
|
||
(In reply to Till Schneidereit [:till] from comment #69) > ::: js/src/jit/MIR.cpp > @@ +3380,5 @@ > > > > MDefinition * > > MArrayJoin::foldsTo(TempAllocator &alloc) { > > + // :TODO: Enable this optimization after fixing Bug 977966 test cases. > > + if (true) > > why not just an unconditional `return this`? This was an attempt at muting compilers, but after checking with gcc4.8 and clang3.3, I do not see any warning about unreachable code, so I will remove this if before pushing.
Flags: needinfo?(nicolas.b.pierron)
Comment 71•10 years ago
|
||
Thanks for taking care of that. (I just saw the commit fly by in the push log and thought it felt a little wrong.) ::: js/src/jit-test/tests/ion/bug977966.js @@ +79,5 @@ > assertEq(s.split("-").join("$`$&$'"), i + "$`$&$'" + i); > assertEq(s.replace("-", "$`$&$'"), "" + i + i + "-" + i + i); > } > > +// Check that, as opposed to String.replace we are doing a global replacement as String.split do. Might as well add: assertEq("".split("").join("x"), ""); assertEq("abc".split("").join("x"), "axbxc"); which was the last thing from comment 61.
Comment 72•10 years ago
|
||
(In reply to Simon Lindholm from comment #71) > assertEq("".split("").join("x"), ""); > assertEq("abc".split("").join("x"), "axbxc"); oh, right. I'll will add these test cases too. Then maybe we should take a different approach and instead of basing our optimization on str_replace, we should base it on str_split, which will build a rope instead of an array of dependent strings.
Comment 73•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/d94086070578 I've push the test cases of both patches and comment 71 tests with the last patch (attachment 8475232 [details] [diff] [review]) which disables this feature for the time being. I did not push the previous patch (attachment 8474780 [details] [diff] [review]) as I am no longer sure this is the right approach anymore.
Keywords: leave-open
Comment 75•10 years ago
|
||
Hi! Can I work on this bug?
Comment 76•10 years ago
|
||
(In reply to Victor Carlquist from comment #75) > Can I work on this bug? This bug has some code associated to it, it would be better to close this one and make another one to solve the rest of the problem. Fixing the remaining issue implies moving some code around to reuse what we do with split, except that we want to construct a new string at the same time instead of constructing an array. Are you sure you still want to work on this bug? I opened Bug 1112537 for this purpose. This bug fixed the most of the Ion part of the problem, now we are lacking the JS engine part to provide a good substitute to map to.
Status: REOPENED → RESOLVED
Closed: 10 years ago → 10 years ago
Keywords: leave-open
Resolution: --- → FIXED
Summary: Optimize String#split('foo').join('bar') pattern → Optimize String#split('foo').join('bar') pattern (IonMonkey part)
Comment 78•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f6ee0c4405c5
You need to log in
before you can comment on or make changes to this bug.
Description
•