I was recently looking at some string-creation stats on SS/V8. For one thing, ropes are winning us big time: most strings created are ropes and most ropes are not flattened. But we still create a ton of rope nodes. One way to do better is to allow n-ary rope nodes (instead of current binary), thereby allowing us to use less nodes to describe the same sequence of concats. Putting these in the FINALIZE_SHORT_STRING arena would let a single rope have up to 12 children. But when can we make n-ary rope nodes? One immediate strategy is to bring back JSOP_CONCATN except to have JSOP_CONCATN build a lazy rope instead of doing an eager concat. (Bug 581747 killed JSOP_CONCATN because the eager concat was sometimes much worse than ropes.) A later idea is that, with linear inference (which would tell us a string was un-aliased), we could mutate rope nodes in place to append children, rather than creating new nodes. This bug should also instrument a browser to get some real non-SS/V8 rope-usage measurements.
I should mention that several of the string-heavy SS/V8 benchmarks have what seem to be hot JSOP_CONCATN-able expressions. I need to put back in JSOP_CONCATN to be sure though...
One case where CONCATN would be tremendously helpful is when concatenating short strings. If you concatenate two strings and the result will fit into a JSFatInlineString then we eagerly create that result. And now that we have Latin1 strings we can fit strings of length 23 in a JSFatInlineString. So if you have an expression like this: > 'rgba(' + r + ',' + g + ',' + b + ',' + a + ')'; We'll end up unnecessarily generating 7 temporary strings! I've seen this causes significant amounts of over-allocation in both pdf.js and Shumway, e.g. see https://github.com/mozilla/shumway/pull/1595/ where I added some horrible concat3..concat9 methods to Shumway to avoid exactly this problem.
Nursery-allocating strings should help but even then allocating fewer strings is faster and more efficient. We could also do some pattern-matching in Ion but doing this in the bytecode might be simpler... The problem is that CONCATN is not easy to Ion-compile. One thing we could do is factor out the ToString operations, so for the example you gave we'd have: JSOP_STRING "rgba(" JSOP_GETLOCAL r JSOP_TOSTRING JSOP_STRING "," ... JSOP_CONCATN Then CONCATN can assume all its operands are strings but it'll still be a challenge to handle this op efficiently. We have MConcat to concatenate just two strings and it's already complicated (it uses all registers on x86)...
FWIW, I'm no longer in favor of my original proposal to add an n-ary rope or to do this pattern-matching in the parser. For one thing, as Jan said, Ion pattern-matching should be fine and I think should be able to be much more aggressive due to its ability to speculate. Also, an n-ary rope seems like it'd require malloc. One possible compromise would be to allocate ropes as FAT_INLINE_STRINGs and use the extra space to store more siblings inline (so 4 total children on 32-bit, but only 3 on 64-bit). This would complicate the (already complex) flattening algorithm, though. Btw, in the example you gave in comment 2, is that rgba string immediately fed into some CSS/DOM field that expects a string of form "rgba(r,g,b,a)"? If so, then that suggests an optimization we've discussed for a while: based on our TI knowledge of the DOM setter and local pattern-matching, we avoid creating the string altogether and just pass the rgba components directly to the C++ method. This is like ye olde JSTraceableNative design where, for every native, you could have a list of specializations that applied if the types matched. bz was interested in this.
(In reply to Luke Wagner [:luke] from comment #4) > Btw, in the example you gave in comment 2, is that rgba string immediately > fed into some CSS/DOM field that expects a string of form "rgba(r,g,b,a)"? > If so, then that suggests an optimization we've discussed for a while: based > on our TI knowledge of the DOM setter and local pattern-matching, we avoid > creating the string altogether and just pass the rgba components directly to > the C++ method. This is like ye olde JSTraceableNative design where, for > every native, you could have a list of specializations that applied if the > types matched. bz was interested in this. It is indeed for that, and such a solution would be very, very nice. Specifically, this is for the `fillStyle` and `strokeStyle` properties of CanvasRenderingContext2D, but there are probably lots of use cases where having it work on styles would be useful, too. I've been thinking about trying to propose spec changes to make those properties take uint32 values but don't know how likely that'd be to happen. Partly because it's bound to break some buggy content.
Yes, fillStyle and strokeStyle (and maybe inline style width/height) are exactly the cases where we've considered this. There's a bit of relevant discussion in bug 919992, but also we've been talking about this in general for a while...
> Btw, in the example you gave in comment 2, is that rgba string immediately > fed into some CSS/DOM field that expects a string of form "rgba(r,g,b,a)"? Almost certainly, and treating that specially sounds like a fine idea. But there are other concatenation cases that don't get fed into CSS/DOM fields, so the general idea still applies.
Yeah, maybe doing that is good enough for now. We can start with the: num + 'px' case and then do the rgba thing, I can help with the Ion part if needed. Shumway could also use str.concat("a", "b", "c"). Right now that function just concatenates but we can add a fast path to that function for the result-fits-in-fat-string case.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.