Open Bug 615290 Opened 14 years ago Updated 1 year ago

avoid unnecessary flattening when rope traversal would suffice

Categories

(Core :: JavaScript Engine, defect)

defect

Tracking

()

People

(Reporter: luke, Unassigned)

References

Details

There are many places where we call chars() (which flattens ropes) that could be avoided by iterating over rope segments directly. For example, (big1 + big2).escape() should not copy big1+big2 into a buffer just to copy the whole thing again into the new escaped string. A complicating factor is that there is a potential for slowdown if you have a rope full of tiny pieces that is traversed multiple times. For example, this code would be slower if we traversed ropes instead of flattening in indexOf: s = "" for (var i = 0; i < 100; ++i) s += i + ','; for (var i = 0; i < 1000000; ++i) s.indexOf('999') One solution is to have rope nodes store the number of child nodes (which is cheap to maintain and can fit into the flag bits) and then flatten if there is a poor length:num-nodes ratio before doing any rope traversal operations.
Oops, "indexOf('99')"
Ran across this today while looking over some profiles. news.bbc.co.uk hits this problem a lot, even while the browser is idle - it seems to concat lots of temporary strings and then use them as the needle when invoking str.indexOf, i.e. somestring.indexOf("a" + "b") In my tests, over a period of a couple hours this alone created 5 megabytes of garbage and contributed to multiple collections (along with a sizable growth in memory usage) - it was the #1 allocator. The same performance caveats apply as Luke mentioned; the fact that we flatten the ropes in place helps mitigate the performance impact here at least.
Great to hear you are digging in to these things. Could you describe the characteristics of the strings you are seeing in a bit more detail? It sounds like you were saying that eager flattening was actually good for bbc; is that right?
I won't know for sure until I have JS stack information for the pages in question. If they're only ever doing the indexOf operation once, eager flattening is a pessimization, since we flatten once and then throw it away. I think this would be the case for 'abc'.indexOf('b' + 'c') but I can't say for sure that's what they're doing.
Even if indexOf is ever called once, eager flattening could be necessary since the pattern is matched in a loop. Consider: "ababababababababc".indexOf("a" + "b" + "c") (and forget for the moment that the parser will merge these literals and that concats of short strings are always done eagerly since the chars can be stored inline in the header :)
Blocks: 771737
Assignee: general → nobody
No assignee, updating the status.
Status: ASSIGNED → NEW
No assignee, updating the status.
No assignee, updating the status.
No assignee, updating the status.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.