Speed up Function.prototype.toString

RESOLVED FIXED in Firefox 56

Status

()

enhancement
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: Ehsan, Assigned: jandem)

Tracking

(Blocks 1 bug)

unspecified
mozilla56
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox56 fixed)

Details

Attachments

(2 attachments)

While profiling ember tests, I noticed that we spend a lot of time in fun_toString, and almost all of the time goes to StringBuffer::append() calls.  This is quite sad, but the underlying Vector storage here is quite inefficient.  Can we consider using something more efficient, like SegmentedVector perhaps?
Flags: needinfo?(jdemooij)
Do we save any time if we directly return the source-substring, so we don't have to go through the StringBuffer in the first place?

@@ -1038,16 +1038,24 @@ js::FunctionToString(JSContext* cx, Hand
     bool addParentheses = haveSource && !prettyPrint && (fun->isLambda() && !fun->isArrow());
 
     if (haveSource && !script->scriptSource()->hasSourceData() &&
         !JSScript::loadSource(cx, script->scriptSource(), &haveSource))
     {
         return nullptr;
     }
 
+    if (haveSource && !addParentheses) {
+        static constexpr size_t DeflateLimit = 100; // TODO: Pick a sensible limit.
+        size_t start = script->toStringStart(), end = script->toStringEnd();
+        if (end - start < DeflateLimit)
+            return script->scriptSource()->substring(cx, start, end);
+        return script->scriptSource()->substringDontDeflate(cx, start, end);
+    }
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #0)
> […]  Can we consider using something more efficient, like
> SegmentedVector perhaps?

SegmentedVector has the issue that it allocates segments of identical sizes, instead of using the increasing size like vectors.  

This is an issue because this can be a waste of memory for small content, and this can be an issue as the allocator would appear in the profile for large content.  Increasing the allocation size of the segments would reduce the allocator cost for large content, and allow us to make smaller segment first. (maybe inlined, like vectors?)
(Assignee)

Comment 3

2 years ago
Fixing fun_toString overhead is on my list for this week. André's suggestion is great.

Offhand, SegmentedVector is not a great fit for StringBuffer because (1) we really want a data structure that supports inline storage as most strings are short and (2) StringBuffer can steal the Vector's underlying buffer and use it for the new string - if we use SegmentedVector we'd need to do another malloc + copy to finish the string.
Good points, thanks!  It probably makes sense to first address the issues in fun_toString before considering changes to the underlying data structures, but it still seems to me that we should consider having a more specialized data structure for StringBuffer, perhaps preserving features such as inline buffer stealing and exponential growth of segments but to avoid excessive copying during buffer growth before the string is finished, since I regularly see in profiles where we spend tons of times doing just that.  Hopefully a lot of that time is due to issues like what André mentioned which can be more effectively fixed in callers of course.  :-)
(Assignee)

Comment 5

2 years ago
FWIW on Ember we call fun_toString thousands of times for the closureAction function (642 chars). A 1-slot cache is sufficient to optimize this, but I'll probably go with 2 slots because that does better on Angular (fewer calls but *much* larger functions).

I think a cache for this makes sense: it also hits quite often on random websites, helps avoid allocating duplicate (often large) strings, and worst-case it's just a few pointer comparisons.

Patches for that + comment 1 tomorrow.
(Assignee)

Comment 6

2 years ago
This is what you suggested in comment 1. It improves the micro-benchmark below from 88 ms to 80 ms or so, so about a 10% improvement.

function g(x, y, z) {
    var a = [x, y, z];
    var b = ["foo", "bar", "baz"];
    //012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
    //012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
    //012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789    
    //012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
    //012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789
    //01234567890123456789012345678901234567890123456789012345678901234
}
function f() {
    var t = new Date;
    for (var i = 0; i < 100000; i++) {
        s = g.toString();
    }
    print(new Date - t);
}
f();
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Attachment #8889802 - Flags: review?(andrebargull)
(Assignee)

Comment 7

2 years ago
Just a simple purged-on-GC cache with 2 entries. It improves the micro-benchmark in the previous comment from 80 ms to 3-4 ms.

I tested this on some real-world websites and it hits quite a lot - when loading cnn.com for instance there are ~10 hits for a 40 KB (IIRC) function.

It might even be worth using a HashMap for this, but for now this will do.
Flags: needinfo?(jdemooij)
Attachment #8889803 - Flags: review?(andrebargull)
(Assignee)

Comment 8

2 years ago
Morphing this into a function toString bug.
Depends on: 1383775
Summary: Consider using SegmentedVector in StringBuffer instead of Vector → Speed up Function.prototype.toString
(Assignee)

Updated

2 years ago
Blocks: 1377343
Attachment #8889802 - Flags: review?(andrebargull) → review+
Comment on attachment 8889803 [details] [diff] [review]
Part 2 - Add FunctionToStringCache

Review of attachment 8889803 [details] [diff] [review]:
-----------------------------------------------------------------

Makes sense to me. (I assume we don't have to worry about GC moving things, because the cache is cleared every time a GC happens.)

::: js/src/gc/Zone.h
@@ +465,5 @@
>      js::ZoneGroupOrGCTaskData<js::AtomSet> atomCache_;
>  
>      // Cache storing allocated external strings. Purged on GC.
>      js::ZoneGroupOrGCTaskData<js::ExternalStringCache> externalStringCache_;
> +    js::ZoneGroupOrGCTaskData<js::FunctionToStringCache> functionToStringCache_;

There should be a newline between externalStringCache_ and functionToStringCache_, otherwise it looks like the comment belongs to both members.
Attachment #8889803 - Flags: review?(andrebargull) → review+

Comment 10

2 years ago
Pushed by jandemooij@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3d0a9de378cc
part 1 - Avoid StringBuffer overhead in FunctionToString in the common case. r=anba
https://hg.mozilla.org/integration/mozilla-inbound/rev/80bb069a8fd5
part 2 - Add a cache for Function.prototype.toString. r=anba

Comment 11

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/3d0a9de378cc
https://hg.mozilla.org/mozilla-central/rev/80bb069a8fd5
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Duplicate of this bug: 1377343
You need to log in before you can comment on or make changes to this bug.