Closed Bug 1837410 Opened 3 years ago Closed 2 years ago

JSON.stringify is 2x slower than Safari on sp3

Tracking

()

Status:

RESOLVED FIXED

Milestone:

117 Branch

Tracking Flags:

Tracking

Status

firefox117

---

fixed

People

(Reporter: mstange, Assigned: sfink, NeedInfo)

References

(Blocks 1 open bug)

Details

(Whiteboard: [sp3])

Attachments

(3 files)

stringify-microbenchmark.html 3 years ago Markus Stange [:mstange] 6.19 KB, text/html		Details
stringify-through-input-microbenchmark.html 3 years ago Markus Stange [:mstange] 995 bytes, text/html		Details
Bug 1837410 - accelerated path for JSON.stringify 2 years ago Steve Fink [:sfink] [:s:] 48 bytes, text/x-phabricator-request		Details \| Review

Markus Stange [:mstange]

Reporter

Description

•

3 years ago

Attached file stringify-microbenchmark.html — Details

On the attached microbenchmark, I get the following scores:

Firefox Nightly: 473ms
Chrome Canary: 352ms
Safari Technology Preview: 197ms

Matching Safari's performance of JSON.stringify on this shape of JSON would improve our score on each of these three sp3 benchmarks by 4%: TodoMVC-JavaScript-ES5, TodoMVC-JavaScript-ES6, TodoMVC-JavaScript-ES6-Webpack

Jira Integration Bot

Updated

•

3 years ago

See Also: → https://mozilla-hub.atlassian.net/browse/SP3-407

Markus Stange [:mstange]

Reporter

Comment 1

•

3 years ago

Attached file stringify-through-input-microbenchmark.html — Details

Here's another version of the benchmark which is closer to what happens in the actual benchmark, because it embeds a string from input.value in the JSON. In Firefox, this causes us to take the utf-16 JSON serialization path.

Firefox Nightly: 492ms
Chrome Canary: 350ms
Safari Technology Preview: 194ms

Markus Stange [:mstange]

Reporter

Comment 2

•

3 years ago

Firefox Profile: https://share.firefox.dev/45Rblsp
Safari Profile: https://share.firefox.dev/3oLhZQp

JavaScriptCore has a FastStringifier:

https://searchfox.org/wubkat/rev/1ef008a7bed116ef9afa0741d99ceb2bd3d3ce05/Source/JavaScriptCore/runtime/JSONObject.cpp#659-669

// ------------------------------ FastStringifier --------------------------------

// FastStringifier does a no-side-effects stringify of the most common types of
// objects and arrays. It bails out if the serialization is any longer than a
// fixed buffer and handles only the simplest cases, including only 8-bit character
// strings. Instead of explicit checks to prevent excessive recursion and cycles,
// it counts on hitting the buffer size limit to catch those things. If it fails,
// since there is no side effect, the full general purpose Stringifier can be used
// and the only cost of the fast stringifying attempt is the time wasted.

class FastStringifier {

They also seem to have a more efficient way to enumerate properties:

https://searchfox.org/wubkat/rev/1ef008a7bed116ef9afa0741d99ceb2bd3d3ce05/Source/JavaScriptCore/runtime/JSONObject.cpp#1071-1077

if (UNLIKELY(!structure.canPerformFastPropertyEnumeration())) {
    recordFastPropertyEnumerationFailure(object);
    return;
}
structure.forEachProperty(m_vm, [&](const auto& entry) -> bool {
    // ...
});

In Firefox, we seem to do property enumeration and property lookup separately.

Jan de Mooij [:jandem]

Comment 3

•

3 years ago

(In reply to Markus Stange [:mstange] from comment #2)

They also seem to have a more efficient way to enumerate properties:

There might be some overlap with bug 1836679. I want to look into that more after the mprotect work, and can also look into this one.

Matthew Gaudet (he/him) [:mgaudet]

Updated

•

3 years ago

Severity: -- → S3

Priority: -- → P2

Steve Fink [:sfink] [:s:]

Assignee

Comment 4

•

3 years ago

I'm prototyping this one. Preliminary result is that the fast path (which should cover the examples in this bug) is slightly over twice the speed of the slow path. Results are only preliminary because I'm still outputting the named properties in reverse order, I'm not doing the length check so it would actually infinite loop if you had a cycle, and I imagine there are some additional checks that I'm not doing yet.

Without fast path: 13800.40ms for 2236400000 chars, 6.170808003725295 ns/char.
With fast path: 6283.81ms for 2236400000 chars, 2.8097893884283893 ns/char.

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

2 years ago

Assignee: nobody → sphink

Status: NEW → ASSIGNED

Steve Fink [:sfink] [:s:]

Assignee

Comment 5

•

2 years ago

Attached file Bug 1837410 - accelerated path for JSON.stringify — Details

Steve Fink [:sfink] [:s:]

Assignee

Comment 6

•

2 years ago

Had to take some time off, but this is getting closer to being real and I've pushed a timing run.

Patch contains some weird test code that I'll need to do in a different way. Passes all jit-tests and jstests, which confuses me because I've had to fix some things that I would totally expect to be tested by test262, but I'm not seeing any JSON tests there (other than for JSON module imports). Still matching up the code to the spec sections, and I need to remove some code duplication between the fast and slow paths.

Steve Fink [:sfink] [:s:]

Assignee

Comment 7

•

2 years ago

I seem to have everything working now, though adding in the additional checks seems to have slowed the fast path down substantially:

Fast: 5098.89ms for 1118200000 chars, 4.56 ns/char.
Slow: 7779.06ms for 1118200000 chars, 6.96 ns/char.

I pushed to try for speedometer, and got around 5% improvement on 5 different Speedometer3 subtests. Actual differences: -2.75% -4.34% -5.20% -6.36% -6.88%

When running the JS shell locally under samply, I noticed the following not so desirable overhead:

15% of the overall time is spent poisoning memory freed by jemalloc when reallocing for ExtractWellSized
14% of the overall time is in GetOwnPropertyPure, some of which is redundant since we're iterating over the own properties
23% in Quote(). Quote() is expensive for both fast and slow paths. It's a little tempting to cache a "this string does not need quoting (for JSON)" bit, though most of the allocation growing time is in Quote so it may not have that much of an effect.
3.6% is spent looking for an interfering .toJSON property, some of which could be saved with a guard bit of some sort.

Steve Fink [:sfink] [:s:]

Assignee

Comment 8

•

2 years ago

Getting closer to something reviewable, and I realized that I've only been looking at Linux numbers. Windows numbers are... different.

I have two pushes where Windows shows similar improvements to Linux, plus some massive 15-20% regressions on a handful of subtests. And my latest push which is mostly up to date (though I've added interrupt checks that this doesn't include) shows across-the-board improvements that I don't trust, and no regressions.

Steve Fink [:sfink] [:s:]

Assignee

Comment 9

•

2 years ago

•

Edited

Information on the hit rates.

For awsy (tp-6), the overall hit rate is 72.4%. Most of the misses are unavoidable. Of the inputs where it is theoretically possible to use the fast path, the hit rate is 99.0%. Well, unless you magically analyzed the replacer function for side effects or something, or observed everything you cared about while it was running. If you count the replacer function argument as a miss, then the hit rate is 91.8%.

Of all calls that took the slow path:

20.8% had a replacer function and so were ineligible
76.4% were primitives
0.1% were ineligible for other reasons
2.1% found an object with a callable toJSON function
0.7% bailed for other reasons (sparse indexes was the most common)

For just speedometer3 (plus some startup noise):

overall hit rate is 84.6%
hit rate only counting non-primitives that aren't going to run any code is 98.7%
hit rate for all non-primitives is 97.1%

Of all calls that took the slow path:

8.4% had a replacer argument
83.5% were primitives
0.7% found something with a callable toJSON property
7.4% found an INELIGIBLE_OBJECT, which could be a variety of things. Typed arrays, BigInts, non-native objects, etc.

That last category is the only category that could be reasonable to reduce right now, but the effort doesn't seem worth it just for bumping the hit rate by a percentage point or so.

A magical analysis that could tell whether a given function could possibly have any side effects on the data in question could boost the hit rate more, but that would only make sense if we were doing it for something else and could just reuse it here.

Pulsebot

Comment 10

•

2 years ago

Pushed by sfink@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/65e729aa1e06 accelerated path for JSON.stringify r=jandem

Narcis Beleuzu [:NarcisB]

Comment 11

•

2 years ago

Backed out for leak failures

Backout link: https://hg.mozilla.org/integration/autoland/rev/ec054fe362b6c5391b791840686832dea90688ba
Log link: https://treeherder.mozilla.org/logviewer?job_id=423600717&repo=autoland&lineNumber=3598

Flags: needinfo?(sphink)

Pulsebot

Comment 12

•

2 years ago

Pushed by sfink@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a954796eff2d accelerated path for JSON.stringify r=jandem

Atila Butkovits

Comment 13

•

2 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/a954796eff2d

Status: ASSIGNED → RESOLVED

Closed: 2 years ago

status-firefox117: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 117 Branch

Ryan VanderMeulen [:RyanVM]

Updated

•

2 years ago

Regressions: 1846246

Mayank Bansal

Comment 14

•

2 years ago

Improvements:
34% on Jetstream2-json-stringify-inspector-Average
31% on Jetstream2-json-stringify-inspector-First
28% on Jetstream2-json-stringify-inspector-Geometric
20%-30% on Jetstream2-json-stringify-inspector-Worst

6% on Speedometer2-Vanilla-ES2015-Babel-Webpack-TodoMVC/CompletingAllItems/Sync and other smaller improvements in related sub-tests
5.6% on Speedometer2-Vanilla-ES2015-TodoMVC/CompletingAllItems and other smaller improvements in related sub-tests
6.6% on Speedometer2-VanillaJS-TodoMVC/CompletingAllItems/Sync and other improvements in related sub-tests

Nicolas B. Pierron [:nbp] {away}

Updated

•

2 years ago

Regressions: 1845671

Steve Fink [:sfink] [:s:]

Assignee

Updated

•

2 years ago

Regressions: 1847369

Andrew McCreight [:mccr8]

Updated

•

2 years ago

Regressions: 1847668

Nicolas Chevobbe [:nchevobbe]

Updated

•

2 years ago

Updated

•

2 years ago

Regressions: 1885819

Jan de Mooij [:jandem]

Updated

•

2 years ago

No longer regressions: 1885819

You need to log in before you can comment on or make changes to this bug.