Closed Bug 1343005 Opened 8 years ago Closed 7 years ago

Optimize `Quote` in json.cpp

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Performance Impact high
Tracking Status
firefox57 --- fixed

People

(Reporter: evilpies, Assigned: tcsc)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(2 files)

The Quote function in json.cpp is really slow and currently the biggest overhead in kraken-stringify-tinderbox (bug 1313373). There are a few things we should do: At least reserve the number of characters in the string that we are appending to the buffer. Stop using appendSubString, which makes us iterate over the data twice. Maybe optimize the string escaping with a lookup table. v8 also has an optimization where they check the reserved size in the buffer and if the new string would always fit in there even when every character is escaped they basically use infallibleAppend. I am also curious how well the MaybeOneOf<Latin1CharBuffer, TwoByteCharBuffer> in Stringbuffer is optimized or if we end up with a branch on every append!
(In reply to Tom Schuster [:evilpie] from comment #0) > I am also curious how well the MaybeOneOf<Latin1CharBuffer, > TwoByteCharBuffer> in Stringbuffer is optimized or if we end up with a > branch on every append! Was the patch I posted an improvement? It should get rid of this overhead on every append call by using the Vector directly.
Needinfo'ing evilpie for an asnwer. I assume otherwise this question will go through the cracks.
Flags: needinfo?(evilpies)
When I tried this on the Google Drive test case I didn't see any speedup. I don't quite remember the kraken results, but I don't think it was a significant win.
Flags: needinfo?(evilpies)
Blocks: 1341902
Whiteboard: [qf]
Whiteboard: [qf] → [qf:p3]
This shows up in Speedometer, see for example this profile from a run of about 100 or so of the subtests: http://bit.ly/2s077v9 Bumping to [qf:p1] as a result. I also spent some time looking at this, will file a couple of dependencies.
Whiteboard: [qf:p3] → [qf:p1]
Depends on: 1376678
Depends on: 1376688
Attached file microbenchmark
Given my mistake in bug 1376678, I'm probably too tired right now, but in my profiling of Quote(), I saw the cost of realloc/memmove to show up as too high IMO. This is the micro benchmark I was using which I extracted from the emberjs benchmark of Speedometer. It may be worth even prescanning the string to compute the exact needed buffer length and just allocate it in one go rather than rely on the underlying vector storage here, or perhaps worth experimenting with.
I made a few stabs at this but none made any measurable effect on the micro benchmark. My attempts were: 1) reserve for the quoted string + quotes and use infallible append 2) avoid making two passes over the string by appending each char immediately after determining it doesn't need escaping 3) prevent having to check the type of the StringBuffer repeatedly by having a specialized append that knows it is appending Latin1 chars to a Latin1 buffer in various combinations. I think the compiler is probably inlining and optimizing better than my local changes in Quote and StringBuffer could help.
Assignee: nobody → tchiovoloni
On my machine the attached patch is around 10% faster on the attached microbenchmark (it finishes in ~4500ms instead of ~5000ms). I also ran it against the lined kracken-stringify-tinderbox from bug 1313373, and saw a similar speed up. All only tested on my machine, YMMV, etc. I'd expect it to do better in cases where the string is longer, but it could also use more memory for those. I've added tests to exercise the new edge cases (I might have added them in the wrong place, I'm not sure). I'm not sure if I should have overflow checking on the reserve() computation, it seems like we might want/need it for 32 bit builds, but I could be wrong.
Comment on attachment 8889968 [details] Bug 1343005 - Optimize Quote in json.cpp. It looks like this is slower on windows. Since the fix might require a completely different approach, I'm clearing the review request.
Attachment #8889968 - Flags: review?(jorendorff)
Most of my comments for the above apply to this one as well, although this patch performs better than the previous (between 15% and 30% better than baseline) on all platforms. Unfortunately, the performance gains here are a little fragile, and I tried a decent amount to fix the ugliness of the code now, but this is about as good as I could get it. - The reserve + infallibleAppend route seems to confound MSVC, although MSVC2017 does a decent amount better than 2015 (even though they should have the same optimizer...). Resize and then shrinking works better (but, sensibly, only if we don't initialize the new items). - Using RangedPtr's (or a reference to the Vector) instead of raw pointers made things notably slower, sadly. - As did having the body of InfallibleQuote live inside Quote, which I would have liked to do to prevent anybody from calling it accidentally (hopefully the name is explicit enough that you probably shouldn't use it). Initially I had moved it out so that I could mark the pointers as __restrict and/or the function as __declspec(restrict) -- this *didn't* make any difference that I could see. - I also found the performance to be worse when the inner check was phrased like `c < sizeof(escapeLookup) || !escapeLookup[c]`, or (and/or) when escapeLookup was 128b instead of 256b. - Using a lookup table over a switch statement or conditionals also helped measurably, but that was true of the previous patch as well, and was one of the suggested optimizations. Relevant talos numbers for a patch that is nearly identical, except less clean (and with a couple optimizations that didn't end up making a difference removed): win32: https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-central&newProject=try&newRevision=08a1f7f2b6a6b75fe132c75515ac2ccef20db628&originalSignature=fcd38e3012e71ceb4e2c3adddf0ee0828cfaa2b5&newSignature=fcd38e3012e71ceb4e2c3adddf0ee0828cfaa2b5&framework=1&selectedTimeRange=172800 win64: https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-central&newProject=try&newRevision=08a1f7f2b6a6b75fe132c75515ac2ccef20db628&originalSignature=e1ca34204aa47e7ebd4a16b37105e102daf78cb5&newSignature=e1ca34204aa47e7ebd4a16b37105e102daf78cb5&framework=1&selectedTimeRange=172800 linux: https://treeherder.mozilla.org/perf.html#/comparesubtest?originalProject=mozilla-central&newProject=try&newRevision=08a1f7f2b6a6b75fe132c75515ac2ccef20db628&originalSignature=f7a7eb10a31d949a8a02f1cee45a5ef452889bf7&newSignature=f7a7eb10a31d949a8a02f1cee45a5ef452889bf7&framework=1&selectedTimeRange=172800 mac: https://treeherder.mozilla.org/perf.html#/comparesubtest?framework=1&newProject=try&newRevision=08a1f7f2b6a6b75fe132c75515ac2ccef20db628&newSignature=16929a4c0fd1b8860da19115860eed96ce421adc&originalProject=mozilla-central&originalSignature=16929a4c0fd1b8860da19115860eed96ce421adc&selectedTimeRange=172800
(In reply to Thom Chiovoloni [:tcsc] from comment #11) > Most of my comments for the above apply to this one as well To be clear, I mean my concerns in comment #8 around overflow checking. I also seem to have imagined that I mentioned something about increased memory usage, but I don't see that anywhere, but for posterity: the fact that this will cause higher memory usage also concerns me.
> Using RangedPtr's (or a reference to the Vector) instead of raw pointers made things notably slower, sadly. Ugh.
Comment on attachment 8889968 [details] Bug 1343005 - Optimize Quote in json.cpp. https://reviewboard.mozilla.org/r/160766/#review176914 Thanks for this work. r=me with the comments below addressed. ::: js/src/json.cpp:50 (Diff revision 2) > -IsQuoteSpecialCharacter(char16_t c) > -{ > - static_assert('\b' < ' ', "'\\b' must be treated as special below"); > - static_assert('\f' < ' ', "'\\f' must be treated as special below"); > - static_assert('\n' < ' ', "'\\n' must be treated as special below"); > - static_assert('\r' < ' ', "'\\r' must be treated as special below"); > + * Requires that the destination has enough space allocated for src after escaping > + * (that is, `2 + 6 * (srcEnd - srcBegin)` characters). > + */ > +template <typename SrcCharT, typename DstCharT> > +static DstCharT* > +InfallibleQuote(const SrcCharT* srcBegin, const SrcCharT* srcEnd, DstCharT* dstPtr) Please try looking at the generated machine code with and without `RangedPtr`. It's very odd that it wouldn't be effectively the same. :-\ ::: js/src/json.cpp:127 (Diff revision 2) > Quote(JSContext* cx, StringBuffer& sb, JSString* str) > { > JSLinearString* linear = str->ensureLinear(cx); > if (!linear) > return false; > - > + // we check if either has non-latin1 before calling ensure, so that the buffer's Style nits: Please change "we check" to "Check" with a capital letter. Also, add a blank line before this comment. (The style rule, not always observed, is to add a blank line before a comment anywhere except at the start of a block.) ::: js/src/json.cpp:135 (Diff revision 2) > - : Quote<char16_t>(sb, linear); > + if (!sb.ensureTwoByteChars()) > + return false; > + } > + if (linear->hasTwoByteChars()) { > + return Quote<char16_t>(sb.rawTwoByteBuffer(), linear); > + } Style nit: Please remove the curly braces around this one-line statement. ::: js/src/vm/StringBuffer.h:141 (Diff revision 2) > } > > + Latin1CharBuffer& rawLatin1Buffer() { > + MOZ_ASSERT(isLatin1()); > + return latin1Chars(); > + } This method isn't necessary, as latin1Chars() is implemented using MaybeOneOf::ref, which calls MaybeOneOf::as, which asserts the same thing.
Attachment #8889968 - Flags: review?(jorendorff) → review+
Comment on attachment 8889968 [details] Bug 1343005 - Optimize Quote in json.cpp. https://reviewboard.mozilla.org/r/160766/#review176914 I mentioned these in IRC, but for posterity. > Please try looking at the generated machine code with and without `RangedPtr`. It's very odd that it wouldn't be effectively the same. :-\ This turned out to be due to MSVC getting confused due to the RangedPtr being an argument (e.g. abi weirdness). Forcing the function to be inlined lets us avoid this, and gives me the same perf (on my machine) as the raw pointer approach.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Performance Impact: --- → P1
Whiteboard: [qf:p1]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: