Closed
Bug 1058631
Opened 10 years ago
Closed 10 years ago
Some small nursery GC performance improvements
Categories
(Core :: JavaScript: GC, defect)
Core
JavaScript: GC
Tracking
()
RESOLVED
FIXED
mozilla34
People
(Reporter: jandem, Assigned: jandem)
Details
Attachments
(2 files)
7.06 KB,
patch
|
terrence
:
review+
|
Details | Diff | Splinter Review |
2.80 KB,
patch
|
Details | Diff | Splinter Review |
I'll attach some patches that make the micro-benchmark below at least 10% faster on OS X. They also improve things on Linux and Windows, though especially on Linux the win appears to be smaller. The patches also win a few hundred points each on Octane earley-boyer. function f() { var prevO = null; for (var i=0; i<100000000; i++) { var o = {x: 1, y: 2, z: 3, prev: prevO}; if (i % 8 === 0) prevO = o; } } var t = new Date; f(); print(new Date - t);
Assignee | ||
Comment 1•10 years ago
|
||
* Clang was not inlining Nursery::moveSlotsToTenured and Nursery::moveElementsToTenured into their only caller, I added MOZ_ALWAYS_INLINE to force inlining. * Flags CrashAtUnhandlableOOM as MOZ_NORETURN, to hint to the compiler calls to it are cold and won't return. * Some minor changes to avoid a few dereferences. These are unlikely to matter much, but very straight-forward so I left them in. This patch improves the micro-benchmark in comment 0 from 1604 to 1502 ms on OS X 32-bit.
Attachment #8479080 -
Flags: review?(terrence)
Comment 2•10 years ago
|
||
Since you're micro-optimizing: PodCopy, used in the copy operations, is not clever for most slot arrays, in fact it is fairly strange: For 128 elements or more PodCopy becomes a single memcpy. For less than 128 elements it becomes a loop plus one memcpy call per element. Of course it's possible that a good C++ compiler does something clever with that.
Comment 3•10 years ago
|
||
Comment on attachment 8479080 [details] [diff] [review] Part 1 - Assorted tweaks Review of attachment 8479080 [details] [diff] [review]: ----------------------------------------------------------------- Nice!
Attachment #8479080 -
Flags: review?(terrence) → review+
Assignee | ||
Comment 4•10 years ago
|
||
Clang was emitting a call to a generic memcpy function. This patch adds an AllocKind switch statement so that each of the memcpy calls has a constant "size" argument, and the compiler can do something smarter. For instance, for the FINALIZE_OBJECT0_BACKGROUND switch-case, Clang now emits the following instead of a memcpy call: movsd (%ebx), %xmm0 movsd 8(%ebx), %xmm1 movsd %xmm1, 8(%esi) movsd %xmm0, (%esi)
Attachment #8479123 -
Flags: review?(terrence)
Assignee | ||
Comment 5•10 years ago
|
||
Part 2 also wins about 100 ms on the comment 0 micro-benchmark, 1502 -> 1408 ms on OS X 32-bit.
Assignee | ||
Comment 6•10 years ago
|
||
(In reply to Lars T Hansen [:lth] from comment #2) > Since you're micro-optimizing: PodCopy, used in the copy operations, is not > clever for most slot arrays, in fact it is fairly strange: For 128 elements > or more PodCopy becomes a single memcpy. For less than 128 elements it > becomes a loop plus one memcpy call per element. > > Of course it's possible that a good C++ compiler does something clever with > that. Yeah, I think it's assuming that C++ compilers do something clever with small constant-size memcpy but not with variable-size memcpy (or at least not all compilers).
Comment 7•10 years ago
|
||
I do know that the < 128 elements thing was based on performance measurements; Waldo probably remembers more of the details.
Assignee | ||
Comment 8•10 years ago
|
||
Comment on attachment 8479123 [details] [diff] [review] Part 2 - Specialize memcpy I see a small regression with this patch on OS X x64 and Linux x64. Not sure why it behaves differently on x64 but it's probably not worth the complexity then.
Attachment #8479123 -
Flags: review?(terrence)
Assignee | ||
Comment 9•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/0556ceb562e3
https://hg.mozilla.org/mozilla-central/rev/0556ceb562e3
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in
before you can comment on or make changes to this bug.
Description
•