Closed
Bug 1020480
Opened 11 years ago
Closed 11 years ago
Micro-optimize replaceAllUsesWith
Categories
(Core :: JavaScript Engine: JIT, enhancement)
Core
JavaScript Engine: JIT
Tracking
()
RESOLVED
FIXED
mozilla33
People
(Reporter: sunfish, Assigned: sunfish)
Details
Attachments
(1 file)
2.78 KB,
patch
|
nbp
:
review+
|
Details | Diff | Splinter Review |
In large compilations, replaceAllUsesWith sometimes shows up in profiles. The attached patch makes it slightly faster, admittedly at the expense of making it a more complex. Instead of removing each MUse from the old use list and adding them to the new list one at a time, this just splices the old list contents into the new list. It still has to update the producer fields, but it can skip all the prev/next pointer updates.
Attachment #8434312 -
Flags: review?(nicolas.b.pierron)
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → sunfish
![]() |
||
Comment 1•11 years ago
|
||
Out of curiosity, do you see any speedups on any asm.js compilation workloads?
Assignee | ||
Comment 2•11 years ago
|
||
Unfortunately, in big cases like sqlite where it matters, replaceAllUsesWith appears to be limited by cache misses traversing the linked list. Here's the hot loop, with profile data:
Before:
│60: mov %rdx,%rax
│63: mov (%rax),%rdx
42.26 │ mov 0x8(%rax),%rcx
1.57 │ mov %rdx,(%rcx)
9.19 │ mov (%rax),%rsi
0.26 │ cmp %r13,%rdx
0.26 │ mov %rcx,0x8(%rsi)
7.35 │ movq $0x0,(%rax)
1.31 │ mov 0x10(%rbx),%rcx
1.84 │ mov %rbx,0x10(%rax)
1.05 │ mov %rdi,0x8(%rax)
│ mov %rcx,(%rax)
1.31 │ mov 0x10(%rbx),%rcx
1.31 │ mov %rax,0x8(%rcx)
2.10 │ mov %rax,0x10(%rbx)
0.79 │ ↑ jne 60
After:
2.58 │60: mov %r13,0x10(%rax)
12.89 │ mov (%rax),%rax
55.59 │ cmp %rdx,%rax
│ ↑ jne 60
Less work, but roughly (within noise) the same amount of time waiting.
Comment 3•11 years ago
|
||
(In reply to Dan Gohman [:sunfish] from comment #2)
> 2.58 │60: mov %r13,0x10(%rax)
> 12.89 │ mov (%rax),%rax
> 55.59 │ cmp %rdx,%rax
> │ ↑ jne 60
What tools give you such output? I don't remember perf giving me anything as precise in terms of instruction profiling.
Updated•11 years ago
|
Attachment #8434312 -
Flags: review?(nicolas.b.pierron) → review+
Assignee | ||
Comment 4•11 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #3)
> (In reply to Dan Gohman [:sunfish] from comment #2)
> > 2.58 │60: mov %r13,0x10(%rax)
> > 12.89 │ mov (%rax),%rax
> > 55.59 │ cmp %rdx,%rax
> > │ ↑ jne 60
>
> What tools give you such output? I don't remember perf giving me anything
> as precise in terms of instruction profiling.
This was from Linux's perf. I disabled the source code view, since that often gets confusing when there's a lot of inlining (hello C++).
Assignee | ||
Comment 5•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
You need to log in
before you can comment on or make changes to this bug.
Description
•