Closed Bug 1020480 Opened 11 years ago Closed 11 years ago

Micro-optimize replaceAllUsesWith

Categories

(Core :: JavaScript Engine: JIT, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: sunfish, Assigned: sunfish)

Details

Attachments

(1 file)

Attached patch rauw.patchSplinter Review
In large compilations, replaceAllUsesWith sometimes shows up in profiles. The attached patch makes it slightly faster, admittedly at the expense of making it a more complex. Instead of removing each MUse from the old use list and adding them to the new list one at a time, this just splices the old list contents into the new list. It still has to update the producer fields, but it can skip all the prev/next pointer updates.
Attachment #8434312 - Flags: review?(nicolas.b.pierron)
Assignee: nobody → sunfish
Out of curiosity, do you see any speedups on any asm.js compilation workloads?
Unfortunately, in big cases like sqlite where it matters, replaceAllUsesWith appears to be limited by cache misses traversing the linked list. Here's the hot loop, with profile data: Before: │60: mov %rdx,%rax │63: mov (%rax),%rdx 42.26 │ mov 0x8(%rax),%rcx 1.57 │ mov %rdx,(%rcx) 9.19 │ mov (%rax),%rsi 0.26 │ cmp %r13,%rdx 0.26 │ mov %rcx,0x8(%rsi) 7.35 │ movq $0x0,(%rax) 1.31 │ mov 0x10(%rbx),%rcx 1.84 │ mov %rbx,0x10(%rax) 1.05 │ mov %rdi,0x8(%rax) │ mov %rcx,(%rax) 1.31 │ mov 0x10(%rbx),%rcx 1.31 │ mov %rax,0x8(%rcx) 2.10 │ mov %rax,0x10(%rbx) 0.79 │ ↑ jne 60 After: 2.58 │60: mov %r13,0x10(%rax) 12.89 │ mov (%rax),%rax 55.59 │ cmp %rdx,%rax │ ↑ jne 60 Less work, but roughly (within noise) the same amount of time waiting.
(In reply to Dan Gohman [:sunfish] from comment #2) > 2.58 │60: mov %r13,0x10(%rax) > 12.89 │ mov (%rax),%rax > 55.59 │ cmp %rdx,%rax > │ ↑ jne 60 What tools give you such output? I don't remember perf giving me anything as precise in terms of instruction profiling.
Attachment #8434312 - Flags: review?(nicolas.b.pierron) → review+
(In reply to Nicolas B. Pierron [:nbp] from comment #3) > (In reply to Dan Gohman [:sunfish] from comment #2) > > 2.58 │60: mov %r13,0x10(%rax) > > 12.89 │ mov (%rax),%rax > > 55.59 │ cmp %rdx,%rax > > │ ↑ jne 60 > > What tools give you such output? I don't remember perf giving me anything > as precise in terms of instruction profiling. This was from Linux's perf. I disabled the source code view, since that often gets confusing when there's a lot of inlining (hello C++).
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: