Closed
Bug 591972
Opened 14 years ago
Closed 14 years ago
JM: generate inline code for JSOP_TABLESWITCH
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: jandem, Assigned: jandem)
References
Details
(Whiteboard: fixed-in-tracemonkey)
Attachments
(1 file, 5 obsolete files)
14.55 KB,
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
JSNES has *many* calls to stubs::TableSwitch. I will try to write a hackish patch to generate inline code for this and measure if it's worthwhile. This won't help SS/V8, as only deltablue has about 600 calls to stubs::TableSwitch.
Assignee | ||
Updated•14 years ago
|
Assignee: general → jandemooij
Status: NEW → ASSIGNED
Comment 1•14 years ago
|
||
Hmm... I thought I implemented this in the last JM and sstangl had ported it. sstangl, do you have a patch lying around?
The only tricky part is going to be allocating the big table buffer and relocating the addresses inside it. Might need to extend the Assembler.
Comment 3•14 years ago
|
||
I think I just used DataLabel32s and patched them up in the previous JM iteration.
Assignee | ||
Comment 4•14 years ago
|
||
Consider this microbenchmark: ---------- function f(a) { var t0 = new Date; for(var i=0; i<10000000; i++) { var b; switch(a) { case 1: b = 2; break; case 2: b = 3; break; case 3: case 4: b = 5; break; case 5: b = 6; break; case 6: b = 7; break; case 7: b = 8; break; case 8: b = 9; break; case 9: b = 10; break; } } print(new Date - t0); } f(7); ---------- We get the following times, after subtracting the empty loop times: - JM before: 161 ms. - JM after: 36 ms. - TM: 70 ms. - V8: 60 ms. (note that V8 does not generate a jump table; changing f(7) to f(0) makes V8 faster, likely O(n)) Generated asm looks like this: 0xf7fc42ea: cmpl $0xffff0001,0x74(%ebx) 0xf7fc42f1: jne 0xf7fc46d2 0xf7fc42f7: mov $0xf7fc4a33,%ecx 0xf7fc42fc: sub $0x1,%edx 0xf7fc42ff: cmp $0x9,%edx 0xf7fc4302: jae 0xf7fc43a3 0xf7fc4308: jmp *(%ecx,%edx,4) Running JSNES for a few seconds results in millions of stubcalls to JSOP_TABLESWITCH. This patch eliminates them, but performance win seems minimal (hard to see though, FPS fluctuates a lot). So, I'm not sure if this is worthwhile. Maybe after FF4 or for some unknown benchmark?
Assignee | ||
Comment 5•14 years ago
|
||
Simplify the code a bit. This is not commit-ready, it has some small hacks and a compile warning (about function-pointer to void* cast). But good enough to measure performance. Also, feel free to close this as WONTFIX.
Attachment #470767 -
Attachment is obsolete: true
Comment 6•14 years ago
|
||
It's a good idea: I think we will want to take this, after we have landed JM and sorted all that out.
I'm curious why we're so much slower right now - if other engines don't fast-path either.
Assignee | ||
Comment 8•14 years ago
|
||
(In reply to comment #7) > I'm curious why we're so much slower right now - if other engines don't > fast-path either. V8 does not generate a jump table, but it does generate a comparison for each case label (hence the O(n) behavior). TM also generates a jump table. JSC seems to do a stub call to a jumptable-stub, but I don't have a standalone JSC shell on Linux to test its performance.
Assignee | ||
Comment 9•14 years ago
|
||
Updated + cleaned up a bit. This works on x86 and x64; for ARM it still does a stub call because we need to implement masm.jump(BaseIndex) for ARM. There are already tests in jit-test/test/basic/test{Table}Switch*.js for JSOP_TABLESWITCH; so I just added some tests that were missing there.
Attachment #470770 -
Attachment is obsolete: true
Attachment #487633 -
Flags: review?(dvander)
Comment on attachment 487633 [details] [diff] [review] Patch Sorry for the delay here... nice patch! >+ for (size_t i = 0; i < jumpTableOffsets.length(); i++) { >+ ptrdiff_t offset = jumpTableOffsets[i]; uint32 here since the offset is absolute. >+ for (uint32 i = 0; i < numJumps; i++) { >+ uint32 target = GET_JUMP_OFFSET(pc); >+ if (!target) >+ target = defaultTarget; >+ ptrdiff_t offset = (originalPC + target) - script->code; also here, and >+ js::Vector<ptrdiff_t, 16> jumpTableOffsets; here. Would also be good to have a copy of the switch (3.14) test cases where the input is not a constant, that is, there's no test that takes the OOL stubs::TableSwitch path. There should also be a test with negative switch indexes. |low| and |high| can be negative. stubs::TableSwitch and the code here (which store as uint32) work because of overflow which is a little subtle. I would either change jsop_tableswitch to use signed ranges, or put a big comment explaining how it works. r=me with those addressed
Attachment #487633 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 11•14 years ago
|
||
New version: s/ptrdiff_t/uint32, added another #define so we don't break compilation on ARM, added tests, changed uint32 to jsint (like tracer and interpreter)
Attachment #487633 -
Attachment is obsolete: true
Attachment #492716 -
Flags: review?(dvander)
Updated•14 years ago
|
Attachment #492716 -
Flags: review?(dvander) → review+
Updated•14 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 12•14 years ago
|
||
Anybody willing to land this? It's a measurable speedup for typical interpreter/emulator-like switch-statements, and switch-statements usually don't trace well.
blocking2.0: --- → ?
Comment 13•14 years ago
|
||
We'll land this after b8 is done. Thanks for your work.
Comment 14•14 years ago
|
||
So wait, are switches faster than this.OPCODES[op](this) ? Because my own emulator (GameBoy Color) relies on this. Does firefox optimize for an array of anonymous functions? It was hard for me to tell for some reason. Does firefox's performance vary majorly based on the case order presented in a switch (might be why I'm confused).
Assignee | ||
Comment 15•14 years ago
|
||
Rebased + make jsop_tableswitch a no-op when there are no cases (tracer does this too). The previous version handled this correctly, but this may prevent problems in the future (JumpTable.offsetIndex pointing to an invalid index is not very intuitive). Sorry for the review noise...
Attachment #492716 -
Attachment is obsolete: true
Attachment #495479 -
Flags: review?(dvander)
Assignee | ||
Comment 16•14 years ago
|
||
(In reply to comment #14) > So wait, are switches faster than this.OPCODES[op](this) ? Because my own > emulator (GameBoy Color) relies on this. With this patch, I think (need to measure to be sure) a table switch(*) is faster than an array, because we don't have to 1) check that this.OPCODES is an array 2) call a separate function. > Does firefox optimize for an array of anonymous functions? I don't think so, but I might be wrong. > Does firefox's > performance vary majorly based on the case order presented in a switch (might > be why I'm confused). About the case order, it depends on the switch statement. For a table switch the case order should not matter. But for other switch statements it does matter because we compare the value to each case label in the same order. * A table switch is a switch with "constant integer cases"; this patch optimizes this to a jump table (when highest value - lowest value <= 256 to prevent excessive memory use)
Comment on attachment 495479 [details] [diff] [review] Patch v3 >+ */ >+ if (numJumps == 0) { >+ frame.pop(); >+ return; >+ } >+ FrameEntry *fe = frame.peek(-1); r=me with a newline after the } here, and a test case that asserts the default label in an empty switch gets executed. >+ return; >+ } >+ RegisterID dataReg; >+ } else { >+ dataReg = frame.copyDataIntoReg(fe); >+ } >+ RegisterID reg = frame.allocReg(); While you're back in the area, could use newlines after } in these places too :)
Attachment #495479 -
Flags: review?(dvander) → review+
Updated•14 years ago
|
Assignee | ||
Comment 18•14 years ago
|
||
Rebased + addressed review comments. Had to use jumpMap now that nmap is gone.
Attachment #495479 -
Attachment is obsolete: true
Attachment #497813 -
Flags: review?(dvander)
Updated•14 years ago
|
Attachment #497813 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 19•14 years ago
|
||
Comment on attachment 497813 [details] [diff] [review] Patch v4 Asking approval2.0. Would be nice to have this in beta 9.
Attachment #497813 -
Flags: approval2.0?
Comment 20•14 years ago
|
||
http://hg.mozilla.org/tracemonkey/rev/673ae0e2f656
Comment 21•14 years ago
|
||
Comment on attachment 497813 [details] [diff] [review] Patch v4 Pretty sure approval wasn't the problem, since nothing else landing in TM has approval.
Attachment #497813 -
Flags: approval2.0?
Comment 22•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/673ae0e2f656
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•