Closed
Bug 364683
Opened 19 years ago
Closed 14 years ago
Optimize for args0-7/vars0-7
Categories
(Core :: JavaScript Engine, defect, P2)
Core
JavaScript Engine
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: brendan, Assigned: dmandelin)
References
Details
Some day let will dominate var, and we'll want to optimize for a range of let stack slot numbers, but it may not be as small and dense. Not a priority right now, will revisit after initial patch.
/be
| Reporter | ||
Updated•19 years ago
|
Status: NEW → ASSIGNED
Priority: -- → P2
| Reporter | ||
Updated•18 years ago
|
Target Milestone: mozilla1.9alpha1 → ---
| Reporter | ||
Comment 1•17 years ago
|
||
See bug 441686 which fuses var and local slots. Won't help sparse let slots, but if let replaces var and lets tend to be top-level and few enough, then it won't hurt.
/be
| Assignee | ||
Comment 2•17 years ago
|
||
Experiments with the inlining prototype suggest that this optimization by itself might reduce times by about 5% or so. It looks like we have about 25 opcodes to spare right now. Based on these results the best would be:
5 getvar0-getvar4
2 getargprop1-getargprop2
1 getvarprop0
5 setvar0-setvar4
6 getarg0-getarg5
3 forvar1-forvar3
bytecode slot 0 slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7
======== ======= ======= ======= ======= ======= ======= ======= =======
forvar 27 97 113 52 0 0 0 0
getarg 532 624 36 72 36 36 0 0
getvar 1231 654 639 260 104 0 0 0
setvar 740 74 98 78 78 0 0 0
varinc 38 36 26 0 0 0 0 0
getgvar 1 0 1 0 0 0 0 0
setgvar 3 0 1 0 1 1 1 1
getargpr 0 108 394 0 0 0 0 0
getvarpr 102 11 0 28 0 0 0 0
Comment 3•17 years ago
|
||
(In reply to comment #2)
> It looks like we have about 25
> opcodes to spare right now.
Note that getvar/getlocal, setvar/setlocal etc. pairs since fixing the bug 441686 are implemented with the same code in the interpreter. The pairs can be merged into single bytecodes with some straightforward work in the decompiler. That would release an extra 8 bytecodes.
Also later I plan to merge getarg/getvar etc.into the single bytecodes freeing another 8 bytecodes.
Comment 4•17 years ago
|
||
(In reply to comment #2)
> Experiments with the inlining prototype suggest that this optimization by
> itself might reduce times by about 5% or so.
Have you considered a version of bytecodes that uses not two but one byte for indexing?
| Assignee | ||
Comment 5•17 years ago
|
||
I hadn't thought of that, but it's a good idea. Andreas and I also today talked about eliminating the need for multiple reads and byte assembly on 2- and 4-byte operands by aligning them on a boundary where they could be read with a single machine instruction.
Comment 6•17 years ago
|
||
On machines with cheap unaligned access (and appropriate endianness?) we could load 16- and 32-bit immediates with a single read today, I think. Might give a sense of the speed gain to be had.
| Reporter | ||
Comment 7•17 years ago
|
||
No load is better than any load, aligned or not. We will liberate bytecode over time, so I say fire for maximum effect.
/be
Comment 8•17 years ago
|
||
(In reply to comment #6)
> On machines with cheap unaligned access (and appropriate endianness?)
endianness is only relevant for xdr and xdr can always patch the code when saving/loading scripts to ensure endianness neutrality in the saved images.
| Reporter | ||
Updated•16 years ago
|
Assignee: brendan → dmandelin
Comment 9•14 years ago
|
||
Wontfix?, as we have JITs and try to minimize our opcode count?
Comment 10•14 years ago
|
||
I highly doubt that we still need this.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•