Closed Bug 364683 Opened 19 years ago Closed 14 years ago

Optimize for args0-7/vars0-7

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: brendan, Assigned: dmandelin)

References

Details

Some day let will dominate var, and we'll want to optimize for a range of let stack slot numbers, but it may not be as small and dense. Not a priority right now, will revisit after initial patch. /be
Status: NEW → ASSIGNED
Priority: -- → P2
Target Milestone: mozilla1.9alpha1 → ---
See bug 441686 which fuses var and local slots. Won't help sparse let slots, but if let replaces var and lets tend to be top-level and few enough, then it won't hurt. /be
Experiments with the inlining prototype suggest that this optimization by itself might reduce times by about 5% or so. It looks like we have about 25 opcodes to spare right now. Based on these results the best would be: 5 getvar0-getvar4 2 getargprop1-getargprop2 1 getvarprop0 5 setvar0-setvar4 6 getarg0-getarg5 3 forvar1-forvar3 bytecode slot 0 slot 1 slot 2 slot 3 slot 4 slot 5 slot 6 slot 7 ======== ======= ======= ======= ======= ======= ======= ======= ======= forvar 27 97 113 52 0 0 0 0 getarg 532 624 36 72 36 36 0 0 getvar 1231 654 639 260 104 0 0 0 setvar 740 74 98 78 78 0 0 0 varinc 38 36 26 0 0 0 0 0 getgvar 1 0 1 0 0 0 0 0 setgvar 3 0 1 0 1 1 1 1 getargpr 0 108 394 0 0 0 0 0 getvarpr 102 11 0 28 0 0 0 0
(In reply to comment #2) > It looks like we have about 25 > opcodes to spare right now. Note that getvar/getlocal, setvar/setlocal etc. pairs since fixing the bug 441686 are implemented with the same code in the interpreter. The pairs can be merged into single bytecodes with some straightforward work in the decompiler. That would release an extra 8 bytecodes. Also later I plan to merge getarg/getvar etc.into the single bytecodes freeing another 8 bytecodes.
(In reply to comment #2) > Experiments with the inlining prototype suggest that this optimization by > itself might reduce times by about 5% or so. Have you considered a version of bytecodes that uses not two but one byte for indexing?
I hadn't thought of that, but it's a good idea. Andreas and I also today talked about eliminating the need for multiple reads and byte assembly on 2- and 4-byte operands by aligning them on a boundary where they could be read with a single machine instruction.
On machines with cheap unaligned access (and appropriate endianness?) we could load 16- and 32-bit immediates with a single read today, I think. Might give a sense of the speed gain to be had.
No load is better than any load, aligned or not. We will liberate bytecode over time, so I say fire for maximum effect. /be
(In reply to comment #6) > On machines with cheap unaligned access (and appropriate endianness?) endianness is only relevant for xdr and xdr can always patch the code when saving/loading scripts to ensure endianness neutrality in the saved images.
Assignee: brendan → dmandelin
Wontfix?, as we have JITs and try to minimize our opcode count?
I highly doubt that we still need this.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.