Closed Bug 602765 Opened 14 years ago Closed 14 years ago

nanojit: in Nativei386.cpp, generate d[b + i<<s] addressing modes in asm_load64() and asm_store64()

Categories

(Core Graveyard :: Nanojit, defect)

x86
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: n.nethercote, Assigned: n.nethercote)

Details

(Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin)

Attachments

(1 file)

A lot like bug 599247.  This will help Kraken a lot, because it will help GETELEM/SETELEM on arrays of doubles.
This patch: 
- Uses SIB addressing modes for asm_load64 and asm_store64 where possible.
- Removes SSE_LDSD because it's dead.
- Modifies the "if (value->isop(LIR_ldd)" case in asm_store64 to only apply
  for non-SSE2 machines -- as far as I can tell, for SSE2 machines the 
  behaviour in that case was the same as the behaviour in the final fallback
  case.

Instruction counts for SS are slightly better, for V8 barely changed (I haven't bothered showing them), for Kraken much better:

---------------------------------------------------------------
| millions of instructions executed                           |
| total                        | on-trace (may overestimate)  |
---------------------------------------------------------------
|    90.661    90.630 (------) |    46.858    46.833 (1.001x) | 3d-cube
|    41.949    40.655 (1.032x) |    25.173    23.877 (1.054x) | 3d-morph
|    96.763    96.703 (1.001x) |    42.334    42.325 (------) | 3d-raytrace
|    24.663    24.663 (------) |    12.241    12.241 (------) | access-binary-
|    95.713    95.713 (------) |    85.506    85.506 (------) | access-fannkuc
|    30.507    30.515 (------) |    17.227    17.227 (------) | access-nbody
|    36.523    36.523 (------) |    25.238    25.238 (------) | access-nsieve
|     7.426     7.426 (------) |     3.246     3.246 (------) | bitops-3bit-bi
|    36.814    36.814 (------) |    32.519    32.519 (------) | bitops-bits-in
|    15.856    15.856 (------) |    12.016    12.016 (------) | bitops-bitwise
|    40.369    40.349 (1.001x) |    35.056    35.036 (1.001x) | bitops-nsieve-
|    17.426    17.426 (------) |    13.242    13.242 (------) | controlflow-re
|    81.651    81.651 (------) |    28.986    28.986 (------) | crypto-aes
|    32.170    32.170 (------) |     4.740     4.740 (------) | crypto-md5
|    19.871    19.871 (------) |     6.385     6.385 (------) | crypto-sha1
|    71.892    71.892 (------) |    21.930    21.930 (------) | date-format-to
|    69.944    69.966 (------) |     9.722     9.722 (------) | date-format-xp
|    44.916    43.291 (1.038x) |    31.066    29.441 (1.055x) | math-cordic
|    22.772    22.775 (------) |     6.336     6.336 (------) | math-partial-s
|    22.078    21.574 (1.023x) |    13.409    12.911 (1.039x) | math-spectral-
|    49.510    49.511 (------) |    34.585    34.585 (------) | regexp-dna
|    30.047    30.048 (------) |     9.277     9.277 (------) | string-base64
|    85.976    85.977 (------) |    24.262    24.262 (------) | string-fasta
|   110.426   110.426 (------) |    17.140    17.140 (------) | string-tagclou
|   135.482   135.483 (------) |    20.808    20.808 (------) | string-unpack-
|    43.086    43.087 (------) |     8.448     8.448 (------) | string-validat
-------
|  1354.504  1351.007 (1.003x) |   587.765   584.293 (1.006x) | all

---------------------------------------------------------------
| millions of instructions executed                           |
| total                        | on-trace (may overestimate)  |
---------------------------------------------------------------
|  3299.437  3299.439 (------) |  3023.019  3023.019 (------) | ai-astar
|  3098.288  2818.770 (1.099x) |  1753.803  1474.317 (1.190x) | audio-beat-det
|  1554.794  1371.179 (1.134x) |  1345.192  1161.581 (1.158x) | audio-dft
|  3045.661  2766.562 (1.101x) |  1732.539  1453.455 (1.192x) | audio-fft
|  2782.030  2638.685 (1.054x) |  1855.819  1712.478 (1.084x) | audio-oscillat
|  9020.597  8781.105 (1.027x) |  5733.404  5493.910 (1.044x) | imaging-gaussi
|  3203.332  3146.063 (1.018x) |   885.431   828.133 (1.069x) | imaging-darkro
|  6636.983  5891.407 (1.127x) |  4732.534  3986.964 (1.187x) | imaging-desatu
|   699.861   699.861 (------) |     9.976     9.976 (------) | json-parse-fin
|   489.504   489.504 (------) |     5.926     5.926 (------) | json-stringify
|  1530.744  1530.567 (------) |   731.883   731.651 (------) | stanford-crypt
|   827.973   827.887 (------) |   373.312   373.234 (------) | stanford-crypt
|  1904.672  1904.675 (------) |  1168.008  1168.008 (------) | stanford-crypt
|   566.547   566.420 (------) |   231.819   231.693 (1.001x) | stanford-crypt
-------
| 38660.431 36732.133 (1.052x) | 23582.674 21654.352 (1.089x) | all


Kraken timings are about 1.03-1.05x better overall.  The results are so good because Kraken has lots of array-of-double gets and sets, and this patch really helps with them.
Attachment #482742 - Flags: review?(rreitmai)
Attachment #482742 - Flags: review?(rreitmai) → review+
Looks like this gained about 5--6ms on Sunspider on AWFY.
http://hg.mozilla.org/mozilla-central/rev/0ec71c535878
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
http://hg.mozilla.org/tamarin-redux/rev/2ab051e82ac6
Whiteboard: fixed-in-nanojit, fixed-in-tracemonkey → fixed-in-nanojit, fixed-in-tracemonkey, fixed-in-tamarin
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: