Last Comment Bug 674143 - See if there are any benefits to having the tags for js::Value be 8-bit sign extended values
: See if there are any benefits to having the tags for js::Value be 8-bit sign ...
Status: RESOLVED FIXED
[inbound]
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: mozilla8
Assigned To: general
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on:
Blocks: ARMJSPerf
  Show dependency treegraph
 
Reported: 2011-07-25 18:46 PDT by Marty Rosenberg [:mjrosenb]
Modified: 2011-08-01 07:58 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Shift the range of tags for js::Value's (5.56 KB, patch)
2011-07-26 15:54 PDT, Marty Rosenberg [:mjrosenb]
luke: review+
Details | Diff | Splinter Review
do the same thing as before, but work on x86 (6.46 KB, patch)
2011-07-29 10:24 PDT, Marty Rosenberg [:mjrosenb]
luke: review+
Details | Diff | Splinter Review

Description Marty Rosenberg [:mjrosenb] 2011-07-25 18:46:55 PDT
presently, to test if a js::Value holds an integer, we compare its upper 32 bits with 0xffff001, which on X86 requires a 5 byte compare instruction, and on ARM
requires 3 separate instructions (effectively, load 0xffff0000 into a temp; mask in 0x1; do comparison),
If we shift the range of tags to be 0xffffff80-0xffffffff, then all of the tags are representable as an 8 bit sign extended value, which can be tested on x86 in a single 2 byte instruction, and on arm in a single 32 byte instruction.
so it looks like it should be a win all around.
Comment 1 Luke Wagner [:luke] 2011-07-25 22:20:51 PDT
Sounds great
Comment 2 Marty Rosenberg [:mjrosenb] 2011-07-26 15:54:09 PDT
Created attachment 548615 [details] [diff] [review]
Shift the range of tags for js::Value's

x86 should immediately get all of the benefits of the new encoding, whereas on ARM, I needed to touch the macro assembler so it emits single instruction cmn (compare negative) for compares against negative numbers.

Looks like the answer is "Yes".
preliminary results are:
=============================================================================

** TOTAL **:      1.054x as fast    17495.0ms +/- 0.8%   16605.9ms +/- 0.2%     significant

=============================================================================

  v8:             1.054x as fast
    crypto:       1.040x as fast
    raytrace:     1.105x as fast
    richards:     1.008x as fast
    splay:        *1.050x as slow*
    deltablue:    1.119x as fast
    earley-boyer: 1.045x as fast
    regexp:       1.018x as fast

=============================================================================

** TOTAL **:           1.114x as fast    3813.4ms +/- 0.1%   3423.6ms +/- 0.1%     significant

=============================================================================

  3d:                  1.099x as fast
    cube:              1.098x as fast
    morph:             1.050x as fast
    raytrace:          1.122x as fast

  bitops:              1.073x as fast
    3bit-bits-in-byte: 1.059x as fast
    nsieve-bits:       1.062x as fast
    bitwise-and:       -             
    bits-in-byte:      1.109x as fast

  math:                1.24x as fast
    cordic:            1.009x as fast
    partial-sums:      1.45x as fast
    spectral-norm:     1.085x as fast

  regexp:              *1.130x as slow*
    dna:               *1.130x as slow*
  string:              1.081x as fast
    tagcloud:          1.075x as fast
    unpack-code:       1.113x as fast
    validate-input:    1.089x as fast
    base64:            *1.015x as slow*
    fasta:             1.059x as fast

  access:              1.146x as fast
    nbody:             1.26x as fast
    binary-trees:      1.066x as fast
    fannkuch:          1.149x as fast
    nsieve:            1.071x as fast

  date:                1.154x as fast
    format-tofte:      1.142x as fast
    format-xparb:      1.167x as fast

  controlflow:         1.114x as fast
    recursive:         1.114x as fast

  crypto:              1.109x as fast
    md5:               1.111x as fast
    aes:               1.112x as fast
    sha1:              1.096x as fast

I am admittedly curious about base64, dna, and splay.  The only reason I can think of for the slowdown is that we take slightly longer to generate cmp's in some cases.
Comment 3 Marty Rosenberg [:mjrosenb] 2011-07-26 16:23:32 PDT
(In reply to comment #2)
 
> x86 should immediately get all of the benefits of the new encoding, whereas
> on ARM, I needed to touch the macro assembler so it emits single instruction
> cmn (compare negative) for compares against negative numbers.
>
In case it was not obvious, the patch includes these changes; the ARM work is done 
> Looks like the answer is "Yes".
> preliminary results are:
these results are for ARM/Linux, running on the pegatron on my desk. I have not tested x86 yet, nor android/more modern ARM chips
Comment 4 Marty Rosenberg [:mjrosenb] 2011-07-27 19:06:33 PDT
Comment on attachment 548615 [details] [diff] [review]
Shift the range of tags for js::Value's

Since I don't have commit access yet, if you could push this to the try server, then to a repo, that would be incredibly useful (or if you can get someone else to do it...)
Comment 5 David Anderson [:dvander] 2011-07-27 20:33:09 PDT
Comment on attachment 548615 [details] [diff] [review]
Shift the range of tags for js::Value's

Luke, could you take a look at the tag changes?
Comment 6 Luke Wagner [:luke] 2011-07-28 13:38:53 PDT
Comment on attachment 548615 [details] [diff] [review]
Shift the range of tags for js::Value's

Looks good to me.  I'm a bit surprised this is all it took ;-).  Per your request, I pushed to try; you can watch the action unfold live:
  http://tbpl.mozilla.org/?tree=Try&rev=67a24913f7d2
Comment 7 Marty Rosenberg [:mjrosenb] 2011-07-29 10:24:22 PDT
Created attachment 549413 [details] [diff] [review]
do the same thing as before, but work on x86

*cough* I only changed the complexity of the checks on 32 bit platforms, and x86 had hard coded the offset between "here is the first byte of the compare instruction" and "here is the last byte of the branch".  Which the space saving patch changed.  It looks like this was the only hard coded offset that was changed (the other one is to check the shape of an object), so the hard coded offset of 12 has been lowered to 9.
Comment 8 Luke Wagner [:luke] 2011-07-29 10:40:13 PDT
Watch the action unfold:
 http://tbpl.mozilla.org/?tree=Try&rev=4dbde781da2f
Comment 9 Luke Wagner [:luke] 2011-07-29 11:08:17 PDT
Passing comment, you'll want to remove trailing whitespace in the patch.
Comment 10 Luke Wagner [:luke] 2011-07-29 15:46:00 PDT
Comment on attachment 549413 [details] [diff] [review]
do the same thing as before, but work on x86

That is very very green.
Comment 12 Marco Bonardo [::mak] 2011-08-01 07:58:34 PDT
http://hg.mozilla.org/mozilla-central/rev/b48fab587b84

Note You need to log in before you can comment on or make changes to this bug.