Specialize big typed arrays with a singleton type

RESOLVED FIXED in mozilla16

Status

()

Core
JavaScript Engine
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: azakai, Assigned: bhackett)

Tracking

(Blocks: 1 bug)

unspecified
mozilla16
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [js:t])

Attachments

(1 attachment)

(Reporter)

Description

5 years ago
In compiled code from Emscripten and other compilers, memory is implemented as a big typed array. Speeding that up would give improvements across the board in compiled code benchmarks. luke and bhackett say 

> we can give big typed arrays a singleton type so
> that we could further specialize get/set element
> paths to individual typed arrays.  That would allow
> us to bake in the element base and hardcode the limit in
> the bounds check.  That take us down from ~12 ops to do
> a get/set to 3-5.

Updated

5 years ago
Blocks: 710398
(Assignee)

Comment 1

5 years ago
Created attachment 631224 [details] [diff] [review]
basic patch (e758973c6ab1)

Basic patch that gives singleton types to typed arrays allocated in global code, and optimizes GETELEM/SETELEM based on them by baking in lengths and base addresses.  This doesn't do LICM based on such arrays, e.g. we will keep testing the length on accesses in a loop and will keep on loading the constant base into a register.  Fixing this wouldn't be horribly complicated but I don't know if this is necessary (IM will do better with LICM and CSE) and anyways it should be done in a separate patch.  Times:

var memory = new Int32Array(10000);
function foo() { 
  var n = 0;
  for (var i = 0; i < 10000; i++) {
    for (var j = 0; j < 10000; j++)
      memory[j];
  }
}
foo();

Before: 186
After:  111

var memory = new Int32Array(10000);
function foo() { 
  var n = 0;
  for (var i = 0; i < 10000; i++) {
    for (var j = 0; j < 10000; j++)
      memory[j] = j;
  }
}
foo();

Before: 194
After:  148

This microbenchmark will be pretty sensitive to regalloc etc. and probably not representative of behavior in general.  Wondering how this does on the tests of interest.
Assignee: general → bhackett1024
Attachment #631224 - Flags: review?(dvander)
(Reporter)

Comment 2

5 years ago
Some results:

corrections: unchanged
dlmalloc: 8% faster
fannkuch: 15% faster
fasta: unchanged
memops: 7% faster
primes: unchanged
skinning: 10% faster

So looks very nice.

However, I see no speedup on larger benchmarks, box2d and bullet ( https://github.com/kripken/misc-js-benchmarks ), where I was hoping/expecting for an improvement. They do plenty of memory accesses, but also a lot of float math. Any ideas?

Comment 3

5 years ago
It would be useful to see what % of element reads/writes are seeing the singleton array type; perhaps something about the large benchmarks is confounding TI.
Attachment #631224 - Flags: review?(dvander) → review+
(Assignee)

Comment 4

5 years ago
I looked at one of the hot functions in the bullet benchmark and the type information is fine and the optimization is performed.  This is a short benchmark though (.8 seconds for me) and seems to be compiling a lot of code; the two hot functions only execute about 1 million times each.  I'm guessing that the performance gains from this patch are being drowned out by compilation costs.  What happens if you make the benchmark longer running?
(Reporter)

Comment 5

5 years ago
I made it run a lot longer, 13-14 seconds. I now get a 3% speedup. Is it surprising it is that little?
(Assignee)

Comment 6

5 years ago
Not really, I guess.  We may just be running into Amdahl's law.  I think that what I mainly need is a more complete understanding of the workflows for how C code (or LLVM bytecode?) gets turned into machine code via our approach and via NaCl's approach and the avenues for moving towards the latter.  Will get on that right after I get back from China.
Whiteboard: [js:t]
(Assignee)

Comment 7

5 years ago
Pushed, with some tweaks.  This gives singleton types to all typed arrays and data views above a certain limit (10MB), to be more robust against other initialization patterns.  (e.g. mandreel apps seem to do their initialization in a function, not in global code)

https://hg.mozilla.org/integration/mozilla-inbound/rev/e3ec1bc37d8c
Backed out in https://hg.mozilla.org/integration/mozilla-inbound/rev/8e830624d9ee - the debug builds all say (https://tbpl.mozilla.org/php/getParsedLog.php?id=13019803&tree=Mozilla-Inbound) "jit-test/tests/jaeger/recompile/bug651119.js: Assertion failure: !fe->isConstant(), at ../../../js/src/methodjit/FrameState.cpp:1591"
(Assignee)

Comment 9

5 years ago
Oops, last minute cleanup broke things.

https://hg.mozilla.org/integration/mozilla-inbound/rev/195ffaea56ea
https://hg.mozilla.org/mozilla-central/rev/195ffaea56ea
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla16
Depends on: 769433
Depends on: 785776
You need to log in before you can comment on or make changes to this bug.