Closed
Bug 961821
Opened 10 years ago
Closed 10 years ago
Optimize writes to Typed Objects
Categories
(Core :: JavaScript Engine: JIT, defect)
Tracking
()
RESOLVED
FIXED
mozilla29
People
(Reporter: jaswanth.sreeram, Assigned: pnkfelix)
References
Details
Attachments
(5 files, 2 obsolete files)
1.64 KB,
text/html
|
Details | |
5.23 KB,
text/html
|
Details | |
2.68 KB,
application/x-javascript
|
Details | |
8.70 KB,
patch
|
Details | Diff | Splinter Review | |
6.94 KB,
patch
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:29.0) Gecko/20100101 Firefox/29.0 (Beta/Release) Build ID: 20140118030204 Steps to reproduce: Open the attached test case in Nightly 29.0a1 (2014-01-18). Actual results: Copying the Typed Object into another takes ~10 seconds. Expected results: Copying the Typed Object should take less time. For comparison, copying a JS Array into another takes 70ms.
Reporter | ||
Comment 1•10 years ago
|
||
If elements in the Typed Object array are simply summed (instead of copied to another Typed Object), performance is much better - 27ms (vs. 27ms for doing the same with JS Arrays).
Reporter | ||
Comment 2•10 years ago
|
||
Here is another test case that compares performance of copying data for different typed objects. The cases compared are 1) A 2D Array type of structs 2) A 3D Array type of scalars 3) A 1D Array type of scalars 4) A 1D JS Array of scalars I see the following results on my machine (i7-3770k 4-cores, 4GB RAM): 2DStructs took: 55 ms ---- 3DArray took: 1764 ms --- 1D Array took: 801 ms --- JS 1DArray took: 5 ms
Reporter | ||
Comment 3•10 years ago
|
||
Comment 4•10 years ago
|
||
Here is a revision of the html tests as a standalone .js file that one can run in a spidermonkey js shell. I reduced the size of the arrays just to make the benchmark run faster. I also added in two extra loops, one that copies elements from binary-data to js-array (bd2js) and the other direction as well (js2bd), in order to isolate whether the costs are just in writing or are in reading and writing both. Current results: running binary data/array comparison, outer_loop: 15 len: 30720 bd2bd: 2895 bd2js: 66 js2bd: 2833 js2js: 7 So it looks like the big problem here is writing to binary-data array of int, though reading is clearly a problem too.
Comment 5•10 years ago
|
||
generalization of my previous js shell to wrap the array entries in structs (in a manner semi-analogous to attachment 8363946 [details], though I did not attempt to directly imitate that code).
running binary data/array uint8 direct compare, outer_loop: 15 len: 30720
bd2bd: 2998 bd2js: 69 js2bd: 2843 js2js: 7
running binary data/array struct uint8 compare, outer_loop: 15 len: 30720
bd2bd: 68 bd2js: 82 js2bd: 30 js2js: 7
The latter results are odd (the fact that in the struct case js2bd writing-to-bd is faster than bd2js reading-from-bd). Perhaps I have decreased the input len too much and its just noisy. Not sure.
Attachment #8365932 -
Attachment is obsolete: true
Comment 6•10 years ago
|
||
pumping up the len for attachment 8365938 [details] to match Jaswanth's original benchmark yields the following results on my macbook pro:
running binary data/array uint8 direct compare, outer_loop: 15 len: 2073600
bd2bd: 192985 bd2js: 3432 js2bd: 190676 js2js: 46
running binary data/array struct uint8 compare, outer_loop: 15 len: 2073600
bd2bd: 120 bd2js: 3385 js2bd: 51 js2js: 45
(So if anything, that just accentuates the oddity I noted in my previous comment: bd2js is somehow the slowest when one has wrapped in the struct, slower even than bd2bd, while js2bd is clearly taking the lion's share of the blame for the non-struct-wrapped case.)
Comment 7•10 years ago
|
||
(argh I just realized that the results from comment 4, comment 5 and comment 6 were all taken on a build with both --enable-optimize and --enable-debug. So while the numbers may be interesting, they do not reflect what you should see from a proper build that toggles the debug option to --disable-debug.)
Comment 8•10 years ago
|
||
Here's a re-run of the results from comment 6 with a --enable-optimize --disable-debug build of the js shell: running binary data/array uint8 direct compare, outer_loop: 15 len: 2073600 bd2bd: 6777 bd2js: 50 js2bd: 6781 js2js: 40 running binary data/array struct uint8 compare, outer_loop: 15 len: 2073600 bd2bd: 34 bd2js: 48 js2bd: 23 js2js: 39 So not surprisingly, all of the results are considerably improved by cutting out the debug overhead. Now the wrapped-in-struct js2bd and bd2bd are competitive with js2js. For the uint8 direct, js2bd continues to be a smoking gun, which is consistent with jaswanth's earlier findings.
Comment 9•10 years ago
|
||
(In reply to Jaswanth Sreeram from comment #0) > Actual results: > Copying the Typed Object into another takes ~10 seconds. Confirmed in nightly 29.0a1 (2014-01-27), win 7 x64. The results of the testcases: "Binary Data took 27522 ms....JS Arrays took 160ms" "2DStructs took: 83 ms ---- 3DArray took: 2553 ms --- 1D Array took: 1374 ms --- JS 1DArray took: 7 ms"
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 10•10 years ago
|
||
facepalm: my work in Bug 933269 only added jit-support for *reading* from TypedObject arrays, not writing into them. So of course writing into the arrays is taking forever: its going through the interpreter.
Comment 11•10 years ago
|
||
It was not too hard to add support for writing into TypedObject arrays, assuming the patch I just hacked up is sound. (It helped a lot to be able to refer to IonBuilder::setPropTryTypedObject and IonBuilder::getElemTryTypedObject while hacking it up; see also Bug 933269 and Bug 898349.)
Comment 12•10 years ago
|
||
much better results on my microbenchmark (binary_data_loop.js) after applying this patch (and pumping up the array length to match jaswanth's original workload). Compare the below against the results from comment 8. % js binary_data_loop.js running binary data/array uint8 direct compare, outer_loop: 15 len: 2073600 bd2bd: 23 bd2js: 56 js2bd: 25 js2js: 44 running binary data/array struct uint8 compare, outer_loop: 15 len: 2073600 bd2bd: 22 bd2js: 52 js2bd: 23 js2js: 47
Comment 13•10 years ago
|
||
Comment on attachment 8366591 [details] [diff] [review] patch A v1: jit-support for writes into TypedObject arrays Niko: drop me an email if you get this and have time to do the review. Otherwise I will reassign. Thanks!
Attachment #8366591 -
Flags: review?(nmatsakis)
Updated•10 years ago
|
Updated•10 years ago
|
Blocks: harmony:typedobjects
Comment 14•10 years ago
|
||
Comment on attachment 8366591 [details] [diff] [review] patch A v1: jit-support for writes into TypedObject arrays (also, I still need to write test cases, along the lines of the ones I wrote for Bug 933269. But I figure that need not block review of patch A.)
Comment 15•10 years ago
|
||
Comment on attachment 8366591 [details] [diff] [review] patch A v1: jit-support for writes into TypedObject arrays Review of attachment 8366591 [details] [diff] [review]: ----------------------------------------------------------------- Look good. Can you add a test of some kind? Ideally using the option to force ion to trigger early, rather than using a ton of iterations.
Attachment #8366591 -
Flags: review?(nmatsakis) → review+
Comment 16•10 years ago
|
||
(In reply to Niko Matsakis [:nmatsakis] from comment #15) > Can you add a test of some kind? Ideally using the option to force ion to > trigger early, rather than using a ton of iterations. Will do
Comment 17•10 years ago
|
||
Wait, are you saying that there have been complaints about the 30000 iterations in the various tests from js/src/jit-test/tests/TypedObject/*.js ? (I vaguely recall seeing a note from shu earlier today somewhere saying something about 30,000 being a little much, but I hadn't really given it much thought until now.) Do you want me to change the other tests in that directory to also not use "large" iteration counts? (Not sure how large 30,000 really is, compared to the workloads I was running earlier today...) I was assuming there was some infrastructural reason why we were not switching on some sort of |--ion-eager| flag in the jit-tests. But I will investigate the matter more carefully now.
Comment 18•10 years ago
|
||
(In reply to Felix S. Klock II [:pnkfelix, :fklock] from comment #17) > (I vaguely recall seeing a note from shu earlier today somewhere saying > something about 30,000 being a little much, but I hadn't really given it > much thought until now.) To be clear: I *saw* the note earlier today. Shu wrote it some days ago.
Comment 19•10 years ago
|
||
Yes, I think 30,000 is too much. I have been meaning to go over and change the tests en-masse. I have been told that --ion-eager should be "good enough" as long as you call the function a few times, but I have empirically found this to be untrue: in fact, at this very moment, I am being bamboozled by a bug that only appears if we run 30000 iterations, and does not if we (e.g.) set the ion threshold to 30 and run 100 iters. That said, I don't quite understand why --ion-eager isn't good enough: I think the idea is that we should jit right away (with insufficient type info) and then re-jit again and again as more type info becomes available.
Comment 20•10 years ago
|
||
Well, I can see right now why --ion-eager would not be a good idea, at least not everywhere: my very first test that I wrote, it exposes a bug somewhere in patch A v1 (attachment 8366591 [details] [diff] [review]) that has to do with invalidation bailouts, but it only occurs after somewhere between 5000 and 6500 iterations. And if you turn on --ion-eager, the invalidation bailouts (as well as the bug) goes away. The concrete test case is this: function foo_u32() { for (var i = 0; i < 8000; i += 4) { var vec = new Vec3u32Type(); // making index non-trivially dependent on |i| to foil compiler optimization vec[(i) % 3] = i; vec[(i + 1) % 3] = i+1; vec[(i + 2) % 3] = i+2; var sum = vec[0] + vec[1] + vec[2]; if (sum != 3*i+3) { print(i) } assertEq(sum, 3*i + 3); } } foo_u32(); and a sample failing instance is this: % IONFLAGS=bailouts ../../../objdir-dbgopt-js/js/src/shell/js tests/TypedObject/jit-write-u32-into-u32-array.js [Bailouts] Took invalidation bailout! Snapshot offset: 336 5436 tests/TypedObject/jit-write-u32-into-u32-array.js:22:4 Error: Assertion failed: got 5438, expected 16311
Comment 21•10 years ago
|
||
There should have been a: var Vec3u32Type = TypedObject.uint32.array(3); in that code I cut-and-pasted, before the definition of foo_u32. Sorry, that's what I get for not using the attachment system. :(
Comment 22•10 years ago
|
||
shu helped me debug my code. I left out the line for converting the index to a byteoffset (and that also doubled as a bounds-check). This *despite* my having claimed to have used IonBuilder::getElemTryTypedObject as a template... sigh. Will upload new version of patch A (and a separate patch with many tests) soon.
Comment 23•10 years ago
|
||
nmatsakis: an update on testing strategy: shu pointed out that the reason that --ion-eager is masking my bug is that the parallel compiles are not finishing before the benchmark completes. --ion-eager --ion-parallel-compile=off is enough to reliably reproduce this bug, and presumably would then let us lower the iteration count. But we do not currently have a way to for --ion-parallel-compile=off in a jit-test (yet). (I've been playing with using |setJitCompilerOption("ion.usecount.trigger", 10);| but AFAICT that is not helping me lower the iteration count for my particular test cases. Maybe I need to refactor the code to let these usecounts matter...)
Comment 24•10 years ago
|
||
(In reply to Felix S. Klock II [:pnkfelix, :fklock] from comment #23) > (I've been playing with using |setJitCompilerOption("ion.usecount.trigger", > 10);| but AFAICT that is not helping me lower the iteration count for my > particular test cases.) I was mistaken here. Setting the usecount trigger (e.g. to a value of 30) does help lower the iteration count. For the tests I'm looking at, varying the usecount trigger doesn't let you lower the iteration count to an arbitrarily small value; i.e. I can't lower the iterations to an absurdly low number like 100. But lowering the usecount trigger to 30 let me reduce the iteration count that would reliably expose the bug from the previous 7000 to now 1000. That's nothing to sneeze at. (Of course, this is all on my local machine... so who knows if these observations generalize to hardware elsewhere and in the future. It would be better IMO to let the tests toggle --ion-parallel-compile=off as well as --ion-eager.)
Comment 25•10 years ago
|
||
fixed bug in pA v1. (one line fix; inheriting review from prior version.)
Attachment #8366591 -
Attachment is obsolete: true
Comment 26•10 years ago
|
||
the aforementioned tests. Mostly modeled after the tests for jit-supports for array reads.
Comment 27•10 years ago
|
||
here's the results of the try run of patchess A (v2) and T (v1): https://tbpl.mozilla.org/?tree=Try&rev=a6d19e83546b I'm going to push to mozlla-inbound next.
Comment 28•10 years ago
|
||
Pushed to inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/d35364fc05e7 https://hg.mozilla.org/integration/mozilla-inbound/rev/854c4cf1a9f3
Comment 29•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/d35364fc05e7 https://hg.mozilla.org/mozilla-central/rev/854c4cf1a9f3
Assignee: nobody → pnkfelix
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
You need to log in
before you can comment on or make changes to this bug.
Description
•