Closed
Bug 664249
Opened 13 years ago
Closed 13 years ago
Step 2 of Typed Array creation speedups
Categories
(Core :: JavaScript Engine, enhancement)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla8
People
(Reporter: nsm, Assigned: nsm)
References
Details
Attachments
(2 files, 13 obsolete files)
83.37 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
49.80 KB,
patch
|
mrbkap
:
review+
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 Build Identifier: This set of patches applies on top of bug 656519 and introduces the following improvements. The TypedArray C++ object held by the JSObject has been dropped in favour of putting all properties in inline slots in the JSObject representing the JS TypedArray. Some changes in the TypedArray public API to hide implementation details. This includes updates to TM and JM to generate proper instructions. Creation time has reduced even more. The actual Array Buffer is still in a JSObject for the Array Buffer and not inlined for really small arrays. Another patch fixes JM to generate code for 'length' property access on TypedArray which was missing before. This is primarily responsible for faster access times when length is referenced in a loop. The uses of the API in Firefox have been fixed, and passes try server - http://tbpl.mozilla.org/?tree=Try&rev=d449d210cc8d Benchmarks: [access] http://pastebin.mozilla.org/1230046 [creation] http://pastebin.mozilla.org/1230047 [large-array-creation] http://pastebin.mozilla.org/1230049 [bocoup] http://pastebin.mozilla.org/1230050 [skinning] http://pastebin.mozilla.org/1230052 System: Darwin macironik.local 10.7.4 Darwin Kernel Version 10.7.4: Mon Apr 18 21:24:17 PDT 2011; root:xnu-1504.14.12~3/RELEASE_X86_64 x86_64 8G RAM, 2.3 Ghz quad core i7 Without patches -------------- Access (3 run mean) -------------- Plain and simple 42.0363333333 -------------- TM 2.86466666667 -------------- JM 14.9576666667 -------------- TM, JM (with loop profiling) 14.9523333333 -------------- Creation (3 run mean) -------------- Plain and simple 2.29966666667 -------------- TM 1.806 -------------- JM 1.95866666667 -------------- TM, JM (with loop profiling) 1.816 -------------- Large arrays (3 run mean) -------------- Plain and simple 5.34633333333 -------------- TM 4.51233333333 -------------- JM 5.29433333333 -------------- TM, JM (with loop profiling) 4.65266666667 -------------- Bocoup (1 run) (time in ms, smaller is better) -------------- Plain and simple Write: 9736 Read: 11742 Loop Copy: 13262 Slice Copy: 192 -------------- TM Write: 618 Read: 539 Loop Copy: 816 Slice Copy: 199 -------------- JM Write: 1279 Read: 1450 Loop Copy: 1878 Slice Copy: 200 -------------- TM, JM (with loop profiling) Write: 691 Read: 1510 Loop Copy: 840 Slice Copy: 203 -------------- Skinning (Skinned vertices per second) (1 run) (larger is better) -------------- Plain and simple 294895.676099 -------------- TM 8700845.82085 -------------- JM 1374831.25382 -------------- TM, JM (with loop profiling) 8030652.68065 With patches -------------- Access (3 run mean) -------------- Plain and simple 42.4646666667 -------------- TM 2.856 -------------- JM 7.53233333333 -------------- TM, JM (with loop profiling) 7.608 -------------- Creation (3 run mean) -------------- Plain and simple 1.37533333333 -------------- TM 0.947 -------------- JM 1.075 -------------- TM, JM (with loop profiling) 0.954333333333 -------------- Large arrays (3 run mean) -------------- Plain and simple 4.64133333333 -------------- TM 4.56033333333 -------------- JM 4.61 -------------- TM, JM (with loop profiling) 4.64733333333 -------------- Bocoup (1 run) (time in ms, smaller is better) -------------- Plain and simple Write: 8997 Read: 11066 Loop Copy: 12567 Slice Copy: 191 -------------- TM Write: 610 Read: 541 Loop Copy: 794 Slice Copy: 192 -------------- JM Write: 1237 Read: 1429 Loop Copy: 1931 Slice Copy: 197 -------------- TM, JM (with loop profiling) Write: 673 Read: 1492 Loop Copy: 818 Slice Copy: 190 -------------- Skinning (Skinned vertices per second) (1 run) (larger is better) -------------- Plain and simple 285527.916846 -------------- TM 8877042.95704 -------------- JM 1448983.59264 -------------- TM, JM (with loop profiling) 8140000.0 Reproducible: Always
Assignee | ||
Comment 1•13 years ago
|
||
Attachment #539271 -
Flags: review?(mrbkap)
Assignee | ||
Comment 2•13 years ago
|
||
Assignee | ||
Comment 3•13 years ago
|
||
Order of patch application is: 1. Apply patches from 656519 2. Apply "Store typed array properties in inline slots" 3. Apply "Fast length access in JM" 4. Apply "Fix TypedArray API changes"
Comment 4•13 years ago
|
||
Jandem, how does this affect bug 663485?
Comment 5•13 years ago
|
||
(In reply to comment #4) > Jandem, how does this affect bug 663485? Bug 663485 inlines JSOP_LENGTH on typed arrays with TI enabled, the length patch here does this in the IC, it's good to have both. The "inline slots" patch does affect bug 663485 but, looking at the JM/TM changes, rebasing will be straight-forward. Nikhil, in bug 663485 we want to optimize typed array access based on type inference. In this case we don't have an actual typed array but we know the typed array type (int8, uint16 etc). The only (minor) change outside JM is changing slotWidth() so that we can calculate the slot width based on the type, without needing an actual typed array object. Are you planning other typed array changes?
Updated•13 years ago
|
Assignee: general → nsm.nikhil
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment 6•13 years ago
|
||
Btw I don't think this will affect the performance numbers in bug 663485 much since we'll have to generate the same number of instructions. I think this is expected since bug 656519 and this bug are mostly about speeding up array *creation*. It's hard to say though without measuring.
Assignee | ||
Comment 7•13 years ago
|
||
Jandem, There are no more major planned changes in the TypedArray APIs or implementation. One thing remaining is to try and store the ArrayBuffer data in the inline slots of the TypedArray when the size is really small. This should not affect JIT or TI since the constructors will handle making the data field point to the right place.
Assignee | ||
Updated•13 years ago
|
Assignee | ||
Comment 8•13 years ago
|
||
I may have changed some things due to bug 664577, just being safe by updating
Attachment #539271 -
Attachment is obsolete: true
Attachment #539271 -
Flags: review?(mrbkap)
Assignee | ||
Comment 9•13 years ago
|
||
Attachment #539272 -
Attachment is obsolete: true
Assignee | ||
Comment 10•13 years ago
|
||
Update to use TypedArray non-inline public API
Attachment #539273 -
Attachment is obsolete: true
Assignee | ||
Updated•13 years ago
|
Attachment #539857 -
Flags: review?(mrbkap)
Assignee | ||
Updated•13 years ago
|
Attachment #539859 -
Flags: review?(mrbkap)
Assignee | ||
Updated•13 years ago
|
Attachment #539860 -
Flags: review?(mrbkap)
Assignee | ||
Comment 11•13 years ago
|
||
Attachment #539857 -
Attachment is obsolete: true
Attachment #539857 -
Flags: review?(mrbkap)
Attachment #540894 -
Flags: review?(mrbkap)
Assignee | ||
Comment 12•13 years ago
|
||
updated to apply against latest tracemonkey
Attachment #539859 -
Attachment is obsolete: true
Attachment #539859 -
Flags: review?(mrbkap)
Attachment #540896 -
Flags: review?(mrbkap)
Assignee | ||
Comment 13•13 years ago
|
||
updated to apply against latest tracemonkey. nsContentUtils.cpp has dropped structure clone and the use of typed arrays.
Attachment #539860 -
Attachment is obsolete: true
Attachment #539860 -
Flags: review?(mrbkap)
Attachment #540901 -
Flags: review?(mrbkap)
Assignee | ||
Comment 14•13 years ago
|
||
bump
Comment 15•13 years ago
|
||
Comment on attachment 540894 [details] [diff] [review] Store typed array properties in inline slots Review of attachment 540894 [details] [diff] [review]: ----------------------------------------------------------------- This looks pretty good. Why is getField necessary, though? It seems like you could use getSlot with regular JS values. ::: js/src/jstypedarray.cpp @@ +909,5 @@ > return NULL; > > + *(uint32*)getField(obj, FIELD_TYPE) = ArrayTypeID(); > + > + do { I don't think you need this do {} while(0); loop here. You can just use braces for scoping. @@ +1702,5 @@ > NULL, /* call */ \ > NULL, /* construct */ \ > NULL, /* xdrObject */ \ > NULL, /* hasInstance */ \ > + _typedArray::obj_trace, /* trace */ \ I think this backslash is no longer properly lined up (though this patch view doesn't show it). ::: js/src/jstypedarray.h @@ +173,5 @@ > static JSBool obj_getAttributes(JSContext *cx, JSObject *obj, jsid id, uintN *attrsp); > > static JSBool obj_setAttributes(JSContext *cx, JSObject *obj, jsid id, uintN *attrsp); > > + static void * getField(const JSObject *obj, int field); Nit (here and below): the * should be next to the function name. ::: js/src/jstypedarrayinlines.h @@ +51,5 @@ > return *((JSUint32*) obj->slots); > } > > inline uint8 * > ArrayBuffer::getDataOffset(JSObject *obj) { Pre-existing, but the { should be on its own line. @@ +65,5 @@ > + return (void *) base; > +} > + > +inline JSUint32 > +TypedArray::getLength(const JSObject *obj) { return *((JSUint32 *) getField(obj, FIELD_LENGTH)); } Even for these small inline functions, we should probably use the functionname(args) { return stuff; } style.
Attachment #540894 -
Flags: review?(mrbkap) → review+
Comment 16•13 years ago
|
||
Comment on attachment 540896 [details] [diff] [review] Fast length access in JM Passing this review off to dmandelin since I don't know enough JaegerMonkey stuff to do it.
Attachment #540896 -
Flags: review?(mrbkap) → review?(dmandelin)
Updated•13 years ago
|
Attachment #540901 -
Flags: review?(mrbkap) → review+
Comment 17•13 years ago
|
||
Comment on attachment 540896 [details] [diff] [review] Fast length access in JM Review of attachment 540896 [details] [diff] [review]: ----------------------------------------------------------------- r+ with nit fixed. ::: js/src/methodjit/PolyIC.cpp @@ +853,5 @@ > + masm.loadObjClass(pic.objReg, pic.shapeReg); > + Jump notArray = masm.testClass(Assembler::NotEqual, pic.shapeReg, obj->getClass()); > + > + masm.loadPtr(Address(pic.objReg, offsetof(JSObject, slots)), pic.objReg); > + masm.load32(Address(pic.objReg, sizeof(js::Value)*js::TypedArray::FIELD_LENGTH), pic.objReg); Nit: the js:: aren't needed here because of the 'using namespace js' in this file.
Attachment #540896 -
Flags: review?(dmandelin) → review+
Assignee | ||
Comment 18•13 years ago
|
||
Fixes nits, removes getField. Properties are stored in slots except pointers to buffer and data which are stored raw due to two-byte alignment requirement. (patch has details) JM improvement patch has been folded into this patch after nit fix, approved by dmandelin.
Attachment #540894 -
Attachment is obsolete: true
Attachment #540896 -
Attachment is obsolete: true
Attachment #545264 -
Flags: review?(mrbkap)
Assignee | ||
Comment 19•13 years ago
|
||
Drop fields here as well.
Attachment #540901 -
Attachment is obsolete: true
Attachment #545267 -
Flags: review?(mrbkap)
Assignee | ||
Comment 20•13 years ago
|
||
Attachment #545267 -
Attachment is obsolete: true
Attachment #545267 -
Flags: review?(mrbkap)
Attachment #545269 -
Flags: review?(mrbkap)
Updated•13 years ago
|
Attachment #545264 -
Flags: review?(mrbkap) → review+
Updated•13 years ago
|
Attachment #545269 -
Flags: review?(mrbkap) → review+
Assignee | ||
Comment 21•13 years ago
|
||
properties are stored in slots, but as slots (using jsval methods) instead of treating it as raw memory. byteOffset is not directly added but done every time. TM and JM have been modified appropriately
Attachment #545264 -
Attachment is obsolete: true
Attachment #549527 -
Flags: review?(mrbkap)
Assignee | ||
Comment 22•13 years ago
|
||
Attached the wrong patch before.
Attachment #549527 -
Attachment is obsolete: true
Attachment #549527 -
Flags: review?(mrbkap)
Attachment #549529 -
Flags: review?(mrbkap)
Assignee | ||
Comment 23•13 years ago
|
||
Attachment #545269 -
Attachment is obsolete: true
Attachment #549533 -
Flags: review?(mrbkap)
Comment 24•13 years ago
|
||
Blake, I took a look at the interdiff, and while it seems plausible as adjustments to Bill's patchwork, it's pretty clear to me that without having looked closely at the other 80K of changes I shouldn't do an interdiff review here. :-\ (Much as I might want to say good enough and do it anyway.) So I'll wait for you to get back to this on Monday, I think.
Updated•13 years ago
|
Attachment #549529 -
Flags: review?(mrbkap) → review+
Updated•13 years ago
|
Attachment #549533 -
Flags: review?(mrbkap) → review+
Comment 25•13 years ago
|
||
Backed out while investigating a webgl permaorange on mozilla-inbound.
Comment 26•13 years ago
|
||
And since things went green after the backouts, this bug does seem to be the regressor. Since backing out the two changes attributed to this bug resulted in bustage, the eventual changeset to fix this should include both this bug and bug 667047, in a single revision, looks like.
Comment 27•13 years ago
|
||
I can reproduce the two failures on my Windows box. The failures are in: http://hg.mozilla.org/integration/mozilla-inbound/file/8e4e181a9ce9/content/canvas/test/webgl/conformance/draw-elements-out-of-bounds.html on lines 83 and 87. If I turn on webgl.verbose and futz with the test, the first problem gives this verbose error: DrawElements: bound vertex attribute buffers do not have sufficient size for given indices from the bound element array That corresponds to: http://hg.mozilla.org/mozilla-central/annotate/982a5835fba1/content/canvas/src/WebGLContextGL.cpp#l1558 The second problem hits that as well, but I don't know WebGL nearly well enough to say whether the munging I did to that test file to get a usable line number from it would mean that second error message is bogus. (I copied the evalstr for line 83 into a new line inserted between 82/83, and I did the same for 87 between 86/87.)
Comment 28•13 years ago
|
||
For the first problem, that case is hit when checked_maxIndexInSubArrayPlusOne is 4 and maxAllowedCount is 3. That's because checked_maxIndexPlusOne is 4. That's because maxIndex is 3. And I'm guessing based on the name that that really should be 2. Still looking at this, fumbling my way through slightly. :-)
Comment 29•13 years ago
|
||
Okay, I think the reason for my guess was wrong, but I think I see the problem. It looks like mBoundElementArrayBuffer->FindMaxUbyteElement(), if I trace a little, is returning 3 because mBoundElementArrayBuffer->mData is pointed at the start of new Uint8Array([3, 0, 1, 2]).subarray(1), rather than one into it.
Comment 30•13 years ago
|
||
Which points to WebGLContext::BufferData_array(WebGLenum target, JSObject *wa, WebGLenum usage), which points to this: GLenum error = CheckedBufferData(target, JS_GetTypedArrayByteLength(wa), JS_GetTypedArrayData(wa), usage); Which says that JS_GetTypedArrayData is either not returning the right thing, or it's not being used correctly or something. Blake isn't sure exactly what the semantics of this stuff was supposed to be, offhand, and says he's going to figure it out soon. :-)
Comment 31•13 years ago
|
||
I think JS_GetTypedArrayData isn't returning the right thing, looking at the patches that landed here a little more closely. Blake said he'd check for certain after a dentist appointment this afternoon.
Comment 32•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/794fe3408d3a
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla8
Comment 33•13 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/025d0712bfce
Comment 34•13 years ago
|
||
Part of the typed array IC changes here look broken to me. 1.32 + Address data_base(objReg, sizeof(Value) * js::TypedArray::FIELD_DATA); 1.33 + masm.loadPrivate(data_base, objReg); 1.34 + 1.35 + JSObject *tarray = js::TypedArray::getTypedArray(obj); 1.36 + int shift = js::TypedArray::slotWidth(tarray); 1.37 + 1.38 + int byteOffset = js::TypedArray::getByteOffset(tarray); 1.39 + masm.addPtr(Imm32(byteOffset), objReg); There has only been a clasp guard previously, and the byte offset of a typed array can vary across instances of the same kind of array. Why doesn't the typed array have a pointer directly to the base of its data, so that the offset doesn't need to be added every single time the array is accessed? Also, is there documentation on how typed arrays are laid out internally? I'm trying to merge this stuff into the TI branch and grubbing through the code to figure out what's going on is annoying.
Comment 35•13 years ago
|
||
This patch breaks pdf.js. We don't have a reduced test case, but something is clearly borked here. I will back out and post instructions how to reproduce the failure (its pretty easy).
Comment 36•13 years ago
|
||
Go to this page. Works with an 8/3 nightly, broken with this patch (paper isn't rendered). andreasgal.github.com/pdf.js/web/viewer.html Backing out.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 37•13 years ago
|
||
Actually, I am knee deep in conflict hell with the backout so I will file a dependent bug instead.
Updated•13 years ago
|
Status: REOPENED → RESOLVED
Closed: 13 years ago → 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•