Closed Bug 656519 Opened 14 years ago Closed 13 years ago

Step 1 of planned improvements to typed arrays

Categories

(Core :: JavaScript Engine, enhancement)

All
macOS
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: nsm, Assigned: nsm)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(4 files, 15 obsolete files)

29.03 KB, patch
Details | Diff | Splinter Review
3.51 KB, patch
Details | Diff | Splinter Review
937 bytes, patch
Details | Diff | Splinter Review
26.54 KB, patch
Details | Diff | Splinter Review
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0.1) Gecko/20100101 Firefox/4.0.1 Build Identifier: In reference to meta bug 650657, this patch modifies ArrayBuffers to store data in JSObject::slots when possible. TypedArray implementations including serialize/deserialize has been fixed to work for both approaches and all tests pass. There are performance improvements for small arrays, but also a slight decrease in performance for large arrays. Only initialization and access have been tested however. Patch is attached. Reproducible: Always
Since typed array sizes are fixed, could we just give small arrays that are using slots a different class from the large arrays that are not, so that we can keep access to both fast by switching on the different class in the jit and just generating different code? Large typed arrays are pretty commonly used in non-WebGL applications (canvas imagedata, binary blob handling, etc).
Status: UNCONFIRMED → NEW
Ever confirmed: true
I am sorry about not putting a comment in the patch upload, a little confusion about Bugzilla. Here it is. The updated patch now *always* uses JSObject::slots to store arrays of any length, completely discarding the use of a private ArrayBuffer instance. This gets rid of the finalizer as discussed in bug #650657. There are noticeable improvements in array creation time. mrbkap also explained how these slots are different from another kind of fixed slots which is just extra space tacked onto the end of a JSObject. Using fixed slots for small arrays and non-fixed slots for large arrays is something I am going to try now.
IMPORTANT: the current patch has bugs. A new one and test cases are on the way. Sorry about that.
This patch performs the following 1) Use Value *slots in the JSObject to store ArrayBuffer bytes NOTE: This patch still does not use reserved slots to get a fixed chunk of data for 'free'. It calloc's as required. 2) The data is stored length prefixed in slots. So the first 4/8 bytes depending on the architecture store the length of the buffer and the remaining is set to hold the data. 3) The specification[0] says nothing about setting properties on ArrayBuffers so they should behave like normal objects. Since this patch takes over the slots it defines the various callbacks and uses an internal JSObject (stored in the private field). All properties except byteLength get tacked onto it and read back from it. [this was missing from the previous patch leading to data corruption on property assignment. ] This patch introduces *BREAKING* changes (breaks the compile) since certain parts of the content code pertaining to XHR directly access ArrayBuffer implementation details which are changed by this patch. There is no longer a privately held ArrayBuffer object with byteLength and void* data members. Fixing them is easy and will be done once this patch shapes up. Only the spidermonkey tests have been run. There may be certain things I've missed about property behaviour when implementing the callbacks. Looking forward to feedback on that. Now for performance issues: I've done some basic benchmarks which follow $ uname -a Darwin host-4-135.mv.mozilla.com 10.7.3 Darwin Kernel Version 10.7.3: Sun Mar 6 13:37:56 PST 2011; root:xnu-1504.14.2~1/RELEASE_X86_64 x86_64 (8GB RAM, 2.3 Ghz Intel i7) All results are averaged over 3 runs. Interpreter flags: TM enabled (-j) --------- 1) typed-array-creation.js [1] - creates size 5 views of every type a million times each. Without patch 3.627 With patch 2.585 The reduced malloc call does make a difference in small array sizes. 2) small-view-access.js [2] - creates 4 different views of 3-5 elements and tries to access array elements a million times. Without patch 0.285s With patch 0.279s 3) typed-array-creation-large.js [3] - creates size 1024*1024 arrays of every type a 1000 times each. Without patch 4.025 With patch 4.433 I'm not really sure what causes this slowdown. The older code also used a calloc, and this code also uses a calloc. 4) bocoup-benchmark [4] - Float32Array read/write/copy/slice test - slightly modified from http://weblog.bocoup.com/javascript-typed-arrays No patch, with jit Write: 607 Read: 535 Loop Copy: 794 Slice Copy: 194 Patch, with jit Write: 615 Read: 537 Loop Copy: 796 Slice Copy: 188 5) skinning.js [5] - code from bug 622494 just modified to run on SpiderMonkey (s/alert/print) skinned-test-typed.js averaged over 3 runs no patch, with jit skinned vertices per second (bigger is better): 8777492.50749 with patch, with jit skinned vertices per second (bigger is better): 8937022.97702 -------- The improvements are not substantial except the small array allocation since the rest of the code is the same. But this now allows using reserved slots for small arrays so that memory is assigned at the end of the JSObject. That will be the next step but I would really like feedback on the patch till now. [0] http://www.khronos.org/registry/typedarray/specs/latest/ [1] http://pastebin.mozilla.org/1230047 [2] http://pastebin.mozilla.org/1230046 [3] http://pastebin.mozilla.org/1230049 [4] http://pastebin.mozilla.org/1230050 [5] http://pastebin.mozilla.org/1230052
Attachment #532333 - Attachment is obsolete: true
Attachment #533864 - Flags: review?(mrbkap)
Attachment #533864 - Flags: feedback?(gal)
This patch just augments the earlier one to reserve 16 slots and not make heap allocations if the array buffer size is small enough to fit in the slots. as of writing a js::Value is 8 bytes 16 reserved slots = 128 bytes on 64 bit: 8 bytes used for length -> 120 available for use (U)Int8Array - 120 (U)Int16Array - 60 (U)Int32/Float32Array - 30 Float64Array - 15 We could also use 8 reserved slots = 64 bytes on 64 bit: 8 bytes used for length -> 56 available for use (U)Int8Array - 56 (U)Int16Array - 28 (U)Int32/Float32Array - 14 Float64Array - 7 This seems to make a little more sense because WebGL arrays using vectors will usually be Float32Arrays with a size of 3-4 based on 2D or 3D But transformation matrices could be size 16. New benchmarks: Darwin host-6-29.mv.mozilla.com 10.7.3 Darwin Kernel Version 10.7.3: Sun Mar 6 13:37:56 PST 2011; root:xnu-1504.14.2~1/RELEASE_X86_64 x86_64 compares calloced slots vs reserved slots (initial reserve slot size 16) little-typed-array-benchmark-access - creates 4 different views of 3-5 elements and tries to access their array elements a 1000000 times each, averaged over 3 runs Heap, with jit 0.277 Heap, with jit, with JM 0.296 Reserved, with jit 0.277 Reserved, with jit, with JM 0.295 little-typed-array-benchmark-all-types - creates 1000000 size 5 type arrays of every view type - averaged over 3 runs Heap, with jit 2.559 Heap, with jit, with JM 2.57 Reserved, with jit 1.816 Reserved, with jit, with JM 1.823 Small typed array creation time has dropped significantly! \o/ little-typed-array-benchmark-large-arrays - creates 1000 size 1024*1024 arrays of each view, averaged over 3 runs Heap, with jit 4.558 Heap, with jit, with JM 4.543 Reserved, with jit 4.239 Reserved, with jit, with JM 4.353 benchmark-bocoup.js Heap, with jit Write: 595 Read: 484 Loop Copy: 775 Slice Copy: 193 Heap, with jit, with JM Write: 660 Read: 508 Loop Copy: 832 Slice Copy: 192 Reserved, with jit Write: 593 Read: 504 Loop Copy: 798 Slice Copy: 187 Reserved, with jit, with JM Write: 652 Read: 493 Loop Copy: 842 Slice Copy: 192 skinned-test-typed.js averaged over 3 runs Heap, with jit skinned vertices per second (bigger is better): 9153333 Reserved, with jit skinned vertices per second (bigger is better): 9173333 ------------ The next step is to make typed arrays store a 'buffer' internally until it is asked for explicitly.
Attachment #534092 - Flags: review?(mrbkap)
Attachment #534092 - Flags: feedback?(gal)
Comment on attachment 533864 [details] [diff] [review] Store data in slots, delegate properties to private JSObject Review of attachment 533864 [details] [diff] [review]: ----------------------------------------------------------------- This looks pretty good overall. I do have some nitpicks and small things to iron out, though. ::: js/src/jstypedarray.cpp @@ +67,4 @@ > using namespace js; > using namespace js::gc; > > +static const char *byte_length_str = "byteLength"; This probably wants to be part of the runtime's atomState. @@ +136,5 @@ > return true; > } > > +static JSObject * > +delegateObject(JSContext *cx, JSObject *obj) { Two nitpicks: static functions like that have StudyCaps names (so this should be DelegateObject). Also, functions declared at the top levels' { should go on the next line (so that vim's [[ works). @@ +138,5 @@ > > +static JSObject * > +delegateObject(JSContext *cx, JSObject *obj) { > + if (!obj->getPrivate()) { > + JSObject *delegate = JS_NewObject(cx, NULL, NULL, NULL); This should use js_NewNonFunction<WithProto::Class>. Inside the JS engine, we try to use js_* functions as much as possible to avoid the relocation overhead. @@ +139,5 @@ > +static JSObject * > +delegateObject(JSContext *cx, JSObject *obj) { > + if (!obj->getPrivate()) { > + JSObject *delegate = JS_NewObject(cx, NULL, NULL, NULL); > + JS_ASSERT(delegate); Sadly, we can't assert this. @@ +141,5 @@ > + if (!obj->getPrivate()) { > + JSObject *delegate = JS_NewObject(cx, NULL, NULL, NULL); > + JS_ASSERT(delegate); > + if (!delegate) { > + JS_ReportErrorNumber(cx, js_GetErrorMessage, NULL, JSMSG_OUT_OF_MEMORY); Unless an API specifies otherwise, it's responsible for throwing an exception. That means you can simply return false here and let js_NewObject/NewNonFunction take care of error reporting. (Also, as a side note, OUT_OF_MEMORY can be reported more succinctly as js_ReportOutOfMemory(cx)). @@ +171,5 @@ > + * The first 4/8 bytes hold the length. > + * The rest of it is a flat data store for the array buffer. > + */ > + uint32 size = nbytes; > +#ifdef JS_64BIT This code seems to exist twice (here and again down below). It should be pulled out into a function. @@ +206,5 @@ > + *objp = obj; > + return true; > + } > + > + JSObject *delegate = delegateObject(cx, obj); You're missing a null check for the OOM case. @@ +212,5 @@ > + > + // if false, there was an error, so propagate it > + // otherwise, if propp is non-null, the property > + // was found, otherwise it was not > + // found so look in the prototype chain Nit: I'd follow the surrounding style for comments here and use: /* * ... */ Also, please start comments with capital letters and use periods. @@ +1551,1 @@ > JSCLASS_HAS_CACHED_PROTO(JSProto_ArrayBuffer), I think we also need to mark that we're NON_NATIVE here. @@ +1556,4 @@ > EnumerateStub, > ResolveStub, > ConvertStub, > + FinalizeStub, I'd forgotten when I mentioned this, but you can make finalize be null, which is a little clearer. @@ +1755,4 @@ > if (!proto) > return NULL; > > + JS_ASSERT(proto); This assertion seems vacuous in light of the previous if statement.
Attachment #533864 - Flags: review?(mrbkap)
Attached patch Updated to fix review issues (obsolete) — Splinter Review
ArrayBuffer is made non-native and other fixes. Note: this update causes a failure when trying to apply the second patch (use 16 reserved slots) on top.
Attachment #533864 - Attachment is obsolete: true
Attachment #534092 - Attachment is obsolete: true
Attachment #533864 - Flags: feedback?(gal)
Attachment #534092 - Flags: review?(mrbkap)
Attachment #534092 - Flags: feedback?(gal)
No logic changes. Only updated patch so that it applies against changes to previous patch.
Attached file Fix ArrayBuffer API changes (obsolete) —
This patch fixes the change in ArrayBuffer implementation in non-Spidermonkey code that uses ArrayBuffers in C++. Since ArrayBuffer is now a 'opaque' JSObject, property access is only through the getters.
Looks great. You have to update the uuid when you change the IDL.
Attached patch Fix ArrayBuffer API changes (obsolete) — Splinter Review
update UUID in IDL file.
Attachment #535374 - Attachment is obsolete: true
Fix typo on 32 bit systems.
Attachment #534946 - Attachment is obsolete: true
Comment on attachment 534944 [details] [diff] [review] Updated to fix review issues This looks good (were you waiting for my review here?). I have two nits that I can fix before checkin (though I'm assuming you consider this ready for checkin, please do verify!). >+static inline void >+AllocateSlots(JSContext *cx, JSObject *obj, uint32 size) >+{ >+ uint32 bytes = size; >+#ifdef JS_64BIT >+ bytes += sizeof(uint64); >+ obj->slots = (js::Value *)cx->calloc_(bytes); >+#else >+ bytes += sizeof(uint32); >+ obj->slots = (js::Value *)cx->calloc_(bytes); >+#endif >+ *((uint32*)obj->slots) = size; >+} This can't ignore the return value of calloc_... So this needs to return a boolean indicating whether or not we ran out of memory. >-ArrayBuffer::freeStorage(JSContext *cx) >-{ >- if (data) { >- cx->free_(data); >-#ifdef DEBUG >- // the destructor asserts that data is 0 in debug builds >- data = NULL; >-#endif >- } >-} >- > ArrayBuffer::~ArrayBuffer() > { >- JS_ASSERT(data == NULL); >+} The comment and the diff seem to be at odds here. >+ArrayBuffer::obj_lookupProperty(JSContext *cx, JSObject *obj, jsid id, >+ JSObject **objp, JSProperty **propp) >+{ ... >+ JSBool delegateResult = delegate->lookupProperty(cx, id, objp, propp); >+ >+ /* If false, there was an error, so propagate it. >+ * Otherwise, if propp is non-null, the property >+ * was found. Otherwise it was not >+ * found so look in the prototype chain. >+ */ >+ if (!delegateResult) >+ return false; No need for delegateResult here. The rest looks awesome... We should get it in!
Attachment #534944 - Flags: review+
Comment on attachment 535382 [details] [diff] [review] Fix ArrayBuffer API changes This also looks good.
Attachment #535382 - Flags: review+
(In reply to comment #15) > Comment on attachment 534944 [details] [diff] [review] [review] > Updated to fix review issues > > This looks good (were you waiting for my review here?). I have two nits that > I can fix before checkin (though I'm assuming you consider this ready for > checkin, please do verify!). > > >+static inline void > >+AllocateSlots(JSContext *cx, JSObject *obj, uint32 size) > >+{ > >+ uint32 bytes = size; > >+#ifdef JS_64BIT > >+ bytes += sizeof(uint64); > >+ obj->slots = (js::Value *)cx->calloc_(bytes); > >+#else > >+ bytes += sizeof(uint32); > >+ obj->slots = (js::Value *)cx->calloc_(bytes); > >+#endif > >+ *((uint32*)obj->slots) = size; > >+} > > This can't ignore the return value of calloc_... So this needs to return a > boolean indicating whether or not we ran out of memory. How about using bytes += sizeof(size_t) here instead of the ugly ifdef?
(In reply to comment #17) > (In reply to comment #15) > > Comment on attachment 534944 [details] [diff] [review] [review] [review] > > Updated to fix review issues > > > > This looks good (were you waiting for my review here?). I have two nits that > > I can fix before checkin (though I'm assuming you consider this ready for > > checkin, please do verify!). > > > > >+static inline void > > >+AllocateSlots(JSContext *cx, JSObject *obj, uint32 size) > > >+{ > > > >+ uint32 bytes = size; > > >+#ifdef JS_64BIT > > >+ bytes += sizeof(uint64); > > >+ obj->slots = (js::Value *)cx->calloc_(bytes); > > >+#else > > >+ bytes += sizeof(uint32); > > >+ obj->slots = (js::Value *)cx->calloc_(bytes); > > >+#endif > > >+ *((uint32*)obj->slots) = size; > > >+} > > > > This can't ignore the return value of calloc_... So this needs to return a > > boolean indicating whether or not we ran out of memory. > > How about using bytes += sizeof(size_t) here instead of the ugly ifdef? I have just replaced this to use uint64 on all platforms, because certain assertions in other parts of the code stress byte-alignment to 8 byte boundaries.
(In reply to comment #15) > Comment on attachment 534944 [details] [diff] [review] [review] > Updated to fix review issues > > This looks good (were you waiting for my review here?). I have two nits that > I can fix before checkin (though I'm assuming you consider this ready for > checkin, please do verify!). > > >+static inline void > >+AllocateSlots(JSContext *cx, JSObject *obj, uint32 size) > >+{ > >+ uint32 bytes = size; > >+#ifdef JS_64BIT > >+ bytes += sizeof(uint64); > >+ obj->slots = (js::Value *)cx->calloc_(bytes); > >+#else > >+ bytes += sizeof(uint32); > >+ obj->slots = (js::Value *)cx->calloc_(bytes); > >+#endif > >+ *((uint32*)obj->slots) = size; > >+} > > This can't ignore the return value of calloc_... So this needs to return a > boolean indicating whether or not we ran out of memory. > > >-ArrayBuffer::freeStorage(JSContext *cx) > >-{ > >- if (data) { > >- cx->free_(data); > >-#ifdef DEBUG > >- // the destructor asserts that data is 0 in debug builds > >- data = NULL; > >-#endif > >- } > >-} > >- > > ArrayBuffer::~ArrayBuffer() > > { > >- JS_ASSERT(data == NULL); > >+} > > The comment and the diff seem to be at odds here. > > >+ArrayBuffer::obj_lookupProperty(JSContext *cx, JSObject *obj, jsid id, > >+ JSObject **objp, JSProperty **propp) > >+{ > ... > >+ JSBool delegateResult = delegate->lookupProperty(cx, id, objp, propp); > >+ > >+ /* If false, there was an error, so propagate it. > >+ * Otherwise, if propp is non-null, the property > >+ * was found. Otherwise it was not > >+ * found so look in the prototype chain. > >+ */ > >+ if (!delegateResult) > >+ return false; > > No need for delegateResult here. > > The rest looks awesome... We should get it in! There are mochikit test no. 2 failures on running these patches on the try server (http://tbpl.mozilla.org/?tree=Try&rev=f25ce3ee82e4). These ran late Friday night, I will fix them today and then have it ready for checkin.
(In reply to comment #15) > Comment on attachment 534944 [details] [diff] [review] [review] > Updated to fix review issues > > This looks good (were you waiting for my review here?). I have two nits that > I can fix before checkin (though I'm assuming you consider this ready for > checkin, please do verify!). > > >+static inline void > >+AllocateSlots(JSContext *cx, JSObject *obj, uint32 size) > >+{ > >+ uint32 bytes = size; > >+#ifdef JS_64BIT > >+ bytes += sizeof(uint64); > >+ obj->slots = (js::Value *)cx->calloc_(bytes); > >+#else > >+ bytes += sizeof(uint32); > >+ obj->slots = (js::Value *)cx->calloc_(bytes); > >+#endif > >+ *((uint32*)obj->slots) = size; > >+} > > This can't ignore the return value of calloc_... So this needs to return a > boolean indicating whether or not we ran out of memory. Fixed > > >-ArrayBuffer::freeStorage(JSContext *cx) > >-{ > >- if (data) { > >- cx->free_(data); > >-#ifdef DEBUG > >- // the destructor asserts that data is 0 in debug builds > >- data = NULL; > >-#endif > >- } > >-} > >- > > ArrayBuffer::~ArrayBuffer() > > { > >- JS_ASSERT(data == NULL); > >+} > > The comment and the diff seem to be at odds here. This code has been removed altogether. Have I missed something? > > >+ArrayBuffer::obj_lookupProperty(JSContext *cx, JSObject *obj, jsid id, > >+ JSObject **objp, JSProperty **propp) > >+{ > ... > >+ JSBool delegateResult = delegate->lookupProperty(cx, id, objp, propp); > >+ > >+ /* If false, there was an error, so propagate it. > >+ * Otherwise, if propp is non-null, the property > >+ * was found. Otherwise it was not > >+ * found so look in the prototype chain. > >+ */ > >+ if (!delegateResult) > >+ return false; > > No need for delegateResult here. > > The rest looks awesome... We should get it in! Since we are tacking properties onto the delegate, shouldn't we check it when a lookup is requested?
Attached file Updated to fix review issues (obsolete) —
This change fixes: 1. Catch a failed calloc and return false and handle it where it is used and return NULL there. 2. Fix a change in a test. The previous patch had modified a test which shouldn't have been done. This patch reverts that and ensures that ArrayBuffer.prototype cannot be serialized. The same principle as the typed array classes has been used by having a fastClass and a slowClass and setting ArrayBuffer.prototype to slowClass while ArrayBuffer::create sets actual objects to fastClass.
Attached patch Updated to fix review issues (obsolete) — Splinter Review
Updated to fix review issues This change fixes: 1. Catch a failed calloc and return false and handle it where it is used and return NULL there. 2. Fix a change in a test. The previous patch had modified a test which shouldn't have been done. This patch reverts that and ensures that ArrayBuffer.prototype cannot be serialized. The same principle as the typed array classes has been used by having a fastClass and a slowClass and setting ArrayBuffer.prototype to slowClass while ArrayBuffer::create sets actual objects to fastClass.
Attachment #534944 - Attachment is obsolete: true
Attachment #536365 - Attachment is obsolete: true
Attached patch Fix ArrayBuffer API changes (obsolete) — Splinter Review
Blake, this patch can be checked in now. Try server results are at http://tbpl.mozilla.org/?tree=Try&rev=2b32f9f1d93b Failures do not seem to be something to do with my changes.
Attachment #535382 - Attachment is obsolete: true
Attached patch Fix ArrayBuffer API changes (obsolete) — Splinter Review
Update to fix 2 conflicts in merge from mozilla-central.
Attachment #537005 - Attachment is obsolete: true
Attached patch Fix ArrayBuffer API changes (obsolete) — Splinter Review
Fix compile issues.
Attachment #539245 - Attachment is obsolete: true
Assignee: general → nsm.nikhil
Why are we adding jsobj.h includes all over content code, exactly?
r-=me, over my dead body. I worked really hard to get that **** out of the include dependencies.
What would be a way to fix this while still having the optimizations?
Ideally, we'd have a public API for manipulating ArrayBuffers and typed arrays and keep the inlined versions for use inside the engine.
> What would be a way to fix this while still having the optimizations? Put the inline stuff (which is not used by content, right?) into a jstypedarrayinlines.h that's not public and have that include jsobj.h?
Depends on: 665355
Depends on: 665356
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Depends on: 665914
Depends on: 669005
Blocks: 677854
No longer blocks: 677854
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: