Closed
Bug 1146597
Opened 10 years ago
Closed 10 years ago
Use unboxed arrays for JSOP_NEWARRAY arrays
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla40
Tracking | Status | |
---|---|---|
firefox40 | --- | fixed |
People
(Reporter: bhackett1024, Assigned: bhackett1024)
References
(Blocks 1 open bug)
Details
Attachments
(10 files, 2 obsolete files)
276.09 KB,
patch
|
Details | Diff | Splinter Review | |
80.76 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
38.57 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
4.86 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
5.34 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
61.64 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
77.13 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
16.47 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
44.87 KB,
patch
|
Details | Diff | Splinter Review | |
843 bytes,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
Bug 1106828 describes a possible layout to use for unboxed arrays. Given the work that's happened since while doing unboxed plain objects, that layout can be improved. The target now is:
[ group elements length capacityAndInitializedLength]
Where |elements| points directly to the unboxed array data, with no header involved. The main trick here is how to compact information about the allocated capacity and the initialized length into 32 bits. I think that using the lower 26 bits for the initialized length and the upper 6 bits as an index into an array of possible capacities should work. This makes it slower to access these two quantities for an unboxed array vs. a native array, but the difference shouldn't be excessive and if it proves to be a problem for e.g. Ion compilation of code that keeps appending data to an array, we should be able to add new optimizations to keep things from slowing down.
The initial application for unboxed arrays will be plain array literals created by scripts. Later we can use unboxed arrays in other places, like JSON arrays, 'new Array()', other pc-dependent array operations like split(), and so forth. Also, initially unboxed arrays won't be able to cope with holes, though this will be relaxed later, at least for non-int32 element types. In principle we could have unboxed arrays with holes and int31 values, though I don't know if the extra complexity/cost would be worth it.
Assignee | ||
Comment 1•10 years ago
|
||
WIP which adds unboxed arrays and baseline/Ion fast paths for getelem/setelem/setelem_hole. This is almost done but fails on a few benchmarks.
Assignee: nobody → bhackett1024
Assignee | ||
Comment 2•10 years ago
|
||
Working patch. This behaves correctly on all the major benchmarks but does not yet improve anything. On microbenchmarks like:
var n = 0;
for (var i = 0; i < 1000000; i++) {
var a = [i, 2, 3, 4, 5, 6, 7, 8];
for (var j = 0; j < a.length; j++)
n += a[j];
n = 0;
}
When running without ggc using unboxed arrays (--unboxed-arrays in the shell) improves my time from 125ms to 48ms, so the improvement of a smaller memory footprint is pretty nice. There's still some weirdness here though, as for now unboxed arrays are being finalized in the foreground because background finalization more than triples my time on this benchmark (???). I'm comparing without using GGC since for now unboxed arrays are not nursery allocatable (they have a finalizer) and fixing this will require some work on ggc.
Will break this patch up for review.
Attachment #8589740 -
Attachment is obsolete: true
Assignee | ||
Comment 3•10 years ago
|
||
Add UnboxedArrayObject class and associated functionality, and update UnboxedLayout to work with unboxed arrays as well as unboxed plain objects.
Attachment #8590319 -
Flags: review?(jdemooij)
Assignee | ||
Comment 4•10 years ago
|
||
Modify JSOP_NEWARRAY in a similar fashion to what was done for JSOP_NEWOBJECT and plain objects, so that we can mark the initial objects created at an opcode as preliminary and use an unboxed representation later if desired. This also splits out a JSOP_SPREADCALLARRAY opcode, as spread calls have some requirements imposed by baseline that the object definitely be a normal dense array.
Attachment #8590322 -
Flags: review?(jdemooij)
Assignee | ||
Comment 5•10 years ago
|
||
A few macro assembler changes to support unboxed arrays.
Attachment #8590324 -
Flags: review?(jdemooij)
Assignee | ||
Comment 6•10 years ago
|
||
Handle unboxed arrays in ReceiverGuard, and handle unboxed arrays in the various routines used for compiling property accesses through prototype objects.
Attachment #8590325 -
Flags: review?(jdemooij)
Assignee | ||
Comment 7•10 years ago
|
||
Add baseline caches for unboxed array GETELEM, SETELEM, SETELEM_HOLE, and length accesses.
Attachment #8590327 -
Flags: review?(jdemooij)
Assignee | ||
Comment 8•10 years ago
|
||
There are a lot more ways that Ion can compile accesses to unboxed arrays vs. unboxed plain objects. This patch makes the mechanism used for identifying and triggering aborts during Ion compilation based on accesses to object groups with unanalyzed preliminary objects.
Attachment #8590331 -
Flags: review?(jdemooij)
Assignee | ||
Comment 9•10 years ago
|
||
Add fast paths to Ion for GETELEM, SETELEM, SETELEM_HOLE, and length accesses on unboxed arrays.
Attachment #8590332 -
Flags: review?(jdemooij)
Assignee | ||
Comment 10•10 years ago
|
||
Comment on attachment 8590331 [details] [diff] [review]
fix preliminary object abort mechanism
The content of this patch is required to fix bug 1149498, so asking for review there.
Attachment #8590331 -
Flags: review?(jdemooij)
Assignee | ||
Updated•10 years ago
|
Attachment #8590331 -
Attachment is obsolete: true
Comment 11•10 years ago
|
||
Comment on attachment 8590319 [details] [diff] [review]
basic functionality
Review of attachment 8590319 [details] [diff] [review]:
-----------------------------------------------------------------
Sorry for the delay, r=me with comments below addressed.
::: js/src/vm/UnboxedObject.cpp
@@ +496,5 @@
> + // The length shape to use for arrays is cached via a modified initial
> + // shape for array objects. Create an array now to make sure this entry
> + // is instantiated.
> + if (!NewDenseEmptyArray(cx))
> + return false;
Hm, allocating a new, unused array whenever we convert an array to native seems a bit hackish. Is there a way to avoid this or do this only when necessary, maybe |if (shape->isEmptyShape())| below?
@@ +525,5 @@
> return false;
>
> // Propagate all property types from the old group to the new group.
> + if (layout.isArray()) {
> + PropagatePropertyTypes(cx, JSID_VOID, group, nativeGroup);
PropagatePropertyTypes is fallible.
@@ +530,5 @@
> + } else {
> + for (size_t i = 0; i < layout.properties().length(); i++) {
> + const UnboxedLayout::Property& property = layout.properties()[i];
> + jsid id = NameToId(property.name);
> + PropagatePropertyTypes(cx, id, group, nativeGroup);
And here.
@@ +1014,5 @@
> +
> + size_t capacity = (GetGCKindBytes(allocKind) - offsetOfInlineElements()) / elementSize;
> + res->setCapacityIndex(exactCapacityIndex(capacity));
> + } else {
> + uint8_t* elements = cx->zone()->pod_malloc<uint8_t>(length * elementSize);
I think this can be
UniquePtr<uint8_t[], JS::FreePolicy> elements(...);
Then we can remove the js_free below and use res->elements_ = elements.release();
@@ +1036,5 @@
> +
> +bool
> +UnboxedArrayObject::setElement(ExclusiveContext* cx, size_t index, const Value& v)
> +{
> + uint8_t* p = elements() + index * elementSize();
Can we assert index < initLength here and in initElement, getElement?
@@ +1045,5 @@
> +UnboxedArrayObject::initElement(ExclusiveContext* cx, size_t index, const Value& v)
> +{
> + uint8_t* p = elements() + index * elementSize();
> + if (elementType() == JSVAL_TYPE_STRING || elementType() == JSVAL_TYPE_OBJECT)
> + *reinterpret_cast<void**>(p) = nullptr;
Is this to avoid bogus pre-barriers? This is clearer:
if (UnboxedTypeNeedsPreBarrier(elementType()))
@@ +1220,5 @@
> + MOZ_ASSERT(cap <= newCapacity);
> +
> + uint8_t* newElements;
> + if (hasInlineElements()) {
> + newElements = cx->zone()->pod_malloc<uint8_t>(newCapacity * elementSize());
This can't overflow because MaximumCapacity doesn't come close to UINT32_MAX, right?
We can static_assert MaximumCapacity < UINT32_MAX / sizeof(double)
::: js/src/vm/UnboxedObject.h
@@ +112,5 @@
> js_free(traceList_);
> }
>
> + bool isArray() {
> + return properties_.empty();
Maybe |return elementType_ != JSVAL_TYPE_MAGIC;|, for symmetry with initArray?
Attachment #8590319 -
Flags: review?(jdemooij) → review+
Comment 12•10 years ago
|
||
Comment on attachment 8590322 [details] [diff] [review]
new array opcodes
Review of attachment 8590322 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/BaselineIC.h
@@ +1739,5 @@
> friend class ICStubSpace;
>
> + HeapPtrObject templateObject_;
> +
> + ICNewArray_Fallback(JitCode* stubCode)
Make this |explicit|.
::: js/src/jit/CodeGenerator.cpp
@@ +4075,5 @@
> }
> };
>
> +typedef JSObject* (*NewArrayOperationFn)(JSContext*, HandleScript, jsbytecode*, uint32_t,
> + NewObjectKind);
Nit: fix indentation.
Attachment #8590322 -
Flags: review?(jdemooij) → review+
Comment 13•10 years ago
|
||
Comment on attachment 8590324 [details] [diff] [review]
macro assembler changes
Review of attachment 8590324 [details] [diff] [review]:
-----------------------------------------------------------------
r=me with the issue below fixed.
::: js/src/jit/MacroAssembler.cpp
@@ +1021,5 @@
> + branchKey(Assembler::BelowOrEqual, lengthAddr, index, failure);
> + jump(&done);
> + bind(&capacityIsIndex);
> + and32(Imm32(~0x3), temp);
> + rshiftPtr(Imm32(UnboxedArrayObject::CapacityShift - 2), temp);
I think this will clear the 2 lowest bits of the initialized length, but we'll shift those off anyway. Seems like we should either s/~0x3/CapacityMask/ or swap these two lines.
We should probably static_assert sizeof(UnboxedArrayObject::CapacityArray[0]) == 4 to make it more obvious what's going on.
Attachment #8590324 -
Flags: review?(jdemooij) → review+
Updated•10 years ago
|
Attachment #8590325 -
Flags: review?(jdemooij) → review+
Assignee | ||
Comment 14•10 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #11)
> ::: js/src/vm/UnboxedObject.cpp
> @@ +496,5 @@
> > + // The length shape to use for arrays is cached via a modified initial
> > + // shape for array objects. Create an array now to make sure this entry
> > + // is instantiated.
> > + if (!NewDenseEmptyArray(cx))
> > + return false;
>
> Hm, allocating a new, unused array whenever we convert an array to native
> seems a bit hackish. Is there a way to avoid this or do this only when
> necessary, maybe |if (shape->isEmptyShape())| below?
This is a bit hackish, yeah, but this is actually only done once for each distinct group that has objects converted to natives, so trying to optimize this case more didn't seem to make much sense.
Updated•10 years ago
|
Attachment #8590327 -
Flags: review?(jdemooij) → review+
Comment 15•10 years ago
|
||
Comment on attachment 8590332 [details] [diff] [review]
ion fast paths
Review of attachment 8590332 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/VMFunctions.cpp
@@ +1111,4 @@
> {
> // This function is called from Ion code for StoreElementHole's OOL path.
> // In this case we know the object is native and we can use setDenseElement
> // instead of setDenseElementWithType.
Nit: update the comment.
Attachment #8590332 -
Flags: review?(jdemooij) → review+
Comment 16•10 years ago
|
||
Comment 17•10 years ago
|
||
It seems this patch regressed a lot of tests on AWFY
Flags: needinfo?(bhackett1024)
Assignee | ||
Comment 18•10 years ago
|
||
Flags: needinfo?(bhackett1024)
Comment 19•10 years ago
|
||
(In reply to Brian Hackett (:bhackett) from comment #18)
> Backed out:
>
> https://hg.mozilla.org/integration/mozilla-inbound/rev/15c5ec8215f7
Are you sure you did not backed out too many things?
http://arewefastyet.com/#machine=28&view=single&suite=misc&subtest=basic-strcat&start=1430321582&end=1430388445
Assignee | ||
Comment 20•10 years ago
|
||
(In reply to Nicolas B. Pierron [:nbp] from comment #19)
> Are you sure you did not backed out too many things?
>
> http://arewefastyet.com/#machine=28&view=single&suite=misc&subtest=basic-
> strcat&start=1430321582&end=1430388445
This regression seems to have fixed itself.
Assignee | ||
Comment 21•10 years ago
|
||
This patch fixes all the performance regressions on AWFY in ss/kraken/octane from the earlier patches. It also includes some unboxed array improvements so that appending elements to an array in a loop is faster with unboxed arrays than normal arrays.
Attachment #8600197 -
Flags: review?(jdemooij)
Assignee | ||
Comment 22•10 years ago
|
||
This is a patch I wrote some time back in concert with the patch just posted. It adds an analysis pass when unboxed array setelem_holes are performed to see if it can reuse the computation of the array capacity done in each setelem, by adding a loop phi that is only recomputed when a VM call is made and the array capacity might change. This improves setelem_hole loops but not that much, maybe 5%. Since with the previous patch setelem_hole loops are a lot faster on unboxed arrays anyways, I don't think the complexity in this patch is necessary for now, and just posting it for posterity in case we revisit this issue later.
Comment 23•10 years ago
|
||
Comment on attachment 8600197 [details] [diff] [review]
perf fixes
Review of attachment 8600197 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/BaselineCompiler.cpp
@@ +1830,5 @@
> if (key == JSProto_Array) {
> // Pass length in R0.
> masm.move32(Imm32(0), R0.scratchReg());
>
> + ObjectGroup* group = ObjectGroup::allocationSiteGroup(cx, script, pc, JSProto_Array);
Should we do the same for JSOP_NEWOBJECT?
::: js/src/jit/BaselineInspector.h
@@ +116,5 @@
> JSObject* getTemplateObject(jsbytecode* pc);
> JSObject* getTemplateObjectForNative(jsbytecode* pc, Native native);
> JSObject* getTemplateObjectForClassHook(jsbytecode* pc, const Class* clasp);
>
> + // Sometimes the group a template object will have is known, even if the object itself isn't.
Nit: this is more than 80 chars I think.
::: js/src/vm/TypeInference.cpp
@@ +643,5 @@
> void
> TypeSet::print(FILE* fp)
> {
> + if (!fp)
> + fp = stderr;
Nit: is this needed, considering the fp argument defaults to stderr?
Attachment #8600197 -
Flags: review?(jdemooij) → review+
Assignee | ||
Comment 24•10 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #23)
> ::: js/src/jit/BaselineCompiler.cpp
> @@ +1830,5 @@
> > if (key == JSProto_Array) {
> > // Pass length in R0.
> > masm.move32(Imm32(0), R0.scratchReg());
> >
> > + ObjectGroup* group = ObjectGroup::allocationSiteGroup(cx, script, pc, JSProto_Array);
>
> Should we do the same for JSOP_NEWOBJECT?
Probably, though I haven't seen any tests that are affected by this. The pattern here is where an array is created once and then written/read in some long loops, and that doesn't seem to happen much with plain objects.
> ::: js/src/vm/TypeInference.cpp
> @@ +643,5 @@
> > void
> > TypeSet::print(FILE* fp)
> > {
> > + if (!fp)
> > + fp = stderr;
>
> Nit: is this needed, considering the fp argument defaults to stderr?
I added this because my lldb is unable to figure out the default argument when calling print(), and doesn't seem to understand what stderr is either.
Comment 25•10 years ago
|
||
Comment 26•10 years ago
|
||
Assignee | ||
Comment 27•10 years ago
|
||
The second push above is a followup to fix a redundant check of the runtime unboxed object/array option, which was breaking the JS_OPTION_USE_UNBOXED_OBJECTS environment variable.
Comment 28•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/020c6a559e3a
https://hg.mozilla.org/mozilla-central/rev/5b82ebe679e6
Status: NEW → RESOLVED
Closed: 10 years ago
status-firefox40:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla40
Comment 29•10 years ago
|
||
Awfy detected improvements \0/, but also some regressions:
mac 32bit, no asmjs: http://arewefastyet.com/regressions/#/regression/1457055
- none
mac 64bit, no amsjs: http://arewefastyet.com/regressions/#/regression/1457080
- asmjs-ubench-skinning: 0.39% regression
mac 32bit, ion: http://arewefastyet.com/regressions/#/regression/1457050
- kraken-crypto-pbkdf2: 1.81%
mac 64bit, ion: http://arewefastyet.com/regressions/#/regression/1457069
- kraken-crypto-pbkdf2: 1.81%
- ss 1.0.1: unpack-code: 4.46%
- ss 1.0.1: tagcloud: 3.05%
- misc 0.4: bugs-608733-trace-compiler: 1.98%
- misc 0.4 : bugs-1131099-lodash2: 2.49%
- misc 0.4 : bugs-1131099-underscore: 0.3%
Was this expected? And can we do something about these regressions?
Assignee | ||
Comment 30•10 years ago
|
||
I'll take a look at the regressions sometime in the next couple days.
Flags: needinfo?(bhackett1024)
Assignee | ||
Comment 31•10 years ago
|
||
Gary and decoder, this bug added a --unboxed-arrays option to the shell which would be great for fuzzing. As with --unboxed-objects, this tests a feature that is not on by default anywhere.
Comment 32•10 years ago
|
||
Brian, coverity considers that this change can introduce a Division or modulo by zero (cid 1296731) on this line:
https://dxr.mozilla.org/mozilla-central/source/js/src/vm/UnboxedObject.cpp?from=js/src/vm/UnboxedObject.cpp#1035
If you think that is correct, I can report a new bug if you want.
Assignee | ||
Comment 33•10 years ago
|
||
(In reply to Sylvestre Ledru [:sylvestre] from comment #32)
> Brian, coverity considers that this change can introduce a Division or
> modulo by zero (cid 1296731) on this line:
> https://dxr.mozilla.org/mozilla-central/source/js/src/vm/UnboxedObject.
> cpp?from=js/src/vm/UnboxedObject.cpp#1035
>
> If you think that is correct, I can report a new bug if you want.
This is a false positive.
Comment 34•10 years ago
|
||
(In reply to Brian Hackett (:bhackett) from comment #33)
> This is a false positive.
Thanks, ignored in coverity then. Thanks!
Assignee | ||
Comment 35•10 years ago
|
||
With cachegrind and perf I don't see a difference in instruction counts on string-tagcloud or string-unpack-code, but krake-stanford-crypto-pbkdf2 does execute about 2.5% more instructions. This seems to be because of the hasUnanalyzedPreliminaryObjects() call in TryReuseArrayGroup, which calls the outlined function ObjectGroup::maybeSweep several times. This patch avoids calling maybeSweep() and seems to fix the instruction count issue.
There are a couple bigger issues here this patch doesn't address. It would be nice if the initial test in maybeSweep() was inlined, but this is annoying to do because of include file issues. Also, it would be good to optimize the involved natives (array_slice mainly I think) from Ion.
Flags: needinfo?(bhackett1024)
Attachment #8603883 -
Flags: review?(jdemooij)
Updated•10 years ago
|
Attachment #8603883 -
Flags: review?(jdemooij) → review+
Comment 36•10 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•