Closed Bug 1146597 Opened 10 years ago Closed 10 years ago

Use unboxed arrays for JSOP_NEWARRAY arrays

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla40

Tracking Flags:

Tracking

Status

firefox40

---

fixed

People

(Reporter: bhackett1024, Assigned: bhackett1024)

References

(Blocks 1 open bug)

Details

Attachments

(10 files, 2 obsolete files)

WIP 10 years ago Brian Hackett [Laid off!] 273.27 KB, patch		Details \| Diff \| Splinter Review
patch 10 years ago Brian Hackett [Laid off!] 276.09 KB, patch		Details \| Diff \| Splinter Review
basic functionality 10 years ago Brian Hackett [Laid off!] 80.76 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
new array opcodes 10 years ago Brian Hackett [Laid off!] 38.57 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
macro assembler changes 10 years ago Brian Hackett [Laid off!] 4.86 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
lookups through unboxed arrays 10 years ago Brian Hackett [Laid off!] 5.34 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
baseline ICs 10 years ago Brian Hackett [Laid off!] 61.64 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
fix preliminary object abort mechanism 10 years ago Brian Hackett [Laid off!] 9.42 KB, patch		Details \| Diff \| Splinter Review
ion fast paths 10 years ago Brian Hackett [Laid off!] 77.13 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
perf fixes 10 years ago Brian Hackett [Laid off!] 16.47 KB, patch	jandem : review+	Details \| Diff \| Splinter Review
faster unboxed array setelem_hole 10 years ago Brian Hackett [Laid off!] 44.87 KB, patch		Details \| Diff \| Splinter Review
patch 10 years ago Brian Hackett [Laid off!] 843 bytes, patch	jandem : review+	Details \| Diff \| Splinter Review

Brian Hackett [Laid off!]

Assignee

Description

•

10 years ago

Bug 1106828 describes a possible layout to use for unboxed arrays. Given the work that's happened since while doing unboxed plain objects, that layout can be improved. The target now is: [ group elements length capacityAndInitializedLength] Where |elements| points directly to the unboxed array data, with no header involved. The main trick here is how to compact information about the allocated capacity and the initialized length into 32 bits. I think that using the lower 26 bits for the initialized length and the upper 6 bits as an index into an array of possible capacities should work. This makes it slower to access these two quantities for an unboxed array vs. a native array, but the difference shouldn't be excessive and if it proves to be a problem for e.g. Ion compilation of code that keeps appending data to an array, we should be able to add new optimizations to keep things from slowing down. The initial application for unboxed arrays will be plain array literals created by scripts. Later we can use unboxed arrays in other places, like JSON arrays, 'new Array()', other pc-dependent array operations like split(), and so forth. Also, initially unboxed arrays won't be able to cope with holes, though this will be relaxed later, at least for non-int32 element types. In principle we could have unboxed arrays with holes and int31 values, though I don't know if the extra complexity/cost would be worth it.

Hannes Verschore [:h4writer]

Updated

•

10 years ago

Blocks: 1118938

Brian Hackett [Laid off!]

Assignee

Comment 1

•

10 years ago

Attached patch WIP (obsolete) — Details — Splinter Review

WIP which adds unboxed arrays and baseline/Ion fast paths for getelem/setelem/setelem_hole. This is almost done but fails on a few benchmarks.

Assignee: nobody → bhackett1024

Brian Hackett [Laid off!]

Assignee

Comment 2

•

10 years ago

Attached patch patch — Details — Splinter Review

Working patch. This behaves correctly on all the major benchmarks but does not yet improve anything. On microbenchmarks like: var n = 0; for (var i = 0; i < 1000000; i++) { var a = [i, 2, 3, 4, 5, 6, 7, 8]; for (var j = 0; j < a.length; j++) n += a[j]; n = 0; } When running without ggc using unboxed arrays (--unboxed-arrays in the shell) improves my time from 125ms to 48ms, so the improvement of a smaller memory footprint is pretty nice. There's still some weirdness here though, as for now unboxed arrays are being finalized in the foreground because background finalization more than triples my time on this benchmark (???). I'm comparing without using GGC since for now unboxed arrays are not nursery allocatable (they have a finalizer) and fixing this will require some work on ggc. Will break this patch up for review.

Attachment #8589740 - Attachment is obsolete: true

Brian Hackett [Laid off!]

Assignee

Comment 3

•

10 years ago

Attached patch basic functionality — Details — Splinter Review

Add UnboxedArrayObject class and associated functionality, and update UnboxedLayout to work with unboxed arrays as well as unboxed plain objects.

Attachment #8590319 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Comment 4

•

10 years ago

Attached patch new array opcodes — Details — Splinter Review

Modify JSOP_NEWARRAY in a similar fashion to what was done for JSOP_NEWOBJECT and plain objects, so that we can mark the initial objects created at an opcode as preliminary and use an unboxed representation later if desired. This also splits out a JSOP_SPREADCALLARRAY opcode, as spread calls have some requirements imposed by baseline that the object definitely be a normal dense array.

Attachment #8590322 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Comment 5

•

10 years ago

Attached patch macro assembler changes — Details — Splinter Review

A few macro assembler changes to support unboxed arrays.

Attachment #8590324 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Comment 6

•

10 years ago

Attached patch lookups through unboxed arrays — Details — Splinter Review

Handle unboxed arrays in ReceiverGuard, and handle unboxed arrays in the various routines used for compiling property accesses through prototype objects.

Attachment #8590325 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Comment 7

•

10 years ago

Attached patch baseline ICs — Details — Splinter Review

Add baseline caches for unboxed array GETELEM, SETELEM, SETELEM_HOLE, and length accesses.

Attachment #8590327 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Comment 8

•

10 years ago

Attached patch fix preliminary object abort mechanism (obsolete) — Details — Splinter Review

There are a lot more ways that Ion can compile accesses to unboxed arrays vs. unboxed plain objects. This patch makes the mechanism used for identifying and triggering aborts during Ion compilation based on accesses to object groups with unanalyzed preliminary objects.

Attachment #8590331 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Comment 9

•

10 years ago

Attached patch ion fast paths — Details — Splinter Review

Add fast paths to Ion for GETELEM, SETELEM, SETELEM_HOLE, and length accesses on unboxed arrays.

Attachment #8590332 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Comment 10

•

10 years ago

Comment on attachment 8590331 [details] [diff] [review] fix preliminary object abort mechanism The content of this patch is required to fix bug 1149498, so asking for review there.

Attachment #8590331 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Updated

•

10 years ago

Attachment #8590331 - Attachment is obsolete: true

Jan de Mooij [:jandem]

Comment 11

•

10 years ago

Comment on attachment 8590319 [details] [diff] [review] basic functionality Review of attachment 8590319 [details] [diff] [review]: ----------------------------------------------------------------- Sorry for the delay, r=me with comments below addressed. ::: js/src/vm/UnboxedObject.cpp @@ +496,5 @@ > + // The length shape to use for arrays is cached via a modified initial > + // shape for array objects. Create an array now to make sure this entry > + // is instantiated. > + if (!NewDenseEmptyArray(cx)) > + return false; Hm, allocating a new, unused array whenever we convert an array to native seems a bit hackish. Is there a way to avoid this or do this only when necessary, maybe |if (shape->isEmptyShape())| below? @@ +525,5 @@ > return false; > > // Propagate all property types from the old group to the new group. > + if (layout.isArray()) { > + PropagatePropertyTypes(cx, JSID_VOID, group, nativeGroup); PropagatePropertyTypes is fallible. @@ +530,5 @@ > + } else { > + for (size_t i = 0; i < layout.properties().length(); i++) { > + const UnboxedLayout::Property& property = layout.properties()[i]; > + jsid id = NameToId(property.name); > + PropagatePropertyTypes(cx, id, group, nativeGroup); And here. @@ +1014,5 @@ > + > + size_t capacity = (GetGCKindBytes(allocKind) - offsetOfInlineElements()) / elementSize; > + res->setCapacityIndex(exactCapacityIndex(capacity)); > + } else { > + uint8_t* elements = cx->zone()->pod_malloc<uint8_t>(length * elementSize); I think this can be UniquePtr<uint8_t[], JS::FreePolicy> elements(...); Then we can remove the js_free below and use res->elements_ = elements.release(); @@ +1036,5 @@ > + > +bool > +UnboxedArrayObject::setElement(ExclusiveContext* cx, size_t index, const Value& v) > +{ > + uint8_t* p = elements() + index * elementSize(); Can we assert index < initLength here and in initElement, getElement? @@ +1045,5 @@ > +UnboxedArrayObject::initElement(ExclusiveContext* cx, size_t index, const Value& v) > +{ > + uint8_t* p = elements() + index * elementSize(); > + if (elementType() == JSVAL_TYPE_STRING || elementType() == JSVAL_TYPE_OBJECT) > + *reinterpret_cast<void**>(p) = nullptr; Is this to avoid bogus pre-barriers? This is clearer: if (UnboxedTypeNeedsPreBarrier(elementType())) @@ +1220,5 @@ > + MOZ_ASSERT(cap <= newCapacity); > + > + uint8_t* newElements; > + if (hasInlineElements()) { > + newElements = cx->zone()->pod_malloc<uint8_t>(newCapacity * elementSize()); This can't overflow because MaximumCapacity doesn't come close to UINT32_MAX, right? We can static_assert MaximumCapacity < UINT32_MAX / sizeof(double) ::: js/src/vm/UnboxedObject.h @@ +112,5 @@ > js_free(traceList_); > } > > + bool isArray() { > + return properties_.empty(); Maybe |return elementType_ != JSVAL_TYPE_MAGIC;|, for symmetry with initArray?

Attachment #8590319 - Flags: review?(jdemooij) → review+

Jan de Mooij [:jandem]

Comment 12

•

10 years ago

Comment on attachment 8590322 [details] [diff] [review] new array opcodes Review of attachment 8590322 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/BaselineIC.h @@ +1739,5 @@ > friend class ICStubSpace; > > + HeapPtrObject templateObject_; > + > + ICNewArray_Fallback(JitCode* stubCode) Make this |explicit|. ::: js/src/jit/CodeGenerator.cpp @@ +4075,5 @@ > } > }; > > +typedef JSObject* (*NewArrayOperationFn)(JSContext*, HandleScript, jsbytecode*, uint32_t, > + NewObjectKind); Nit: fix indentation.

Attachment #8590322 - Flags: review?(jdemooij) → review+

Jan de Mooij [:jandem]

Comment 13

•

10 years ago

Comment on attachment 8590324 [details] [diff] [review] macro assembler changes Review of attachment 8590324 [details] [diff] [review]: ----------------------------------------------------------------- r=me with the issue below fixed. ::: js/src/jit/MacroAssembler.cpp @@ +1021,5 @@ > + branchKey(Assembler::BelowOrEqual, lengthAddr, index, failure); > + jump(&done); > + bind(&capacityIsIndex); > + and32(Imm32(~0x3), temp); > + rshiftPtr(Imm32(UnboxedArrayObject::CapacityShift - 2), temp); I think this will clear the 2 lowest bits of the initialized length, but we'll shift those off anyway. Seems like we should either s/~0x3/CapacityMask/ or swap these two lines. We should probably static_assert sizeof(UnboxedArrayObject::CapacityArray[0]) == 4 to make it more obvious what's going on.

Attachment #8590324 - Flags: review?(jdemooij) → review+

Jan de Mooij [:jandem]

Updated

•

10 years ago

Attachment #8590325 - Flags: review?(jdemooij) → review+

Brian Hackett [Laid off!]

Assignee

Comment 14

•

10 years ago

(In reply to Jan de Mooij [:jandem] from comment #11) > ::: js/src/vm/UnboxedObject.cpp > @@ +496,5 @@ > > + // The length shape to use for arrays is cached via a modified initial > > + // shape for array objects. Create an array now to make sure this entry > > + // is instantiated. > > + if (!NewDenseEmptyArray(cx)) > > + return false; > > Hm, allocating a new, unused array whenever we convert an array to native > seems a bit hackish. Is there a way to avoid this or do this only when > necessary, maybe |if (shape->isEmptyShape())| below? This is a bit hackish, yeah, but this is actually only done once for each distinct group that has objects converted to natives, so trying to optimize this case more didn't seem to make much sense.

Jan de Mooij [:jandem]

Updated

•

10 years ago

Attachment #8590327 - Flags: review?(jdemooij) → review+

Jan de Mooij [:jandem]

Comment 15

•

10 years ago

Comment on attachment 8590332 [details] [diff] [review] ion fast paths Review of attachment 8590332 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/VMFunctions.cpp @@ +1111,4 @@ > { > // This function is called from Ion code for StoreElementHole's OOL path. > // In this case we know the object is native and we can use setDenseElement > // instead of setDenseElementWithType. Nit: update the comment.

Attachment #8590332 - Flags: review?(jdemooij) → review+

Pulsebot

Comment 16

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/02dad33cf2a0

Guilherme Lima

Comment 17

•

10 years ago

It seems this patch regressed a lot of tests on AWFY

Flags: needinfo?(bhackett1024)

Brian Hackett [Laid off!]

Assignee

Comment 18

•

10 years ago

Backed out: https://hg.mozilla.org/integration/mozilla-inbound/rev/15c5ec8215f7

Flags: needinfo?(bhackett1024)

Nicolas B. Pierron [:nbp]

Comment 19

•

10 years ago

(In reply to Brian Hackett (:bhackett) from comment #18) > Backed out: > > https://hg.mozilla.org/integration/mozilla-inbound/rev/15c5ec8215f7 Are you sure you did not backed out too many things? http://arewefastyet.com/#machine=28&view=single&suite=misc&subtest=basic-strcat&start=1430321582&end=1430388445

Brian Hackett [Laid off!]

Assignee

Comment 20

•

10 years ago

(In reply to Nicolas B. Pierron [:nbp] from comment #19) > Are you sure you did not backed out too many things? > > http://arewefastyet.com/#machine=28&view=single&suite=misc&subtest=basic- > strcat&start=1430321582&end=1430388445 This regression seems to have fixed itself.

Brian Hackett [Laid off!]

Assignee

Comment 21

•

10 years ago

Attached patch perf fixes — Details — Splinter Review

This patch fixes all the performance regressions on AWFY in ss/kraken/octane from the earlier patches. It also includes some unboxed array improvements so that appending elements to an array in a loop is faster with unboxed arrays than normal arrays.

Attachment #8600197 - Flags: review?(jdemooij)

Brian Hackett [Laid off!]

Assignee

Comment 22

•

10 years ago

Attached patch faster unboxed array setelem_hole — Details — Splinter Review

This is a patch I wrote some time back in concert with the patch just posted. It adds an analysis pass when unboxed array setelem_holes are performed to see if it can reuse the computation of the array capacity done in each setelem, by adding a loop phi that is only recomputed when a VM call is made and the array capacity might change. This improves setelem_hole loops but not that much, maybe 5%. Since with the previous patch setelem_hole loops are a lot faster on unboxed arrays anyways, I don't think the complexity in this patch is necessary for now, and just posting it for posterity in case we revisit this issue later.

Jan de Mooij [:jandem]

Comment 23

•

10 years ago

Comment on attachment 8600197 [details] [diff] [review] perf fixes Review of attachment 8600197 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/BaselineCompiler.cpp @@ +1830,5 @@ > if (key == JSProto_Array) { > // Pass length in R0. > masm.move32(Imm32(0), R0.scratchReg()); > > + ObjectGroup* group = ObjectGroup::allocationSiteGroup(cx, script, pc, JSProto_Array); Should we do the same for JSOP_NEWOBJECT? ::: js/src/jit/BaselineInspector.h @@ +116,5 @@ > JSObject* getTemplateObject(jsbytecode* pc); > JSObject* getTemplateObjectForNative(jsbytecode* pc, Native native); > JSObject* getTemplateObjectForClassHook(jsbytecode* pc, const Class* clasp); > > + // Sometimes the group a template object will have is known, even if the object itself isn't. Nit: this is more than 80 chars I think. ::: js/src/vm/TypeInference.cpp @@ +643,5 @@ > void > TypeSet::print(FILE* fp) > { > + if (!fp) > + fp = stderr; Nit: is this needed, considering the fp argument defaults to stderr?

Attachment #8600197 - Flags: review?(jdemooij) → review+

Brian Hackett [Laid off!]

Assignee

Comment 24

•

10 years ago

(In reply to Jan de Mooij [:jandem] from comment #23) > ::: js/src/jit/BaselineCompiler.cpp > @@ +1830,5 @@ > > if (key == JSProto_Array) { > > // Pass length in R0. > > masm.move32(Imm32(0), R0.scratchReg()); > > > > + ObjectGroup* group = ObjectGroup::allocationSiteGroup(cx, script, pc, JSProto_Array); > > Should we do the same for JSOP_NEWOBJECT? Probably, though I haven't seen any tests that are affected by this. The pattern here is where an array is created once and then written/read in some long loops, and that doesn't seem to happen much with plain objects. > ::: js/src/vm/TypeInference.cpp > @@ +643,5 @@ > > void > > TypeSet::print(FILE* fp) > > { > > + if (!fp) > > + fp = stderr; > > Nit: is this needed, considering the fp argument defaults to stderr? I added this because my lldb is unable to figure out the default argument when calling print(), and doesn't seem to understand what stderr is either.

Pulsebot

Comment 25

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/020c6a559e3a

Pulsebot

Comment 26

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/5b82ebe679e6

Brian Hackett [Laid off!]

Assignee

Comment 27

•

10 years ago

The second push above is a followup to fix a redundant check of the runtime unboxed object/array option, which was breaking the JS_OPTION_USE_UNBOXED_OBJECTS environment variable.

Carsten Book [:Tomcat]

Comment 28

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/020c6a559e3a https://hg.mozilla.org/mozilla-central/rev/5b82ebe679e6

Status: NEW → RESOLVED

Closed: 10 years ago

status-firefox40: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla40

Hannes Verschore [:h4writer]

Comment 29

•

10 years ago

Awfy detected improvements \0/, but also some regressions: mac 32bit, no asmjs: http://arewefastyet.com/regressions/#/regression/1457055 - none mac 64bit, no amsjs: http://arewefastyet.com/regressions/#/regression/1457080 - asmjs-ubench-skinning: 0.39% regression mac 32bit, ion: http://arewefastyet.com/regressions/#/regression/1457050 - kraken-crypto-pbkdf2: 1.81% mac 64bit, ion: http://arewefastyet.com/regressions/#/regression/1457069 - kraken-crypto-pbkdf2: 1.81% - ss 1.0.1: unpack-code: 4.46% - ss 1.0.1: tagcloud: 3.05% - misc 0.4: bugs-608733-trace-compiler: 1.98% - misc 0.4 : bugs-1131099-lodash2: 2.49% - misc 0.4 : bugs-1131099-underscore: 0.3% Was this expected? And can we do something about these regressions?

Brian Hackett [Laid off!]

Assignee

Comment 30

•

10 years ago

I'll take a look at the regressions sometime in the next couple days.

Flags: needinfo?(bhackett1024)

Brian Hackett [Laid off!]

Assignee

Comment 31

•

10 years ago

Gary and decoder, this bug added a --unboxed-arrays option to the shell which would be great for fuzzing. As with --unboxed-objects, this tests a feature that is not on by default anywhere.

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

10 years ago

Depends on: 1161346

Sylvestre Ledru [:Sylvestre]

Comment 32

•

10 years ago

Brian, coverity considers that this change can introduce a Division or modulo by zero (cid 1296731) on this line: https://dxr.mozilla.org/mozilla-central/source/js/src/vm/UnboxedObject.cpp?from=js/src/vm/UnboxedObject.cpp#1035 If you think that is correct, I can report a new bug if you want.

Brian Hackett [Laid off!]

Assignee

Comment 33

•

10 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #32) > Brian, coverity considers that this change can introduce a Division or > modulo by zero (cid 1296731) on this line: > https://dxr.mozilla.org/mozilla-central/source/js/src/vm/UnboxedObject. > cpp?from=js/src/vm/UnboxedObject.cpp#1035 > > If you think that is correct, I can report a new bug if you want. This is a false positive.

Sylvestre Ledru [:Sylvestre]

Comment 34

•

10 years ago

(In reply to Brian Hackett (:bhackett) from comment #33) > This is a false positive. Thanks, ignored in coverity then. Thanks!

Brian Hackett [Laid off!]

Assignee

Comment 35

•

10 years ago

Attached patch patch — Details — Splinter Review

With cachegrind and perf I don't see a difference in instruction counts on string-tagcloud or string-unpack-code, but krake-stanford-crypto-pbkdf2 does execute about 2.5% more instructions. This seems to be because of the hasUnanalyzedPreliminaryObjects() call in TryReuseArrayGroup, which calls the outlined function ObjectGroup::maybeSweep several times. This patch avoids calling maybeSweep() and seems to fix the instruction count issue. There are a couple bigger issues here this patch doesn't address. It would be nice if the initial test in maybeSweep() was inlined, but this is annoying to do because of include file issues. Also, it would be good to optimize the involved natives (array_slice mainly I think) from Ion.

Flags: needinfo?(bhackett1024)

Attachment #8603883 - Flags: review?(jdemooij)

Jan de Mooij [:jandem]

Updated

•

10 years ago

Attachment #8603883 - Flags: review?(jdemooij) → review+

Pulsebot

Comment 36

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/ab0de7cdab7f

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 37

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/ab0de7cdab7f

Mike Hommey [:glandium]

Updated

•

10 years ago

Depends on: 1162134

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

9 years ago

Depends on: 1193543

You need to log in before you can comment on or make changes to this bug.