Closed Bug 1494537 Opened 6 years ago Closed 6 years ago

Add IC support for adding and updating elements past the end of a sparse array

Categories

(Core :: JavaScript Engine: JIT, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla65
Performance Impact high
Tracking Status
firefox65 --- fixed

People

(Reporter: bas.schouten, Assigned: djvj)

References

(Blocks 2 open bugs)

Details

Attachments

(4 files, 11 obsolete files)

61.99 KB, text/plain
Details
1.79 KB, application/x-javascript
Details
2.09 KB, application/x-javascript
Details
25.38 KB, patch
djvj
: review+
Details | Diff | Splinter Review
Attached file stubissues.txt
While investigating bug 1494504, by instrumenting the failure points inside the stubs we've found that the vast majority of the fallbacks is caused by GDocs accessing a dictionary style sparse array (see attached log). Things could be considerably improved here is we had an optimized stub that reached directly for the dictionary object.
This looks very similar to bug 1488786, but this is the SetElem instead of GetElem part IIUC.
Blocks: CacheIR
This might be a dupe of Bug 1488768. The observation that it is dictionary mode is interesting.
I think this is absolutely essentially to solve for GDocs performance (this is such a major part of what keystrokes do having something absolutely optimal for these dictionary objects for both get and set would have a large impact). Kannan also tells me this is quite feasible for 64, so nominating.
Whiteboard: [qf]
Attached file benchmark.js
Attached is the microbenchmark I was working against, which is the problematic case as described by Kannan.
Summary: Add a stub for dictionary objects → Add IC support for adding elements past the end of a sparse array
Assignee: nobody → mgaudet
Status: NEW → ASSIGNED
Whiteboard: [qf] → [qf:p1:f64]
On the benchmark I attached, my tests show about a 30% improvement vs. the no-cache case.
Attachment #9013799 - Attachment is obsolete: true
Updated partial patches: - While the microbench case is pretty well made for the add-element case, new information from Kannan suggests this is not happening nearly as much as expected. - As for the -update- element case (https://phabricator.services.mozilla.com/D7793), so far I've seen only minor impacts on a microbenchmark. The patch as posted is very awkward; It suffers because we have a callVM that then subsequently wants to be able to jump into another IC chain. Register spilling and filling is very awkward (and in the case of Ion, AutoSaveLiveRegisters makes things worse, not better) Kannan is going to put more analysis up in Bug 1488768, but given the preview of that analysis on IRC, my personal inclination is to abandon these patches. (Definitely D7793).
Ugh, I wrote these comments on the phabricator patch.. but I guess it doesn't auto-post to the bug. Super not cool. Re-psting the top-level comments: I don't think we can abandon this patch. I suspect what's happening is that we are genericizing the Ion IC stub by hitting _this_ particular failure path too many times, before we hit the optimizable dense write cases.. at which point we're not attaching a stub for that. This can be addressed by some hokey tuning changes in the optimizable => generic state transition logic, but that's really a very brittle way to approach the issue. Even if this stub isn't a huge performance win, it'll serve well as the "right" way to keep this IC chain in an optimizable mode until we get to a point where we are hitting the more traditionally optimizable cases being seen (i.e. the writes into the dense locations). I suspect we'll also need a similar stub for writes into dense locations which are more than 1 past the length, because the use case we're hitting also has a bunch of those. So even if we fix this case, we might still end up generifying because those are not handled and will be counted as attach-failures, causing us to generify once again.
I ran my fallback stub analysis with this patch applied, and noticed something weird: the dense-initialized-capacity of the array being written to suddenly dropped to 6 - i.e. we don't grow the dense-initialized-capacity beyond that. What is happening is that an early write is hitting this optimizable case, forcing dictionarification when the array is still small (without re-densification), and then after that all the writes are on slot-ful indexes, instead of dense indexes. I'm starting to think the original more straightforward approach you were taking is the right thing here. Suggestion: the "optimized" case here should just be to guard on any write to an index > denseInitializedLength, and call into a pretty naive NativeObject::setProperty (and let it handle all of the setting and re-densification if needed).
QA Contact: sdetar
Do we know what makes the arrays sparse in the first place? Large index probably?
(In reply to Kannan Vijayan [:djvj] from comment #10) > This can be addressed by some hokey tuning changes in the optimizable => > generic state transition logic, but that's really a very brittle way to > approach the issue. Even if this stub isn't a huge performance win, it'll > serve well as the "right" way to keep this IC chain in an optimizable mode > until we get to a point where we are hitting the more traditionally > optimizable cases being seen (i.e. the writes into the dense locations). I think this is precisely the kind of case where we actually really want to rethink our state transition logic and potentially IC chain manipulation. If we attach a generic stub here, the generic stub will serve all requests until the GC causes us to discard stubs and we come around and try to attach a new stub. It is an 'improvement' over the existing state of affairs because we won't be in generic mode the second time around. However, it could also be a large degradation too: Imagine a SETELEM sees the following sequence of indices: [102912, 0, 1, 2, 3,..., 1024]. Even though we have a faster stub for every write after the first, if we attach the generic stub first, it will service all requests, we'll not hit the fallback stub again, and we'll not attach a faster stub after that. (Now that I write this out, Bug 1488786 seems mildly concerning). This ends up also being related to Bug 1495519, which is about how to handle state transitions... failure seems like a good thing in this case.. > I suspect we'll also need a similar stub for writes into dense locations > which are more than 1 past the length, because the use case we're hitting > also has a bunch of those. (That's Bug 1488768)
Priority: -- → P3
Attachment #9014601 - Attachment is obsolete: true
QA Contact: sdetar
Summary: Add IC support for adding elements past the end of a sparse array → Add IC support for adding and updating elements past the end of a sparse array
Assigning to Kannan, as I will be on PTO... very shortly.
Assignee: mgaudet → kvijayan
Attached patch handle-oob-array-setelem.patch (obsolete) — Splinter Review
Updated the patch to handle all cases where we are assigning to element locations outside of the dense-initialized-length bounds.
Add stub to handle out-of-dense-bounds writes to arrays
Attached patch ic-for-sparse-array-stores.patch (obsolete) — Splinter Review
Updated patch which passes jit-tests locally. Running on try here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=e8f3b077726e21101a773d249df426a648169b18
Attachment #9015584 - Attachment is obsolete: true
Attachment #9016025 - Attachment is obsolete: true
Attachment #9013798 - Attachment is obsolete: true
Add CacheIR stub for out-of-capacity-bounds assignments to arrays.
Pushed by kvijayan@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/6213dd2a20f2 Add CacheIR stub for out-of-capacity-bounds assignments to arrays. r=tcampbell
Backed out changeset 6213dd2a20f2 (bug 1494537) for hazard failures on a CLOSED TREE Backout link: https://hg.mozilla.org/integration/mozilla-inbound/rev/b8601c0c061473d7c81844816d6efdbfbc1eb53b Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&resultStatus=pending,running,success,testfailed,busted,exception&classifiedState=unclassified&revision=6213dd2a20f2c544722ff18d62863cbf6e031b8e Log link: https://treeherder.mozilla.org/logviewer.html#?job_id=205400886&repo=mozilla-inbound&lineNumber=51024 Log snippet: [task 2018-10-14T21:50:12.782Z] Wrote explained_hazards.tmp [task 2018-10-14T21:50:12.782Z] Wrote unnecessary.tmp [task 2018-10-14T21:50:12.782Z] Wrote refs.tmp [task 2018-10-14T21:50:12.782Z] Found 24 hazards and 15 unsafe references [task 2018-10-14T21:50:12.783Z] Running heapwrites to generate heapWriteHazards.txt [task 2018-10-14T21:50:12.783Z] PATH="/builds/worker/workspace/sixgill/usr/bin:${PATH}" SOURCE='/builds/worker/checkouts/gecko' ANALYZED_OBJDIR='/builds/worker/workspace/obj-analyzed' LD_LIBRARY_PATH="${LD_LIBRARY_PATH}:/builds/worker/workspace/obj-haz-shell/dist/bin" XDB='/builds/worker/workspace/sixgill/usr/bin/xdb.so' /builds/worker/workspace/obj-haz-shell/dist/bin/js /builds/worker/checkouts/gecko/js/src/devtools/rootAnalysis/analyzeHeapWrites.js > heapWriteHazards.txt [task 2018-10-14T21:50:42.178Z] + check_hazards /builds/worker/workspace/analysis [task 2018-10-14T21:50:42.179Z] + set +e [task 2018-10-14T21:50:42.179Z] ++ grep -c 'Function.*has unrooted.*live across GC call' /builds/worker/workspace/analysis/rootingHazards.txt [task 2018-10-14T21:50:42.180Z] + NUM_HAZARDS=24 [task 2018-10-14T21:50:42.181Z] ++ grep -c '^Function.*takes unsafe address of unrooted' /builds/worker/workspace/analysis/refs.txt [task 2018-10-14T21:50:42.181Z] + NUM_UNSAFE=15 [task 2018-10-14T21:50:42.182Z] ++ grep -c '^Function.* has unnecessary root' /builds/worker/workspace/analysis/unnecessary.txt [task 2018-10-14T21:50:42.184Z] + NUM_UNNECESSARY=1188 [task 2018-10-14T21:50:42.184Z] ++ grep -c '^Dropped CFG' /builds/worker/workspace/analysis/build_xgill.log [task 2018-10-14T21:50:42.207Z] + NUM_DROPPED=0 [task 2018-10-14T21:50:42.207Z] ++ perl -lne 'print $1 if m!found (\d+)/\d+ allowed errors!' /builds/worker/workspace/analysis/heapWriteHazards.txt [task 2018-10-14T21:50:42.209Z] + NUM_WRITE_HAZARDS=64 [task 2018-10-14T21:50:42.209Z] + set +x [task 2018-10-14T21:50:42.209Z] TinderboxPrint: rooting hazards<br/>24 [task 2018-10-14T21:50:42.209Z] TinderboxPrint: (unsafe references to unrooted GC pointers)<br/>15 [task 2018-10-14T21:50:42.209Z] TinderboxPrint: (unnecessary roots)<br/>1188 [task 2018-10-14T21:50:42.209Z] TinderboxPrint: heap write hazards<br/>64 [task 2018-10-14T21:50:42.209Z] TEST-UNEXPECTED-FAIL 24 rooting hazards detected [task 2018-10-14T21:50:42.209Z] TinderboxPrint: documentation<br/><a href='https://wiki.mozilla.org/Javascript:Hazard_Builds#Diagnosing_a_rooting_hazards_failure'>static rooting hazard analysis failures</a>, visit "Inspect Task" link for hazard details [task 2018-10-14T21:50:42.209Z] TEST-UNEXPECTED-FAIL 64 heap write hazards detected out of 0 allowed [task 2018-10-14T21:50:42.209Z] TinderboxPrint: documentation<br/><a href='https://wiki.mozilla.org/Javascript:Hazard_Builds#Diagnosing_a_heap_write_hazard_failure'>heap write hazard analysis failures</a>, visit "Inspect Task" link for hazard details [task 2018-10-14T21:50:42.209Z] /builds/worker/checkouts/gecko/taskcluster/scripts/builder/hazard-analysis.sh: line 176: exit_status: command not found [task 2018-10-14T21:50:42.210Z] + onexit [task 2018-10-14T21:50:42.210Z] + grab_artifacts /builds/worker/workspace/analysis /builds/worker/artifacts [task 2018-10-14T21:50:42.210Z] + local analysis_dir [task 2018-10-14T21:50:42.210Z] + analysis_dir=/builds/worker/workspace/analysis [task 2018-10-14T21:50:42.210Z] + local artifacts [task 2018-10-14T21:50:42.210Z] + artifacts=/builds/worker/artifacts [task 2018-10-14T21:50:42.210Z] + cd /builds/worker/workspace/analysis [task 2018-10-14T21:50:42.210Z] + ls -lah [task 2018-10-14T21:50:42.213Z] total 4.6G
Flags: needinfo?(kvijayan)
Ugh.. The shape should have been a RootedShape.
Steve, I have no idea how to figure out what the rooting hazard I'm running into here is: https://treeherder.mozilla.org/#/jobs?repo=try&revision=4e8001162d348ebf8b5d353217197c87d886d1c2&selectedJob=205443615 Can you help me out? What are the steps to track the treeherder red back to the point in the source that produced it?
Flags: needinfo?(kvijayan) → needinfo?(sphink)
I'm very confused, but I'll document what I know for now. But first tl;dr -- I don't see any valid hazards here, this appears to be the hazard analysis going haywire. The first hazard listed is Function '_ZN2js3jit16BaselineCompiler7compileEv$uint32 js::jit::BaselineCompiler::compile()' has unrooted 'autoEnterAnalysis' of type 'js::AutoEnterAnalysis' live across GC call '_ZN2js10CallObject20createTemplateObjectEP9JSContextN2JS6HandleIP8JSScriptEENS4_IP8JSObjectEENS_2gc11InitialHeapE$js::CallObject* js::CallObject::createTemplateObject(JSContext*, class JS::Handle<JSScript*>, class JS::Handle<JSObject*>, uint8)' at js/src/jit/BaselineCompiler.cpp:189 cleaning that up a bit, that's Function 'js::jit::BaselineCompiler::compile()' has unrooted 'autoEnterAnalysis' of type 'js::AutoEnterAnalysis' live across GC call 'js::CallObject::createTemplateObject()' at js/src/jit/BaselineCompiler.cpp:189 I looked at gcTypes.txt.gz (which is normally not needed, so is hidden within hazardIntermediates.tar.xz) to see why AutoEnterAnalysis is a GC type: GCPointer: js::AutoEnterAnalysis contains field 'unboxedLayoutToCleanUp' of type class mozilla::UniquePtr<js::UnboxedLayout, JS::DeletePolicy<js::UnboxedLayout> > contains field 'mTuple' (with a pointer to unsafe storage) holding a struct mozilla::Pair<js::UnboxedLayout*, JS::DeletePolicy<js::UnboxedLayout> > That seems legit -- AEA has an UnboxedLayout field in a UniquePtr. And a previous run that did not show this hazard agrees. So we have a GC pointer type held live across a GC function (unsurprisingly, createTemplateObject can indeed GC.) But AutoEnterAnalysis also suppresses GC within its RAII scope, which we can see from typeInfo.txt, under the key 'GCSuppressors': "js::AutoEnterAnalysis": "suppressGC : js::gc::AutoSuppressGC", That just means it suppresses GC by having a field named 'suppressGC' of type 'AutoSuppressGC'. Anyway, that's why this shouldn't be a hazard: createTemplateObject is called while GC is suppressed. Looking at callgraph.txt, it *knows* that GC is suppressed during that call: # traverse.py callgraph.txt (Cmd) resolve compile ... #104760 = uint32 js::jit::BaselineCompiler::compile() (1 callers) ... (Cmd) callee #104760 ... #104774 = (SUPPRESSED) js::NamedLambdaObject* js::NamedLambdaObject::createTemplateObject(JSContext*, class JS::Handle<JSFunction*>, uint8) ... Even more mysterious, if I use the intermediate files downloaded from that try push and rerun just the last step of the analysis, it reports only 3 hazards, not the 13. And those 3 hazards look valid to me, though they're not from this push and they're either false positives (most likely) or irrelevant due to test code running in a different environment.
To be clear, I have looked at the hazard results and see no actual hazards introduced by the patch here.
Attached patch ic-for-sparse-array-stores.patch (obsolete) — Splinter Review
Updated, r+-ed patch (by :tcampbell) for this which has no rooting hazards, yet triggers the static analysis rooting hazards check.
Attachment #9016385 - Attachment is obsolete: true
Attachment #9016460 - Attachment is obsolete: true
Comment on attachment 9017220 [details] [diff] [review] ic-for-sparse-array-stores.patch Review of attachment 9017220 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit-test/tests/cacheir/bug1494537.js @@ +63,5 @@ > + assertEq(mutate_array_frozen(x, count, 10), false); > + assertEq(y[end], count - 1); > +} > + > +// Let's make sure updates to the array happen as expecte.d nit (and this is my typo I'm pretty sure): expected.
Aha! Things are making much more sense now. I've been blind -- I just assumed that the analysis was screwing up in some bizarre way, even though the code didn't touch it. But I think the analysis is fine, it's just that this patch introduces a bug into the JS engine that *runs* the analysis. Specifically, I have this line of code: if (!gcInfo && !(source in body.suppressed) && !suppressed) { ... } It is mysteriously testing as true when running the analysis. When I run the code under the JS debugger and eval the condition, it returns false. 'body.suppressed' is an array that is built up with body.suppressed = []; for (var point in someGenerator()) body.suppressed[point] = true; where point is a small integer. In this case, body.suppressed[source] is true. I'm guessing that I would "fix" the problem if I changed this to: if (!gcInfo && !body.suppressed[source] && !suppressed) but fortunately for you I write weird code.
This is both amazing and depressing. Nice work, and thank you for involuntarily debugging this patch. It seems like some additional tests need to be checked in. This has to be some weirdness with dense/sparse arrays and how we're adjusting the flags on those when we update them. Looking into it.
Flags: needinfo?(sphink)
Attached patch ic-for-sparse-array-stores.patch (obsolete) — Splinter Review
Updated patch which moves the optimized vm-call up one level in the VM callpath. This passes tests: https://treeherder.mozilla.org/#/jobs?repo=try&revision=0a19ed7ca3b4aec45ee2801131926598c7de0c90 I'm actually wary of over-optimizing this and introducing brittleness that bites us in the future. This seems close enough to the previous path that we should capture most of the gains from before, with far less risk.
Attachment #9017220 - Attachment is obsolete: true
Attachment #9017666 - Flags: review?(tcampbell)
Comment on attachment 9017666 [details] [diff] [review] ic-for-sparse-array-stores.patch Review of attachment 9017666 [details] [diff] [review]: ----------------------------------------------------------------- Works for me if you still happy with the perf. We should run experiment to determine if the PACKED flag was the source of error, for our own sanity.
Attachment #9017666 - Flags: review?(tcampbell) → review+
Attached patch ic-for-oob-array-stores.patch (obsolete) — Splinter Review
Ok, so I actually realized that if we're moving up one level anyway, there's no reason not to handle all cases of 'index >= initializedLength' instead of just 'index >= capacity'. This has the slight potential to overlap with the 'DenseElementHole' stub in the situation where we get an assignment into an array where 'index == initializedLength', but the IC generator code tests for that case first anyway, so we'll only have this stub swallow that case in the strange situation where we _first_ see a fully out-of-dense-bounds write, and then subsequently see lots of "edge-of-dense-bounds" writes. Sorry for the extra review load. Since Jan is back it would be good to get a quick look at this from him as well. Try run is in progress and seems to be green as observable here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=ea906022ecca3754f96e21992e00207decc743df&selectedJob=205980586
Attachment #9017666 - Attachment is obsolete: true
Attachment #9017759 - Flags: review?(tcampbell)
Attachment #9017759 - Flags: feedback?(jdemooij)
Comment on attachment 9017759 [details] [diff] [review] ic-for-oob-array-stores.patch Review of attachment 9017759 [details] [diff] [review]: ----------------------------------------------------------------- r+ during soft-freeze. Let's try this.
Attachment #9017759 - Flags: review?(tcampbell) → review+
Pushed by kvijayan@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/7a7d5508f873 Add CacheIR stub for out-of-initialized-length-bounds assignments to arrays. r=tcampbell
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Depends on: 1500285
Depends on: 1500255
Depends on: 1500254
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla64 → ---
Backout by csabou@mozilla.com: https://hg.mozilla.org/mozilla-central/rev/6c55991a052e Backed out changeset 7a7d5508f873 as requested by tcampbell on irc for causing crashes in bug 1500285, 1500255. a=backout
I think the backout should also be done on mozilla-beta, see the regressing changeset there: https://hg.mozilla.org/releases/mozilla-beta/rev/7a7d5508f873
That'll happen automatically. They're kept in sync until the Gecko version is bumped to 65 on Monday.
Attached patch fix-errors.patch (obsolete) — Splinter Review
Found the bug in the patch. It's really mundane, and I can't believe the other tests didn't catch it. Just the delta has been attached for review. I'll upload the merged patch before landing.
Attachment #9018901 - Flags: review?(tcampbell)
Attachment #9017759 - Flags: feedback?(jdemooij)
Comment on attachment 9018901 [details] [diff] [review] fix-errors.patch Review of attachment 9018901 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/IonCacheIRCompiler.cpp @@ +2275,4 @@ > ValueOperand val = allocator.useValueRegister(masm, reader.valOperandId()); > bool strict = reader.readBool(); > > + allocator.discardStack(masm); This has been a persistent problem in implementing Ion CacheIR emitters. It's unfortunate that this wasn't caught by any tests, but it seems that we need an alternative mechanism to ensure that discardStack is happening when required. We used to have an identical pathology in the use of prepareVMCall, and missing AutoSaveLiveRegisters. That was addressed by requiring prepareVMCall take a reference to a AutoSaveLiveRegisters, as a way of trying to maintain the API invariants. Perhaps discardStack needs to return a token to be passed to prepareVMCall, hooking it in, in a similar fashion. I've previously written patches that tried to move discardStack inside prepareVMCall, however, I never finished them because it wasn't always clear why/where the appropriate place is to call discardStack, and it doesn't always line up with the prepareVMCall (IIRC).
Re, :mgaudet - agreed that it would be better if there was some clean way of ensuring this sort of oversight doesn't happen. What I've noticed in the cases where "sometimes its needed, and sometimes its not", is that one of the cases where it's done automatically is when a failure path is added before the VMCall - this forces a stack reset, which makes sense as the next stub expects a clean stack. We can have the addition of a failure path also return a stack-reset token that is passed into the prepareForCallVM() code.
Comment on attachment 9018901 [details] [diff] [review] fix-errors.patch Review of attachment 9018901 [details] [diff] [review]: ----------------------------------------------------------------- r=me. Great comments from :mgaudet here.
Attachment #9018901 - Flags: review?(tcampbell) → review+
Updating with rolled up finalized patch, including fuzzbug tests and fixes.
Attachment #9017759 - Attachment is obsolete: true
Attachment #9018901 - Attachment is obsolete: true
Attachment #9019372 - Flags: review+
Pushed by kvijayan@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/7fed4b128d9d Add CacheIR stub for out-of-capacity-bounds assignments to arrays. r=tcampbell
No longer blocks: 1496847
Depends on: 1500463
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Is this something we might want to uplift to 64? Do we need to verify the results in some way in nightly?
Flags: needinfo?(kvijayan)
Hey liz, yes I would like to uplift this to 64, but it needs to be done in tandem with the patch for bug 1500255, which was committed shortly after this patch landed. How do I ensure this?
Flags: needinfo?(kvijayan) → needinfo?(lhenry)
Request approval on both and note the dependency where the form asks about it.
Flags: needinfo?(lhenry)
Depends on: 1502143
(In reply to Kannan Vijayan [:djvj] from comment #51) > Hey liz, yes I would like to uplift this to 64, but it needs to be done in > tandem with the patch for bug 1500255, which was committed shortly after > this patch landed. How do I ensure this? If we want to uplift this we should do so soon. May be a little risky at this point, I'll leave it up to you.
Flags: needinfo?(kvijayan)
Bas: There were multiple follow-up fixes that had to land after the initial patch on this one, some well after the initial patch landing. I think it's good to let the patch get some time in nightly given the bugs in initial landing.
Flags: needinfo?(kvijayan)
Blocks: 1488435
Performance Impact: --- → P1
Whiteboard: [qf:p1:f64]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: