Closed
Bug 1176406
Opened 10 years ago
Closed 10 years ago
IonMonkey: Poor type refinement of element access results during inlining
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla42
| Tracking | Status | |
|---|---|---|
| firefox42 | --- | fixed |
People
(Reporter: sbauman, Assigned: sbauman)
References
Details
Attachments
(3 files, 4 obsolete files)
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:38.0) Gecko/20100101 Firefox/38.0
Build ID: 20150511103303
Steps to reproduce:
Run a debug shell with the example program attached.
Actual results:
Despite the fact that the calls to `myForEach` are inlined into the body of `doForEach`, type information from the surrounding context is not used to refine the result types for array accesses inlined from `myForEach`. Instead, generic code is generated using only the type information observed in the body of `myForEach`.
Expected results:
The compiler should generate optimized array accesses based on the types of `a` and `b` in `doForEach`, rather than just using the stack type set for `myForEach`.
| Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(jdemooij)
Comment 1•10 years ago
|
||
I'm pretty sure that this blog post explains what's going on here:
http://rfrn.org/~shu/2013/03/20/two-reasons-functional-style-is-slow-in-spidermonkey.html
Status: UNCONFIRMED → NEW
Ever confirmed: true
| Assignee | ||
Comment 2•10 years ago
|
||
| Assignee | ||
Comment 3•10 years ago
|
||
(In reply to Till Schneidereit [:till] from comment #1)
> I'm pretty sure that this blog post explains what's going on here:
> http://rfrn.org/~shu/2013/03/20/two-reasons-functional-style-is-slow-in-
> spidermonkey.html
I used Shu's examples initially to narrow down the cause. It seems that lambda lifting is unnecessary, duplicating the forEach function is sufficient. Duplicating the function allows SpiderMonkey to
1. Inline both calls to the forEach function -- SpiderMonkey will not inline more than one level of a function.
2. Specialize the result type set of the two array index operations.
I did not use Shu's examples for this bug, since problem 2 is most obvious after hacking around problem 1. Problem 2 can be addressed by examining the index type set for the arguments to GET_ELEM operations.
| Assignee | ||
Comment 4•10 years ago
|
||
Comment on attachment 8625084 [details] [diff] [review]
possible-fix.patch
This patch attempts to refine the result type and result type set of LoadElement operations when the stack type set fails to do so. To refine the type set, this unions the element type sets for each object type flowing into the GET_ELEM operation.
| Assignee | ||
Comment 5•10 years ago
|
||
Comment on attachment 8625084 [details] [diff] [review]
possible-fix.patch
diff --git js/src/jit/IonBuilder.cpp js/src/jit/IonBuilder.cpp
index ee7ca51..c8f55ea 100644
--- js/src/jit/IonBuilder.cpp
+++ js/src/jit/IonBuilder.cpp
@@ -8512,12 +8512,12 @@ IonBuilder::jsop_getelem_dense(MDefinition* obj, MDefinition* index, JSValueType
// If we can load the element as a definite double, make sure to check that
// the array has been converted to homogenous doubles first.
TemporaryTypeSet* objTypes = obj->resultTypeSet();
+ bool inBounds = !readOutOfBounds && !needsHoleCheck;
bool loadDouble =
unboxedType == JSVAL_TYPE_MAGIC &&
barrier == BarrierKind::NoBarrier &&
loopDepth_ &&
- !readOutOfBounds &&
- !needsHoleCheck &&
+ inBounds &&
knownType == MIRType_Double &&
objTypes &&
objTypes->convertDoubleElements(constraints()) == TemporaryTypeSet::AlwaysConvertToDoubles;
@@ -8553,8 +8553,42 @@ IonBuilder::jsop_getelem_dense(MDefinition* obj, MDefinition* index, JSValueType
MOZ_ASSERT(knownType == MIRType_Value);
}
- if (knownType != MIRType_Value)
+ if (knownType != MIRType_Value) {
load->setResultType(knownType);
+ } else if (inBounds && objTypes && objTypes->getObjectCount() >= 1) {
+ // If the on stack type set set cannot provide a refinement,
+ // attempt to use the element type sets for the objectt types
+ // which may flow into the GET_ELEM operation.
+
+ LifoAlloc* lifoAlloc = alloc().lifoAlloc();
+
+ TemporaryTypeSet empty;
+ TemporaryTypeSet* types = ∅
+
+ bool refine = true;
+
+ // Compute the LUB of the element type sets for each object typea.
+ for (unsigned i = 0; i < objTypes->getObjectCount(); i++) {
+
+ // Get the HeapTypeSet associated with the index for each object.
+ TypeSet::ObjectKey* key = objTypes->getObject(i);
+ HeapTypeSetKey property = key->property(JSID_VOID);
+ HeapTypeSet* currentSet = property.maybeTypes();
+
+ // Short cut termination
+ if (!currentSet || currentSet->unknown()) {
+ refine = false;
+ break;
+ }
+
+ types = TypeSet::unionSets(types, currentSet, lifoAlloc);
+ }
+
+ if (refine && !types->unknown()) {
+ load->setResultType(types->getKnownMIRType());
+ load->setResultTypeSet(types);
+ }
+ }
current->push(load);
return pushTypeBarrier(load, types, barrier);
Comment 6•10 years ago
|
||
Comment on attachment 8625084 [details] [diff] [review]
possible-fix.patch
Hi Spenser, sorry for the delay! I'll be on PTO most of this week so I'll forward your request to Brian.
Flags: needinfo?(jdemooij)
Attachment #8625084 -
Flags: feedback?(bhackett1024)
| Assignee | ||
Comment 7•10 years ago
|
||
Attachment #8625084 -
Attachment is obsolete: true
Attachment #8625084 -
Flags: feedback?(bhackett1024)
| Assignee | ||
Updated•10 years ago
|
Attachment #8627453 -
Flags: review?(bhackett1024)
| Assignee | ||
Comment 8•10 years ago
|
||
This fixes a performance regression introduced by the original version of the patch.
Attachment #8627453 -
Attachment is obsolete: true
Attachment #8627453 -
Flags: review?(bhackett1024)
Attachment #8628020 -
Flags: review?(bhackett1024)
| Assignee | ||
Comment 9•10 years ago
|
||
Timings with and without the attached patch applied. Feel free to ignore "Patch + inline" column.
Comment 10•10 years ago
|
||
Comment on attachment 8628020 [details] [diff] [review]
possible-fix.patch
Review of attachment 8628020 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/IonBuilder.cpp
@@ +8466,5 @@
>
> +static TemporaryTypeSet*
> +ComputeHeapType(const TemporaryTypeSet* objTypes, const jsid id, LifoAlloc* lifoAlloc)
> +{
> + if (objTypes->empty())
if (objTypes->empty() || objTypes->unknownObject())
@@ +8475,5 @@
> +
> + bool refine = true;
> +
> + for (unsigned i = 0; i < objTypes->getObjectCount(); i++) {
> +
This blank line should be removed.
@@ +8476,5 @@
> + bool refine = true;
> +
> + for (unsigned i = 0; i < objTypes->getObjectCount(); i++) {
> +
> + TypeSet::ObjectKey* key = objTypes->getObject(i);
if (key->unknownProperties()) return nullptr;
@@ +8483,5 @@
> +
> + if (!currentSet || currentSet->unknown()) {
> + refine = false;
> + break;
> + }
This logic seems wrong. If an unknown property is encountered while looking through the objects, this function will end up underapproximating the possible types which can be produced by the read.
Also, somewhere in this loop freeze() should be called on the property. Otherwise the types of the properties could potentially grow in the future without invalidating the Ion code.
Attachment #8628020 -
Flags: review?(bhackett1024)
| Assignee | ||
Comment 11•10 years ago
|
||
(In reply to Brian Hackett (:bhackett) from comment #10)
> @@ +8483,5 @@
> > +
> > + if (!currentSet || currentSet->unknown()) {
> > + refine = false;
> > + break;
> > + }
>
> This logic seems wrong. If an unknown property is encountered while looking
> through the objects, this function will end up underapproximating the
> possible types which can be produced by the read.
I am unclear on what the concern is here. If this condition triggers, then
a nullptr is returned and no refinement of the result types is done, so this
should not result in anything being underapproximated. Could you clarify
what you mean here?
Comment 12•10 years ago
|
||
(In reply to Spenser Andrew Bauman from comment #11)
> I am unclear on what the concern is here. If this condition triggers, then
> a nullptr is returned and no refinement of the result types is done, so this
> should not result in anything being underapproximated. Could you clarify
> what you mean here?
Oops, I didn't follow what would happen in the last line of the function correctly. Can you remove the refine variable entirely and just return early with a nullptr on an unknown property type?
| Assignee | ||
Comment 13•10 years ago
|
||
(In reply to Brian Hackett (:bhackett) from comment #12)
> (In reply to Spenser Andrew Bauman from comment #11)
> > I am unclear on what the concern is here. If this condition triggers, then
> > a nullptr is returned and no refinement of the result types is done, so this
> > should not result in anything being underapproximated. Could you clarify
> > what you mean here?
>
> Oops, I didn't follow what would happen in the last line of the function
> correctly. Can you remove the refine variable entirely and just return
> early with a nullptr on an unknown property type?
Sure thing. I will fix that up along with your other comments.
| Assignee | ||
Comment 14•10 years ago
|
||
Attachment #8628020 -
Attachment is obsolete: true
Attachment #8628403 -
Flags: review?(bhackett1024)
Comment 15•10 years ago
|
||
Comment on attachment 8628403 [details] [diff] [review]
possible-fix.patch
Review of attachment 8628403 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/IonBuilder.cpp
@@ +8466,5 @@
>
> +TemporaryTypeSet*
> +IonBuilder::computeHeapType(const TemporaryTypeSet* objTypes, const jsid id)
> +{
> + if (objTypes->empty() || objTypes->unknownObject())
Hmm, actually, I think this should be:
if (objTypes->unknownObject() || objTypes->getObjectCount() == 0)
return nullptr;
Otherwise if objTypes has no objects then we would return a pointer to the |empty| local variable.
@@ +8475,5 @@
> + LifoAlloc* lifoAlloc = alloc().lifoAlloc();
> +
> + Vector<HeapTypeSetKey, 0, JitAllocPolicy> properties(alloc());
> + if (!properties.initCapacity(objTypes->getObjectCount()))
> + return nullptr;
It would be a bit more efficient to supply a non-zero capacity (e.g. 4) and call reserve() instead of initCapacity(). Also, |properties| can use SystemAllocPolicy instead --- using JitAllocPolicy will cause the memory for the storage vector to remain allocated until the entire compilation finishes.
Attachment #8628403 -
Flags: review?(bhackett1024) → review+
| Assignee | ||
Comment 16•10 years ago
|
||
Final edits proposed by Brian Hackett.
Attachment #8628403 -
Attachment is obsolete: true
Attachment #8628550 -
Flags: review?(bhackett1024)
Updated•10 years ago
|
Attachment #8628550 -
Flags: review?(bhackett1024) → review+
| Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Comment 17•10 years ago
|
||
can we get a try run here ?
Flags: needinfo?(sabauma)
Keywords: checkin-needed
| Assignee | ||
Comment 18•10 years ago
|
||
(In reply to Carsten Book [:Tomcat] from comment #17)
> can we get a try run here ?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=924ee9d8ae74
Flags: needinfo?(sabauma)
| Assignee | ||
Updated•10 years ago
|
Summary: IonMonkey: Poor type refinement of element access results during inlininig → IonMonkey: Poor type refinement of element access results during inlining
| Assignee | ||
Updated•10 years ago
|
Keywords: checkin-needed
Comment 19•10 years ago
|
||
Keywords: checkin-needed
Comment 20•10 years ago
|
||
Assignee: nobody → sabauma
Status: NEW → RESOLVED
Closed: 10 years ago
status-firefox42:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Comment 21•10 years ago
|
||
This patch is responsible for a 40% improvement on http://arewefastyet.com/#machine=17&view=single&suite=dromaeo&subtest=dromaeo-3d-cube
but also for a 6% regression on http://arewefastyet.com/#machine=29&view=single&suite=dart&subtest=DeltaBlue
| Assignee | ||
Comment 22•10 years ago
|
||
(In reply to Guilherme Lima from comment #21)
> This patch is responsible for a 40% improvement on
> http://arewefastyet.com/
> #machine=17&view=single&suite=dromaeo&subtest=dromaeo-3d-cube
> but also for a 6% regression on
> http://arewefastyet.com/#machine=29&view=single&suite=dart&subtest=DeltaBlue
I am unable to replicate the regression on DeltaBlue on my machine, which averages a 3.8% improvement with this patch.
Comment 23•10 years ago
|
||
Nice! I also took a look at the dart-deltablue regression and it seems it is not a real regression, but a variation in score. So no need to look at dart-deltablue.
A regression/improvement was reported on AWFY:
- slave: Win 8 32-bit (i7-4700HQ, browser)
- mode: Ion
Regression(s)/Improvement(s):
- dromaeo: dromaeo-3d-cube: 43.81% (improvement)
Recorded range:
- http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=89d62eb8b35d&tochange=cd949a8d064c
More details: http://arewefastyet.com/regressions/#/regression/1791834
You need to log in
before you can comment on or make changes to this bug.
Description
•