IonMonkey: Significant regression in dromaeo_dom

RESOLVED FIXED in Firefox 18

Status

()

defect
RESOLVED FIXED
7 years ago
6 years ago

People

(Reporter: dvander, Assigned: djvj)

Tracking

(Depends on 2 bugs)

unspecified
mozilla18
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox18+ fixed)

Details

(Whiteboard: [ion:p1:fx18] [leave open])

Attachments

(5 attachments, 13 obsolete attachments)

10.00 KB, patch
jandem
: review+
Details | Diff | Splinter Review
15.57 KB, patch
jandem
: review+
Details | Diff | Splinter Review
44.57 KB, patch
jandem
: review+
Details | Diff | Splinter Review
4.84 KB, patch
dvander
: review+
Details | Diff | Splinter Review
3.78 KB, patch
sstangl
: review+
Details | Diff | Splinter Review
When IonMonkey is enabled, our dromaeo_dom score gets about 2X worse on Talos.
Is there a link to the perf diff?  Which subtests are most relevant, or is it across the board?
(In reply to Boris Zbarsky (:bz) from comment #1)
> Is there a link to the perf diff?  Which subtests are most relevant, or is
> it across the board?

I will run the test manually very soon - I just looked at the Talos numbers in TBPL so far.
tbpl can show per-test numbers...  Was this a try push, or are you just comparing an Ion tree to m-c or something?
Running Dromaeo DOM tests @ http://dromaeo.com/?dom

With IonMonkey enabled, as individual scores, I get (higher is better):

DOM Attributes - 318
DOM Modification - 151
DOM Query - 11143
DOM Traversal - 45
Full results - http://dromaeo.com/?id=166959

With IonMonkey disabled, as individual scores, I get:

DOM Attributes - 559
DOM Modification - 210
DOM Query - 2942
DOM Traversal - 290
Full results - http://dromaeo.com/?id=166958
Err, comment #4 is messed up. Those scores are reversed (first set is w/out IonMonkey, second set is w/ IonMonkey).

The URLs are - 
IM: http://dromaeo.com/?id=177959
JM: http://dromaeo.com/?id=177958
I looked a bit into this. The main problem on the DOM attributes and Traversal tests seems to be that they are accessing DOM properties with PropertyOp getters. IIUC they will go away with the new DOM bindings, but we don't use these yet for HTML elements etc.

To fix this, we have the following options:

(1) Use inline paths similar to the ones we use for new-style getters.
(2) Add a getter IC. Could also work as a fallback for the TI-based paths.
(3) Not Ion-compile scripts that accessed PropertyOp getters (and ListBase proxies?)

Option (3) is the simplest to implement and may suffice since PropertyOp getters will slowly disappear. OTOH, we may need an IC as fallback anyway and in that case supporting PropertyOp getters too shouldn't be that much harder.
Does IM have an IC for JSNative getters?

If so, we also have option 4: Switch quickstubs (which are what I assume you're seeing) to using JSNatives.
IM does not have an IC for JSNative getters. It desperately needs one, though.

If we switch away from PropertyOp quickstubs to JSNative getters, that would be great, otherwise, we can try to do a hybrid IC to handle both...
Now that we have JSNative ICs in JM, I think switching the quickstubs is a good idea.  Or at least trying to and measuring the performance impact.
OK, so it looks like a hybrid combination of 2 and 4 is the best here. We can get rid of the PropertyOps that we're trying to kill anyway, as well as add the desperately needed accessor ICs.

I have been asked to, and intend to look into this.
That's awesome!

Note that we'll still need something for ListBase proxies... At this point inline style is using those.  Possibly you can use #3 for those initially?
For the initial land, that might be a plausible, but obviously we are going to want to implement bug 785465. As soon as we finish the general cases, we can move on to fixing various DOM access performance things.

In particular, I want to fix this and bug 765454 first.
Yeah, I was just talking initial land, if doing it that way makes landing simpler.
I agree that Option (2) seems best as a catch-all case here, and seems pretty straightforward to implement.

Any takers?
(Assignee)

Comment 15

7 years ago
I can take this but I'll need to familiarize myself a bit with the DOM side of things.
Please feel free to ping me on irc if you have questions!
(Assignee)

Comment 17

7 years ago
Turns out that changing ICs to have callouts to C++ will require some changes to the way on-stack invalidations are handled in Ion.  Figuring out how it works before getting on with this patch.
Assignee: general → kvijayan
Hmm.  Don't efaust's existing fast paths have that problem already?  Or are those not ICs?
(Assignee)

Comment 19

7 years ago
No, they don't poly IC the calls.  Efaust does cleverly identify the situation where multiple object types have the same target at the call site, but it's still a single target being called, and it's mainlined.
(Assignee)

Comment 20

7 years ago
Boris:

I was talking about this with dvander yesterday and we were wondering if implementing this optimization for monomorphic cases only would resolve the performance issue.  If that works out, we can avoid the more nitpicky changes necessary to add calls to ICs.

How difficult would it be for you to give me a patch that moves the quickstubs over to JSNatives?  We can test this out in short order.
I was actually just considering this this morning. If we just move quickstubs, it should already work with the current accessor fastpath, I'm pretty sure.
(Assignee)

Comment 22

7 years ago
There'll need to be some slight tweaking.  In |TestCommonPropFunc| we'll need to check for JSNatives, which is slightly different from retrieving getter objects, but not really.

But this is a trivial change to make.
Why do we need to care that it's a native? Let it flow into the MCall, and then if it's native, it should lower to an LCallNative, which is exactly what we wanted, right? What am I missing?
Note that this will still not solve the own property problem that is going to come up with |window.window|, once it goes to the new bindings, according to Ms2ger.
> we were wondering if implementing this optimization for monomorphic cases only would
> resolve the performance issue

It would fix the microbenchmarky parts of dromaeo.  It won't necessarily fix real-life code, and might not fix the jquery/prototype/etc parts of dromaeo (known as "dromaeo_css"), assuming there's a regression on those.

> How difficult would it be for you to give me a patch that moves the quickstubs over to
> JSNatives? 

I'd guess not hard, but also not until Tuesday or Wednesday, for obvious reasons.  ;)

Eric, What's the problem with window.window?
(Assignee)

Comment 26

7 years ago
(In reply to Eric Faust [:efaust] from comment #23)
> Why do we need to care that it's a native? Let it flow into the MCall, and
> then if it's native, it should lower to an LCallNative, which is exactly
> what we wanted, right? What am I missing?

I just mean we have to pull out the JSNative by checking for a different flag on the Shape attribute and then casting the getterOp to a JSFunction, instead of just pulling the getterValue.  Nothing major - should be a couple of lines.  All the other logic should be valid already, as you note.
(Assignee)

Comment 27

7 years ago
(In reply to Boris Zbarsky (:bz) from comment #25)
> It would fix the microbenchmarky parts of dromaeo.  It won't necessarily fix
> real-life code, and might not fix the jquery/prototype/etc parts of dromaeo
> (known as "dromaeo_css"), assuming there's a regression on those.

I'll still be working on the call ICs.  I think we have a basic game plan worked out - I just need a bit better understanding of how Ion OSI works in the details before working on it.

> I'd guess not hard, but also not until Tuesday or Wednesday, for obvious
> reasons.  ;)

Of course.  The Ion end for that is trivial so we can plug it in whenever it's ready to go.
I am told that window.window is to be implemented in the new system as an own-property getter on the instance object itself. Is this true? If so, we are going to need some form of smart IC, as the current TI-based approach will be insufficient.
Oh, Brendan's proposal?

"window" would be a JSNative getter on the global.

"window.window" would be a get off a proxy.  I suspect it'll continue to be deathly slow as it is now, though it would be nice to fix that sometime.  ;)
Note to self: that patch fails unit tests, because JSPROP_NATIVE_ACCESSORS gives the getter and setter null names so that xpc_qsThrowMethodFailed can't find a memberId and hence we get fatal asserts in ThrowCallFailed because memberName is null and memberId is void.

But I think you should be able to run dromaeo with it, at least, and get some perf numbers.  If it looks promising, I'll figure out some way to make the error reporting work.
(Assignee)

Comment 32

7 years ago
Just ran both builds (IonMonkey build vs. Mozilla-central build) across Dromaeo DOM.  Just changing the quickstubs to native doesn't seem to do anything actually.

Reference (m-c build):
Dom attrs: 755.60 runs/s
Dom modifications: 325.67 runs/s
Dom query: 17962.83 runs/s
Dom traversal: 415.38 runs/s

IonMonkey build:
Dom attrs: 768.10 runs/s
Dom modifications: 316.79 runs/s
Dom query: 6753.21 runs/s
Dom traversal: 43.12 runs/s

DOM Query seems to have improved marginally, but it's still running at about 40% of the speed of m-c build.

DOM Traversal seems to have worsened somewhat.  I'm guessing DOM traversal would be hitting polymorphic cases and the JSNative calls may actually be worse for performance on balance than PropertyOp calls, since calling JSNatives requires creation of a bunch of heavyweight objects which PropertyOps looks like it avoids.
> Just changing the quickstubs to native doesn't seem to do anything actually.

I wouldn't expect it to until there's an IC for JSNatives.  See comment 6, comment 7, comment 8.

> calling JSNatives requires creation of a bunch of heavyweight objects

Like what?  The arguments are the same, the handling of "this" is similar.  The only difference is passing the "callee", but that's already available....  Or are you talking in the particular implementation in Ion?
(Assignee)

Comment 34

7 years ago
(In reply to Boris Zbarsky (:bz) from comment #33)
> I wouldn't expect it to until there's an IC for JSNatives.  See comment 6,
> comment 7, comment 8.

I just traced the construction of the property access in IonBuilder.  We're definitely finding the JSNative for the relevant property accesses in both the query and the traverse tests.  So it turns out that the callee _is_ monomorphic in all the caes were're hitting.  Something else is going on that makes this slow.

> Like what?  The arguments are the same, the handling of "this" is similar. 
> The only difference is passing the "callee", but that's already
> available....  Or are you talking in the particular implementation in Ion?

Nah, I was just under the impression that CallArgs construction was pretty heavyweight.  Looking at the code this doesn't seem to be the case.
(Assignee)

Comment 35

7 years ago
We're spending a lot of time inside ion GetPropertyCache (the IC fallback).  Here's a listing of all the locations, scripts, lines, and count numbers of where those are happening:

http://pastebin.mozilla.org/1806517
(Assignee)

Comment 36

7 years ago
Quick check indicates that these correspond to '.length' queries.

In dom-query, it's '.length' on the result of |document.getElementsByName()|.  In dom-traverse, it's '.length' on the result of |document.body.childNodes|.
Ah.  Both of those are list proxies.  That's not hitting quickstub code at all.  Fixing that requires bug 785465, basically.
(Assignee)

Comment 38

7 years ago
Ok, bug 785464 is not the solution to this.  '.length' is not a slot access, and thus doesn't get caught by the IC (which only codes for slot accesses).

Boris:  Waldo tells me that '.length' is a JSNative getter?  Is that true?  If so, we can still do a mainline (monomorphic) VM call for this.  I should just need to adapt that code to handle ListBase-based non-native objects.  Will look at that tonight.
(Assignee)

Comment 39

7 years ago
(In reply to Kannan Vijayan [:djvj] from comment #38)
> Ok, bug 785464

D'oh.  I meant bug 785465.
(Assignee)

Comment 40

7 years ago
Yeah, I think this can be made to work mainline, as long as '.length' is a JSNative getter on the prototype.  It won't work on objects with expandos on them, but it should cover dromaeo for now until we get a more permanent solution in place (the permanent solution being ICs for VM calls).

Pseudocode checks on the object would basically go like this:

obj.isProxy() && obj.proxyHandler == ListBaseHandler && obj.numFixedSlots < ExpandoObjectSlotNumber && obj.hasNoExpando && obj.proto == typeObjectProto
.length is a JSNative getter, yes.

Why would fixing bug 785465 not work?  The whole point of that bug is to IC accesses to things like .length in this case!
Whiteboard: [ion:p1] → [ion:p1:fx18]
(Assignee)

Comment 42

7 years ago
(In reply to Boris Zbarsky (:bz) from comment #41)
> .length is a JSNative getter, yes.
> 
> Why would fixing bug 785465 not work?  The whole point of that bug is to IC
> accesses to things like .length in this case!

Ion's ICs as they stand are not built to allow for VM calls from within them.  The fundamental issue being that VM calls can cause invalidation of live on-stack jitcode, and the infrastructure for hotpatching the call sites when that happens doesn't exist yet for out-of-line code (which is where the IC VM calls would have to go).

The general solution is to enable that infrastructure, but that's a bit more involved than just handling monomorphic getters which would allow us to place the vm call in mainline code instead of OOL code.
OK, but does the ListBase thing actually do the call inside the IC?  Or does it just do the property get in the IC?  The property get is the slow part here, as I recall....
(Assignee)

Comment 44

7 years ago
The ListBase patch that I just wrote up for Ion does property gets only (and by that I mean slot accesses), which are easy because they are guaranteed not to invalidate anything.

The issue with length is that the "property get" is actually a call to a JSNative (additionally, a call which in the general case can't be guaranteed never to cause jitcode invalidation (ew, double negative)).

A big part of the slowdown right now is that we're going down the |getGeneric| codepath for that, which does the whole "walk prototypes, traverse shape tree" dance.  We can skip that dance with a few quick shape and prototype guards, then making a straight call to the appropriate JSNative.
So there are two separate things going on here:

1)  Getting the actual getter function to call.  This does not involve any VM calls and should be ICable.
2)  Calling the getter.

Is the problem just that Ion can't split up the work that way?

Doing slot anything on ListBase is pointless, because they're proxies.  They never have anything in slots, ever.

> We can skip that dance with a few quick shape and prototype guards, then making a
> straight call to the appropriate JSNative.

Yes, that's the point.  I thought that was what the ListBase stuff in JM did, but maybe I was just confused.  ;)

At the very least, you should be able to copy whatever guarding JM does.
(Assignee)

Comment 46

7 years ago
(In reply to Boris Zbarsky (:bz) from comment #45)
> So there are two separate things going on here:
> 
> 1)  Getting the actual getter function to call.  This does not involve any
> VM calls and should be ICable.
> 2)  Calling the getter.
> 
> Is the problem just that Ion can't split up the work that way?

Currently yes.  What you suggest here is actually a good idea: use an IC to just do the retrieval, and then do the actual native call mainline.  We could do this but it would mean adding a new type of IC whose job it was to retrieve getter functions, not do the IC.  We'd also need to change the getprop code to recognize GETPROP sites which are expected to reduce down to JSNative calls, and emit this IC for those sites only.
 
> Doing slot anything on ListBase is pointless, because they're proxies.  They
> never have anything in slots, ever.

Ok.
 
> Yes, that's the point.  I thought that was what the ListBase stuff in JM
> did, but maybe I was just confused.  ;)

I'm pretty sure the ListBase stuff in JM puts the call itself inside the IC.  I.e. it "does the getprop" as opposed to "gets the JSNative required to do the getprop".
 
> At the very least, you should be able to copy whatever guarding JM does.

Guarding within the IC is a bit easier since you have a reference object to work from as you generate each stub within the IC.  Guarding outside the IC shouldn't be that difficult either but I can't copy the JM code directly for that, since I won't have a reference object I can do lookups on.

I need a way to check from a TypeObject, whether the underlying class of objects being described is based on ListBase or not.  Trying to figure that out now.
(Assignee)

Comment 47

7 years ago
Ok, so telling if the objects flowing through a particular GETPROP site are actually ListBase objects is hard to do.  We'll just have to bite the bullet and implement call VM support in Ion ICs.

Comment 48

7 years ago
(In reply to Kannan Vijayan [:djvj] from comment #47)
> Ok, so telling if the objects flowing through a particular GETPROP site are
> actually ListBase objects is hard to do.  We'll just have to bite the bullet
> and implement call VM support in Ion ICs.

What bug # is that?
(Assignee)

Comment 49

7 years ago
Yeah, it's probably a good idea to have a distinct bug for that.  Just made one: Bug 790051.
Depends on: 790051
Blocks: 790386
Can anyone give me an ETA on when we'll know whether the above patch is good enough in terms of performance?  Keep in mind that comment 31 also has to be addressed before landing, but I'd sort of like to know whether the general approach is going to work at all before investing time in that...
(Assignee)

Comment 51

7 years ago
I expect to know by the end of the week.  I have this patch basically done - just need to figure out some final stack manipulation details.  I'm hoping to be testing/debugging the combined patches for this and bug 790051 by tonight.
(Assignee)

Comment 52

7 years ago
bz:

I got down the final bits I need to put in place and started evaluating my changes.  I discovered ListBase.length is not converted to a JSNative binding in the patch you supplied.

Both hasSlot() and hasGetterValue() return false for the shape for the length property.  Gonna try to figure out how to change |length| into a JSNative by reading what your patch does.  If you can get to it before me, that'd be awesome.
Oh, good catch.   Yeah, that's not a quickstub.  It just uses copy/pasted code.  The relevant code for the length getter is in js/xpconnect/src/dombindings.cpp (see length_getter).  And we'll need to fix dombindingsgen.py to generate the right autogenerated getters/setters (this part is similar to the quickstub patch, since the code was cribbed from there) and fix the code in ListBase<LC>::getPrototype in js/xpconnect/src/dombindings.cpp to pass the JSPROP_NATIVE_ACCESSORS flag.

Will we need to also change the ListBase IC in JM?

I can try to do the .length thing tomorrow, but chances are low.  Monday is much more likely.
(Assignee)

Comment 54

7 years ago
Ok I'll try to take care of myself today and tomorrow.  Thanks for the pointers.  If I'm able to get it working, I'll post here.  If not I'll ping you on Monday.
(Assignee)

Comment 55

7 years ago
(In reply to Boris Zbarsky (:bz) from comment #53)
> Will we need to also change the ListBase IC in JM?

Sorry didn't answer this earlier.  Bhackett would know for sure as I don't know if JM already handles JSNatives or whether support will need to be added.
(Assignee)

Comment 56

7 years ago
Just finished doing some cursory testing on this.  I think this does what we want.  Running mochitests on it right now, but I'm not sure if I'm missing something or not.
I'm sort of guessing the existing IC might already work, since it seems to work for CSS2Properties and I'm pretty sure those use JSNative.

As far as the patch goes, we should call the argument vp (and get rid of lots of the branching on isAttr in the process).  The |vp = &(JS_RVAL(cx, vp_));| thing is  no-op and can just go away.  argp should probably be called argv, but we presumably have a place we set that up already in the method case; we just need to do it in the attr case too.

You need to null-check the return value of JS_THIS_OBJECT; otherwise calling the getter with null as this will crash.

Thank you for doing that part!
(Assignee)

Comment 58

7 years ago
No worries.  I appreciate your help with this.

Mochitests failed an assert after doing well for a long time, on test_bug466751.xhtml, at XPCQuickStubs.cpp:226.

Let's talk on Monday about this stuff.  I want to get a clear idea of how the behaviour for these various DOM objects is implemented and what the guard logic needs to be.  If it "just works" after these changes then all the better, but I want a better understanding of what's going on.
> Mochitests failed an assert after doing well for a long time

Yep.  See comment 31.

I'm actually off on Monday, but let's talk Tuesday?  I'll be generally on IRC 9-5 Eastern, but we can set up some other time if that would be helpful.  Let me know!
Attachment #658255 - Attachment is obsolete: true
I checked on the ListBase stuff in JM, and it looks like it handles JSNatives just fine at first glance: the ListBase check is before the part where we check what sort of getter we have.
Comment on attachment 661659 [details] [diff] [review]
Convert quickstubs to JSNative, now passing tests

Review of attachment 661659 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/XPCQuickStubs.cpp
@@ +132,5 @@
>                                             stringTable + ps->name_index,
> +                                           JSVAL_VOID,
> +                                           (JSPropertyOp)ps->getter,
> +                                           (JSStrictPropertyOp)ps->setter,
> +                                           flags | JSPROP_SHARED | JSPROP_NATIVE_ACCESSORS)) 

Trailing whitespace.
Attachment #661659 - Flags: review?(peterv) → review+
I'm going to spin the two "stop using JSPropertyOps" patches out into bug 792215.
Depends on: 792215
Attachment #661376 - Attachment is obsolete: true
Attachment #661659 - Attachment is obsolete: true
Blocks: 792354
(Assignee)

Comment 64

7 years ago
Progress update: almost there.  The last remaining piece is to figure out how to handle type mismatches on returns from getters.  Ion ICs may be constructed with the expectation of a specific typed output.  If the type of the getter's result doesn't match the expected type, we need to make a call out to monitor types which should invalidate the compiled script that the IC belongs to.

This should be relatively straightforward since all the work to enable OSI from ABI-calls inside ICs has already been done to be able to make the getter call in the first place.  Type monitoring should be just another ABI call guarded by a typecheck.
Oh, interesting.  I wonder whether as a followup it would make sense to expose the return type for some DOM stuff to the JIT, in cases where we have a strongly typed return....  Separate bug, obviously.
(Assignee)

Comment 66

7 years ago
We could do that, and it would allow us to eliminate the typeguard, but it's not clear how much of a perf benefit that would be - ICs already incur the cost of one or more OOL jumps (to stubcode), a bunch of guards as we find the right stub to execute, a register spill, an ABI call to the JSNative, and a register fill.  The difference of a single typeguard in the middle of all of that is likely to not have a huge impact.

I'd guess it's more trouble to implement that than the perf benefits we could expect.  dvander may have some thoughts on it..
(Assignee)

Comment 67

7 years ago
But it may make sense from a memory usage perspective, since we could avoid emitting the assembly instructions for the guard's failure case (the call to type monitor).  That's save probably 20-30 bytes per jsnative-call-ic-stub.  If there are a lot of those generated in the browser we might be able to justify it just on the slimness metric.
(Assignee)

Comment 68

7 years ago
Attaching this to report some preliminary results and numbers.  This patch adds support for calling JSNative getters to the GetPropertyCache IC in Ion.  Preliminary results look good but there is clearly work to be done.

Here are numbers (results are runs per second, higher is better).

ION DISABLED
============
DOM Attributes:     658.07
DOM Modification:   300.62  
DOM Query:          13543.12
DOM Traversal:      328.80

ION ENABLED + PROPERTYOPS => JSNATIVES
======================================
DOM Attributes:     648.35
DOM Modification:   301.06
DOM Query:          6371.12
DOM Traversal:      49.50

ION ENABLED + PROPERTYOPS => JSNATIVES + THIS PATCH
===================================================
DOM Attributes:     661.07
DOM Modification:   303.92
DOM Query:          12000.01
DOM Traversal:      59.19

I still haven't tested "ION ENABLED, NO PATCHES" because I didn't have time to do another full build today.  That should be done just to see if the propertyop => jsnative patches alone introduce any regressions.

However, the promising bit is DOM Query, which is once again in the ballpark of the results we get with Ion disabled.  We're 10% or so down there.

DOM traversal gains 25%, but we're nowhere close to the previous result there.

In DOM Query, there are only a few candidates that could be responsible for the slowdown: a getprop for nodeType on a DOM object, a getprop for the getElementById/getElementsByName/getElementsByTagName functions on |document|, or array GETELEMS into a nodelist returned by the above functions (or some combination of the lot).

In DOM Traverse, it's |document.body|, |document.body.childNodes|, array GETELEMS on a nodelist, or one of the firstChild/lastChild/nextSibling/previousSibling accessors on DOM objects.

That's really the only things it can be because that's the only things that are going on inside the loops in these benchmarks.
Thanks for measuring that!

It's interesting that the dom-attr and dom-modify numbers are basically not budging for you; maybe they just don't have that many DOM getters/setters?  That said, I just tried dom-modify with and without ion, and I see a pretty big difference.  Especially on the appendChild and insertBefore tests inside there, which makes sense because those exercise getters and setters a lot more than the other subtests of dom-modify.

Similarly, on dom-attr I'd expect the perf differences to be concentrated on the "element.property" and "element.property = value" subtests, since those are the only ones that exercise this stuff.  What do the scores on those sub-tests look like?

Let me know if you want me to do the measuring here, btw; I can pretty easily spin up 4 or 5 opt builds in an hour or so if desired...
(Assignee)

Comment 70

7 years ago
Did you try the dom-modify and dom-attr tests with your PropertyOp => JSNative patch or without it?

Without it I'd expect there to be a difference.  With it, efaust's DOM work ensures that the Ion scores are on par with the JM scores.  There may still be gains to be had for Ion in those two benches, but I'm trying to focus on the regressions in order of severity, so that means I'll be looking closely at DOM Traversal tomorrow followed by mopping up the remaining slowdown in DOM Query.

Are you around tomorrow?  I'll probably want to ask you about the implementation of a number of these property accesses and getelems (once I figure out which are the biggest slowpath culprits, which I hope little bit of printf tracing should be able to tell me).
> Did you try the dom-modify and dom-attr tests with your PropertyOp => JSNative patch or
> without it?

I haven't tried dom-attr.  I thought I had the JSNative patch applied, but I just rebuilt to make sure, and you're right: with it applied dom-modify seems to be fine.

> With it, efaust's DOM work ensures that the Ion scores are on par with the JM scores.

I thought efaust's stuff was all for the WebIDL bindings, which none of this stuff is exercising yet.  Did he also add something that works on generic JSNative getters when I wasn't looking?  I thought the ion ICs couldn't handle that sort of thing without the patch in this bug!

In any case, if we posit that "normal" JSNative getters are fine with Ion, however we got there, and if we assume that your ListBase thing makes .length on lists fast again, then the most likely reason for the remaining slowdown are the GETELEMs on the list proxy, I would think; everything else listed in comment 68 is quickstub JSNatives.  dom-traverse has a lot more of those GETELEMS than dom-query, as I recall, which would also match your numbers.

> Are you around tomorrow?

Yes, between about 9am (_maybe_ 8:30am) Pacific and 2pm Pacific.  Happy to talk then!
> I thought efaust's stuff was all for the WebIDL bindings,

OK, I talked to efaust, and got myself partially unconfused.  His summary is:

  What we did was: ask "is everyone at this callsite looking for the
  same JSFunction?" if yes, and the dom callback said we could, then we skimmed to
  the DOM fastpath stuff. In general, though, we generate direct calls to all unique
  JSFunctions, both native and scripted.

What that means is that microbenchmarks will definitely win from efaust's changes, since they will hit the "unique JSFunction" case.  That explains the behavior on dom-attr and dom-modify, where all the gets are always on the same kind of node.

On the other hand, I just tried a situation where we have different JSFunctions on different iterations of a loop, and perf even with ion seems to be fine.  Testcase is:

 <body>
   <script>
     var bod = document.body;
     var docEl = document.documentElement;
     function getter(el) {
       return Object.getOwnPropertyDescriptor(el.__proto__, "firstChild").get;
     }
     var arr = [bod, docEl, bod, docEl]
     var count = 1000000;
     var start = new Date();
     var prevChild;
     for (var i = 0; i < count; ++i) {
        var newChild = arr[i%4].firstChild;
        if (newChild == prevChild) { alert('TOTAL FAIL'); }
        prevChild = newChild;
     }
     var end = new Date();
     alert((end - start) / count * 1e6)
     alert(getter(bod) == getter(docEl))
   </script>
 </body>

I don't understand why that's fast.  ;)  And if things like this _can_ end up slow, that could affect dom-query and maybe dom-traverse.
(In reply to Boris Zbarsky (:bz) from comment #72)
>
> I don't understand why that's fast.  ;)

It's probably because we don't compile scripts containing JSOP_DEFFUN (for the "function getter() {}" in the global scope), bug 764310. You can try to move the loop into its own function and see if that's still fast. With a debug build, IONFLAGS=scripts,aborts can tell you which scripts are compiled or abort.
> It's probably because we don't compile scripts containing JSOP_DEFFUN

AHA!  Yes, this script totally shows a significant slowdown:

 <body>
  <script>
    var bod = document.body;
    var docEl = document.documentElement;
    var arr = [bod, docEl]
    var count = 1000000;
    var start = new Date();
    for (var i = 0; i < count; ++i) {
       arr[i%2].firstChild;
    }
    var end = new Date();
    alert((end - start) / count * 1e6)
  </script>

So that could be affecting dom-query.  It's almost certainly hitting the firstChild/lastChild/nextSibling/previousSibling tests of dom-traverse.

Comment 75

7 years ago
So should bug 764310 be blocking this one?
(Assignee)

Comment 77

7 years ago
So it turns out that the issue is GetElems.

The following is a tally of call counts where we take the fallback path for GetElem ICs:

  88899 KVKV: GetElementCache: http://dromaeo.com/tests/dom-query.html:79 pc=125 line=84
  88899 KVKV: GetElementCache: http://dromaeo.com/tests/dom-query.html:79 pc=190 line=86
  88899 KVKV: GetElementCache: http://dromaeo.com/tests/dom-query.html:79 pc=255 line=88
  88899 KVKV: GetElementCache: http://dromaeo.com/tests/dom-query.html:79 pc=60 line=82
 362440 KVKV: GetElementCache: http://dromaeo.com/tests/dom-query.html:44 pc=50 line=47
 374680 KVKV: GetElementCache: http://dromaeo.com/tests/dom-query.html:58 pc=50 line=61
 398040 KVKV: GetElementCache: http://dromaeo.com/tests/dom-query.html:51 pc=50 line=54
 403560 KVKV: GetElementCache: http://dromaeo.com/tests/dom-query.html:65 pc=50 line=68

 604065 KVKV: GetElementCache: http://dromaeo.com/tests/dom-attr.html:40 pc=35 line=42

 212414 KVKV: GetElementCache: http://dromaeo.com/tests/dom-traverse.html:35 pc=59 line=40
 239522 KVKV: GetElementCache: http://dromaeo.com/tests/dom-traverse.html:22 pc=59 line=27
 754976 KVKV: GetElementCache: http://dromaeo.com/tests/dom-traverse.html:66 pc=47 line=70

Most of these GetElems correspond to array-like accesses on DOM object lists, and one set of them corresponds to own-property accesses on a DOM object.  Here's the breakdown:

In DOM-Query:
  Array accesses on the result of getElementsByName and getElementsByTagName

In DOM-Traverse:
  Array accesses on |document.body.childNodes|

In DOM-Attr:
  GetElems (string property name accesses) on a particular dom object that's had those own properties previously set.

The DOM-Attr case I can't see a way to do anything about.  The property accesses are all based on different names each time, so there's no real way to optimize away a shape lookup.  But we're already at performance par with DOM-Attr so it's low priority.
(Assignee)

Comment 78

7 years ago
This code uses the patch just posted to 790051 to actually generate calls from GetPropertyCache ICs.

The approach doesn't actually use the dynamic returnAddr => SafepointIndex table that was added in that patch, but it does use the new OOL exit frame type from that patch.  It reuses the returnAddressToFp of the GetPropertyCache fallback function as it's returnAddressToFp, which will correctly be looked up to the right SafepointIndex in the case of invalidation.
Attachment #662755 - Attachment is obsolete: true
Attachment #664126 - Flags: review?(jdemooij)
> The DOM-Attr case I can't see a way to do anything about.

That's also not using integer names and not hitting any DOM stuff at all: it's a pure JS testcase, right?  I wouldn't worry about it too too much.  We beat V8 on the sets, but lose on the gets; JSC does slightly better than us on both  but it really has nothing to do with the DOM per se, afaict.
(Assignee)

Comment 80

7 years ago
Updates on GetElem: There seems to be something else going on that I'm missing.

My initial strategy was something dvander suggested: check if the JM cache for the GetElem had been disabled (which would happen for these GetElems), and if so, don't generate caches in IM, preferring instead to just emit an inline call.  This didn't work because sniffing the JM caches wasn't possible from the IM code.  The JM caches were definitely getting disabled, but the PIC was getting discarded before we got to Ion code.

Rather than figure that out, I took the easier route of just making the GetElement VM function keep track (within bytecode analysis) of whether it had ever seen a native object for that particular location.  Then Ion checks this and if it's present, chooses not to emit an IC for that GetElem.

This too, works, but the results it yields are not nearly as good as we'd want.

DOM-Query improves by about 2% (From 10668 or to 10896 on my box).  DOM-Traverse improves by about 20% (From 57 to 69).

We're looking for 20% improvements from DOM-Query (to 12346), and 550% improvement to DOM-Traverse (to 313).

I've double checked, and we've eliminated basically all the GetElementCache calls in these two benches, so it doesn't seem like these were the issue.  Tomorrow I'll try to diagnose this further.

A tentative thought is maybe we're recompiling a lot with Ion?  Will have to check it out.
I'm happy to do some profiling here too, if you tell me which all patches I need to apply...
(In reply to Kannan Vijayan [:djvj] from comment #67)
> But it may make sense from a memory usage perspective, since we could avoid
> emitting the assembly instructions for the guard's failure case (the call to
> type monitor).  That's save probably 20-30 bytes per jsnative-call-ic-stub. 
> If there are a lot of those generated in the browser we might be able to
> justify it just on the slimness metric.

Drive-by comment:  if you're talking about Ion code, it probably won't make a difference, because the amount of Ion code generated is quite small.  But if you're talking about Jaeger code, then it could be significant.
(Assignee)

Comment 83

7 years ago
(In reply to Boris Zbarsky (:bz) from comment #81)
> I'm happy to do some profiling here too, if you tell me which all patches I
> need to apply...

Your QuickStubs => JSNative patch.  The ListBase getters/setters => JSNative patch.  Followed by the patch I just posted to bug 790051, followed by the patch I posted here.  They should apply to tip cleanly, as I rebased them this morning before putting them up for review.

Thanks for the help.
OK, so looking at dom-traverse what's going on is that we're going through a slow path for the actual .firstChild and .lastChild (I only profiled those first two tests).

In particular, we seem to have jitcode calling js::GetProperty(?), which then does the whole resolve, etc, etc dance.  80% of the time is under the js::GetProperty, but only about 4% is under the actual getters.  The rest is pure overhead (e.g. over 50% is calling ion::GetPcScript, after which as I understand we just lose).

It looks like those gets are not hitting the IC for whatever reason.  Note that these are probably polymorphic at the moment, so not going to be helped by efaust's stuff per comment 72 and comment 74.
Oh, and it looks like we're still paying the AutoFlushCache cost in the ListBase case.  Not sure whether that's expected...
Comment on attachment 664126 [details] [diff] [review]
Add native getter call case for Ion ICs, which also handles ListBase proxy objects

Review of attachment 664126 [details] [diff] [review]:
-----------------------------------------------------------------

Great work, looks much nicer than the JM ICs.

Can you also modify a random getter (like the customNative one in the shell) to call JS_GC() after a while and see if that works as expected?

::: js/src/ion/IonCaches.cpp
@@ +243,5 @@
> +        JS_ASSERT(obj != holder);
> +
> +        // If there's a single jump to |failures|, we can patch the shape guard
> +        // jump directly. Otherwise, jump to the end of the stub, so there's a
> +        // common point to patch.

Comment nit: we always jump to the end of the stub here.

@@ +273,5 @@
> +            Value expandoVal = obj->getFixedSlot(GetListBaseExpandoSlot());
> +            JSObject *expando = expandoVal.isObject() ? &(expandoVal.toObject()) : NULL;
> +            JS_ASSERT_IF(expando, expando->isNative() && expando->getProto() == NULL);
> +
> +            masm.loadValue(expandoAddr, tempVal);

We don't have to load the full Value if you pass expandoAddr to branchTestObject and branchTestUndefined below. Then we only need a scratch register for extractObject and the shape check.

@@ +309,5 @@
> +        // register (and restore it afterwards). After this point, we cannot jump
> +        // directly to |stubFailure| since we may still have to pop the object register.
> +        Label prototypeFailures;
> +        if (output.hasValue()) {
> +            scratchReg = output.valueReg().scratchReg();

Is it possible for the return type to be != MIRType_Value? If not we can JS_ASSERT(output.hasValue()); and simplify this a bit.

@@ +371,5 @@
> +
> +        // Native functions have the signature:
> +        //  bool (*)(JSContext *, unsigned, Value *vp)
> +        // Where vp[0] is space for an outparam, vp[1] is |this|, and vp[2] onward
> +        // are the function arguments.

Nit: duplicate comment a few lines above this one.

@@ +408,5 @@
> +
> +        // If the result's expected type is not MIRType_Value, then generate a type guard
> +        // that calls TypeScript::Monitor on failure.
> +        JS_ASSERT(output.hasTyped() || output.hasValue());
> +        if (output.hasTyped()) {

Is this case actually possible? If we always have a Value that's used by MTypeBarrier it would simplify things a lot.

@@ +412,5 @@
> +        if (output.hasTyped()) {
> +            Label typeGuardFailure;
> +
> +            switch(output.type()) {
> +            case MIRType_Undefined:

Style nit: indent switch cases + default with 2 spaces.

@@ +431,5 @@
> +            case MIRType_String:
> +                masm.branchTestString(Assembler::NotEqual, JSReturnOperand, &typeGuardFailure);
> +                break;
> +            case MIRType_Object:
> +                masm.branchTestObject(Assembler::NotEqual, JSReturnOperand, &typeGuardFailure);

For MIRType_Object we also have to guard on the TypeObjects, see MacroAssembler::guardTypeSet in IonMacroAssembler.cpp. We could try to call this method here, but if the IC is always followed by an MTypeBarrier it would simplify this a lot.

@@ +614,5 @@
> +
> +    if (shape->hasSlot() && shape->hasDefaultGetter())
> +        return true;
> +
> +    return false;

Nit: can we "return true" at the end, and invert the previous condition? That way we can later easily add more conditions.

@@ +628,5 @@
> +        return shape->getterValue().toObject().isFunction() &&
> +               shape->getterValue().toObject().toFunction()->isNative();
> +    }
> +
> +    return false;

Same here.

@@ +659,4 @@
>          return false;
>  
> +    if (isListBase) {
> +        if (!IsCacheableGetPropCallNative(checkObj, holder, shape))

We should also return here if cache.idempotent(), since getters may be effectful. Can you also add a JS_ASSERT(!idempotent()) to attachCallGetter just to be sure? Also, don't we want to handle non-ListBase getters too?

::: js/src/ion/arm/MacroAssembler-arm.h
@@ +607,5 @@
> +        branchPtr(cond, lhs, Imm32(ptr.value), label);
> +    }
> +
> +    void branchPrivatePtr(Condition cond, Register lhs, ImmWord ptr, Label *label) {
> +        branchPtr(cond, lhs, Imm32(ptr.value), label);

Nit: pass ptr instead of Imm32(ptr.value), same for x86.

::: js/src/ion/shared/MacroAssembler-x86-shared.h
@@ +378,5 @@
> +    bool buildOOLFakeExitFrame(const Register &scratch, void *fakeReturnAddr) {
> +        uint32 descriptor = MakeFrameDescriptor(framePushed(), IonFrame_JS);
> +        mov(ImmWord((uintptr_t) fakeReturnAddr), scratch);
> +        Push(Imm32(descriptor));
> +        Push(scratch);

This can be Push(ImmWord(fakeReturnAddr));

So that you don't need the scratch argument (on 32-bit this will push the value directly and on 64-bit it uses the scratch register)
Attachment #664126 - Flags: review?(jdemooij)
If there's not always a type barrier, IonBuilder could always add a MMonitorTypes instruction if the IC ever called a getter (there's already an accessGetter analysis bit which may do what you want), and we could store this bool in the IonCache and only generate a getter stub if the flag is set. That way the output is always a Value and we don't have to monitor anything.
(In reply to Jan de Mooij [:jandem] from comment #86)
> @@ +371,5 @@
> > +
> > +        // Native functions have the signature:
> > +        //  bool (*)(JSContext *, unsigned, Value *vp)
> > +        // Where vp[0] is space for an outparam

Note that vp[0] is more than that: it stores ObjectValue(*callee) before the call.
(Assignee)

Comment 89

7 years ago
Updates on progress.

On IRC bz identified some GetProps happening in DOM-Traverse that were always taking the slowpath VM call instead of directly calling the getter.  Investigation revealed that these objects weren't getting ICs at all, and that Ion was simply defaulting to making a VM call for them.  This was because the typeset for these objects included |null|, and Ion only builds ICs for |this| values which are definitely objects.

I changed the ListBase/GetProperty patch to do the following:
  1. Catch situations where the |this| of a GetProp could possibly be an object, but the typeset also included null and/or undefined, and emit a guarded IC for these cases.
  2. Change the CallNative IC path to also generate stubs for JSNative getters on regular native objects.

With this change along with the fixes to GetElem, DOM-Traverse improves significantly, to being slightly better than Dromaeo without Ion.

dromaeo.com/?id=180495,180496

The only remaining issue is the 20% slowdown on DOM-Query.  bz looked into that quickly and had some thoughts on IRC:

2:57 < bz> ok
12:57 < bz> so
12:57 < bz> in case you care
12:57 < bz> with JM, 40% of the time on that test is under the 
            getElementsByTagName getter
12:58 < bz> 23% is under the GetElem stub
12:58 < bz> 10% under length_getter
12:59 < bz> 5% nodeType getter
12:59 < bz> 12% jitcode
12:59 < bz> the rest looks like various overhead
12:59 < bz> in comparison, with IM, I see
13:00 < bz> 40% under GetElementCache, of which only 15% is under 
            proxy_GetElement.  That used to be 17% with JM
13:00 < bz> so the overhead of the GetElement stub was about 5% of the smaller 
            time
13:00 < bz> the overhead of GetElementCache is about 25% of the larger time
13:00 < bz> so that's basically the entire difference right there
13:01 < bz> djvj: ^
13:01 < bz> djvj: in particular, 7% of time is self time in GetElementCache
13:01 < bz> djvj: in particular, plus the AutoFlushCache and TypeMonitorResult 
            bits

I'll take a look and these soon.  At this point, enough progress has been made, and the changes are heavy enough to warrant a cleanup of the patches so far and push to inbound and then do the rest incrementally.  I'm spending a bunch of overhead time rebasing these repeatedly now.
(Assignee)

Comment 90

7 years ago
So last night I cleaned up these patches and rebased them, removed the debug cruft, and tested them out, and I got this:

http://dromaeo.com/?id=180507,180508

Ion is first, No-Ion is second.

Anyway, with the regression caused by the patches to bug 792215, I can't apply these to tip anymore.  This regression NEEDS to get nailed down soon.

bz:
Can you look into the memory leak on bug 792215?  I'll work in expanding these patches to also handle PropertyOps in case we can't track the memory leak down in time.
(Assignee)

Comment 91

7 years ago
Posted patch IC JSNative getters (obsolete) — Splinter Review
Attachment #664126 - Attachment is obsolete: true
(Assignee)

Comment 92

7 years ago
Change Ion's ICs to handle GETPROPs where the input object may possibly be null or undefined.  We just prefix the GetPropCache with a fallible unbox in this case.
(Assignee)

Comment 93

7 years ago
We track GETELEMS at runtime and record if we've ever seen a non-native object at a site, we record it.  Later on, when emitting ICs, we check if the property being retreived is an int32, and in that case, skip the IC.

This may be able to be generalized a bit, but that's a tuning thing.
(Assignee)

Comment 94

7 years ago
Added PropertyOp support to the patches yesterday.  The new numbers with all patches applied get us back to par on performance on everything except DOM-Attr, where there's a 20% slowdown still.

Dromaeo comparison: http://dromaeo.com/?id=180653,180654
(left side is with ion, right side is no-ion)

Posting the patches so far up for review and working on the remaining regression as a delta.
(Assignee)

Comment 95

7 years ago
Jan: This is basically similar to the patch to add Ion exit frames for calling JSNatives from OOL stubs, from bug 790051.
Attachment #665569 - Flags: review?(jdemooij)
(Assignee)

Comment 96

7 years ago
Posted patch IC JSNative getters (obsolete) — Splinter Review
Fixed patch from previous review.

Notes: I can't guarantee that the output type is going to be MIRType_Value.  As far as I can tell we cannot be sure of that looking at IonBuilder's jsop_getprop implementation, so we have to handle this case.

All other nits and fixes should be in.
Attachment #665100 - Attachment is obsolete: true
Attachment #665570 - Flags: review?(jdemooij)
(Assignee)

Comment 97

7 years ago
This is a delta off of the "IC JSNative getters" patch that adds handling for PropertyOps.  It seems more readable to present this as a separate delta so it's clear what changed.
Attachment #665572 - Flags: review?(jdemooij)
(Assignee)

Comment 98

7 years ago
This changes jsop_getprop in IonBuilder to IC GetProps even when the input object may be a null or undefined.  In those cases, it just prefixes the IC with a fallible unbox.
Attachment #665102 - Attachment is obsolete: true
Attachment #665573 - Flags: review?(dvander)
(Assignee)

Comment 99

7 years ago
This patch changes bytecode analysis to add a single boolean field that tracks whether the site of a GetProp has ever seen a non-native object.

Later on, when considering the generation of ICs for GetElem operations, we decide to NOT create IC if:
1. The GetElem key is an integer
2. We've seen non-native |this| objects at that location.

This basically avoids inserting GetElem ICs for array-accesses on ListBase objects.
Attachment #665105 - Attachment is obsolete: true
Attachment #665574 - Flags: review?(dvander)
(Assignee)

Comment 100

7 years ago
Forgot to mark previous attachment as a patch.
Attachment #665574 - Attachment is obsolete: true
Attachment #665574 - Flags: review?(dvander)
Attachment #665576 - Flags: review?(dvander)
(Assignee)

Comment 101

7 years ago
The 20% regression in DOM-attr is entirely in the Property Set code.  ICing callouts to StrictPropertyOp setters should handle it.
Attachment #665569 - Flags: review?(jdemooij) → review+
Comment on attachment 665570 [details] [diff] [review]
IC JSNative getters

Review of attachment 665570 [details] [diff] [review]:
-----------------------------------------------------------------

Nice. I think addressing the comment below deserves another review, but otherwise this is ready. I can review an updated patch this weekend.

::: js/src/ion/IonCaches.cpp
@@ +400,5 @@
> +            JSReturnOperand);
> +
> +        // If the result's expected type is not MIRType_Value, then generate a type guard
> +        // that calls TypeScript::Monitor on failure.
> +        JS_ASSERT(output.hasTyped() || output.hasValue());

Thinking about this more, it seems much simpler to do what JM does: in IonBuilder check if accessGetter is set and in that case always add a MonitorTypes instruction. Then propagate that flag to the IonCache and only generate getter stubs if it's set. The advantages are that (1) the result type is always MIRType_Value, simplifying regalloc inside this stub and (2) we don't have to duplicate the type check logic below.

@@ +679,5 @@
> +    } else if (IsCacheableGetPropCallNative(checkObj, holder, shape)) {
> +        // Don't enable call native stubs if cache is idempotent, since
> +        // they can be effectful.
> +        if (!cache.idempotent())
> +            callNative = true;

If cache.idempotent() it's possible for both callNative and readSlot to be false, which will trigger the assert below. We should return early here:

if (cache.idempotent())
    return true;

callNative = true;

@@ +701,5 @@
>  
>      if (cache.stubCount() < MAX_STUBS) {
>          cache.incrementStubCount();
>  
> +        if (readSlot) 

Nit: trailing whitespace.

::: js/src/ion/shared/MacroAssembler-x86-shared.h
@@ +374,5 @@
>          JS_ASSERT(framePushed() == initialDepth + IonExitFrameLayout::Size());
>          return true;
>      }
>  
> +    bool buildOOLFakeExitFrame(const Register &scratch, void *fakeReturnAddr) {

You can remove the now unused "scratch" argument.
Attachment #665570 - Flags: review?(jdemooij)
Attachment #665572 - Flags: review?(jdemooij) → review+
Attachment #665573 - Flags: review?(dvander) → review+
Attachment #665576 - Flags: review?(dvander) → review+
(Assignee)

Comment 103

7 years ago
Posted patch IC JSNative getters (obsolete) — Splinter Review
With issues addressed.  Some extra change in this one since mozilla-inbound recently made calling |getProto| on proxies a no-no.  Instead, we have to call |getTaggedProto| and convert that to an object pointer.
Attachment #665570 - Attachment is obsolete: true
Attachment #666031 - Flags: review?(jdemooij)
(Assignee)

Comment 104

7 years ago
Attached wrong patch (old on instead of updated one).  This is the correct one.
Attachment #666031 - Attachment is obsolete: true
Attachment #666031 - Flags: review?(jdemooij)
Attachment #666032 - Flags: review?(jdemooij)
Comment on attachment 665569 [details] [diff] [review]
Add ION exit frame type for OOL calls to PropertyOp

This landed with the wrong bug number (a closed bug that I can't access even with s-s privs at that; so I can't also post a message there for hg blame posterity):
https://hg.mozilla.org/mozilla-central/rev/4858ec60ad5f
Note: If the bug number had been correct, the marking tool we use (m-cMerge) would have closed this bug, since [leave open] hadn't been added to the whiteboard.
Based on Are we fast yet, the following commits are causing a performance regression on kraken / sunspider / v8 and assorted tests.

https://hg.mozilla.org/integration/mozilla-inbound/rev/44465ef545e3
https://hg.mozilla.org/integration/mozilla-inbound/rev/a2843362ce9b
https://hg.mozilla.org/mozilla-central/rev/a2843362ce9b
https://hg.mozilla.org/mozilla-central/rev/44465ef545e3
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
(In reply to Nicolas B. Pierron [:pierron] [:nbp] from comment #107)
> Based on Are we fast yet, the following commits are causing a performance
> regression on kraken / sunspider / v8 and assorted tests.
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/44465ef545e3
> https://hg.mozilla.org/integration/mozilla-inbound/rev/a2843362ce9b

Backed out:

https://hg.mozilla.org/integration/mozilla-inbound/rev/59665618b6c9
https://hg.mozilla.org/integration/mozilla-inbound/rev/963b6aadad7c
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment on attachment 666032 [details] [diff] [review]
IC JSNative getters

Review of attachment 666032 [details] [diff] [review]:
-----------------------------------------------------------------

Awesome, r=me with comments below addressed.

::: js/src/ion/IonCaches.cpp
@@ +293,5 @@
> +        // If we need a scratch register, use either an output register or the object
> +        // register (and restore it afterwards). After this point, we cannot jump
> +        // directly to |stubFailure| since we may still have to pop the object register.
> +        Label prototypeFailures;
> +        if (output.hasValue()) {

"output" is always a Value now so you can remove the other two branches, the "restoreScratch" boolean and the comment.

@@ +467,5 @@
> +{
> +    MacroAssembler masm;
> +    RepatchLabel failures;
> +
> +    JS_ASSERT(!idempotent());

Also add JS_ASSERT(allowGetters());

@@ +578,5 @@
> +    bool callNative = false;
> +
> +    if (IsCacheableGetPropReadSlot(checkObj, holder, shape)) {
> +        readSlot = true;
> +    } else if (IsCacheableGetPropCallNative(checkObj, holder, shape) && !cache.idempotent()) {

This should also check cache.allowGetters().
Attachment #666032 - Flags: review?(jdemooij) → review+
(Assignee)

Comment 111

7 years ago
(In reply to Ed Morley [:edmorley UTC+1] from comment #106)
> Note: If the bug number had been correct, the marking tool we use (m-cMerge)
> would have closed this bug, since [leave open] hadn't been added to the
> whiteboard.

Thanks Ed.  I'll add that to when I check it in next.  Sorry about the mixup.  They've been backed out by nbp for perf regressions anyway so I'll be pushing them in again, this time more careful with the checking message.
(Assignee)

Comment 112

7 years ago
(In reply to Kannan Vijayan [:djvj] from comment #111)
> (In reply to Ed Morley [:edmorley UTC+1] from comment #106)
> > Note: If the bug number had been correct, the marking tool we use (m-cMerge)
> > would have closed this bug, since [leave open] hadn't been added to the
> > whiteboard.
> 
> Thanks Ed.  I'll add that to when I check it in next.  Sorry about the
> mixup.  They've been backed out by nbp for perf regressions anyway so I'll
> be pushing them in again, this time more careful with the checking message.

Ed: Actually, that patch wasn't backed out.  I'll back it out and re-commit it with the appropriate bug number.
(Assignee)

Updated

7 years ago
Whiteboard: [ion:p1:fx18] → [ion:p1:fx18] [leave open]
(Assignee)

Comment 113

7 years ago
(In reply to Ed Morley [:edmorley UTC+1] from comment #105)
> Comment on attachment 665569 [details] [diff] [review]
> Add ION exit frame type for OOL calls to PropertyOp
> 
> This landed with the wrong bug number (a closed bug that I can't access even
> with s-s privs at that; so I can't also post a message there for hg blame
> posterity):
> https://hg.mozilla.org/mozilla-central/rev/4858ec60ad5f

Backed out and recommitted with correct message:

backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/a7c637172709
recommit: https://hg.mozilla.org/integration/mozilla-inbound/rev/fcac73861b03
(Assignee)

Comment 114

7 years ago
(In reply to Nicolas B. Pierron [:pierron] [:nbp] from comment #109)
> (In reply to Nicolas B. Pierron [:pierron] [:nbp] from comment #107)
> > Based on Are we fast yet, the following commits are causing a performance
> > regression on kraken / sunspider / v8 and assorted tests.
> > 
> > https://hg.mozilla.org/integration/mozilla-inbound/rev/44465ef545e3
> > https://hg.mozilla.org/integration/mozilla-inbound/rev/a2843362ce9b
> 
> Backed out:
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/59665618b6c9
> https://hg.mozilla.org/integration/mozilla-inbound/rev/963b6aadad7c

The bug for the "Skip GetElem ICs for non-native objects" was that Array objects were getting counted as non-native (as their isNative was false).  Fixing that leaves a 0.6% regression in Kraken still, I'm guessing because the patch moves a GetPCScript from inside a conditional that'll execute some of the time to the main function body and execute all of the time.

Reordering the checks to put the cheap ones first (check for non-native, non-array object before doing the GetPCScript) fixes this.  I'll check this in tomorrow after I confirm that this doesn't affect the perf improvements to the dromaeo code.

Still need to test the "allow ICing of GETPROPS on potentially null/undefined objects" patch independently to see if it's causing perf snags.  It really shouldn't, and I expect to be able to check that back in as is.   Reasoning:  GetProps on null or undefined will already raise TypeErrors, which will take the deopt path in all cases in Ion, so I'm not adding any new deoptimization cases.  Alternatively, I'm converting some unconditional calls to GetProperty to ICs, in cases where the presence of null or undefined in the typeset is really an artifact of type-inference and not "real" in any meaningful sense.
(Assignee)

Comment 116

7 years ago
Fixed patch.  Same logic as before, but the |this| is checked to be both non-array AND non-native instead of just non-native, and the GetPcScript is nested within a check for whether the target is a non-native, non-array object, to make it rarer.
Attachment #665576 - Attachment is obsolete: true
Attachment #666355 - Flags: review?(dvander)
Attachment #666355 - Flags: review?(dvander) → review+
(Assignee)

Comment 117

7 years ago
Bug 786126 - part 2 - Skip IC generation for GetElems of int32 indexes on non-native objects. (r=dvander)
https://hg.mozilla.org/integration/mozilla-inbound/rev/738c9ad0f809
(Assignee)

Comment 118

7 years ago
Just looked into the 'Allow null/undefined in Ion IC this-object typeset' patch.  These do introduce a regression (of about 2.5% on kraken and 0.5% on sunspider on 64-bit Linux).

The regressions are caused by extra recompiles introduced by the change.

In SS, the recompiles are on 3d-raytrace:189 (Scene.prototype.intersect), and in Kraken/crypto-ccm (crypto-ccm-data line 20).

I'm not sure why these recompile are getting induced.  But I'm putting that aside for the time being to investigate the try-failures for the IC jsnative/property-op getters patch, seen here:

https://tbpl.mozilla.org/?tree=Try&rev=30c1f7c47411
(Assignee)

Updated

7 years ago
Depends on: 795801
(Assignee)

Updated

7 years ago
Depends on: 795803
(Assignee)

Updated

7 years ago
Depends on: 797435
(Assignee)

Comment 120

7 years ago
Currently ICs are disabled on otherwise object-typed values which the typeset indicates may potentially be null or undefined.  This patch enables ICing of those values by prefixing the IC with a fallible object unbox.

The previous implementation of this patch didn't work because it reused the existing codepath for creating the IC instruction, which does things like mark the IC idempotent based on snooping JM IC info, and other fancy stuff.  Expanding the situations in which that code is run actually introduces regressions in benchmarks that don't have anything to do with the DOM case we're trying to address here.

This patch adds an extra path where a GetPropertyCache can be created.  This GetPropertyCache will have nothing special done to it (like getting marked idempotent).  This avoids the SS/Kraken regressions but keeps the perf gains on dromaeo.
Attachment #665573 - Attachment is obsolete: true
Attachment #668235 - Flags: review?(sstangl)
Comment on attachment 668235 [details] [diff] [review]
Allow null/undefined in Ion IC this-object typeset.

Review of attachment 668235 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good. IonBuilder::jsop_getprop() desperately needs a cleanup, but that can be done later.

::: js/src/ion/IonBuilder.cpp
@@ +5878,5 @@
>          return makeCallBarrier(getter, 0, false, types, barrier);
>      }
>  
> +    // If the input is guaranteed to be an object, then we want
> +    // to specialize it via an slot load or an IC.

uber-nit: 'a slot load'

@@ +5933,4 @@
>              load->setAllowGetters();
> +
> +        ins = load;
> +    } else if(obj->type() == MIRType_Value &&

uber-nit: space after 'if'. Both conditions can probably be expressed on a single line.
Attachment #668235 - Flags: review?(sstangl) → review+
(Assignee)

Comment 122

7 years ago
Pushed: Part 3 - Allow null/undefined in typeset of target object of GetProp ICs.
Try: https://tbpl.mozilla.org/?tree=Try&rev=1b80b1a5f1e2
Hg: https://hg.mozilla.org/integration/mozilla-inbound/rev/825de541f566
(Assignee)

Comment 123

7 years ago
With this patch in, my dromaeo run with ion-on vs ion-off showed the following results (left side is ion, right side is no-ion):

http://dromaeo.com/?id=181221,181222

Overall score is at par or better on Ion.  We are still regressing about 10% in the DOM-Attr subset, due to slowdowns in the |element.property = value| (setter) and |element.expando = value| subtests.

Also, in DOM-Query, there is a single test which is slower with ion than with no-ion: "getElementById (not in document)" gets 1567 for Ion vs. 1768 for no-ion.

Tomorrow I take a look at what JM is doing with its setter and dom/expando handling.
"getElementById (not in document)" just measures the performance of some function calls that return null and assign it to a global (declared with var) variable.  Not sure what, if anything, could go wrong there.

"element.property = value" is presumably exercising your new JSStrictPropertyOp setter code, right?

"element.expando" (the _getter_ is what dromaeo claims is slower) is just a completely vanilla property get of a property stored in an object slot.  The only caveat is that it's doing the get 10240 times on the same element with different property names for each get.  I would not be surprised if JM just disables its IC in that situation or some such....  But there's nothing DOM-specific about that part of the test.  Here's a shell testcase that shows the same slowdown with ion compared to --no-ion:

var a = {};
var num = 10240;
for ( var i = 0; i < num; i++ )
    a["test" + num] = function(){};
var ret;
var start = new Date;
for (var j = 0; j < 100; ++j)
    for ( var i = 0; i < num; i++ )
	ret = a["test" + num];
var end = new Date;
print(end - start)
Yeah, on that testcase JM has its GetElem() stubcall which spends most of its time atomizing the string and calling GetProperty.

Ion has GetElementCache, which interns the string and calls GetProperty, but also has calls to TypeMonitorResult and the AutoFlushCache stuff which might well account for the difference, at least in part.
Duplicate of this bug: 792490
Duplicate of this bug: 790386
(Assignee)

Comment 129

7 years ago
Creating new bugs for the individual test regressions in dromaeo-dom.
(Assignee)

Updated

7 years ago
Depends on: 801105
I did a few Dromaeo runs of Fx19 versus Fx17.

DOM Core: http://dromaeo.com/?id=181914,181945
Indeed, most of this one seems to be the attributes tests.

DOM JSLib: http://dromaeo.com/?id=181932,181947
This one isn't on Talos so I'm not sure if we care. The differences seem small, anyway.

DOM CSS: http://dromaeo.com/?id=181936,181948
We still have a ways to go here - that is bug 786160, I guess.
(Assignee)

Updated

7 years ago
Depends on: 807498

Updated

7 years ago
Depends on: 810588
Kannan, the vast majority of this bug is fixed, right? Should we continue to track this?
(Assignee)

Comment 132

7 years ago
Yeah, the only major thing I can see that's left is disabling of ICs when they get too big.  We can break that out into a separate bug and track it separately, since we are within target numbers for releasing with Ion.

I'm closing bug 801105 (meta bug for DOM-attr regressions).
Bug 807498 is for IC disabling - I'll leave that open.

Technically the first bug depends on the second, bug I think it'd be easier to track what needs to be done now by closing this bug and the dom-attr meta bug, and just leaving the IC-disabling bug to wrap up at some point.

Ack?
Closing this as fixed seems fine.  Let's focus on bug 807498 and bug 786160, and perhaps on better support for proxies.
(Assignee)

Comment 134

7 years ago
Ok closing.
Status: REOPENED → RESOLVED
Last Resolved: 7 years ago7 years ago
Resolution: --- → FIXED

Updated

6 years ago
Depends on: 841956
Duplicate of this bug: 787892
Depends on: 848088
You need to log in before you can comment on or make changes to this bug.