Closed
Bug 962256
Opened 11 years ago
Closed 11 years ago
Exact rooting regressed dromaeo_css and mootools by ~5%
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla30
People
(Reporter: terrence, Assigned: bzbarsky)
References
Details
Attachments
(4 files, 2 obsolete files)
11.80 KB,
patch
|
jonco
:
review+
|
Details | Diff | Splinter Review |
114.63 KB,
text/html
|
Details | |
6.96 KB,
patch
|
bholley
:
review+
Sylvestre
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
6.79 KB,
patch
|
Details | Diff | Splinter Review |
See this for a starting place:
http://perf.snarkfest.net/compare-talos/index.html?oldRevs=b3c08e6fa790&newRev=61fd0f987cf2&submit=true
We should be able to profile the harder-hit tests to find the regressing roots.
Assignee | ||
Comment 1•11 years ago
|
||
More precisely, it regressed dromaeo_css by 5%. The mootools subtest regressed by 10%. The subtests of the mootools subtest I don't have data on from compare-talos, but it's worth running on dromaeo.com to see what's going on there.
tracking-firefox29:
--- → ?
Comment 2•11 years ago
|
||
Could you help us understand how this will impact users?
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 3•11 years ago
|
||
How a 10% regression in our handling of a popular library will impact users? Probably by making web pages slower.
Flags: needinfo?(bzbarsky)
Updated•11 years ago
|
Comment 4•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #3)
> How a 10% regression in our handling of a popular library will impact users?
> Probably by making web pages slower.
Thanks for that I was not aware of the popularity of the library. Tracking based on Comment 3.
Reporter | ||
Comment 5•11 years ago
|
||
For the mootools:div~div test, most of the extra time looks like it's distributed like:
6.3% - js::Proxy::callProp
6.1% - js::Proxy::get
5.5% - mozilla::dom::HTMLDocumentBinding::DOMProxyHandler::get
1.8% - JS_ForwardGetPropertyTo
1.6% - JS_HasPropertyById
0.9% - JS_AlreadyHasOwnPropertyById
5.2% - js::jit::DoGetElemFallback
2.1% - proxy_GetElement
1.9% - js::Proxy::get
1.4% - mozilla::dom::NodeListBinding::DOMProxyHandler::get
1.5% - js::jit::DoGetPropFallback
1.2% - nsINode::QuerySelectorAll
Percentages are increase of total time.
Also of note: Call->Invoke->Invoke->RunScript->Interpret appears to be about 0.2% slower in this trace.
Assignee | ||
Comment 6•11 years ago
|
||
JS_ForwardGetPropertyTo ends up re-rooting everything. Maybe we should just handlify it? The callsites in mozilla::dom::HTMLDocumentBinding::DOMProxyHandler::get are all already rooted.
JS_HasPropertyById is already handlified, though.... So is JS_AlreadyHasOwnPropertyById.
nsINode::QuerySelectorAll shouldn't be affected by exact rooting at all. :(
Comment 7•11 years ago
|
||
I can take a look at handlifying JS_ForwardGetPropertyTo. It doesn't seem to be on the list of bug 959787.
Comment 8•11 years ago
|
||
I'd guess that JS_HasPropertyById and JS_AlreadyHasOwnPropertyById are more expensive with exact rooting because they locally root the return value of a function they call (an object and a shape). Handlification doesn't help because they aren't returning those values to any other function, they are just checking that the return value satisfies some property.
JS_HasPropertyById only checks that the returned shape isn't null (ignoring the object), and JS_AlreadyHasOwnPropertyById only checks that the returned object is the same as the object passed in (ignoring the shape). In theory, I could imagine creating some specialized version of LookupPropertyById that avoids the need to root the return values, by performing those checks deep within the callees, but in practice it would have to happen deep within some call stack, and it seems to involve some kind of generic function callbacks, so I'm not sure if you could even do it.
Reporter | ||
Comment 9•11 years ago
|
||
Two of those four roots aren't used after the call, so it may be safish to from-marked-location on them, as attached. Unfortunately that code calls into the ops table, so if someone is using those outparams to root and expects them to stay live, it could be quite bad.
Another option would be to make the ops support <NoGC>. Or to have some ballast rooted slots around. We could also have an interface that passes in the temporaries so that the caller could shave roots, maybe using the JS stack. All of that is going to be much more code, however.
Reporter | ||
Comment 10•11 years ago
|
||
This is a bit less sketchy. It removes or delays some roots in baseline's GetProp and GetElem fallbacks. I don't think it introduces any new rooting issues, but I've kicked off a try run to verify. Please disregard the review request if that is red.
https://tbpl.mozilla.org/?tree=Try&rev=c9d5fc5339fc
It seems to save about 0.3% - 1.2%, depending on the trace. Having run a handful more traces now, I can say that the stack from main up to event dispatch is either 0% or 6% slower, depending on the trace. I'm guessing this is a layout/code size/timing issue related to adding tons of new inline code more than specific instructions. Would be nice to hit this with VTune to be sure.
The majority of the remaining overhead (~5%) that might feasibly be in our control appears to be the Rooted in Proxy's INVOKE_ON_PROTOTYPE. I'm going to try getPrototype<NoGC> next.
Attachment #8365567 -
Attachment is obsolete: true
Attachment #8366269 -
Flags: review?(sphink)
Reporter | ||
Comment 11•11 years ago
|
||
I was (unsurprisingly) reading instrument's output incorrectly. For GetElem, the above patch actually does:
Total Self
Without: 660ms 255ms
With: 619ms 188ms
No EXR: 536ms 160ms
Not perfect, but still a substantial improvement. Looks like there is one hazard for me to deal with.
The GetProp stub is restored to its original performance, but is too few samples to make any real conclusions either way.
Reporter | ||
Comment 12•11 years ago
|
||
This should be better.
https://tbpl.mozilla.org/?tree=Try&rev=d9056942b386
Attachment #8366269 -
Attachment is obsolete: true
Attachment #8366269 -
Flags: review?(sphink)
Attachment #8366346 -
Flags: review?(sphink)
Reporter | ||
Comment 13•11 years ago
|
||
Proxy::get is not implicated: we bail to handler->get before creating the root. One instance of Proxy::get is even faster in the traces I'm looking at. It looks like the 1% performance shift either way is down to noise.
What *is* implicated is mozilla::dom::HTMLDocumentBinding::DOMProxyHandler::get: the self time here goes from 74ms before to 158ms afterwards. Other than that it's down to 5ms here, 7ms there. There may be some fat in the SetElementIC to trim too, including 20ms in DefinePropertyOrElement. Note, another instance of DOMProxyHandler for NodeListBinding is only a tiny bit slower too, so it may be something specific to HTMLDocumentBinding.
Reporter | ||
Updated•11 years ago
|
Attachment #8366369 -
Attachment description: cssquery-mootools.html → A single-file testcase useful for profiling.
Reporter | ||
Comment 14•11 years ago
|
||
Looking at the generated code, it seems that HTMLDocumentBinding::DOMProxyHandler::get has 4 new Rooted where 1 would do and NodeListBinding::DOMProxyHandler::get has maybe 1, so the numbers above do seem reasonable.
Assignee | ||
Comment 15•11 years ago
|
||
Hmm. So the Rooted in HTMLDocumentBinding::DOMProxyHandler::get are:
1) unforgeableHolder. Really does need to be rooted, afaict.
2) expando. Needs to be rooted if not null.
3) nameVal. Needs to be rooted, afaict.
4) result. This could probably be a raw JSObject*.
How would 1 Rooted do here?
Reporter | ||
Comment 16•11 years ago
|
||
Those 4 roots have non-overlapping lifetimes, so for the objects you could do what we do for the interpreter: Rooted<JSObject*> root; Rooted<JSObject*>& unforgeableHolder = root; Rooted<JSObject*>& expando = root; etc. Or just re-use the root with a generic name.
The Value nameVal is only used in the else branch of a MOZ_LIKELY if-statement, so could be moved into the MOZ_UNLIKELY half and hopefully not created on most calls.
Of course all this is generated code, so those interdependent pieces might be quite hard to implement in practice. I was not even able to find where any of that is generated with my brief grepping.
Assignee | ||
Comment 17•11 years ago
|
||
Ah, I see. I'll write up a patch to try that so we can test it.
Reporter | ||
Comment 18•11 years ago
|
||
Comment on attachment 8366346 [details] [diff] [review]
shave_roots_bl_get_stubs-v1.diff
Switching review, since Steve is sick this week.
Attachment #8366346 -
Flags: review?(sphink) → review?(jcoppeard)
Comment 19•11 years ago
|
||
Comment on attachment 8366346 [details] [diff] [review]
shave_roots_bl_get_stubs-v1.diff
Review of attachment 8366346 [details] [diff] [review]:
-----------------------------------------------------------------
So reduces the amount of rooting by rooting things later or not at all, mostly the JSScript where this is not used past something that can GC.
As such it looks fine, except for:
::: js/src/jit/BaselineIC.cpp
@@ +3971,2 @@
> // Handle optimized arguments[i] access.
> if (!GetElemOptimizedArguments(cx, frame, &lhsCopy, rhs, res, &isOptimizedArgs))
It turns out that GetElemOptimizedArguments() can set isOptimizedArgs to false and also update lhsCopy, meaning that this rearrangement is not valid.
We need to root lhsCopy in the original place so that GelElementOperation() below can see the new value.
The name 'isOptimizedArgs' is pretty misleading - it seems it really means 'is optimised args, and has been handled already'. Maybe renaming to 'handledOptimizedArgs' would make this clearer.
Attachment #8366346 -
Flags: review?(jcoppeard) → review+
Reporter | ||
Comment 20•11 years ago
|
||
Comment 21•11 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
status-firefox29:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Assignee | ||
Comment 22•11 years ago
|
||
How do the numbers look now? Should I still be looking into the comment 16 issue?
Flags: needinfo?(terrence)
Reporter | ||
Comment 23•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #22)
> How do the numbers look now? Should I still be looking into the comment 16
> issue?
http://graphs.mozilla.org/graph.html#tests=[[72,63,21]]&sel=none&displayrange=30&datatype=running
Hard to interpret. We're maybe about halfway back to where we were on dromaeo-css on 11 Jan, but still ahead of where we were on 10 Jan? The roots in the last patch appeared to be about 20% of the overall slowdown with the roots in comment 16 being about 60%. I think we should still at least try to take a stab at those still. Deadline for uplift is this weekend; is there anything I can do to help?
Status: RESOLVED → REOPENED
Flags: needinfo?(terrence)
Resolution: FIXED → ---
Assignee | ||
Comment 24•11 years ago
|
||
Another interesting question: why are we even ending up in Proxy::get here? What's a typical jsid coming through on those gets?
Assignee | ||
Comment 25•11 years ago
|
||
Attachment #8368199 -
Flags: review?(peterv)
Assignee | ||
Comment 26•11 years ago
|
||
Note that this basically backs out bug 950991, since JSAutoCompartment in fact can't GC now
Assignee | ||
Updated•11 years ago
|
Assignee: terrence → bzbarsky
Status: REOPENED → ASSIGNED
Assignee | ||
Comment 27•11 years ago
|
||
I filed bug 965992 on not calling into get() here at all. Once we land the attached patch, I think we should just mark this resolved and then focus on avoiding the slow path. ;)
Depends on: 965992
Reporter | ||
Comment 28•11 years ago
|
||
Agreed! I'm going to close this, as I think we've reached satisfaction with our investigation.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago → 11 years ago
status-firefox29:
fixed → ---
Resolution: --- → FIXED
Target Milestone: mozilla29 → ---
Reporter | ||
Comment 29•11 years ago
|
||
Whoops! I didn't realize that Boris' patch was still waiting on review.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Updated•11 years ago
|
Flags: needinfo?(bobbyholley)
Comment 30•11 years ago
|
||
And please don't forget the uplift request to 29 ;)
Assignee | ||
Comment 31•11 years ago
|
||
Hrm. Which parts of this need uplifting to 29? We should not have landed parts of the bug without others... :(
Comment 32•11 years ago
|
||
That might sound stupid but patches which fixed this bug ? :)
Assignee | ||
Comment 33•11 years ago
|
||
Yes, and which of those are not on 29 already?
Comment 34•11 years ago
|
||
Sorry, I don't know. This bug is on my radar because it still open and tracked.
Comment 35•11 years ago
|
||
Comment on attachment 8368199 [details] [diff] [review]
yet another part. Eliminate some unnecessary roots in HTMLDocumentBinding::DOMProxyHandler::get.
Review of attachment 8368199 [details] [diff] [review]:
-----------------------------------------------------------------
The generated code here looks reasonable enough. I'll trust Boris on the design bits. r=bholley
Attachment #8368199 -
Flags: review?(peterv) → review+
Updated•11 years ago
|
Flags: needinfo?(bobbyholley)
Assignee | ||
Comment 36•11 years ago
|
||
So to answer my own question, the parts that are not my patch already landed on 29, so don't need backporting.
Assignee | ||
Comment 37•11 years ago
|
||
Flags: in-testsuite+
Assignee | ||
Comment 38•11 years ago
|
||
Comment on attachment 8368199 [details] [diff] [review]
yet another part. Eliminate some unnecessary roots in HTMLDocumentBinding::DOMProxyHandler::get.
[Approval Request Comment]
Bug caused by (feature/regressing bug #): Exact rooting turn on
User impact if declined: Slightly slower DOM manipulation
Testing completed (on m-c, etc.): Compiles and runs.
Risk to taking this patch (and alternatives if risky): Risk should be pretty low.
String or IDL/UUID changes made by this patch: None.
Attachment #8368199 -
Flags: approval-mozilla-aurora?
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla30
Updated•11 years ago
|
status-firefox29:
--- → affected
status-firefox30:
--- → fixed
Updated•11 years ago
|
Attachment #8368199 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 40•11 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•