Closed
Bug 638328
Opened 14 years ago
Closed 13 years ago
Investigate the XPC overhead on WebGL calls.
Categories
(Core :: Graphics: CanvasWebGL, defect)
Core
Graphics: CanvasWebGL
Tracking
()
RESOLVED
FIXED
mozilla10
People
(Reporter: mattwoodrow, Assigned: bjacob)
References
(Depends on 1 open bug)
Details
Attachments
(4 files, 1 obsolete file)
1.61 KB,
patch
|
peterv
:
review+
|
Details | Diff | Splinter Review |
1.39 KB,
patch
|
peterv
:
review+
|
Details | Diff | Splinter Review |
5.45 KB,
patch
|
peterv
:
review+
|
Details | Diff | Splinter Review |
1.02 KB,
patch
|
roc
:
review+
|
Details | Diff | Splinter Review |
Inside a call to glUniform3f, I'm see a lot of time spent inside XPC (I think!).
xpc_qsUnwrapArgImpl - 1.1%
castNative - 1.1%
getWrapper - 0.4%
JS_ValueToNumber - 0.2%
The total of these is almost more time than is spent actually inside glUniform3f(). Is there any way we can reduce these?
Comment 1•14 years ago
|
||
There are some ways, yes. And there are plans to redo how all that works for Fx6 or so. Which quickstub are we talking about here? You may be able to do some stuff short-term to make this faster....
Assignee | ||
Comment 2•14 years ago
|
||
the quickstub for WebGL.uniform3f is defined in
content/canvas/src/CustomQS_WebGL.h
and has a 'TN version' (not that I actually know what that is. Vlad wrote that.)
Comment 3•14 years ago
|
||
TN == Traceable Native. That would be called from trace, but on jsgamebench we seem to always be in the methodjit (something else worth figuring out).
Depends on: 622298
Comment 4•14 years ago
|
||
Oh, and I did look at the code; none of the classes involved are doing any of the fast-path stuff we can do in quickstubs, and adding it would be some work. Probably better to just wait for the new bindings work, hence the dependency.
Assignee | ||
Comment 5•14 years ago
|
||
See bug 638321 comment 5: the XPC overhead for certain functions, at least Uniform3f, Uniform4f and BindTexture, is huge (it is as expensive as the function itself).
Assignee | ||
Comment 6•14 years ago
|
||
For Uniform4f:
| + 2.6%, nsIDOMWebGLRenderingContext_Uniform4f(JSContext*, unsigned int, unsigned long long*), XUL
| | + 1.0%, mozilla::WebGLContext::Uniform4f(nsIWebGLUniformLocation*, float, float, float, float), XUL
| | | - 0.7%, glUniform4fARB_Exec, GLEngine, XUL
| | | 0.1%, nsCOMPtr_base::assign_from_qi_with_error, XUL
| | | 0.1%, mozilla::WebGLUniformLocation::Release(), XUL
| | | 0.0%, nsCOMPtr_base::~nsCOMPtr_base(), XUL
| | - 0.4%, xpc_qsUnwrapArgImpl, XUL
| | - 0.3%, getWrapper, XUL
| | 0.2%, mozilla::WebGLContext::QueryInterface(nsID const&, void**), XUL
| | - 0.2%, castNative, XUL
| | - 0.2%, mozilla::WebGLContext::Release(), XUL
| | 0.1%, JS_ValueToNumber, XUL
| | 0.1%, glUniform4f, libGL.dylib
So despite being quickstubbed, we pay a huge price for just converting JS arguments to C++ values.
Assignee | ||
Comment 7•14 years ago
|
||
Another remark: we also pay a high price (10% in that function) for QueryInterface and Release.
Assignee | ||
Comment 8•14 years ago
|
||
Let's break down the time spent in Uniform4f:
* 38% spent converting JS arguments to C++ values
* 31% spent in OpenGL
* 12% spent in Release() (on WebGLContext and on WebGLUniformLocation)
* 12% spent in QueryInterface and assign_from_qi_with_error
Comment 9•14 years ago
|
||
Adding a bit for nsIDOMWebGLRenderingContext and nsIWebGLUniformLocation to DOMCI_CASTABLE_INTERFACES might help.
Assignee | ||
Comment 10•14 years ago
|
||
Now this is weird: a bug report complains that those uniformXx functions perform actually worse in Chrome than in FF:
http://code.google.com/p/chromium/issues/detail?id=74526
@ Peter: thanks for the tip, I will try to make a patch.
Assignee | ||
Comment 11•13 years ago
|
||
Finally got Peter to explain me what he meant!
Peter -> just one thing I don't remember. I can understand that this will allow to optimize things on the xpconnect/quickstubs side. But what about "my" side, in the C++ implementation of WebGL functions: there, I am doing QueryInterface to get e.g. a WebGLUniformLocation* from a nsIWebGLUniformLocation*. Will this QueryInterface magically go faster thanks to the present patch, or should I replace it by something else? (A plain cast?)
Attachment #560851 -
Flags: review?(peterv)
Assignee | ||
Comment 12•13 years ago
|
||
Some WebGL functions were opted out from auto-quickstubbing because they used to directly access the JS context to manipulate the JS return value. They don't do that anymore, instead they return a nsIVariant.
Attachment #560852 -
Flags: review?(peterv)
Assignee | ||
Comment 13•13 years ago
|
||
previous version didn't build, as getContextAttributes is still manipulating its return jsval. Other functions are fine.
Attachment #560852 -
Attachment is obsolete: true
Attachment #560852 -
Flags: review?(peterv)
Attachment #560854 -
Flags: review?(peterv)
Comment 14•13 years ago
|
||
Benoit, if you list nsIWebGLUniformLocation as "builtinclass" in the IDL (see the way nsIDOMEventTarget does it, say), then you can assume that given an nsIWebGLUniformLocation the C++ class is one that you wrote. If there's only one class that implements that interface, then you can just cast to it instead of using QI.
Assignee | ||
Comment 15•13 years ago
|
||
Should I be parcimonious in sprinkling "builtinclass" over the IDL?
I mean, should I only put "builtinclass" on nsIWebGLUniformLocation and other super-performance-critical interfaces, or is it OK to put it all over the place? What is the cost of putting "builtinclass" on more classes than necessary?
I'm asking because in the WebGL implementation we have some generic functions to handle WebGL function arguments, see GetConcreteObjectAndGLName and friends in WebGLContext.h, so it would be nice to keeping treating all WebGL function parameters in the same way, rather than having to treat the "builtinclass" ones differently.
Assignee | ||
Comment 16•13 years ago
|
||
...or wait, is it possible to determine, using template metaprogramming, whether a type is a "builtinclass"? That would allow me to update GetConcreteObject and friends to handle this in a completely fool-proof way.
Comment 17•13 years ago
|
||
(In reply to Benoit Jacob [:bjacob] from comment #15)
> I mean, should I only put "builtinclass" on nsIWebGLUniformLocation and
> other super-performance-critical interfaces, or is it OK to put it all over
> the place? What is the cost of putting "builtinclass" on more classes than
> necessary?
You can put it on any class that's never supposed to be implemented in JS. There shouldn't be any downside afaik. Note that from C++, for example from GetConcreteObjectAndGLName, there is no easy way to detect whether the interface is marked "builtinclass" or not.
Assignee | ||
Comment 18•13 years ago
|
||
Attachment #564061 -
Flags: review?(peterv)
Updated•13 years ago
|
Attachment #560854 -
Flags: review?(peterv) → review+
Updated•13 years ago
|
Attachment #560851 -
Flags: review?(peterv) → review+
Comment 19•13 years ago
|
||
Comment on attachment 564061 [details] [diff] [review]
use builtinclass and remove do_QueryInterface
Review of attachment 564061 [details] [diff] [review]:
-----------------------------------------------------------------
::: content/canvas/src/WebGLContext.h
@@ +2111,5 @@
>
> if (isNull)
> *isNull = PR_FALSE;
>
> + *aConcreteObject = static_cast<ConcreteObjectType*>(aInterface);
It might still be a good idea to assert in a debug build that a QI gives the same object (especially since all those classes already have an IID). Could you also file a bug to add a way to statically assert from C++ that a given interface is marked builtinclass, and to do that here? Given that this is a fairly generic conversion function it'd be good that we could make sure that anything that this converts is safe.
Attachment #564061 -
Flags: review?(peterv) → review+
Assignee | ||
Comment 20•13 years ago
|
||
Filed follow-up bug 694114.
Assignee | ||
Comment 21•13 years ago
|
||
Comment 22•13 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/ac13dbf11d24
https://hg.mozilla.org/mozilla-central/rev/ed8aa9cecb82
https://hg.mozilla.org/mozilla-central/rev/ca162fd3c352
Assignee: nobody → bjacob
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla10
Assignee | ||
Comment 23•13 years ago
|
||
This crept in my last minute changes, unreviewed, to address review comments. sorry.
Attachment #567068 -
Flags: review?(roc)
Attachment #567068 -
Flags: review?(roc) → review+
Assignee | ||
Comment 24•13 years ago
|
||
Assignee | ||
Comment 25•13 years ago
|
||
These patches make a huge difference on Google MapsGL!
Assignee | ||
Comment 26•13 years ago
|
||
Should I try to get Aurora approval for these patches? If WebGL performance is considered strategic?
Comment 27•13 years ago
|
||
Comment on attachment 567068 [details] [diff] [review]
fix bug in assertion, bad usage of queryinterface
>+ nsCOMPtr<ConcreteObjectType> tmp(do_QueryInterface(aInterface, &rv));
Nit: this should actually read
nsRefPtr<ConcreteObjectType> tmp(do_QueryObject(aInterface, &rv));
[The effect is the same in this case.]
Updated•13 years ago
|
OS: Mac OS X → All
Hardware: x86 → All
Assignee | ||
Updated•13 years ago
|
Blocks: mapsgl-performance
You need to log in
before you can comment on or make changes to this bug.
Description
•