Open
Bug 935315
Opened 11 years ago
Updated 2 years ago
Import WebGL API directly into Wasm without needing to route via handwritten thunks in JavaScript
Categories
(Core :: Graphics: CanvasWebGL, enhancement, P5)
Core
Graphics: CanvasWebGL
Tracking
()
NEW
People
(Reporter: dougc, Unassigned)
References
(Blocks 1 open bug)
Details
(Whiteboard: webgl-next)
Attachments
(1 file)
337.94 KB,
image/png
|
Details |
* The asm.js subset does not handle JS objects and the current WebGL spec requires JS objects for many functions, both as arguments and as a result. This leads to asm.js code calling JS trampoline functions that map between an integer name to respective JS objects, and this is rather inefficient and also makes it impossible to directly call the WebGL functions from asm.js.
Perhaps it would be possible to extend WebGL to assign integer names to objects and to accept these where objects are currently accepted. New utility functions could be added to map the integer names to the JS objects.
It's not clear how best to handle the WebGL functions that return a JS object. Asm.js might still be able to inline function calls that immediately map the result to an integer name, but this would require an extension to the asm.js FFI and might not be practical. For example: createTexture().name
* Further, asm.js works efficiently with one heap, and it is currently necessary to copy between this heap and separate typed arrays when calling WebGL functions.
Perhaps the WebGL functions that accept a typed array could be extended to accept optional 'start' and 'end' arguments so that the asm.js heap could be passed directly. This would avoid the copying and support the direct calling of WebGL functions from asm.js.
Comment 1•11 years ago
|
||
Are we trying to solve a performance problem here, i.e. is it the case that the needed layer between asm.js code and code calling WebGL is a large performance burden? Do you have profiles showing that?
Comment 2•11 years ago
|
||
Alon, have you run into similar issues while working on Emscripten OpenGL->WebGL translation?
Flags: needinfo?(azakai)
Comment 3•11 years ago
|
||
I don't think this is worth investigating without having some benchmarks which indicate that these are actual pain points. In particular, GL is largely unbottlenecked by API call volume, so making the average call less expensive shouldn't have much impact on performance.
It should not be necessary to copy to a separate typed array for calling WebGL functions. Please note which API entrypoints require this. buffer(Sub)Data and tex(Sub)Image2d both accept ArrayBufferViews, which can be made from slices of existing ArrayBuffers at the expense of the garbage created by constructing these temporary views. With GGC, I don't think there will be any major GC pressure to worry about.
Comment 4•11 years ago
|
||
We do need to do an FFI call from asm to do WebGL. However, in a well-optimized codebase there should not be too many such calls, I am hoping. Aside from the FFI, we do need to do lookups to e.g. match gl ids to WebGL objects, then we pass the objects to WebGL. That adds some overhead, but again, I hope in a good codebase it isn't terrible. I don't have benchmarks showing either way though.
My thinking actually is that we can eliminate most of this overhead by proxing WebGL calls from a codebase running in a worker. So glDrawArrays would send a message using integer IDs and so forth, and the main thread would do the actual lookup and then WebGL call. If postMessage were very optimized, it seems like this would minimize the amount of work done in the compiled codebase.
Flags: needinfo?(azakai)
Updated•11 years ago
|
Whiteboard: webgl-next
Reporter | ||
Comment 5•11 years ago
|
||
(In reply to Jeff Gilbert [:jgilbert] from comment #3)
> I don't think this is worth investigating without having some benchmarks
> which indicate that these are actual pain points. In particular, GL is
> largely unbottlenecked by API call volume, so making the average call less
> expensive shouldn't have much impact on performance.
I saw some hints that these could be hot when profiling wmw. The asm.js call overhead accounted for a few % of the cycles, and perhaps the calls to the JS WebGL trampoline functions account for some of this. There were also hints that the GC activity was excessive. I agree that some targeted profiling would be useful.
> It should not be necessary to copy to a separate typed array for calling
> WebGL functions. Please note which API entrypoints require this.
> buffer(Sub)Data and tex(Sub)Image2d both accept ArrayBufferViews, which can
> be made from slices of existing ArrayBuffers at the expense of the garbage
> created by constructing these temporary views. With GGC, I don't think there
> will be any major GC pressure to worry about.
Good points. It would probably be possible for the asm.js syntax to accept calls to WebGL functions that require an ArrayBufferView if the argument creates the view - it could optimize away the creation of the view and directly call an internal function with the buffer and offset.
Further, it would probably be possible for the asm.js syntax to accept calls to WebGL functions that return JS objects so long as the result is immediately mapped back to an integer name - the creation of the result object could be optimized away.
This leaves extending WebGL to:
1. Assign an integer 'name' element to objects created by WebGL functions.
2. Accept this integer name in place of the object in WebGL function arguments.
3. Add functions to map these integer names back to their associated JS objects (which would ideally be allocated only if requested).
The Emscripten GL library already does this in trampolines between asm.js and the JS WebGL functions.
These same optimizations could also help non-asm.js JS code reduce consing and optimize calls to WebGL functions.
Comment 6•11 years ago
|
||
I don't think moving the int->class trampoline into C++ will help extraordinarily with performance, since we have to convert int->class at some point. There's a bunch of information attached to the CPP classes that we need to process calls; we certainly can't just pipe WebGL int values through to GL directly. (also WebGL objects could be backed by between zero and two GL objects, at times)
I would want to see some damning evidence of performance issues with doing this step in JS before I would suggest trying to modify the spec.
I would instead suggest putting together a performant JS library to provide the functionality you want.
I do quite like the idea of trying to optimize clearly-transient ArrayBufferViews. I think we should spin that off into another bug and see what the JS guys think. I filed bug 936168 for this.
Comment 7•8 years ago
|
||
In the new WebGL 2 spec, this was optimized by some amount by creating new entry points that don't need to allocate any temporary garbage. This has been observed to effectively reduce stuttering, as well as giving a nice 5% or so performance win on WebGL heavy applications.
Now, when profiling an upcoming WebGL heavy demo running with Wasm and WebGL 2, I do see the JS -> C/C++ input validation that WebIDL bindings generator generates to be the third highest bottleneck for the CPU after ANGLE and WebGL API validation are "optimized" out. ("optimized" here meaning hackishly disabled for profiling purposes)
Running with AMD CodeXL, we see the following locations as the biggest hotspots for the application on the native C/C++ side:
http://clb.demon.fi/dump/profiles/webgl_dom_validation_after_webgl_hacks.png
The estimation is that the JS -> C/C++ WebIDL validation (highlighted in red) is slowing down the application by 11.6% in the tested demo.
This profile strongly encourages to pursue this type of optimization. Luke has already been roadmapping this for the future, so "importing" WebGL 2 API directly to Wasm would be a relatively big performance gain.
Comment 8•8 years ago
|
||
Err, adding the above screenshot as attachment in case the hosting is not available in the future.
Updated•8 years ago
|
Summary: Spec. WebGL compatible with asm.js → Import WebGL API directly into Wasm without needing to route via handwritten thunks in JavaScript
![]() |
||
Comment 9•8 years ago
|
||
Thanks for that data. 11.6% is quite significant and a good motivation. Do you have any estimation if this is the common case and what the realistic (i.e., not synthetic/pathological) worst case is?
Comment 10•8 years ago
|
||
This is a common case for WebGL heavy applications, and I think given that this is one of the largest demos so far, also probably the worst case we are seeing in the real world.
Updated•8 years ago
|
Blocks: webgl-perf-parity
Updated•6 years ago
|
Type: defect → enhancement
Priority: -- → P5
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•