Use 1:1 texture sampling with swgl when possible
Categories
(Core :: Graphics: WebRender, enhancement, P3)
Tracking
()
People
(Reporter: jrmuizel, Assigned: lsalzman)
References
(Blocks 3 open bugs)
Details
(Keywords: perf-alert)
Attachments
(1 file)
When we're using the brush opacity shader we're generally sampling from the source 1 to 1. Special casing this somehow, will let us go quite a bit faster.
Updated•4 years ago
|
Reporter | ||
Updated•4 years ago
|
Assignee | ||
Comment 1•4 years ago
|
||
Having a variation of the shader that used texelFetch could be one way to achieve this, since we would bypass linear filtering...
Reporter | ||
Comment 2•4 years ago
|
||
Glenn, how easy is it know in WebRender when we're doing 1:1 texture sampling?
Reporter | ||
Comment 3•4 years ago
|
||
Another alternative approach is a variant of glBlitFramebuffer that does blending.
Assignee | ||
Comment 4•4 years ago
|
||
It looks like https://www.khronos.org/registry/OpenGL/extensions/NV/NV_draw_texture.txt would be one way to piggy back on a quad drawing extension that already supports blending inherently.
Or it would be worth considering just graduating most brush opacity cases into actual surfaces that would be handled by the render compositor...
Comment 5•4 years ago
|
||
Promoting them into compositor surfaces is an intriguing idea, because the vast majority of the code is already present.
I think all we'd need to do is (a) support opacity in compositor surfaces and (b) add a field to the compositor capabilities trait that expresses that a compositor wants / prefers to have images promoted whenever possible (perhaps with some kind of resolution limit / max promoted count or similar).
Both of these should be a small amount of code, and fit well into the existing infrastructure. If I understand correctly, that would mean for SWGL that those images get composited in parallel with the main tile rasterization code? It's also probably beneficial for DC (and perhaps CA) for large images too?
Reporter | ||
Comment 6•4 years ago
|
||
One disadvantage of promoting is that memory usage goes up. Because we need to keep all of the temporaries around until compositing instead of being able to reuse them during rendering. I don't know how much WebRender reuses and how much of a problem this would be in practice.
Reporter | ||
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Updated•4 years ago
|
Assignee | ||
Comment 7•4 years ago
|
||
Mainly this implements a new set of SWGL intrinsics based around swgl_allowTextureNearest
and swgl_commitTextureNearest which can fairly easily provide a further fast-path above
and beyond swgl_commitTextureLinear. This requires the row be from an axis-aligned 1:1
draw so that we can do something not unlike a fast copy of the texture data straight
to the destination in cases where even the linear filter would be essentially doing
the same thing in a more expensive way. For now, only a few WR shaders that were already
using swgl_commitTextureLinear have been fast-pathed with the new intrinsics to see if
this provides significant performance benefit.
Updated•4 years ago
|
Comment 9•4 years ago
|
||
bugherder |
Comment 10•4 years ago
|
||
== Change summary for alert #28213 (as of Wed, 23 Dec 2020 09:38:03 GMT) ==
Improvements:
Ratio | Suite | Test | Platform | Options | Absolute values (old vs new) |
---|---|---|---|---|---|
33% | glterrain | macosx1014-64-shippable-qr | e10s stylo webrender-sw | 8.70 -> 5.86 | |
31% | rasterflood_gradient | macosx1014-64-shippable-qr | e10s stylo webrender-sw | 179.83 -> 235.92 | |
31% | glterrain | windows10-64-shippable-qr | e10s stylo webrender-sw | 3.28 -> 2.26 | |
27% | rasterflood_gradient | linux64-shippable-qr | e10s stylo webrender-sw | 216.00 -> 274.92 | |
27% | rasterflood_gradient | linux64-shippable-qr | e10s stylo webrender-sw | 216.58 -> 274.83 | |
27% | rasterflood_gradient | windows10-64-shippable-qr | e10s stylo webrender-sw | 203.83 -> 258.17 | |
12% | tsvgx | macosx1014-64-shippable-qr | e10s stylo webrender-sw | 371.28 -> 327.58 | |
10% | tsvgx | windows10-64-shippable-qr | e10s stylo webrender-sw | 274.42 -> 247.58 |
For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=28213
Assignee | ||
Updated•4 years ago
|
Description
•