Closed Bug 1669841 (sw-wr-perf-nearest) Opened 4 years ago Closed 3 years ago

Use 1:1 texture sampling with swgl when possible

Categories

(Core :: Graphics: WebRender, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
86 Branch
Tracking Status
firefox83 --- disabled
firefox86 --- fixed

People

(Reporter: jrmuizel, Assigned: lsalzman)

References

(Blocks 3 open bugs)

Details

(Keywords: perf-alert)

Attachments

(1 file)

When we're using the brush opacity shader we're generally sampling from the source 1 to 1. Special casing this somehow, will let us go quite a bit faster.

Blocks: 1669520
Severity: -- → N/A
Priority: -- → P3
Blocks: 1623093
Blocks: sw-wr-perf

Having a variation of the shader that used texelFetch could be one way to achieve this, since we would bypass linear filtering...

Glenn, how easy is it know in WebRender when we're doing 1:1 texture sampling?

Flags: needinfo?(gwatson)

Another alternative approach is a variant of glBlitFramebuffer that does blending.

It looks like https://www.khronos.org/registry/OpenGL/extensions/NV/NV_draw_texture.txt would be one way to piggy back on a quad drawing extension that already supports blending inherently.

Or it would be worth considering just graduating most brush opacity cases into actual surfaces that would be handled by the render compositor...

Promoting them into compositor surfaces is an intriguing idea, because the vast majority of the code is already present.

I think all we'd need to do is (a) support opacity in compositor surfaces and (b) add a field to the compositor capabilities trait that expresses that a compositor wants / prefers to have images promoted whenever possible (perhaps with some kind of resolution limit / max promoted count or similar).

Both of these should be a small amount of code, and fit well into the existing infrastructure. If I understand correctly, that would mean for SWGL that those images get composited in parallel with the main tile rasterization code? It's also probably beneficial for DC (and perhaps CA) for large images too?

Flags: needinfo?(gwatson)

One disadvantage of promoting is that memory usage goes up. Because we need to keep all of the temporaries around until compositing instead of being able to reuse them during rendering. I don't know how much WebRender reuses and how much of a problem this would be in practice.

Blocks: 1678800
Blocks: 1681778
Blocks: 1678779
Blocks: 1676253
Blocks: 1675621
Blocks: 1674478
Blocks: 1674015
Blocks: 1655530
Alias: sw-wr-perf-nearest

Mainly this implements a new set of SWGL intrinsics based around swgl_allowTextureNearest
and swgl_commitTextureNearest which can fairly easily provide a further fast-path above
and beyond swgl_commitTextureLinear. This requires the row be from an axis-aligned 1:1
draw so that we can do something not unlike a fast copy of the texture data straight
to the destination in cases where even the linear filter would be essentially doing
the same thing in a more expensive way. For now, only a few WR shaders that were already
using swgl_commitTextureLinear have been fast-pathed with the new intrinsics to see if
this provides significant performance benefit.

Assignee: nobody → lsalzman
Status: NEW → ASSIGNED
Pushed by lsalzman@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/8b46e29f0c12
provide 1:1 rendering fast-paths for some SWGL shaders. r=jrmuizel
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 86 Branch

== Change summary for alert #28213 (as of Wed, 23 Dec 2020 09:38:03 GMT) ==

Improvements:

Ratio Suite Test Platform Options Absolute values (old vs new)
33% glterrain macosx1014-64-shippable-qr e10s stylo webrender-sw 8.70 -> 5.86
31% rasterflood_gradient macosx1014-64-shippable-qr e10s stylo webrender-sw 179.83 -> 235.92
31% glterrain windows10-64-shippable-qr e10s stylo webrender-sw 3.28 -> 2.26
27% rasterflood_gradient linux64-shippable-qr e10s stylo webrender-sw 216.00 -> 274.92
27% rasterflood_gradient linux64-shippable-qr e10s stylo webrender-sw 216.58 -> 274.83
27% rasterflood_gradient windows10-64-shippable-qr e10s stylo webrender-sw 203.83 -> 258.17
12% tsvgx macosx1014-64-shippable-qr e10s stylo webrender-sw 371.28 -> 327.58
10% tsvgx windows10-64-shippable-qr e10s stylo webrender-sw 274.42 -> 247.58

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=28213

Keywords: perf-alert
Blocks: 1681747
No longer blocks: sw-wr-perf-pbo
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: