Closed Bug 1587428 Opened 5 years ago Closed 5 years ago

Several redundant full-size tile composite passes

Categories

(Core :: Graphics: WebRender, defect, P3)

defect

Tracking

()

RESOLVED DUPLICATE of bug 1591526

People

(Reporter: nical, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

I'm not sure how to interpret what I'm seeing in renderdoc, but I get a first composite pass that render the content picture tiles, followed by two z-rejected full-size draw calls over the content area with tile-sized quads at different vertical offsets, and then a final draw call for the tabs.

See in the attached image the three passes (with the wireframe in yellow that is a bit hard to see).

Previously noted in https://bugzilla.mozilla.org/show_bug.cgi?id=1584794#c17 (see point 5). Might be related to how we have multiple picture caching slices now (for fixed-position stuff versus the scrolled stuff).

Yes, these are (currently) expected, from the background tiles in the content and the UI.

They should be getting z-rejected (appears they are from your comment above), so shouldn't be a huge cost, even though they are not ideal.

My plan is to occlude these tiles on the CPU - I need to do a small amount of refactoring to allow this. The refactoring will allow us to not only skip compositing those tiles, but also know early enough to skip allocating a texture and rasterizing those occluded tiles.

Assignee: nobody → gwatson
Depends on: 1587676
Summary: Three two redundant full-size tile composite passes → Several redundant full-size tile composite passes

I did some profiling of this on my (reasonably old) Intel HD4600 integrated GPU on a 4k screen in RenderDoc.

I did a WR capture, and then replayed this in wrench with the --no-batch option, and got draw call timings in RenderDoc for each of the tile blits.

The tile blits for the content tiles at the front of the screen are ~0.45 - ~0.5ms per tile. The timings for the occluded tiles are typically ~0.002 - ~0.02 ms per tile.

All up, the reported GPU time for the blits of the occluded tiles on that GPU was 0.21 ms / frame, which is ~1.3% of the GPU frame budget time.

I suspect that this is due to the use of hi-z allowing the tiles to be rejected quickly and with minimal memory bandwidth, but we should perhaps try to verify this with GPA which can probably get the Intel hardware counters.

Given this, I think we can probably make this a low priority for now (there's much bigger GPU wins to be had elsewhere), unless we can find a system where the profiler reports significant GPU time in these passes. Does that sound reasonable?

Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(dmalyshau)

Yes, this is totally low priority. Note that Hi-Z may be subject to HW/driver differences, so your footprint on Android, for example, may be different.

Flags: needinfo?(dmalyshau)
Assignee: gwatson → nobody

It makes the picture caching debugging view very slow and hard to read, though. It would be good to address that at least.

Flags: needinfo?(nical.bugzilla)

In various discussions, it became clear that having the tile occlusion would provide several other benefits too (see https://bugzilla.mozilla.org/show_bug.cgi?id=1591526#c1 for more information).

Given those reasons, I'm planning to make this a higher priority - hopefully get it done in the next week or so.

Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: