Closed Bug 1623792 Opened 4 years ago Closed 3 years ago

Optimize primitive dependency handling, taking advantage of spatial information from tile assignment.


(Core :: Graphics: WebRender, task)

Not set





(Reporter: gw, Assigned: gw)




(13 files, 1 obsolete file)

47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
47 bytes, text/x-phabricator-request
Details | Review
No description provided.
Assignee: nobody → gwatson
Keywords: leave-open

Remove associated enum and code path that are no longer used since
the removal of Push/PopClipChain display items.

Pushed by
Remove unused `add_prim_to_start` code path. r=nical

There is no longer a benefit to handling local clip rect as a
special case. Instead, if the local clip rect has any effect on
the primitive, add it to the primitive's clip chain.

This removes some special case code paths, and will simplify
future work as part of this bug, since we're no longer storing
clipping related state in the primitive instance.

This also removes the old special case handling of local clip rects
on primitives, which results in several new PASS reftest results.

This is a partial step towards a larger change. The goal of this and the
follow up patches is to move the tile cache instances to be stored in
the render backend, rather than inside the picture / primitive tree.

This will allow better caching of dependency and visibility state
across both frame and scene builds for primitives. This has the potential
to significantly reduce or eliminate the amount of work we do per-frame
to track per-primitive visibility, clip-chain state and tile assignments.

A longer term goal is to allow correlating up-to-date tile caches with
pipeline display lists that haven't changed. This would allow WR to
skip scene building for content display lists that haven't changed, if
only the outer pipeline content has changed.

Pushed by
Store tile cache instances separately from picture primitives. r=Bert,nical

Previously, tile cache instances were destroyed and recreated
each time a new scene was created, as they were embedded inside
the picture primitives. An elaborate but complicated system was
used to retain important state (such as native surfaces, primitive
dependencies) across new scenes.

This patch moves the tile cache instances to be stored inside the
render backend. It removes the previous code for retaining state
for each tile cache. Instead, tile caches are created / reused /
destroyed during new_async_scene_ready.

This removes quite a bit of complexity. More importantly, it is
another step towards being able to cache and retain state such
as primitive tile assignments and visibility state across both
new frames and scenes.

Pushed by
Pt 4 - Retain tile caches in render backend. r=nical,Bert

Previously, we would defer calculation of whether a filter was a
no-op until frame building. However, this complicates decisions
related to which clips apply to which surfaces.

Now, this decision is made during scene building. This makes things
more consistent during frame building, which we can take advantage
of in future. It also makes things like render target allocations
slightly more predictable.

Even when picture caching is disabled, we want to create the
wrapping picture elements for these slices, to simplify
some upcoming patches (we can assume that the picture cache
elements are always present).

The dynamic check at the start of FrameBuilder::build ensures
that these are pass-through pictures when picture caching is

In future, we'll also take advantage of this to track dirty
regions in the disabled case, to allow partial present to
work in all scenarios.

Pushed by
Pt 5 - Determine no-op filters during scene building. r=nical
Pt 6 - Always create picture caching slices. r=nical
Attachment #9156616 - Attachment is obsolete: true

There doesn't seem to be any need to have this per stacking context, as
the device pixel ratio is always the same during scene building (the
previous code just clones this object as each stacking context is

Pushed by
Pt 7 - Remove space snapper from stacking context stack. r=aosmond
Depends on: 1654442

If the picture isn't a pass-through, any clips attached to it are
handled when compositing that picture into its parent. Thus, we
don't need to push those clips onto the clip stack for child

This is probably not a noticeable performance win, but it's worth
doing to ensure it doesn't regress anything. A follow up commit will
remove the push_clip in this path completely, which will unblock
some other clip-chain optimization work.

Add a new flag for spatial nodes that identifies if the spatial
node is guaranteed to be identity (no transformation at all) for
the entirety of the scene (i.e. either a fixed identity reference
frame or a redundant scroll frame).

Although not used yet, this will be useful in future to help
remove clips during scene building that can't possibly affect
the content rendering no matter what occurs during frame building.

At the same time, port the other bool fields in spatial node into
a single bitflags.

Pushed by
Pt 8 - Add IS_IDENTITY to spatial node flags. r=jnicol
Pushed by
Only push pass-through picture clips onto clip stack. r=kvark

Much of the work that is currently done during frame building to
build clip-chain instances can actually be done during scene
building (most clips are in the same local space as the related

This patch adds ClipSet, and does the minimal amount of refactoring
to move clip_chain_id and local_clip_rect into this structure.

Future work will move much of the current clip chain node information
to be stored inline in ClipSet that doesn't need to be recalculated
each frame build. Eventually, this functionality can be stored as
part of the primitive template, which will avoid doing this work
during scene building unless the clips have changed.

Pushed by
Add ClipSet to PrimitiveInstance. r=kvark,nical

Although real world content generally has a small number of picture
cache slices (typically < 8), it's possible to create contrived
cases that create large numbers of picture cache slices. In these
cases, we want to ensure that we don't create too many slices, to
avoid allocating too much GPU memory for cached surface tiles.

Previously, at the end of scene building, WR would check if the
slice count exceeded the limit, and then merge all of those into
a single slice if required.

Instead of that, WR now retains the first MAX-1 slices, and just
creates the last slice to be a container for all subsequent prims,
if we hit that limit.

This means we can rely on the slice a prim is assigned never
changing. From this, we will be able to eliminate the shared
compositor clips once during scene building, rather than needing
to check these during every frame build.

Pushed by
Pt 9 - Refactor how exceeding max cache slices works. r=kvark

Now that picture cache slices are determined during add_prim, we
no longer need to create as many clusters to separate them based
on the prim flags.

Pushed by
Pt 10 - Remove an unused field and method. r=nical
Pushed by
Pt 11 - Remove unnecessary cluster flags. r=nical

Previously, there was typically a large difference between the
number of primitives in the display list and the number of prims
that were considered visible. However, the Gecko display port
ensures that this is not the case. Additionally, the size of
the instance structure was previously sensitive to performance,
as they were copied during picture cache slice creation, but this
is no longer the case.

As part of the work to retain more state between frames and scenes,
keep the prim visibility information embedded inside the instance
rather than creating and storing it per frame in the prim scratch

This shows up as a performance win in talos, and simplifies some
of the code that interacts with this. Follow up patches will
remove the existence of some of these fields, which are doing
duplicate / redundant work.

Pushed by
Pt 12 - Embed prim visibility info inside instance. r=nical

Fixed up the static assert test I missed, will re-land shortly. Thanks!

Flags: needinfo?(gwatson)
Pushed by
Pt 12 - Embed prim visibility info inside instance. r=nical

The leave-open keyword is there and there is no activity for 6 months.
:gw, maybe it's time to close this bug?

Flags: needinfo?(gwatson)

There is more work to be done, but yes, let's close this. We'll open future specific bugs and link to here when we return to working on this.

Closed: 3 years ago
Flags: needinfo?(gwatson)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.