Open Bug 1611134 Opened 4 years ago Updated 1 month ago

Process primitives by types instead of always in z-order

Categories

(Core :: Graphics: WebRender, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: nical, Unassigned)

References

(Blocks 2 open bugs)

Details

During frame building, each pass currently always go through primitives in z-order and each loop is basically a big match statement on the primitive kind. We get very poor instruction and data locality out of that.
Most of the processing actually doesn't need to be ordered (only the batching phase does).

I propose moving more of the primitive data into their own stores and iterating directly on these for the resource request phases. The specific primitive data would still be grouped by cluster to efficiently skip clusters after the visibility pass.
There is some logic that is common to all primitive types, for example the masking/segmentation logic. These could be also separated out into a single array of whatever data is needed for the masked primitives (and only the masked primitives).
Visibility would also be separated into a single stream.

Not trying to go full ECS here, but I'd like to get some of the cache efficiency ECSs typically get from processing by systems instead of globally, as well as a clear way to isolate code complexity into smaller parts.

Here is a very hand-wavey skeleton of what I have in mind:

pub struct Cluster {
    spatial_node: SpatialNodeIdx,

    primitives: Range<Globaldx>,

    foo_primitives: Range<SpecificIdx>,
    bar_primitives: Range<SpecificIdx>,
    // etc.
    clips: Range<SpecificIdx>,
}

struct Primitive {
    kind: PrimKind, // simple enum with no data fields
    idx: SpecificIdx,
    clip: SpecificIdx,
    // visibility info below potentially in a separate array.
    aabb: LayoutRect,
    visible: bool,
}

struct PrimStore {
    // indexed via GlobalIdx
    primitives: Vec<Primitive>,
    // indexed via specific idx,
    foos: Vec<FooPrimitive>,
    bars: Vec<BarPrimitive>,
    // etc.
    clips: Vec<ClipData>,
}

// Most of the frame building code would look like a succession of simple loops like:

fn process_foo(picture, ..) {
    for cluster in &picture.clusters {
        if !cluster.visible {
            continue;
        }
        for foo_primitive in &prim_store.foos[cluster.foo_primitives] {
            // If we need finer grained culling than cluster level, do it here as well,
            // for some types of primitives it might be cheaper to just process the
            // whole cluster including occluded primtives.
            if !prim_store.primitives[foo_primitive.global_idx].visible {
                continue;
            }

            // Do the thing.
        }
    }
}
Depends on: 1647299
Severity: normal → S3
Blocks: wr-todos
No longer blocks: gfx-complexity
You need to log in before you can comment on or make changes to this bug.