Investigate update_visibility overhead & prepare_primitives
Categories
(Core :: Graphics: WebRender, enhancement, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox78 | --- | affected |
People
(Reporter: bpeers, Assigned: bpeers)
References
(Blocks 1 open bug)
Details
Attachments
(6 files)
Some measuring with Tracy shows that update_visibility
is called hundreds of times each frame, regardless of caching, and that more than half this time is in update_prim_dependencies
.
We seem to be spending about as much time in the loop around calling add_prim_dependency
as in that call itself. Which matches this TODO that there may be some wins here.
I added some tracy_plots
and, in a windowed browser scrolling Earth:
- the majority of tile caches has 8 tiles;
- the majority of primitives touch 1 tile;
- the majority of primitives touch the same tile as the previous primitive;
- scrolling to the bottom already hits Y=70 and "Earth" is not really a huge page, so a fixed 2D array won't cut it;
- there are some outliers, some primitive thinks it's at tile Offset X = 112;
So a simple 2D array is probably too basic, unless it supports scrolling / virtual origins, and/or has a hashmap for overflowing/storing "odd" outliers.
Instead I can start with a small vec (the above suggests 8 entries) of {TileOffset, Box<Tile>}
for 16 bytes each or 2 cache lines total, and do a simple linear scan to find the offset, and remembering+returning the last search.
It should be interesting to see if that helps, and not be so much work that it's a waste when the answer is "No".
Disclaimer: I'm trying to measure with RUSTC OPT LEVEL 2 and "release [optimized]" but I'm not 100% sure my local build is the fastest it could be. Still probably ballpark correct.
Comment 1•5 years ago
|
||
A fixed 2D array based on the content size won't work, but the grid of tiles we maintain is based around the tiles that intersect the current display port, so should be relatively small and constant for a given screen size (and should be independent of the content size).
The key parts are [1] where we select a tile grid around the current viewport, and discard / retain tiles, and [2] where we clamp the indices of tiles that are affected by a primitive to the current tile grid size.
[1] https://searchfox.org/mozilla-central/source/gfx/wr/webrender/src/picture.rs#2817
[2] https://searchfox.org/mozilla-central/source/gfx/wr/webrender/src/picture.rs#2459
Assignee | ||
Comment 2•5 years ago
|
||
8-entry smallVec with 1 item of LRU on Xeon.
Assignee | ||
Comment 3•5 years ago
|
||
.. and Haswell. Both have a savings of ~0.1 to ~0.2ms, although I should get more runs to make this fair. I'll see what Talos says instead.
I like the SmallVec as it doesn't need to be set up or track any dynamic windows, it's pretty simple and fast (no multiplications, modulo, bounds checking). The key part is that we 'almost always' only hit 1 tile and it's 'almost always' the same one as last time, so all that matters is a very fast LRU check and everything else is just backing store for the "special case". Well, at least scrolling Earth is like that :)
I need to vendor tracy-rs first before I can send this out for review & Talos.
Assignee | ||
Comment 4•5 years ago
|
||
Comment 6•5 years ago
|
||
bugherder |
Assignee | ||
Comment 7•5 years ago
|
||
Plot glyph rasterization, evictions, GPU allocs
Assignee | ||
Updated•5 years ago
|
Updated•5 years ago
|
Assignee | ||
Comment 8•5 years ago
|
||
Reduce loop overhead when registering tile dependencies;
most dependencies touch the same one tile repeatedly, so
do a last-access optimization, and also keep the tiles in
2 cache lines for the common case of <= 8 tiles per cache.
Assignee | ||
Comment 9•5 years ago
|
||
This is not update_visibility
per se, but regarding PreparePrims
(which comes immediately after): I noticed when scrolling Earth that ~100 TextRuns come in per frame and they all have a bunch of glyphs that need to be checked against the cache. Again simply due to volume this takes a bit of time (eg. ~0.3ms) even though nothing needs to be done (all glyphs have been seen before).
Part of that goes into preparing the GlyphKeys, I noticed that the subpixel mode was Horizontal most of the time and the transform was the identity most of the time; but special-casing that yielded small savings per frame (only about 20us per frame).
match subpx_dir {
SubpixelDirection::Horizontal => {
if transform.is_identity() {
self.glyph_keys_range = scratch.glyph_keys.extend(
glyphs.iter().map(|src| {
GlyphKey::new_subpixel_horizontal(src.index, src.point.x + prim_offset.x)
}));
} else {
self.glyph_keys_range = scratch.glyph_keys.extend(
glyphs.iter().map(|src| {
let src_point = src.point + prim_offset;
let device_offset = transform.transform(&src_point);
GlyphKey::new_subpixel_horizontal(src.index, device_offset.x)
}));
}
},
_ => {
if transform.is_identity() {
self.glyph_keys_range = scratch.glyph_keys.extend(
glyphs.iter().map(|src| {
let src_point = src.point + prim_offset;
let device_offset = DevicePoint::new(src_point.x, src_point.y);
GlyphKey::new(src.index, device_offset, subpx_dir)
}));
} else {
self.glyph_keys_range = scratch.glyph_keys.extend(
glyphs.iter().map(|src| {
let src_point = src.point + prim_offset;
let device_offset = transform.transform(&src_point);
GlyphKey::new(src.index, device_offset, subpx_dir)
}));
}
}
}
Oh well :)
Comment 10•5 years ago
|
||
Comment 11•5 years ago
|
||
Backed out for build bustages.
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=304656231&repo=autoland
Backout: https://hg.mozilla.org/integration/autoland/rev/bcf1957993d59ef5cd3c167817b4a9b0dce47515
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 12•5 years ago
|
||
Comment 13•5 years ago
|
||
Comment 14•5 years ago
|
||
bugherder |
Comment 15•4 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:jimm, maybe it's time to close this bug?
![]() |
||
Updated•4 years ago
|
Updated•4 years ago
|
Comment 16•4 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:jimm, maybe it's time to close this bug?
![]() |
||
Updated•4 years ago
|
Comment 17•3 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:jimm, maybe it's time to close this bug?
![]() |
||
Updated•3 years ago
|
Comment 18•11 months ago
|
||
Hi :gw, is there anything to salvage here (like https://phabricator.services.mozilla.com/D77573) or should this be closed?
Updated•11 months ago
|
Updated•11 months ago
|
Description
•