Closed
Bug 1480317
Opened 6 years ago
Closed 6 years ago
Scrolled surface caching
Categories
(Core :: Graphics: WebRender, enhancement, P4)
Tracking
()
RESOLVED
FIXED
People
(Reporter: kats, Unassigned)
References
Details
(Whiteboard: [gfx-noted])
We should investigate the cost/benefit of caching the final frame that gets produced by WR, so that we can do less work when async scrolling and consistently hit 60fps.
Reporter | ||
Updated•6 years ago
|
Blocks: wr-investigate
Updated•6 years ago
|
Priority: -- → P3
Comment 1•6 years ago
|
||
Here's a rough description of this could work from Glenn:
"
- So, we currently calculate screen bounding rects for each item
- We can easily extend this to use those to mark a coarse tile grid to let us know which tiles are affected by each primitive
- e.g. 1024 x 512 tiles
- each primitive gets assigned to an alphabatcher struct for each tile it touches
- we draw each tile to an off-screen surface, setting a scissor rect
- each tile is then blitted to the main framebuffer at the correct offset
complexities are:
- there is a performance hit on the first frame - we have to draw to a surface, then blit - so it's not always clear when we *should* be doing this (unless we just decide to always do it)
- scissoring rect may affect batching a little bit - there's some tricks we can use here to avoid needing a scissor rect in most cases (it would only be in the presence of 3d transforms we normally need scissor)
- during a scroll, we need to consider which tiles may be dirty due to and changed property animations, or external textures
- doing this fairly conservatively should be easy
- basically: track any primitives that are affected by a property animation or external texture
- when a property animation / texture changes, re-tag that those tiles need to be re-painted
- it's not clear to me if there's cases where that would result in re-rendering much more than needed, but it's possible there's some edge cases we'd need to solve there
We could initially do this for the root scroll frame
What about when we get a new display list?
- that would then rely on the same caching we intend to do for other off-screen picture effects, that is - a deep compare of the prims/clips to see if the content matches
- I had intended to only do for pictures where we think the overdraw/pixel # makes it worthwhile to cache but can be done more widely if it makes sense to.
"
Comment 2•6 years ago
|
||
<gw> bholley: at a very high level, we can approximately say that WR performance is linear in the number of blended pixels. Estimating this pixel count for a given scene is very easy (we have 90% of the information now). So, my theory is that we can actually have a very reasonable guess of whether a page will be "slow" to draw before we draw it for the first time. If that works, that would be the main driver as to whether it makes sense to cache this surface.
<gw> bholley: (and would mean we don't rely on drawing it and timing it the first time)
<jrmuizel> bholley: for a page like francine the browser ui will still be blocked during rendering even if it's async because all access to the gpu is serialized
<jrmuizel> s/even if it's async/even if it's offscreen/
<gw> jrmuizel: bholley: in the specific case of francine, it seems that the best option is for WR to estimate (based on the above) that the blended pixel count is very high, and therefore cache the render output of that picture. controlling how long that surface exists in the cache is an open question - could be explicit control by gecko, or could be some kind of LRU cache for surfaces?
Updated•6 years ago
|
Blocks: stage-wr-trains
Updated•6 years ago
|
Priority: P3 → P4
Comment 3•6 years ago
|
||
Glenn, Jeff -- It seems like we don't need this optimization for MVP, or do we not have enough info/data to make that decision?
Flags: needinfo?(jmuizelaar)
Flags: needinfo?(gwatson)
Comment 4•6 years ago
|
||
There are a small number of bugs in the P2/P3/P4 buckets that probably rely on scrolled surface caching to run at 60 fps on lower end GPUs.
It's unclear at the moment how many of those sites are not fast enough on discrete GPUs for the initial target, and how many of those are actually blockers for shipping (they tend to be edge cases on a small number of sites).
Flags: needinfo?(gwatson)
Updated•6 years ago
|
Flags: needinfo?(jmuizelaar)
Comment 5•6 years ago
|
||
This is fixed by our current picture caching stuff.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•