Closed Bug 1592044 Opened 5 years ago Closed 4 years ago

Recycle IOSurfaces + framebuffers

Categories

(Core :: Graphics: WebRender, enhancement, P3)

All
macOS
enhancement

Tracking

()

RESOLVED FIXED
mozilla73
Tracking Status
firefox73 --- fixed

People

(Reporter: mstange, Assigned: mstange)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 1 obsolete file)

In the WebRender + OS compositor configuration, during fast scrolling we destroy and allocate a lot of surfaces.

Here's a profile: https://perfht.ml/2JFbXJl

Of the non-idle time, the renderer thread time breaks down as follows:

29% glFlush
15% CA commit
7.8% IOSurface creation
7.1% Binding IOSurface to new texture
6.7% Framebuffer destruction
6.3% Framebuffer creation
3.2% glClientWaitSync

So that's 32% from too much GPU work (probably mostly due to bug 1591529), 15% because we have too many CALayers, and 28% from IOSurface/texture/framebuffer/renderbuffer creation and destruction.
Recycling those resources should allow us to cut down those 28% a lot.

Priority: -- → P3

Definitely cache framebuffers, ideally immutably!

FWIW in WebGL we cache some validation at the framebuffer level, and additionally for new framebuffers GL has to evaluate for framebuffer completeness before use, which is likely to add overhead.

Assignee: nobody → mstange

This is the opposite of caching framebuffers immutably. :)
Why do you want to add depth/stencil planes after creation?

Oh right! I had forgotten about your comment about immutability. I can definitely change it to do that.

The API I have allows calling GetFramebufferForSurface for the same IOSurface at different times with different values for aNeedsDepthBuffer. I don't want to create a depth buffer unnecessarily if somebody calls it with aNeedsDepthBuffer == false, and it seemed wasteful to destroy the existing framebuffer if another call with aNeedsDepthBuffer == true comes in later for the same surface. But in practice this is a really unlikely scenario, so I could just as well recreate the framebuffers. Basically, CompositorOGL doesn't need a depth buffer but WebRender does, and they both use the SurfacePool. However, we don't really want to mix CompositorOGL and WebRender in the same instance of Firefox anyway. I think at the moment we can still get into a mixed state because we don't use WebRender for WebExtension arrow panels, but I'm not sure about that.

Attachment #9111816 - Attachment is obsolete: true
Attachment #9112134 - Attachment description: Bug 1592044 - Add MozFramebuffer::HasDepthAndStencil. r=jgilbert → Bug 1592044 - Add MozFramebuffer::HasDepth() and HasStencil(). r=jgilbert
Attachment #9111817 - Attachment description: Bug 1592044 - WIP: Add SurfacePool → Bug 1592044 - Reduce the frequency of IOSurface and framebuffer creation and destruction with the help of a surface pool. r=jgilbert, r=mattwoodrow
Attachment #9111817 - Attachment description: Bug 1592044 - Reduce the frequency of IOSurface and framebuffer creation and destruction with the help of a surface pool. r=jgilbert, r=mattwoodrow → Bug 1592044 - Reduce the frequency of IOSurface and framebuffer creation and destruction with the help of a surface pool. r=jgilbert,mattwoodrow

I've hit a snag with my threading model, exposed by assertions in debug builds on try:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=281024646&repo=try&lineNumber=8692

The SurfacePool is shared between all windows, but some windows paint on the main thread (context menus, tooltips) and some on the compositor or renderer thread (regular browser windows). Eviction is done in OnEndFrame. So a call to OnEndFrame on the main thread can evict entries that were created on the compositor / renderer thread. This means that we destroy framebuffers and drop references to GLContexts . But GLContext inherits from the non-thread-safe SupportsWeakPtr, which has asserts that make sure that GLContext is only used on a single thread. These things are at odds with one another.

I can see three ways of dealing with this:

  1. Keep the current architecture and allow using GLContext and MozFramebuffer from multiple threads (but only one thread at a time, with the caller being responsible for proper locking).
  2. Keep the current architecture, but put mechanisms in place that ensure that GL resources are only touched on the thread that they were created on.
  3. Have two thread pools: one for "main-thread windows" and one for "compositor/renderer thread windows".

I'm going to explore option 2 for a bit. Jeff, do you have opinions on this?

Flags: needinfo?(jgilbert)

Option 3 works better. I have a solution that replaces SurfacePool::Get() with SurfacePool::Create(size_t aPoolSizeLimit). I'm giving main thread rendering and non-WebRender one pool per window, with a limit of 0, i.e. "discard everything", which preserves the current (pre-patch) behavior. And WebRender gets one shared SurfacePool in its RenderThread whose lifetime it manages.

Pushed by mstange@themasta.com:
https://hg.mozilla.org/integration/autoland/rev/1b4c3b317568
Add MozFramebuffer::HasDepth() and HasStencil(). r=jgilbert
https://hg.mozilla.org/integration/autoland/rev/f97710c9392e
Reduce the frequency of IOSurface and framebuffer creation and destruction with the help of a surface pool. r=jgilbert
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla73
Flags: needinfo?(jgilbert)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: