Open Bug 1193468 Opened 9 years ago Updated 2 years ago

Refactor TextureClient recycling and waiting for compositor

Categories

(Core :: Graphics: Layers, defect)

29 Branch
defect

Tracking

()

People

(Reporter: mattwoodrow, Unassigned)

References

Details

(Whiteboard: [gfx-noted])

We currently have a huge amount of code to retain client side objects and (maybe) recycle them. A lot of this appears to be redundant and over complex, we should try do better.

Things I know of currently:

* RemoveTextureFromCompositableAsync/AsyncTransactionTracker
    * Holds a reference to a TextureClient until all pending async messages have been processed by the compositor
    * Used for TextureClientRecycleAllocator and gonk fence handle (which I don't really understand right now)
    * Requires opt-in from the compositable (seems wrong), code in each of the compositable implementations and #ifdefs

* KeepUntilFullDeallocation/TKeepAlive
    * Holds a reference to some arbitrary refcounted type T until the compositor side TextureHost object has been destroyed
    * Used for implementation specific objects that don't hold a reference while being serialized over IPC.
    * Only requires opt-in from the specific TextureClient impl that needs it (much better).

* TextureClient::WaitForCompositorRecycle
    * Only seems to be called by canvas?
    * Host side seems to make mention of tiling, but TiledContentHost piece looks like dead code

* AtomicRefCountedWithFinalize::SetRecycleCallback
    * Used for getting notifications when a TextureClient is unused.
    * Requires manual inclusion into RemoveTextureFromCompositableAsync in order to work correctly.
    * Callback takes a void* closure data object, usually a refcounted object with confusing lifetime management.

* TextureClient::RecycleTexture
    * Used by TextureClientRecycleAllocator to update the texture flags on a texture being reused.
    * Probably fine, but somewhat confusing named given the above


There's probably more here.

These aren't all necessarily duplicates of each other, but it's really confusing and I think we could cover all the use cases in a much simpler way.
AsyncTransactionTracker(AsyncTransactionWaiter now) was mainly created for fence delivery on gonk. Fence is exposed as file descriptor(fd). To deliver the fd from chrome process to content process, layer transaction is necessary. At the time, there was a recycling mechanism by using WaitForCompositorRecycle() of Bug 979489. At first, I used this mechanism for fence delivery of ClientTiledThebesLayer. But it did not work for ClientThebesLayer, hw accelerated SurfaceStream and ImageLayer.

ClientThebesLayer and SurfaceStream case, during the main thread is calling TextureClient::Lock(), TextureClient need to receive fence's fd from chrome process(b2g process) via transaction. But main thread is already used to call TextureClient::Lock() and cannot be used for fence delivery. Then I implemented AsyncTransactionTracker and the main thread's fence become be delivered via ImageBridge.

In ImageLayer's case, WaitForCompositorRecycle() just did not work to ImageLayer. On ImageLayer, one TextureClient could be delivered multiple times asynchronously, WaitForCompositorRecycle() mechanism could not handle it correctly. Therefore I chose AsyncTransactionTracker for ImageLayer on gonk.

AsyncTransactionTracker expect to use SetRecycleCallback when TextureClient recycling could be async (like video playback or camera preview).
By recent canvas's refactoring, gonk's fence delivery mechanism seems to be removed. Therefore, there might be a risk of regression like Bug 1006164.
Whiteboard: [gfx-noted]
I have a plan.
Assignee: nobody → nical.bugzilla
I like plans, tell me more :)
In short, I am making TextureClient a final class that has a backend-specific TextureData object (X11TextureData, GrallocTextureData, etc.) which lets me:
* put all of the assertions in TextureClient (check that we lock/unlock properly, etc.) instead of duplicating it in all of the base classes
* remove the crazy hacks around TextureClient's refcounting (right now textureclients that are recycled get recycled when their refcount reaches 1, and the recycler holds a strong ref to them to compensate, etc.). We will not recycle the TextureClient itself but recycle the IPDL actor and the TextureData objects instead (which is what we actually want).
* implement the KeepAlive thing properly (the IPDL actor then just keeps the TextureData object instead of having to hack it up for all backends)
* some other stuff related having a "remote" texture for efficient compositor snapshots.

right now I am converting the TextureClient subclasses one by one (I'm halfway through), and as soon as I have them all done, the mechanism for recycling textures will be much easier to implement.
The idea would be to have the recycler hold pairs of TextureData+IPDLActor, and create fresh new TextureClient around them. We'd have a recycler per size that we want to recycle (typically tiles and video frames) and per CompositableForwarder. We would have two recycling strategies: a simpler one based on an asynchronous handshake with the compositor to recycle the data after the TextureClient's refcount goes to zero, and what we currently do with tiling (more aggressively recycle TextureClients regardless of their refcount, because tiling has simpler lifetime and ownership characteristics).
Adding to Matt's thoughts on the topic:
We have several types of recycling purposes:
 1) avoiding reallocation (basically we only recycle texture memory but not texture content)
 2) avoiding repaint (recycle both memory and content). This is not actually about recycling, but more about synchronizing shared resources that are not immutable.

There are several ways we can manage synchronization resources:
 A) by checking whether a resource is writeable without waiting (typically, copy-on-write locks in tiling). This does not require any message passing, just an atomic integer in a shmem.
 B) by getting notified that a resource is available after a message from the compositor, and eventually add the resource to the set of things that can be reused. The advantage of this approach is to be able to run some specific code to release associated objects, or to have logic start automatically as soon as the resource is available. 
 C) by blocking the thread until a resource is made available, which also requires chatting with the compositor process to either know that a transaction is complete or to get fence objects. The only advantage of this approach is that it naturally throttles the producer if the compositor cannot keep up, but this is actually a sign that the producer code should have been smarter about when to produce and do something more like (B) (better to start producing when the resources you are depending on are ready than start producing and block the thread when you realize the resources are not ready yet). Sometimes the pitfalls of this approach are mitigated by the producer being in its own thread like video decoding, but it's still not great.

AsyncTransactionTracker and friends help a lot with B) and C), although I think we could do a better job of having TextureClient hide that under its Locking api.

We kinda have to support the 3 synchronization approaches although in the perfect world we would be only choosing between (A) and (B) depending on what we do with the texture.

Back to recycling, we still have two different use cases (1) and (2), and that'll require at least two mechanisms. (2) being very compositable-specific, we should focus on having a unified solution for (1), and that's not too hard.

> * RemoveTextureFromCompositableAsync/AsyncTransactionTracker

That will have to stay, and will be used at least to forward fence objects to gralloc buffers, and by all of the use cases where we need to block the producer thread until a resource is reusable. We should try to reduce the amount of these cases but that's another story.

> * KeepUntilFullDeallocation/TKeepAlive

This is getting simpler with the work I am doing, but is only used to handle the difference between textures which have to be deallocated on the host and the ones that have to be deallocated on the client. It's not hugely complicated and it's mostly orthogonal to the rest.

> * TextureClient::WaitForCompositorRecycle
>     * Only seems to be called by canvas?
>     * Host side seems to make mention of tiling, but TiledContentHost piece
> looks like dead code
> 
> * AtomicRefCountedWithFinalize::SetRecycleCallback
>     * Used for getting notifications when a TextureClient is unused.
>     * Requires manual inclusion into RemoveTextureFromCompositableAsync in
> order to work correctly.
>     * Callback takes a void* closure data object, usually a refcounted
> object with confusing lifetime management.

That will mostly disappear with the work I am doing. At most the TextureClient will just tell the SurfaceAllocator "Hey I am getting destroyed, feel free to recycle my stuff", and the allocator will decide whether it wants to recycle the TextureData and ipdl actor (and manage the fact that it might happen on any thread). No need for callbacks, and for refcounting tricks since the TextureClient object will actually die without destroying its data.

> 
> * TextureClient::RecycleTexture
>     * Used by TextureClientRecycleAllocator to update the texture flags on a
> texture being reused.
>     * Probably fine, but somewhat confusing named given the above

That will get merged into the shiny and clean recycler that has yet to be made.

There's a whole bunch of stuff in here, so we can talk about them as a whole in this bug, but the different action items all have their respective wins and should happen in separate bugs.
To add to the list of duplicated stuff, the SyncObject class (see TextureClient::SyncWithObject) and FenceHandle serve the same purpose. Both have been sorta meant to expose a platform agnostic interface but are only implemented by one platform, and they are both exposed to/by TextureClient. I'll merge them once I have figured the rest out.
Depends on: 1200595
Assignee: nical.bugzilla → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.