Closed Bug 1249273 Opened 8 years ago Closed 8 years ago

A lot of time spent creating/uploading gl textures when playing videos with BufferTextureHost

Categories

(Core :: Graphics: Layers, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox47 --- fixed

People

(Reporter: nical, Assigned: nical)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 3 obsolete files)

Jeff saw that we spend a lot of time in the driver getting video frames from BufferTextureHost to gl textures on mac (it applies also to other platforms).

A few observations about BufferTextureHost:
1) It uploads its content to a texture source from the transaction (in TextureHosty::Updated).
2) It creates a new gl texture if it doesn't alread have one
3) it doesn't attempt to use a TextureSource that is already in use by the compositable if any.
4) We don't recycle TextureClients/TextureHosts and, because of 3), TextureSources in a lot of video related cases. 

The combination of these 4 things, means we basically create a new shmem, and a gl new texture for every video frame, which is pretty expensive.

Let's focus on the GL texture creation overhead in this bug.

One solution to this could be to get better at recycling TextureClients. This solves 3) and means we would keep BufferTextureHosts around to reuse them with their already-created gl textures.

Another direction would be to do something analogous to what we do with gralloc. When we swap between GrallocTextureHost A and B for a given layer, we try to keep A's gralloc TextureSource for B whenever possible. That logic happens in GrallocTextureHost::PrepareTextureSource and GrallocTextureHost::BindTextureSource.

If we followed this idea with buffer TextureHost we would have something like a GL texture per video stream and when we swap from a TextureHost to another we just upload the new TextureHost's content to that texture.
In fact we would probably need to alternate between 2 or 3 gl textures because drivers end up copying textures if you do an upload to a texture that is currently in use.

I kinda like this approach because it means we allocate less gl textures (2 or 3 per video streams instead of 1 per decoded frame so around 10).

What complicates things somewhat is that video streams don't send textures one by one to the compositor but in batches of N textures with timestamps, and the compositor chooses the current texture when compositing instead of during the transaction.
This means that instead of uploading during the transaction we would upload during compositing (it's not really complicated per say but it's different enough from what other kind of web content does that our code isn't designed for it).
That said, we currently upload the whole batch of video frames we receive in the same transaction, so uploading on demand during composition would spread the load, I guess.
This part makes ImageHost do the PrepareTextureSource/BindTextureSource dance for the chosen image only, rather than for the entire batch.

It should be more efficient (no work done for frames that are not selected) and will give us an opportunity to optimize BufferTextureHost (in the next patch).
Attachment #8720860 - Flags: feedback?(sotaro.ikeda.g)
Attachment #8720860 - Flags: feedback?(jmuizelaar)
The logic here is a simplified version of what gralloc does. When possible we set the BufferTextureHost to reuse an existing DataTextureSource. The actual upload happens later in BufferTextureHost::Lock.
Attachment #8720863 - Flags: feedback?(sotaro.ikeda.g)
Attachment #8720863 - Flags: feedback?(jmuizelaar)
I just tested locally the two patches and it seems to not break things at least. We'll see what try says about that:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=eed5e58574b5

I also need to manually check that the heuristics kick in as expected when playing videos.
Wit a few tweaks I get video streams to reuse TextureSources properly which is good but it breaks other layer types where the current behavior of keeping the TextureSource associated to the BufferTextureHost is desirable (we swap back and forth between two TextureHost rather than getting new ones). I am thinking about adding an API for BufferTextureHost to "forget" their texture source so that the latter can be reused by the next TextureHost and use that specifically in ImageHost. The differences between textures that have an internal buffer (BufferTextureHost, etc.) and the ones that don't (Gralloc, D3D11, etc.) are quite confusing and impractical in this case, unfortunately.
I settled with an API that I am happy with: DataTextureSources can be marked as owned by a specific TextureHost. BufferTextureHost internally uses this information in order to know whether it can recycle in BufferTextureHost::PrepareTextureSource. When a TextureSource changes owner, its update serial is always reset to 0 which means that it doesn't contain valid content.

In ImageHost, since we never reuse a TextureHost with the same content (if a TextureHost is reused it will contain new content), we don't need the TextureHosts to retain their TextureSource once they have been replaced, so we can simply mark the TextureSource as not owned so that another TextureHost recycle it.
The other kinds of layers keep their current behavior by not altering the DataTextureSource's ownership.
Attachment #8721339 - Flags: feedback?(sotaro.ikeda.g)
Attachment #8721339 - Flags: feedback?(jmuizelaar)
This should have a significant impact on video playback performance on platforms where we use OpenGL for compositing (and also help on other platforms to some degree).
nical - how could/should this impact webRTC use?  typically we don't get bursts of frames (and as we discussed at Orlando, the image creation code is sub-optimal, which we want to fix).  Thanks!
Flags: needinfo?(nical.bugzilla)
Comment on attachment 8720860 [details] [diff] [review]
Part 1 - Change the way ImageHost manages TextureSources

Review of attachment 8720860 [details] [diff] [review]:
-----------------------------------------------------------------

On b2g, we need to call the following during the transaction. It seems better for performance. Can't we keep calling them on b2g?

>    img.mFrontBuffer->SetCropRect(img.mPictureRect);	
>    img.mFrontBuffer->Updated();	
>    img.mFrontBuffer->PrepareTextureSource(img.mTextureSource);
Flags: needinfo?(nical.bugzilla)
(In reply to Sotaro Ikeda [:sotaro] from comment #8)
> Comment on attachment 8720860 [details] [diff] [review]
> Part 1 - Change the way ImageHost manages TextureSources
> 
> Review of attachment 8720860 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> On b2g, we need to call the following during the transaction. It seems
> better for performance. Can't we keep calling them on b2g?
> 
> >    img.mFrontBuffer->SetCropRect(img.mPictureRect);	
> >    img.mFrontBuffer->Updated();	
> >    img.mFrontBuffer->PrepareTextureSource(img.mTextureSource);

And calling of SetCropRect() seems not exist after applying the patches.
(In reply to Randell Jesup [:jesup] from comment #7)
> nical - how could/should this impact webRTC use?  typically we don't get
> bursts of frames (and as we discussed at Orlando, the image creation code is
> sub-optimal, which we want to fix).  Thanks!

This patches makes us go from allocating one gl texture per video frame to reusing gl textures between frames for a given layer if the layer is receiving shmem/memory textures. So a video stream will typically swap back and forth between two gl textures to upload the video frames to, instead of creating, uploading and destroying a texture for each video frame which is pretty inside the driver.

I am saying gl texture but the same logic also applies to D3D and other similar "gpu" textures.

(In reply to Sotaro Ikeda [:sotaro] from comment #8)
> On b2g, we need to call the following during the transaction. It seems
> better for performance. Can't we keep calling them on b2g?
> 
> >    img.mFrontBuffer->SetCropRect(img.mPictureRect);	
> >    img.mFrontBuffer->Updated();	
> >    img.mFrontBuffer->PrepareTextureSource(img.mTextureSource);

With this patch we call them as soon as we know which texture is going to be composited next, which is during the transaction if there is only one texture, or during composition the first time the texture appears on screen if we received several textures.

It should actually be more efficient this way because we currently bind more textures than we need to (we bind all textures at the same time including the ones we will composite much later and the ones we may not composite at all). Instead of being bound directly in the transaction, textures are bound during the next transaction which is a tad later but not that much since this is the first thing that happens once the transaction is over. So I don't expect it to be a problem.

> And calling of SetCropRect() seems not exist after applying the patches.

Looking into this.
Flags: needinfo?(nical.bugzilla)
A newer version of these patches fixes the CropRect issue and most of the test failures. I'll put them up for review if the try push comes back green.
(In reply to Nicolas Silva [:nical] from comment #10)
> 
> It should actually be more efficient this way because we currently bind more
> textures than we need to (we bind all textures at the same time including
> the ones we will composite much later and the ones we may not composite at
> all). Instead of being bound directly in the transaction, textures are bound
> during the next transaction which is a tad later but not that much since
> this is the first thing that happens once the transaction is over. So I
> don't expect it to be a problem.

Thanks. It makes sense.
I folded the patch queue differently to make review a bit easier. This patch contains the changes to BufferTextureHost ownership of DataTextureSource. Without the next patch the bahevior should change from what we have now, since the BufferTextureHost owns its TextureSources by default.
Attachment #8720863 - Attachment is obsolete: true
Attachment #8721339 - Attachment is obsolete: true
Attachment #8720863 - Flags: feedback?(sotaro.ikeda.g)
Attachment #8720863 - Flags: feedback?(jmuizelaar)
Attachment #8721339 - Flags: feedback?(sotaro.ikeda.g)
Attachment #8721339 - Flags: feedback?(jmuizelaar)
Attachment #8722490 - Flags: review?(sotaro.ikeda.g)
With this patch, ImageHost only prepares and binds the texture that will be composited instead of all of the textures it receives during the transaction. This can cause PrepareTextureSource to be delayed from a transaction to the next composition if there are multiple textures as discussed earlier. This patch also opts into the mechanism introduced in Part 1 by "disowning" TextureSources, which enables the next TextureHost to reuse them.
Attachment #8720860 - Attachment is obsolete: true
Attachment #8720860 - Flags: feedback?(sotaro.ikeda.g)
Attachment #8720860 - Flags: feedback?(jmuizelaar)
Attachment #8722494 - Flags: review?(sotaro.ikeda.g)
(In reply to Nicolas Silva [:nical] from comment #13)
> Without the next patch the bahevior should change from what we have now,

Sorry, I meant should *not* change.
That said I have done all of the texting with both patches applied and I intend to land them at the same time.

try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=82f62163dc13
Comment on attachment 8722490 [details] [diff] [review]
Part 1 - Let BufferTextureHost reuse exiting DataTextureSource if they are not owned by another texture

Review of attachment 8722490 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!

::: gfx/layers/composite/TextureHost.h
@@ +400,5 @@
>     */
>    virtual void UnbindTextureSource() {}
>  
>    /**
> +   * Called when a Texture

nit. Is there a longer version of comment?
Attachment #8722490 - Flags: review?(sotaro.ikeda.g) → review+
Comment on attachment 8722494 [details] [diff] [review]
Part 2 - Change the way ImageHost manages its textures

Review of attachment 8722494 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!

::: gfx/layers/composite/ImageHost.cpp
@@ +334,2 @@
>    // Make sure the front buffer has a compositor
> +  mCurrentTextureHost->SetCompositor(GetCompositor());

mCurrentTextureHost and img->mTextureHost seems to be mixed in the function. It might be better to unify to one.
Attachment #8722494 - Flags: review?(sotaro.ikeda.g) → review+
(In reply to Sotaro Ikeda [:sotaro] from comment #16)
> nit. Is there a longer version of comment?

Oops! This is a leftover from a previous version of the patch. This function is never called, I removed it from the patch. 

(In reply to Sotaro Ikeda [:sotaro] from comment #17)
> mCurrentTextureHost and img->mTextureHost seems to be mixed in the function.
> It might be better to unify to one.

Good point. I replaced the remaining img->mTextureHost occurences be mCurrentTextureHost in this function.
One of these patches produced a nice win on glterrain on winxp:

https://treeherder.allizom.org/perf.html#/alerts?id=284
https://hg.mozilla.org/mozilla-central/rev/eced4790cbdc
https://hg.mozilla.org/mozilla-central/rev/e6d0c56847ce
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Depends on: 1251910
See Also: → 1251726
See Also: → 1251427
Depends on: 1251726
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: