Surface recycling on Android in GLScreenBuffer is broken
Categories
(Core :: Graphics: CanvasWebGL, defect)
Tracking
()
People
(Reporter: jnicol, Assigned: lsalzman)
References
Details
Attachments
(1 file, 1 obsolete file)
48 bytes,
text/x-phabricator-request
|
pascalc
:
approval-mozilla-beta+
|
Details | Review |
I noticed this while investigating bug 1892716. It exacerbates the issue but it's not the real culprit.
In GLScreenBuffer.cpp, on Android we use a pool of SharedSurface_SurfaceTexture
s, as allocating Android Surfaces is expensive. After we obtain a surface from the pool (or allocate a new one if necessary, we immediately push the surface to the back of the pool here.
Some time later, we push the Surface to the back of the pool again, here in StoreRecycledSurface
.
Now we have the same Surface in the pool multiple times, which presumably could cause some issues with selecting the wrong Surface at the wrong time. But additionally it means that the pool size can grow larger than expected, and due to this check using ==
instead of >=
, we choose not to use an existing surface and allocate a new one instead. We end up allocating a new Surface every time we try to acquire one.
Kelsey, Sotaro, what is the correct fix here? I think we should defensively either add an assert that the pool size is not larger than expected, or switch the condition to >=
. But how should we avoid adding duplicate surfaces to the pool? Do we remove the call here, or should StoreRecycledSurface()
not be being called on Android?
Reporter | ||
Updated•5 months ago
|
Updated•5 months ago
|
Comment 1•4 months ago
|
||
@sotaro Can you take this?
Reporter | ||
Comment 2•4 months ago
|
||
I dug a little deeper in to this. I think the problem is that mDestroyedCallback
should be set when using accelerated canvas, but it is not.
For in-process webgl, mDestroyedCallback
is not set, so we must push the surfaces to the pool in Acquire(). This currently works correctly.
In out-of-process webgl, mDestroyedCallback
is set, and StoreRecycledSurface()
takes care of the recycling, and this code takes care of ensuring we don't add duplicates to the pool, because poolSize == 0
when mDestroyedCallback
is set. This could be clearer, but currently works correctly.
For accelerated canvas, mDestroyedCallback
is not set, but StoreRecycledSurface()
gets called anyway. Hence the problem.
The reason mDestroyedCallback
is not set for accelerated canvas is that we usually set it here. However, for accelerated canvas by the time this gets called we have already registered the texture owner here. If we skip the RegisterTextureOwner()
call in CanvasTranslator::EnsureRemoteTextureOwner()
, then WebGLContext::PushRemoteTexture()
will take care of registering it for us, and it will set the destroyed callback. And then everything works correctly.
Reporter | ||
Comment 3•4 months ago
|
||
CanvasTranslator::EnsureRemoteTextureOwner() was calling
RegisterTextureOwner(), meaning that the owner was already registered
when WebGLContext::PushRemoteTexture() gets called. This means that
the SwapChain::SetDestroyedCallback() never got called for acclerated
canvas swapchains.
This in turn meant that SharedSurfaces were being added the the
swapchain's pool by both SwapChain::Acquire() and
SwapChain::StoreRecycledSurface(), breaking the pooling logic and
resulting in new Surfaces being allocated every time Acquire() was
called.
To fix this, we remove CanvasTranslator's call to
RegisterTextureOwner(), relying on WebGLContext to register it for
us (and set the destroyed callback correctly). We additionally make
the swapchain pooling logic more defensive, and add an assertion that
the callback was correctly set.
Assignee | ||
Comment 4•4 months ago
|
||
Updated•4 months ago
|
Pushed by lsalzman@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/34afcad20473 Disable gl::SwapChain's internal pooling when RemoteTextureMap is used. r=jnicol,gfx-reviewers
Comment 6•4 months ago
|
||
bugherder |
Updated•4 months ago
|
Assignee | ||
Comment 7•4 months ago
|
||
Comment on attachment 9402602 [details]
Bug 1894929 - Disable gl::SwapChain's internal pooling when RemoteTextureMap is used. r?jnicol
Beta/Release Uplift Approval Request
- User impact if declined: Freezing and/or memory leaks on Android.
- Is this code covered by automated tests?: Unknown
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce:
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky):
- String changes made/needed:
- Is Android affected?: Yes
Updated•4 months ago
|
Comment 8•4 months ago
|
||
Comment on attachment 9402602 [details]
Bug 1894929 - Disable gl::SwapChain's internal pooling when RemoteTextureMap is used. r?jnicol
Approved for 127 beta 7, thanks.
Updated•4 months ago
|
Description
•