Closed Bug 1762388 Opened 2 years ago Closed 2 years ago

Invalid to register the same layer tree twice assertion failure when continually reinitializing compositor

Categories

(Core :: Graphics, defect)

defect

Tracking

()

RESOLVED FIXED
100 Branch
Tracking Status
firefox100 --- fixed

People

(Reporter: jnicol, Assigned: jnicol)

References

Details

Attachments

(1 file)

Bug 1762025 acts as a really good stress test for our GPUProcessManager::DisableWebRender() code path: Following a GPU process restart we are unable to create any EGL surfaces, therefore RenderCompositor::Resume() continually raises a WebRenderError::NEW_SURFACE error, and we continually "fall back" and keep destroying the compositor sessions and creating new ones.

This might work a few times, but we eventually run in to this warning (which then asserts in debug builds due to returning IPC_FAIL_NO_REASON).

This is trivially reproducible on Linux by making RenderCompositor{EGL,SWGL,whatever}::Resume() always raise the NEW_SURFACE error and return false, simulating what occurs in bug 1762025.

When the assertion is avoided, that occurs because BrowserChild::ReinitRenderingForDeviceReset() destroys the old layer manager, which in turn causes ContentCompositorBridgeParent::DeallocPWebRenderBridgeParent() to call EraseLayerState(), removing the layersId from the map before it is added again.

Sooner or later, however, and usually sooner, we will end up creating an already-destroyed WebRenderBridgeParent. During BrowserChild::ReinitRendering() this causes WebRenderLayerManager::Initialize() to fail here, meaning we return early before updating PuppetWidget::mWindowRenderer. Therefore on the following BrowserChild::ReinitRenderingForDeviceReset, the WebRenderLayerManager::DoDestroy() call returns early as this is an old layer manager which has already been destroyed. So the WebRenderBridgeParent does not get deallocated, and the layersId does not get erased from the map.

Then boom, we hit this assertion.

When WebRenderLayerManager::Initialize fails in PuppetWidget::CreateRemoteLayerManager, the new layer manager goes out of scope and is destroyed, rather than stored in mWindowRenderer. However, because we did not save a reference to the already-destroyed WebRenderBridgeChild here, WebRenderLayerManager::DoDestroy() does not call WebRenderBridgeChild::Destroy() here. Meaning the parent doesn't get deallocated and we don't erase the layersId. We might actually be leaking the WebRenderBridgeChild and Parent in this case? However if we save a reference to the WebRenderBridgeChild in WebRenderLayerManager::Initialize(), even if it is already destroyed, then that appears to fix this problem.

Does that sound like a reasonable fix to you, Andrew?

Flags: needinfo?(aosmond)

If the compositor is destroyed whilst in the process of being
reinitialized, it is possible for a WebRenderBridgeParent to be
created in an already-destroyed state. In this case, we currently
avoid saving a reference to the WebRenderBridgeChild in
WebRenderLayerManager. However, this means that
WebRenderLayerManager's destruction does not cause the
WebRenderBridgeChild (and therefore the corresponding
WebRenderBridgeParent) to be destroyed, meaning its layer tree ID does
not get erased from the map.

This patch makes us unconditionally store a reference to the
WebRenderBridgeChild in WebRenderLayerManager, meaning it is correctly
destroyed and the layer tree ID is correctly erased from the map.

Assignee: nobody → jnicol
Status: NEW → ASSIGNED

Explanation makes sense to me.

Flags: needinfo?(aosmond)
Pushed by jnicol@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/0fbddbe55d8f
Store reference to WebRenderBridgeChild in WebrenderLayerManager even if already destroyed. r=aosmond
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 100 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: