Closed Bug 1760618 Opened 3 years ago Closed 3 years ago

After dragging a tab with an off-main-thread animation into a new window, the old window keeps compositing at 60fps forever

Categories

(Core :: Graphics: WebRender, defect)

defect

Tracking

()

RESOLVED FIXED
100 Branch
Performance Impact high
Tracking Status
firefox100 --- fixed

People

(Reporter: mstange, Assigned: sotaro)

References

(Blocks 2 open bugs)

Details

(Keywords: perf:resource-use)

Attachments

(3 files)

Attached file OMTA testcase

Steps to reproduce:

  1. Open the attached testcase in a tab and switch to that tab.
  2. Drag the tab out of the tab bar to create a new window from it.
  3. Close that new window.

Expected results:
The parent window should stop triggering composites.

Actual results:
Dragging the tab out of the window prints the following in debug builds:

[Parent 30464, Compositor] ###!!! ASSERTION: Tried to delete invalid animation: 'Error', file /Users/mstange/code/mozilla/gfx/layers/wr/OMTASampler.cpp:204

And from that point on, the original window will trigger composites indefinitely and use up CPU and power.

Flags: needinfo?(hikezoe.birchill)

Thank you Markus for filing this bug. This sounds quite bad. I will investigate what's going on. That said, I am going to do NI to sotaro as well since I believe he's more familiar with the code in question rather then I.

Flags: needinfo?(hikezoe.birchill) → needinfo?(sotaro.ikeda.g)

I did accidentally drop my NI. (I thought I did uncheck it, but it seems not).

Flags: needinfo?(hikezoe.birchill)

Moving into WebRender component since I believe this is more or less an issue in WebRender backend.

And I think the stuck is probably due to an assertion FocusState::IsCurrent. I don't see the stuck on release builds. Bug 1659439 is the closest issue for the assertion because the STR in bug 1659439 comment 0 involves dragging tab between browser windows.

Component: DOM: Animation → Graphics: WebRender
See Also: → 1659439

I tried to reproduce the problem on Windows, Linux and MacOS. But I could not reproduce the problem, is there something I am missing? Does the problem happen also on release build?

I was able to reproduce it in a regular Nightly build on macOS with a fresh profile.

Reproduces on a Linux opt build with a "WRRende~ckend#1" thread and the Renderer thread continuing to use 4-5% of a cpu each, and firefox and Compositor, "BHMgr Monitor", GLXVsyncThread, threads using lesser cpu.

Disabling (temporarily or not) compositing in the window manager (and forcing an invalidation through a focus change) is sufficient to stop the cpu usage.

Ah, I also misunderstood the symptom like :hiro.

From Matrix,

gosh! I think I was misunderstanding the stuck mean. I was thinking it's stuck at somewhere our composition work so that no further composition happens, but in fact it means it keeps compositing, right? karlt's comment realized me. :)

It seems that we failed to remove all animation by WebRenderBridgeParent::UpdateWebRender(). The ClearAnimationResources() seemed not not remove all animations.

Thanks to karlt's comment above and thanks to miko's help to understand our display list building, I think now I understand what's going on.

  1. When a tab is moved to a new browser window, OMTASampler::ClearActiveAnimations gets called via AdoptChild IPC call
  2. in the mean time, the display item for the animation in question persists
  3. then in the first paint, CreateOrRecycleWebRenderUserData returns a new WebRenderAnimationData instance rather than re-using the previous one, thus we assign a new animation id for the animation, thus we do call DeleteCompositorAnimations with the previous animation id, then the "Tried to delete invalid animation" happens

I'd think, given that display items persist after moving between browser windows, we don't need to call ClearActiveAnimations at 1)?

Flags: needinfo?(hikezoe.birchill)

FWIW, though I haven't further investigated about the stuck issue, my guess is that when the tab is detached from the original browser window, we don't clear the animation in question, thus the OMTASample instance for the original window keeps the animation id in question. I suppose sotaro has already noticed it though.

Summary: After dragging a tab with an off-main-thread animation into a new window, the old window is stuck compositing forever → After dragging a tab with an off-main-thread animation into a new window, the old window keeps compositing at 60fps forever

(I've rephrased the bug title and description, sorry for the ambiguous wording!)

In CompositorBridgeParent::RecvAdoptChild(), NotifyChildCreated() is called before WebRenderBridgeParent::UpdateWebRender() call. It is not good.

In the UpdateWebRender(), WebRenderBridgeParent::ClearAnimationResources() is called. But in ClearAnimationResources(), OMTASampler is got from new CompositorBridgeParent. In this case, OMTASampler needs to be got from old CompositorBridgeParent.

The NotifyChildCreated() needs to be called after the UpdateWebRender() in CompositorBridgeParent::RecvAdoptChild().

Flags: needinfo?(sotaro.ikeda.g)

(In reply to Sotaro Ikeda [:sotaro] from comment #12)

In the UpdateWebRender(), WebRenderBridgeParent::ClearAnimationResources() is called. But in ClearAnimationResources(), OMTASampler is got from new CompositorBridgeParent. In this case, OMTASampler needs to be got from old CompositorBridgeParent.

It sounds to me that some APZ related functions have the same issue, WebRenderBridgeParent::UpdateAPZFocusState for example. Does it cause bug 1659439? In the APZ cases we don't clear resources thingie in RecvAdoptChild() though. CCing botond just FYI.

Gah, CCing botond was dropped by an comment collision.

Assignee: nobody → sotaro.ikeda.g

(In reply to Hiroyuki Ikezoe (:hiro) from comment #14)

It sounds to me that some APZ related functions have the same issue, WebRenderBridgeParent::UpdateAPZFocusState for example. Does it cause bug 1659439?

I think it's a potential diagnosis for bug 1659439, yeah. Thanks for making the connection. I commented on that bug so we don't forget if we work on it.

Pushed by sikeda.birchill@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bb74776bc24c Call NotifyChildCreated() after WebRenderBridgeParent::UpdateWebRender() in CompositorBridgeParent::RecvAdoptChild() r=gfx-reviewers,aosmond
Performance Impact: --- → P1
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 100 Branch

Hi Sotaro, thanks for fixing this! I don't see a test included in the patch (maybe it's elsewhere?), and I think it would be nice to ensure this doesn't regress. I've done a quick attempt at writing a browser chrome mochitest for this fix that I'm attaching as a proof of concept to show what I had in mind. I verified locally that it passes with your fix included and fails without, but haven't pushed this to try.

Did you have thoughts about testing that I didn't see?

Flags: needinfo?(sotaro.ikeda.g)

Thank you! I did not create a test for it. For testing the situation, it might be better to check omta value directly than "ChromeUtils.vsyncEnabled()".
:hiro knows omta very well.

:hiro, is it possible to create a test that check omta value directly for testing the situation?

Flags: needinfo?(sotaro.ikeda.g) → needinfo?(hikezoe.birchill)

I don't think we need to make sure the OMTA value there. What the mochitest Florian posted is trying to looks basically reasonable to me.

(Current ChromeUtils.vsyncEnabled() returns a global state, if it returns a per-window state, it would be nice? I am totally unsure though)

Flags: needinfo?(hikezoe.birchill)
Blocks: 1690673
No longer blocks: 1742797

It looks like the needinfo chain stopped somehow. :sotaro, are you taking care of the test?

Flags: needinfo?(sotaro.ikeda.g)

I think it would be nice that Florian pushes the mochitest to phabricator to ask review. I am happy to do r+ on that.

Note that writing this kind of browser mochitest would be much appreciated, specifically for Gecko core developers.

Depends on: 1764812

Florian landed the mochitest in bug 1764812. Thank you, Florian!

Flags: needinfo?(sotaro.ikeda.g)
Blocks: 1764812
No longer depends on: 1764812
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: