Closed Bug 1402439 Opened 7 years ago Closed 7 years ago

Compositor isn't properly notified of animations ending, resulting in infinite composites

Categories

(Core :: Graphics: WebRender, enhancement, P1)

Other Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla58
Tracking Status
firefox58 --- fixed

People

(Reporter: kats, Assigned: kats)

References

Details

(Whiteboard: [wr-mvp] [gfx-noted])

Attachments

(3 files, 2 obsolete files)

With layers-free webrender, it seems like we composite infinitely. This is bad.

I chased it down a little, and it seems that the compositor never really clears out its list of animations, so at [1] there's always some animations that trigger another composite.

I added some logging on the content side to see what was going on and I think that what's happening is that the animation ends on the content side, and that immediately prevents the {nsDisplayOpacity,nsDisplayTransform}::CreateWebRenderCommands function from getting called (maybe it's not an active layer or display item any more? - haven't verified that yet). However the frame is obviously still alive, and so is the WebRenderAnimationData that was created at [2].

The code that triggers the compositor side to clear the animation is also run on a per-paint basis from CreateWebRenderCommands inside [3] so when that stops running we just end up with this dangling animation on the compositor side that keeps triggering composites for no good reason.

We need a better mechanism to know when animations are done and can be discarded on the compositor side.

[1] http://searchfox.org/mozilla-central/rev/15ce5cb2db0c85abbabe39a962b0e697c9ef098f/gfx/layers/wr/WebRenderBridgeParent.cpp#1092
[2] http://searchfox.org/mozilla-central/rev/15ce5cb2db0c85abbabe39a962b0e697c9ef098f/layout/painting/nsDisplayList.cpp#6262
[3] http://searchfox.org/mozilla-central/rev/15ce5cb2db0c85abbabe39a962b0e697c9ef098f/layout/painting/nsDisplayList.cpp#6264
What's worse is that the animations are coming from the UI process on startup, so they basically stick around forever, regardless of what content you're loading.
We can probably put something into RemoveUnusedAndResetWebRenderUserData that discards the compositor animation when we throw away the WebRenderAnimationData.
Blocks: 1395980
I have some WIPs in progress.
Assignee: nobody → bugmail
Try push on top of the layers-free patches, to ensure it doesn't break that
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2fe452faac99774fd96556ac741e1d95dde17eed

And try push on top of m-c tip to ensure it doesn't break layers-full
https://treeherder.mozilla.org/#/jobs?repo=try&revision=65f6402b2ff979da97c8076896091f113e2d738d
I just checked the try push and noticed that the R4 failure[1] is caused by this patches.

[1] layout/reftests/css-animations/stacking-context-transform-none-with-fill-forwards.html
Kats, I think the infinite composition was caused by we didn't add the animation clean up in [1]. With layers, we did clean up the animation when WebRenderContainerLayer got destroyed[2].

[1]http://searchfox.org/mozilla-central/rev/f6dc0e40b51a37c34e1683865395e72e7fca592c/gfx/layers/AnimationInfo.cpp#24
[2]http://searchfox.org/mozilla-central/source/gfx/layers/wr/WebRenderContainerLayer.h#30
With my patch, I can't reproduce the infinite composition in my local.

> to Ethan Lin[:ethlin] from comment #7)
> I just checked the try push and noticed that the R4 failure[1] is caused by
> this patches.
> 
> [1]
> layout/reftests/css-animations/stacking-context-transform-none-with-fill-
> forwards.html

I didn't see this failure with my patch.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=de10b456ea2d7210aa6b9f98baf2f1ba0c001ec9&selectedJob=133028594(In reply
Comment on attachment 8911342 [details]
Bug 1402439 - Redo how we discard compositor animation ids.

https://reviewboard.mozilla.org/r/182814/#review188238

::: commit-message-ddff2:16
(Diff revision 1)
> +animations in that the compositor side never got notified to discard the
> +IDs. This resulted in infinite composition loops. This patch solves this
> +problem by having any unused WebRenderAnimationData trigger discard of
> +the animation id during destruction. This way, even if the nsDisplayItem
> +is deleted in the middle of the animation we have a fallback mechanism
> +to discard the id.

I think the infinite composition could be solved by cleaning up animation during destruction. Do we need another storage 'ActiveCompositorAnimationIds' in WebRenderLayerManager? Do you have the test case?
Comment on attachment 8911341 [details]
Bug 1402439 - Add some documentation for the compositor animation code.

https://reviewboard.mozilla.org/r/182812/#review188284

Thanks for updating the code comments.
Attachment #8911341 - Flags: review?(howareyou322) → review+
(In reply to Ethan Lin[:ethlin] from comment #7)
> I just checked the try push and noticed that the R4 failure[1] is caused by
> this patches.

What makes you say that? The R4 failure is on the push with layers-free enabled, and it only happens in one of the 5 retriggers of that job. I'm positive it's just one of the intermittents we are seeing with layers-enabled, and not specifically a regression from these patches. The layers-full try push has no failures.

(In reply to Peter Chang[:pchang] from comment #11)
> I think the infinite composition could be solved by cleaning up animation
> during destruction.

We could do that, yes. But overall I felt my changes made the whole process easier to follow. Currently we add the animation id for discard on every single transaction, and then remove it from the list if the animation is continuing (via KeepCompositorAnimationsIdAlive which does a linear search). This seems inefficient. Instead my patch changes the code in nsDisplayTransform/nsDisplayOpacity to this form:

if (.. we have animations ..) {
  ...
  add active compositor animation id
} else {
  discard compositor animation id
}

Which puts the logic in a more easily-understood place. I also don't like the fact that we have WebRenderLayerManager-specific code inside AnimationInfo which is shared/generic code.

> Do we need another storage
> 'ActiveCompositorAnimationIds' in WebRenderLayerManager?

I felt this way was better, even though it uses a little more memory. In practice we shouldn't have a lot of compositor animation ids active at any given time.

> Do you have the test case?

Test case for this bug? Just start Firefox with layers-free enabled. It composites infinitely. You don't even need to load any pages. Or did you mean some other test case?

I still prefer my patch because I think it makes the code clearer and adds more documentation, but your patch works also so I'll give it r+ and leave it to you to decide which fix you want to land.
Comment on attachment 8911653 [details]
Bug 1402439 - Clean up the animations when the retained animation data got destroyed,

https://reviewboard.mozilla.org/r/183052/#review188342
Attachment #8911653 - Flags: review?(bugmail) → review+
Please land my part 1 documentation patch with whichever fix you feel is better, thanks.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #13)
> (In reply to Ethan Lin[:ethlin] from comment #7)
> > I just checked the try push and noticed that the R4 failure[1] is caused by
> > this patches.
> 
> What makes you say that? The R4 failure is on the push with layers-free
> enabled, and it only happens in one of the 5 retriggers of that job. I'm
> positive it's just one of the intermittents we are seeing with
> layers-enabled, and not specifically a regression from these patches. The
> layers-full try push has no failures.
> 
Today I also found some intermittents related to animation in R8. Probably they are the same cause.

> (In reply to Peter Chang[:pchang] from comment #11)
> > I think the infinite composition could be solved by cleaning up animation
> > during destruction.
> 
> We could do that, yes. But overall I felt my changes made the whole process
> easier to follow. Currently we add the animation id for discard on every
> single transaction, and then remove it from the list if the animation is
> continuing (via KeepCompositorAnimationsIdAlive which does a linear search).
> This seems inefficient. Instead my patch changes the code in
> nsDisplayTransform/nsDisplayOpacity to this form:
> 
> if (.. we have animations ..) {
>   ...
>   add active compositor animation id
> } else {
>   discard compositor animation id
> }
Agree, this logic here is easy to maintain.
> 
> Which puts the logic in a more easily-understood place. I also don't like
> the fact that we have WebRenderLayerManager-specific code inside
> AnimationInfo which is shared/generic code.
> 
With this patch, AnimationInfo doesn't need to handle the add/remove animation ids.
> > Do we need another storage
> > 'ActiveCompositorAnimationIds' in WebRenderLayerManager?
> 
> I felt this way was better, even though it uses a little more memory. In
> practice we shouldn't have a lot of compositor animation ids active at any
> given time.
> 
I think we can just save ActiveCompositorAnimationIds and create a temporary nsTArray for IPC only.
How do you think? I can do it as a follow-up because we need to fix the infinite composition first.

> > Do you have the test case?
> 
> Test case for this bug? Just start Firefox with layers-free enabled. It
> composites infinitely. You don't even need to load any pages. Or did you
> mean some other test case?
> 
> I still prefer my patch because I think it makes the code clearer and adds
> more documentation, but your patch works also so I'll give it r+ and leave
> it to you to decide which fix you want to land.
Let's land your version.
Comment on attachment 8911342 [details]
Bug 1402439 - Redo how we discard compositor animation ids.

https://reviewboard.mozilla.org/r/182814/#review188380
Attachment #8911342 - Flags: review?(howareyou322) → review+
Attachment #8911653 - Attachment is obsolete: true
(In reply to Peter Chang[:pchang] from comment #16)
> > 
> I think we can just save ActiveCompositorAnimationIds and create a temporary
> nsTArray for IPC only.
> How do you think? I can do it as a follow-up because we need to fix the
> infinite composition first.

I'm not sure what you have in mind for this. One reason I made it a member variable of WebRenderLayerManager was because we need it to persist from one transaction to the next. If in transaction 1 we have an animation, and then in transaction 2 the nsDisplayTransform is removed entirely, then the WebRenderAnimationData destructor runs as part of transaction 2, and it needs to know that the compositor animation id was active in transaction 1 in order to discard it safely. So if you're suggesting that we make the ActiveCompositorAnimationIds lifetime restricted to a single transaction we'll need to make some other changes to work around this problem.

> > I still prefer my patch because I think it makes the code clearer and adds
> > more documentation, but your patch works also so I'll give it r+ and leave
> > it to you to decide which fix you want to land.
> Let's land your version.

Ok, I'll land this for now and then we can do additional changes in follow-ups. If we can get rid of the extra ActiveCompositorAnimationIds without regressing anything that would be great.
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/26fb75dd655c
Add some documentation for the compositor animation code. r=pchang
https://hg.mozilla.org/integration/autoland/rev/4480c624a554
Redo how we discard compositor animation ids. r=pchang
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #13)
> (In reply to Ethan Lin[:ethlin] from comment #7)
> > I just checked the try push and noticed that the R4 failure[1] is caused by
> > this patches.
> 
> What makes you say that? The R4 failure is on the push with layers-free
> enabled, and it only happens in one of the 5 retriggers of that job. I'm
> positive it's just one of the intermittents we are seeing with
> layers-enabled, and not specifically a regression from these patches. The
> layers-full try push has no failures.

I opened the file directly and I can reproduced the problem with the patch, though I didn't check peter's patch.
(In reply to Ethan Lin[:ethlin] from comment #20)
> I opened the file directly and I can reproduced the problem with the patch,
> though I didn't check peter's patch.

Hm, interesting. I will try to reproduce locally and investigate. Thanks!
Attached patch Logging patchSplinter Review
Here is a logging patch that applies on top of the other two patches, it helped me trace the flow of the compositor animation ids. Keeping it here in case it comes in handy later.
I'm not able to reproduce the stacking-context-transform-none-with-fill-forwards.html failure locally :/
Status: NEW → ASSIGNED
Priority: P3 → P1
Whiteboard: [gfx-noted][wr-mvp][triage] → [wr-mvp] [gfx-noted]
I just did some debugging and this patch works for me. I haven't read whole changes in this bug. Please feel free to rework the patch or land it.
Ah I see, I can reproduce the problem if I just load the file directly in the browser. And yes, your patch is correct. We need to reset the animationsId passed to the stacking context for both nsDisplayTransform and nsDisplayOpacity when the animation ends. I'll spin this patch out to another bug since I can't push it to autoland without that.
Comment on attachment 8911868 [details] [diff] [review]
Reset the animation id if there is no animation.

I moved this patch to bug 1402925 (and expanded it to do the same for nsDisplayOpacity).
Attachment #8911868 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/26fb75dd655c
https://hg.mozilla.org/mozilla-central/rev/4480c624a554
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: