ASSERTION: Tried to delete invalid animation: 'Error', file /home/worker/workspace/build/src/gfx/layers/wr/WebRenderBridgeParent.cpp, line 380

RESOLVED FIXED in Firefox 58

Status

()

defect
P1
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: vliu, Assigned: pchang)

Tracking

({regression})

unspecified
mozilla58
ARM
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox55 unaffected, firefox56 unaffected, firefox57 unaffected, firefox58 fixed)

Details

(Whiteboard: [wr-mvp])

Attachments

(2 attachments)

Found it in [1] in the beginning, it happens in Linux x64 QuantumRender debug build.

[1]: https://treeherder.mozilla.org/#/jobs?repo=try&revision=aa4791c97c18002b53b49f3ef2ef7de49b5459d3&selectedJob=126276492

Currently I can reproduce it in local on Ubuntu, with the following prefs. The way to reproduce it is just run 593243-1.html.

pref("layers.async-pan-zoom.enabled", false);
pref("gfx.webrender.layers-free", true);
pref("layers.acceleration.force-enabled", true);
In my current investergation, when CreateWebRenderCommands was called for nsDisplayTransform and nsDisplayOpacity, it appended this CompositorAnimation for discard with its id. After that, calling AddWebRenderParentCommand for this animation under the matching conditions found. In [2], the web render commans was received and inserted in mActiveAnimations. Please see [3].

In contend side, it has chance to send SendDeleteCompositorAnimations to delete CompositorAnimations in mActiveAnimations for preious transaction. See [4] for detail.

The crash in this case is : When CreateWebRenderCommands was called, mDiscardedCompositorAnimationsIds was actually append with this id. But after that, because matching conditions not found (fail), it didn't call AddWebRenderParentCommand to send this WebRenderParentCommand. Under this situation, when endDeleteCompositorAnimations with carying animation id, some animation missed in mActiveAnimations. Finally it heit ERROR in [5].

[1]: https://searchfox.org/mozilla-central/rev/999385a5e8c2d360cc37286882508075fc2078bd/gfx/layers/AnimationInfo.cpp#73
[2]: https://searchfox.org/mozilla-central/rev/999385a5e8c2d360cc37286882508075fc2078bd/gfx/layers/wr/WebRenderBridgeParent.cpp#684
[3]: https://searchfox.org/mozilla-central/rev/999385a5e8c2d360cc37286882508075fc2078bd/gfx/layers/wr/WebRenderBridgeParent.cpp#689
[4]: https://searchfox.org/mozilla-central/rev/999385a5e8c2d360cc37286882508075fc2078bd/gfx/layers/wr/WebRenderLayerManager.cpp#936
[5]: https://searchfox.org/mozilla-central/rev/999385a5e8c2d360cc37286882508075fc2078bd/gfx/layers/wr/WebRenderBridgeParent.cpp#426
Here was a WIP to fix. But I am not sure it is the correct direction. For those animations with matching not found, I tried to remove it from mDiscardedCompositorAnimationsIds.
Because I am not familiar with animation relatives and the WIP may not consider too much. I would set the owner to Peter to further investigate.
Assignee: nobody → howareyou322
Has STR: --- → yes
Status: NEW → ASSIGNED
Priority: P3 → P1
Whiteboard: [wr-mvp]
Target Milestone: --- → mozilla57
I could reproduce this issue based on comment 0. For every AddAnimationsForProperty calls, it will call ClearAnimation in the beginning and then call AddAnimations later. But this function could return early if there are no compositorAnimations in [2]. Therefore, we end of discarding a non-existed animation id and hit assertion.

In Bug 1373836, I landed a patch to reuse the CompositorAnimationId per layer or AnimationInfo(layers-free). I think we only need to discard the CompositorAnimationId inside ~AnimationInfo(). I will submit a try to see any failures.


[1]http://searchfox.org/mozilla-central/source/layout/painting/nsDisplayList.cpp#614
[2]http://searchfox.org/mozilla-central/source/layout/painting/nsDisplayList.cpp#630
Comment on attachment 8908037 [details]
Bug 1396471 - Keep the running compositorAnimationsId alive on the compositor,

https://reviewboard.mozilla.org/r/179720/#review184930

::: gfx/layers/AnimationInfo.cpp:68
(Diff revision 1)
>  void
>  AnimationInfo::ClearAnimations()
>  {
>    mPendingAnimations = nullptr;
>  
> -  if (mAnimations.IsEmpty() && mAnimationData.IsEmpty()) {
> +  if (mAnimations.IsEmpty()) {

The mAnimationData is the internal data structure for compositor only. Here we just check the mAnimations to clean up compositor animations because mAnimations will be changed when new animations are available.
Comment on attachment 8908037 [details]
Bug 1396471 - Keep the running compositorAnimationsId alive on the compositor,

https://reviewboard.mozilla.org/r/179720/#review185030

I don't see how this solves the problem. If my understanding is correct, the problem is this: on every transaction, we call ClearAnimations() at [1] followed by re-adding the animations (if there are any) at [2]. The ClearAnimations call will clear the animations on the compositor side from the previous transaction and the AddAnimations will add them back for the new transaction. However sometimes there are no animations added (e.g. because of [3]), and so on the next transaction, the ClearAnimations tries to clear an animation that doesn't exist.

I don't see how your patch would change this behaviour. We are still calling ClearAnimations at [1] for each transaction and we're not resetting mCompositorAnimationsId back to 0. So we will still be calling AddCompositorAnimationsIdForDiscard with an animation id that doesn't exist on the compositor side. Am I missing something?

[1] http://searchfox.org/mozilla-central/rev/6326724982c66aaeaf70bb7c7ee170f7a38ca226/layout/painting/nsDisplayList.cpp#614
[2] http://searchfox.org/mozilla-central/rev/6326724982c66aaeaf70bb7c7ee170f7a38ca226/layout/painting/nsDisplayList.cpp#740
[3] http://searchfox.org/mozilla-central/rev/6326724982c66aaeaf70bb7c7ee170f7a38ca226/layout/painting/nsDisplayList.cpp#630
Comment on attachment 8908037 [details]
Bug 1396471 - Keep the running compositorAnimationsId alive on the compositor,

https://reviewboard.mozilla.org/r/179720/#review185030

Yes, you are right about the first paragraph.

The condition in [1] is the reason to cause the assertion problem.
For every AddAnimations calls in [2], the mAnimations will be changed. The mAnimationData is a kind of temporary data for animation sampling. When running the test case, I found we still tried to discard the animation id inside ClearAnimations() with empty mAnimations (which mean no AddAnimations called last time).

Therefore, I think it's correct to check mAnimations only inside ClearAnimations() because it means we got AddAnimations() called before.

Therefore, we should only check mAnimations
[1]http://searchfox.org/mozilla-central/rev/2c9a5993ac40ec1db8450e3e0a85702fa291b9e2/gfx/layers/AnimationInfo.cpp#67
[2]http://searchfox.org/mozilla-central/rev/2c9a5993ac40ec1db8450e3e0a85702fa291b9e2/gfx/layers/AnimationInfo.cpp#57


When writing the comment, I find we won't modify the mAnimationData with layers-free because we only modify it inside AnimationInfo::SetCompositorAnimations with layers exist. I will dobule confirm this again.
Attachment #8908037 - Flags: review?(bugmail)
(In reply to Peter Chang[:pchang] from comment #8)
> 
> When writing the comment, I find we won't modify the mAnimationData with
> layers-free because we only modify it inside
> AnimationInfo::SetCompositorAnimations with layers exist. I will dobule
> confirm this again.

Cancel the review flag before I confirm above statement.
It looks like there was something wrong when I reproduce the problem with attachment 8908037 [details]. Kats, you are right that it can't fix the problem.(Kats, sorry for the incorrect review request)

Now the animation id is reused for every compositor animation. During the testing, I found the same animation could be changed multiple times within one transaction. Since we store the ids in nsArray[1], we could store the same id twice and then it got deleted twice in the compositor. Store the discarded animation id with the unordered_set to avoid the duplication. I will attach the patch later.

[1]http://searchfox.org/mozilla-central/rev/2c9a5993ac40ec1db8450e3e0a85702fa291b9e2/gfx/layers/wr/WebRenderLayerManager.cpp#931
(In reply to Peter Chang[:pchang] from comment #8)
> The
> mAnimationData is a kind of temporary data for animation sampling.

Agreed.

> Therefore, I think it's correct to check mAnimations only inside
> ClearAnimations() because it means we got AddAnimations() called before.

I don't see why this would make a difference - in layers-free mAnimationData should always be empty on the content side, right? So mAnimationData.IsEmpty() should always be true and that clause in the condition is useless. So sure, we can remove it, but it shouldn't result in any behaviour change (at least for layers-free).

> When writing the comment, I find we won't modify the mAnimationData with
> layers-free because we only modify it inside
> AnimationInfo::SetCompositorAnimations with layers exist. I will dobule
> confirm this again.

And even this should only happen on the compositor side, right? Not on the content side, which is where we're calling ClearAnimations().

(In reply to Peter Chang[:pchang] from comment #10)
> During the
> testing, I found the same animation could be changed multiple times within
> one transaction.

This is very surprising to me, I thought the animation only gets updated per-transaction so there shouldn't be more than one of those per transaction. If you find this to be the case please provide more details as I would like to understand why this happens.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #11)
> (In reply to Peter Chang[:pchang] from comment #8)
> > The
> > mAnimationData is a kind of temporary data for animation sampling.
> 
> Agreed.
> This is very surprising to me, I thought the animation only gets updated
> per-transaction so there shouldn't be more than one of those per
> transaction. If you find this to be the case please provide more details as
> I would like to understand why this happens.

I can reproduce this with debug build on OSX/Linux but not with opt build by running the following cmd.
Maybe this hits the timing issue because of debug build. I can investigate more next week.

./mach reftest  --setpref reftest.ignoreWindowSize=true --setpref gfx.webrender.layers-free=true --setpref gfx.webrender.enabled=true layout/reftests/bugs/593243*
After investigation, I found this issue with the debug build was not always reproduced. And my statement "the same animation could be changed multiple times within one transaction" in comment 10 was wrong.
After bug 1373836, the compositorAnimationsId is reused for every AnimationInfo. It becomes very hard to trace the clear/add animations pairs during the transactions because we defer the animation clean up to next transaction[1].

If I don't reuse the animation id, I couldn't reproduce the assertion.
Finally, I found the cause and please refer the following example at the end of animation transaction.

[Example]
EndTransaction start
....
....
AddAnimation with id=1 20th
EndTransaction end

EndTransaction start
DiscardCompositorAnimation() -> SendDeleteCompositorAnimations for animation(20th)
ClearAnimation with id=1  21th  // In above AddAnimation(20th), it creates new animation data.
AddAnimation with id=1    21th
EndTransaction end

EndTransaction start
DiscardCompositorAnimation() -> SendDeleteCompositorAnimations for animation(21th)
ClearAnimation with id=1 22th  // In above AddAnimation(21th), it creates new animation data.
//No AddAnimation here
EndTransaction end

EndTransaction start
DiscardCompositorAnimation() -> SendDeleteCompositorAnimations for animation(22th)
********** Assertion happened because we didn't have AddAnimation call in last transaction  ********
EndTransaction end

[1]http://searchfox.org/mozilla-central/rev/2ef8bd8a46a02c68ddbb1d5f25fa254dd7be1fbd/gfx/layers/wr/WebRenderLayerManager.cpp#718
Comment on attachment 8908037 [details]
Bug 1396471 - Keep the running compositorAnimationsId alive on the compositor,

https://reviewboard.mozilla.org/r/179720/#review187448

::: gfx/layers/AnimationInfo.cpp:48
(Diff revisions 1 - 2)
>  
>    Animation* anim = mAnimations.AppendElement();
>  
> +  if (mManager->AsWebRenderLayerManager()) {
> +    mManager->AsWebRenderLayerManager()->
> +      KeepCompositorAnimationsIdAlive(mCompositorAnimationsId);

To fix this issue properly, we can remove the id which was added by last clearAnimation call in the mDiscardedCompositorAnimationsIds. It means we keep the id alive on the compositor side.
My patch also fixed the animation problem in http://www.nintendo.com/games/
Comment on attachment 8908037 [details]
Bug 1396471 - Keep the running compositorAnimationsId alive on the compositor,

https://reviewboard.mozilla.org/r/179720/#review187520

This looks like a good fix, thanks!
Attachment #8908037 - Flags: review?(bugmail) → review+
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a72e51b44407
Keep the running compositorAnimationsId alive on the compositor, r=kats
https://hg.mozilla.org/mozilla-central/rev/a72e51b44407
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: mozilla57 → mozilla58
You need to log in before you can comment on or make changes to this bug.