Closed Bug 1223249 Opened 4 years ago Closed 4 years ago

Intermittent animate-layer-scale-inherit-3.html | image comparison (==), max difference: 255, number of differing pixels: 10000

Categories

(Core :: CSS Parsing and Computation, defect)

45 Branch
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla47
Tracking Status
firefox45 --- wontfix
firefox46 --- fixed
firefox47 --- fixed
firefox-esr45 --- fixed

People

(Reporter: KWierso, Assigned: hiro)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

I've got a replay of this in teenux (it only took me 10 minute to capture). I picked it cause it was near the top of reftest intermittent:
rr replay ~/mozilla-central/tree/rr-recording-1223249/latest-trace

If we can locate someone who has time and wants to use RR to solve this I can grant them access to this machine.

ni?RyanVM on maybe having a better idea who might be a good candidate to dig through this replay? If I'm wrong just cancel the request.
Flags: needinfo?(ryanvm)
Brian might be interested in debugging this?
Flags: needinfo?(bbirtles)
Not sure I'll have a chance in the next 2 weeks but would be interested after that. Adding Hiro too in case he's interested.
Sure, I like oranges! But at first glance, 100s duration [1] is too long compared to the timeout value[2].

[1] https://dxr.mozilla.org/mozilla-central/rev/e355cacefc881ba360d412853b57e8e060e966f4/layout/reftests/transform/animate-layer-scale-inherit-3.html#11
[2] https://dxr.mozilla.org/mozilla-central/rev/e355cacefc881ba360d412853b57e8e060e966f4/layout/reftests/transform/animate-layer-scale-inherit-3.html#39

I think we should use shorter duration, say, 1s as other animate-layer-scale-inherit-X do.
Ni? to dbaron since he wrote the the tests.
Flags: needinfo?(ryanvm)
Flags: needinfo?(bbirtles)
Thank you, roc!
(In reply to Hiroyuki Ikezoe (:hiro) from comment #16)
> Sure, I like oranges! But at first glance, 100s duration [1] is too long
> compared to the timeout value[2].
> 
...
> 
> I think we should use shorter duration, say, 1s as other
> animate-layer-scale-inherit-X do.

I think the timeout is for the snapshot which is supposed to happen while the animation is still in play on the compositor (but with transform:none). Unless you're suggesting the snapshot might sometimes happen during the initial 10ms window where the scale is 0.2?
Sorry, ignore my previous comment; I was misunderstanding what the test was doing.

From the one screenshot I looked at, it does indeed appear that the reftest snapshot is being taken while the transform is scale(0.2).

I guess one possibility is to make a shorter animation that iterates with alternating direction, and take the snapshot in response to the animationiteration event.  Under high load, I suppose it's still possible that compositor and main threads could be out-of-sync in that case, though.
Component: Layout → CSS Parsing and Computation
How about using a super longer duration with step-start and requestAnimationFrame like this?:

#inner { animation: HoldTransform step-start 1000000s }
@keyrames HoldTransform {
 0% { transform: scale(0.2) }
 0.01%, 100% { transform: none }
};

var animation = document.getElementById("inner").getAnimations()[0];
animation.ready.then(function() {
  requestAnimationFrame(function() {
    document.documentElement.classList.remove("reftest-wait");
  });
});

This works fine locally.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #21)
> How about using a super longer duration with step-start and
> requestAnimationFrame like this?:

100s will be sufficient, if it fails, it's a real timeout I think.
With step-start, even the first frame should be treated as transform:none[1] so I don't think this will end up testing what we want it to.

I think we're just failing because we're waiting 100ms but we're timing it from animationstart which happens before we do painting and set up animations on the compositor. Under load it's possible we might run setTimeout callback on the main thread before the animation has been running on the compositor for 10ms.

So I think introducing requestAnimationFrame like you did in this patch is good. I wonder if waiting one or two frames on the main thread *then* doing the 100ms setTimeout would be enough to ensure the compositor has reached the transform:none part of the animation.

[1] See diagram here: https://drafts.csswg.org/css-transitions/#transition-timing-function-property
Comment on attachment 8719590 [details]
MozReview Request: Bug 1223249 - Use step-end with reverse direction to avoid intermittent failures. r?dbaron

(In reply to Brian Birtles (:birtles) from comment #24)
> With step-start, even the first frame should be treated as transform:none[1]
> so I don't think this will end up testing what we want it to.
> 
> I think we're just failing because we're waiting 100ms but we're timing it
> from animationstart which happens before we do painting and set up
> animations on the compositor. Under load it's possible we might run
> setTimeout callback on the main thread before the animation has been running
> on the compositor for 10ms.
> 
> So I think introducing requestAnimationFrame like you did in this patch is
> good. I wonder if waiting one or two frames on the main thread *then* doing
> the 100ms setTimeout would be enough to ensure the compositor has reached
> the transform:none part of the animation.
> 
> [1] See diagram here:
> https://drafts.csswg.org/css-transitions/#transition-timing-function-property

Thank you, Brian.  I thought step-start means (0,1].  I am still figuring out we can avoid any timeouts somehow.
Attachment #8719590 - Flags: review?(dbaron)
OK, I think we can use step-end with reverse direction.  It is little bit tricky, but does not depend on step-start behavior.
Comment on attachment 8719590 [details]
MozReview Request: Bug 1223249 - Use step-end with reverse direction to avoid intermittent failures. r?dbaron

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/35007/diff/1-2/
Attachment #8719590 - Attachment description: MozReview Request: Bug 1223249 - Use step-start to avoid intermittent failures. r?dbaron → MozReview Request: Bug 1223249 - Use step-end with reverse direction to avoid intermittent failures. r?dbaron
Attachment #8719590 - Flags: review?(dbaron)
Did you test that the reftest still fails if you revert the change in the patch that added the reftest (i.e., to GetScaleForValue in https://hg.mozilla.org/mozilla-central/rev/3021b9c9cb84)?

(If it doesn't still fail... did it still fail on mozilla-central without your changes to the test?)
Flags: needinfo?(hiikezoe)
No, I did not test it at all.  I should have done it.
Now I tried it.  Reverting the change makes the test failure on both cases (with or without attachment 8719590 [details]).
Flags: needinfo?(hiikezoe)
Comment on attachment 8719590 [details]
MozReview Request: Bug 1223249 - Use step-end with reverse direction to avoid intermittent failures. r?dbaron

https://reviewboard.mozilla.org/r/35007/#review32013

::: layout/reftests/transform/animate-layer-scale-inherit-3.html:25
(Diff revision 2)
> -  0% { transform: scale(0.2) }
> +   * NOTE: This frames will be used reverse direction.

"This frames" -> "These keyframes"

r=dbaron with that, although you should probably do a few retriggers on a try run to double-check for intermittency (or maybe test in rr's chaos mode a few times)
Attachment #8719590 - Flags: review?(dbaron) → review+
Comment on attachment 8719590 [details]
MozReview Request: Bug 1223249 - Use step-end with reverse direction to avoid intermittent failures. r?dbaron

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/35007/diff/2-3/
https://reviewboard.mozilla.org/r/35007/#review32013

Thank you, dbaron.  I've been running the test with chaos mode locally, but I just pushed a try with retriggers.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d4e50e23219f
Hello sheriffs,  Though the test in this bug did not fail at all, there were lots of new reftest failures on the try on Linux ASAN build.  It's worth filing all of oranges?

An example:
https://treeherder.mozilla.org/logviewer.html#?job_id=16871862&repo=try
Flags: needinfo?(dbaron) → needinfo?(ryanvm)
Keywords: checkin-needed
There's a class of known ASAN reftest failures due to GTK3 widget drawing issues, so as long as that's what's showing in the screenshots (i.e. scrollbar/button/etc differences being highlighted), I wouldn't worry about it.
Flags: needinfo?(ryanvm)
Thank you, Ryan!  That's what I wanted to know.
https://hg.mozilla.org/mozilla-central/rev/b635a8067101
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in before you can comment on or make changes to this bug.