Closed Bug 1395971 Opened 2 years ago Closed 2 years ago

Intermittent dom/animation/test/mozilla/test_deferred_start.html | Starting an animation with a delay starts from the correct point

Categories

(Core :: DOM: Animation, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla62
Tracking Status
firefox57 --- wontfix
firefox61 --- fixed
firefox62 --- fixed

People

(Reporter: aryx, Assigned: mantaroh)

References

Details

(Keywords: intermittent-failure)

Attachments

(5 files)

This showed up twice on the same push but for different platforms (OSX and Windows) but retriggers (3x each) returned green.

https://treeherder.mozilla.org/logviewer.html#?job_id=127732133&repo=autoland
https://treeherder.mozilla.org/logviewer.html#?job_id=127728654&repo=autoland

13:40:05     INFO -  562 INFO None563 INFO TEST-START | dom/animation/test/mozilla/test_deferred_start.html
13:40:05     INFO -  GECKO(9980) | ++DOMWINDOW == 58 (00000258CB850800) [pid = 4352] [serial = 207] [outer = 00000258CB8AA800]
13:40:05     INFO -  GECKO(9980) | ++DOCSHELL 00000258CB8AC000 == 13 [pid = 4352] [id = {4bc0072c-45fa-4bfa-965e-94c7873550a5}]
13:40:05     INFO -  GECKO(9980) | ++DOMWINDOW == 59 (00000258CB8AD800) [pid = 4352] [serial = 208] [outer = 0000000000000000]
13:40:05     INFO -  GECKO(9980) | [Parent 9980] WARNING: [nsFrameLoader] ReallyStartLoadingInternal tried but couldn't show remote browser.
13:40:05     INFO -  GECKO(9980) | : file z:/build/build/src/dom/base/nsFrameLoader.cpp, line 895
13:40:05     INFO -  GECKO(9980) | ++DOMWINDOW == 60 (00000258CB8A1000) [pid = 4352] [serial = 209] [outer = 00000258CB8AD800]
13:40:05     INFO -  GECKO(9980) | ++DOMWINDOW == 61 (00000258CC770800) [pid = 4352] [serial = 210] [outer = 00000258CB8AD800]
13:40:06     INFO -  TEST-INFO | started process screenshot
13:40:06     INFO -  TEST-INFO | screenshot: exit 0
13:40:06     INFO -  Buffered messages logged at 13:40:05
13:40:06     INFO -  564 INFO TEST-PASS | dom/animation/test/mozilla/test_deferred_start.html |  - : Elided 1 passes or known failures.
13:40:06     INFO -  Buffered messages finished
13:40:06    ERROR -  565 INFO TEST-UNEXPECTED-FAIL | dom/animation/test/mozilla/test_deferred_start.html | Starting an animation with a delay starts from the correct point - Starting an animation with a delay starts from the correct point: assert_equals: Got a valid transform matrix on the compositor (got: "") expected 6 but got 0
Thanks for reporting this.

I'll look into it. Perhaps, this is related to bug 1366603.
Assignee: nobody → mantaroh
Status: NEW → ASSIGNED
Is bug 1341294 related, perhaps?
(In reply to Brian Birtles (:birtles) from comment #3)
> Is bug 1341294 related, perhaps?

Yes, I think the root cause is the bug 1341294 .
The workaround of bug 1366603 was not enough to fix this test failure for.

This test looks like to wait for the 'MozAfterPaint' event. The condition of waiting for this event equals that presshell's view manager needs a flush.[1][2] But this check won't work well if animation uses the compositor.
Hence, actually, this test doesn't wait for this event.

[1] http://searchfox.org/mozilla-central/rev/4d8e389498a08668cce9ebf6232cc96be178c3e4/testing/mochitest/tests/SimpleTest/paint_listener.js#60
[2] http://searchfox.org/mozilla-central/rev/4d8e389498a08668cce9ebf6232cc96be178c3e4/layout/base/nsPresContext.cpp#168

Bug 1366603 used the 'rAF', but this number of tick was not enough. I think that we had better to use the waitForAnimationFrames(2) before waiting for 'MozAfterPaint' in order to ensure that compositor sends 'DidComposite' message.(Or use Animation.ready)
Do'h! I pushed earlier.

> Bug 1366603 used the 'rAF', but this number of tick was not enough. I think
> that we had better to use the waitForAnimationFrames(2) before waiting for
> 'MozAfterPaint' in order to ensure that compositor sends 'DidComposite'
> message.(Or use Animation.ready)

If you are not going to fix bug 1366603, I think we should check isRunningOnCompositor flag instead of waiting for one more requestAnimationFrame  (Or Animation.ready) just like attachment 8904857 [details].
(In reply to Hiroyuki Ikezoe (:hiro) from comment #6)
> Do'h! I pushed earlier.
> 
> > Bug 1366603 used the 'rAF', but this number of tick was not enough. I think
> > that we had better to use the waitForAnimationFrames(2) before waiting for
> > 'MozAfterPaint' in order to ensure that compositor sends 'DidComposite'
> > message.(Or use Animation.ready)
> 
> If you are not going to fix bug 1366603, I think we should check
> isRunningOnCompositor flag instead of waiting for one more
> requestAnimationFrame  (Or Animation.ready) just like attachment 8904857 [details]
> [details].

Thanks.
I rewrite the patch using 'isRunningOnCompositor'.
This failure is relatively rare, so I think we should fix bug 1341294 instead of this workaround. (Sorry I meant bug 1341294 in comment 6)
(In reply to Hiroyuki Ikezoe (:hiro) from comment #9)
> This failure is relatively rare, so I think we should fix bug 1341294
> instead of this workaround. (Sorry I meant bug 1341294 in comment 6)

Thanks.

If we can use the code of waiting for 'isRunningOnCompositor' as suggested you, I think that we can use it instead using 'MozAfterPaint'. (i.e. This bug independent from bug 1341294).

I am sorry if my interpretation of this test purpose is mistaken.
Flags: needinfo?(hikezoe)
Comment on attachment 8904887 [details]
Bug 1395971 - Wait for few frames until sending target animation to the compositor.

I cleared the review flag since I should update this.
Attachment #8904887 - Flags: review?(hikezoe)
If the MozAfterPaint setup works as we expected (I believe it should!), isRunningOnCompositor check is not necessary here, right?
Flags: needinfo?(hikezoe)
To be more clear, isRunningOnCompositor flag is set right before we send an animation to the compositor (i.e. it's set before we composite the animation on the compositor), whereas, MozAfterPaint *should* be received after something was composited on the compositor (e.g. in animation case, an animation is composited on the compositor).  So, if MozAfterPaint is supposed to be work as we expected, the test works fine without any workarounds.
Priority: -- → P3
Depends on: 1341294
This failure is getting more frequent for these two days unfortunately.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #15)
> This failure is getting more frequent for these two days unfortunately.

As discussed with you, there are two problems:

 * The 'mViewManagerFlushIsPending'[1] is false when this test calls waitForPaints().
 * We should use 'lastTransactionID' like Talos.[2]

[1] http://searchfox.org/mozilla-central/rev/2c9a5993ac40ec1db8450e3e0a85702fa291b9e2/layout/base/nsRefreshDriver.h#218
[2] http://searchfox.org/mozilla-central/rev/2c9a5993ac40ec1db8450e3e0a85702fa291b9e2/browser/base/content/tab-content.js#819-830
I tried experimental implementation:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=f13101cec3363f5a4d1feb68bf89ed90c6959f60

Very rarely,I hit a two problems on my local environment(Ubuntu + XVFB / RefreshDriver of Software Timer[layout.frame_rate=10]):
 * Test timeout since main process doesn't notify the MozAfterPaint with incremented transactionId.
 * TEST-UNEXPECTED since compositor has not animation property yet when received the corresponded MozAfterPaint.

If I run this test without frame_rate preference, this test failure did not occur(try count is 2000).

I'm not sure why gecko doesn't issue the MozAfterPaint in this case for now. I'll investigate this reason.
I clarify a little this problem:

Gecko will issue the timer which fire the after paint event for fallback if we don't get paint events for some reason.[1] This timer will wake up after 100ms, if we don't get DidComposite message from compositor, this timer will fire the MozAfterPaint.

[1] http://searchfox.org/mozilla-central/rev/40e8eb46609dcb8780764774ec550afff1eed3a5/layout/base/nsPresContext.cpp#3450-3456

This problem is caused by this timer. If I tried to change this interval time to 300ms, I can't reproduce this phenomenon even if 5FPS.
Nice find!
Blocks: 1302071
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #24)
> I clarify a little this problem:
> 
> Gecko will issue the timer which fire the after paint event for fallback if
> we don't get paint events for some reason.[1] This timer will wake up after
> 100ms, if we don't get DidComposite message from compositor, this timer will
> fire the MozAfterPaint.
> 
> [1]
> http://searchfox.org/mozilla-central/rev/
> 40e8eb46609dcb8780764774ec550afff1eed3a5/layout/base/nsPresContext.cpp#3450-
> 3456
> 
> This problem is caused by this timer. If I tried to change this interval
> time to 300ms, I can't reproduce this phenomenon even if 5FPS.

The reason of this 100ms timer is issuing MozAfterPaint even if target is off screen. (for detail, see bug 612190 comment 2)
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #26)
> (In reply to Mantaroh Yoshinaga[:mantaroh] from comment #24)
> > I clarify a little this problem:
> > 
> > Gecko will issue the timer which fire the after paint event for fallback if
> > we don't get paint events for some reason.[1] This timer will wake up after
> > 100ms, if we don't get DidComposite message from compositor, this timer will
> > fire the MozAfterPaint.
> > 
> > [1]
> > http://searchfox.org/mozilla-central/rev/
> > 40e8eb46609dcb8780764774ec550afff1eed3a5/layout/base/nsPresContext.cpp#3450-
> > 3456
> > 
> > This problem is caused by this timer. If I tried to change this interval
> > time to 300ms, I can't reproduce this phenomenon even if 5FPS.
> 
> The reason of this 100ms timer is issuing MozAfterPaint even if target is
> off screen. (for detail, see bug 612190 comment 2)

OK, I tried following fix: (The try result is [1])

 1) Remove the 100ms timer.
     We can drop this software timer of EnsureEventualDidPaintEvent() since I think that we don't use this timer now. 
     But the test of the print preview will fail[2] due to mozAfterPaint didn't occur if we remove this timer. The simplify function of print preview uses mozAfterPaint[3]. But current implementation used software timer version instead of real mozAfterPaint.

 2) Introduce transaction id to paint_listener.js
     Current paint_listener doesn't use transaction id, so we can't distinguish the related listener when received mozAfterPaint.
     We should introduce transaction id to paint_listener.js in order to distinguish each listener and mozAfterPaint.


[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=61267951a88f9a499c0a2562fa3d9d16d78e8c03
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=c2653865b01de512e1cff6163ab33e83b8abfa42
[3] http://searchfox.org/mozilla-central/rev/7e090b227f7a0ec44d4ded604823d48823158c51/toolkit/content/browser-content.js#598
Thanks for doing this. Do we need a separate version of paint_listener.js? Can we just make paint_listener.js always check the transaction ID?
See Also: → 1415783
this has been disabled, please remember to enable it while working on a patch
If content doesn't have a paint listener, gecko will never call EnsureEventualDidPaintEvent() since gecko checks that content has paint listener before calling this function.[1][2] 

i.e. the condition of creating timer is:
 * content has paint listener(listen 'mozAfterPaint').
 * parent's ancestor pres context doesn't fire a paint event(i.e. mFireAfterPaintEvents is true).

[1] https://searchfox.org/mozilla-central/rev/bab833ebeef6b2202e71f81d225b968283521fd6/layout/base/PresShell.cpp#6401
[2] https://searchfox.org/mozilla-central/rev/bab833ebeef6b2202e71f81d225b968283521fd6/layout/painting/nsDisplayList.cpp#2417


Try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=80786be9eb7909d27552716aeffd878f96dca126
I discussed with Hiro,
This change will bring test fail of other tests. I already found test fail of dom/animation.

dom/animation/test/mozilla/file_restyle.html :
 This test will wait for mozAfterPaint of an unthrottled element[1]. So if I remove software timer of mozAfterPaint, this test will time out.

dom/animation/test/style/file_missing-keyframe-on-compositor.html :
 This test will wait for mozAfterPaint even if spite of using useTestRefreshMode. For example, this test waits for MozAfterPaint after calling Animation.pause().[2] But this test use useTestRefreshMode, so MozAfterPaint will not occur here. As result of it, this test will timeout.

[1] https://searchfox.org/mozilla-central/rev/9bab9dc5a9472e3c163ab279847d2249322c206e/dom/animation/test/mozilla/file_restyles.html#70-75
[2] https://searchfox.org/mozilla-central/rev/9bab9dc5a9472e3c163ab279847d2249322c206e/dom/animation/test/style/file_missing-keyframe-on-compositor.html#159-160

I addressed this test, but I'm not sure that how many tests will fail. This try will clarify this fails of this type.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d05a65bc2a1840852c61f0f5d1846a86d7830f9f
Thanks for the try!

(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #34)
> I discussed with Hiro,
> This change will bring test fail of other tests. I already found test fail
> of dom/animation.
> 
> dom/animation/test/mozilla/file_restyle.html :
>  This test will wait for mozAfterPaint of an unthrottled element[1]. So if I
> remove software timer of mozAfterPaint, this test will time out.

To be clear what I guessed during the discussion is the element in the test is out-of-view so nothing should be painted there, then a MozAfterPaint for removing the element is not fired in ensureElementRemoval().
Can we move this discussion to bug 1341294?

As I understand it, we're aiming to fix that bug first, then fix this bug, right?
Blocks: 1419226
Mantaroh told me on IRC that bug 1419226 fixes this.  After landing bug 1419226, we have to enable this test here in bug 1363957  (The test was disabled in that bug).
No longer blocks: 1419226
Depends on: 1419226
Duplicate of this bug: 1426017
I believe this failure might be reduced by bug 1458457.
See Also: → 1458457
Now the patches for bug 1458457 have been landed on autoland.  I am 99% convinced that the patches fix this failure.  I am going to close this on the next Monday if there will appear no new failures after the patches.
Flags: needinfo?(hikezoe)
Still happens. :/
Flags: needinfo?(hikezoe)
OK, I can reproduce the failure even with the patches for bug 1458457 and bug 1459895.  Actually those bugs reduce frequent of this failure, but do not completely fix.  Unfortunately there is a case where |readyTime| is ahead of |aCurrentTime|.  I guess what happened there is that when we call getOMTAStyle() in the test on the main-thead, we synchronously call CompositorBridgeParent::ApplyAsyncProperties() and this function ends up calling SampleAnimations() with the last compose time on the compositor.  Meanwhile, on the compositor, we surely received the transform animation data in question on the compositor, but at that moment we haven't processed the next tick for the initial composition for the transform animation yet.   That means that the last compose time is actually the previous time stamp.

So a fundamental problem here is that, getOMTAStyle() is a synchronous call, whereas every tick on the compositor happens asynchronously.

A solution I can think of is that we do use |readyTime| for such animations anyway when we call SampleAnimations() through getOMTAStyle().  This should fix completely this failure.
After some more thought, we shouldn't do the way in comment 60, it a wallpaper for the fake MozAfterPaint.
I suspected the failure case happens when the initial styling for the animation in question was skipped.

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2372fcbb8fc42d39b1fd546667dc7f605851db54

And now I am almost convinced that's the case.
Comment on attachment 8982359 [details]
Bug 1395971 - Use arrow function in file_deferred_start.html.

https://reviewboard.mozilla.org/r/248306/#review254554
Attachment #8982359 - Flags: review?(bbirtles) → review+
Comment on attachment 8982360 [details]
Bug 1395971 - Use await/async in file_deferred_start.html.

https://reviewboard.mozilla.org/r/248308/#review254556
Attachment #8982360 - Flags: review?(bbirtles) → review+
Comment on attachment 8982361 [details]
Bug 1395971 - Wait for animation being ready to restyle.

https://reviewboard.mozilla.org/r/248310/#review254558
Attachment #8982361 - Flags: review?(bbirtles) → review+
Thanks for the quick review!
Pushed by hikezoe@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c31c15686c98
Use arrow function in file_deferred_start.html. r=birtles
https://hg.mozilla.org/integration/autoland/rev/6653e98d4a73
Use await/async in file_deferred_start.html. r=birtles
https://hg.mozilla.org/integration/autoland/rev/e878bf09bf22
Wait for animation being ready to restyle. r=birtles
https://hg.mozilla.org/mozilla-central/rev/c31c15686c98
https://hg.mozilla.org/mozilla-central/rev/6653e98d4a73
https://hg.mozilla.org/mozilla-central/rev/e878bf09bf22
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
No new failure happened since landing the patches.
You need to log in before you can comment on or make changes to this bug.