Remove the timer of MozAfterPaint which brings intermittent test fail of animation.

ASSIGNED
Assigned to

Status

()

Core
Layout
P3
normal
ASSIGNED
2 months ago
2 days ago

People

(Reporter: mantaroh, Assigned: mantaroh)

Tracking

(Blocks: 1 bug)

Trunk
Points:
---

Firefox Tracking Flags

(firefox59 affected)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(3 attachments)

(Assignee)

Description

2 months ago
Several tests in animation mochitest uses paint_listener for waiting paint event.[1][2] This paint listener doesn't use transactionId now(for detail, see bug 1264409), so this brings several problems for animation tests.

 * Test might success even if received the wrong paint event.
 * If compositor is delay, paint event of software timer version[2] fired, and test of related with compositor will fail.

This bug will introduce transaction id for paint listener in order to use it on animation tests.
(Assignee)

Comment 1

2 months ago
This patch added transaction id into animation tests:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d8040299be0b41f1109d9202d88dd295d9a99fa1

However this patch will not fix an intermittent bug(bug 1395971), I need to work following task in order to fix bug 1395971 after landing this:

 * Delete the timer of ensuring eventual did paint event.
 * Modify tests which use the paint listener(i.e. should use transaction id).
   * Especially,  several APZ and Scroll tests fail due to doesn't occur a painting.
(Assignee)

Comment 2

2 months ago
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #0)
> Several tests in animation mochitest uses paint_listener for waiting paint
> event.[1][2] This paint listener doesn't use transactionId now(for detail,
> see bug 1264409), so this brings several problems for animation tests.
> 
>  * Test might success even if received the wrong paint event.
>  * If compositor is delay, paint event of software timer version[2] fired,
> and test of related with compositor will fail.
> 
> This bug will introduce transaction id for paint listener in order to use it
> on animation tests.

Ah, Sorry for confusion. Bug 1302071 has already fixed the bug 1341294[1]. So I think htat we might not need to check the transaction id on paint_listener.js. I guess the root cause of intermittent test fail(bug 1395971) is software timer of MozAfterPaint. So I will work on it in this bug.


[1] https://hg.mozilla.org/mozilla-central/rev/a535bf3bfff1#l2.170
If that's the case, please mark bug 1341294 as a dupe of bug 1302071 linking to the relevant changeset.
(Assignee)

Comment 4

2 months ago
(In reply to Brian Birtles (:birtles) from comment #3)
> If that's the case, please mark bug 1341294 as a dupe of bug 1302071 linking
> to the relevant changeset.

Thanks!
I've marked it as a dupe.

I'll remove EnsureEventualDidPaintEvent and modify the related code in this bug.
This function will create the timer which fires the MozAfterPaint after attached the MozAfterPaint listener[1]

[1] https://searchfox.org/mozilla-central/rev/919dce54f43356c22d6ff6b81c07ef412b1bf933/layout/base/nsPresContext.cpp#3468-3472
Component: DOM: Animation → Layout
No longer depends on: 1341294
Summary: Introduce transaction id for paint listerner for using it in animation mochitest. → Remove the timer of MozAfterPaint which brings intermittent test fail of animation.
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 10

2 months ago
mozreview-review
Comment on attachment 8930792 [details]
Bug 1419226 - Part 3.Wait for MozAfterPaint instead of waiting for frames on file_deferred_start.html .

https://reviewboard.mozilla.org/r/201446/#review207198

::: dom/animation/test/mozilla/file_deferred_start.html:100
(Diff revision 1)
> -  // for paints and only then do we commence the test. Even doing that, this
> -  // test can sometimes pass when it should not due to a stray paint. Most of
> -  // the time, however, it will correctly fail so hopefully even if we do
> +  // that, this test can sometimes pass when it should not due to a stray paint.
> +  // Most of the time, however, it will correctly fail so hopefully even if we
> +  // do occasionally produce a false negative on one platform, another platform
> -  // occasionally produce a false negative on one platform, another platform
>    // will fail as expected.
> -  return waitForDocLoad().then(() => waitForIdle())
> +  return waitForPaints().then(() => {

I thinks we should still wait for document load.  I'd suggest to wait for document load only in the first test case and drop other usage in other test cases.  What do you think?

::: dom/animation/test/mozilla/file_deferred_start.html:105
(Diff revision 1)
> -    // TODO: Current waitForPaint() will not wait for MozAfterPaint in this
> -    // case(Bug 1341294), so this waiting code is workaround for it.
> +      // TODO : This test might run incorrectly due to fire MozAfterPaint based
> +      //        on software timer(100ms). For detail, see bug 1395971.

I don't quite understand this comment.  Could you please elaborate it?  What will happen in bug 1395971?

::: dom/animation/test/mozilla/file_restyles.html:70
(Diff revision 1)
>        resolve(stylingMarkers);
>      });
>    });
>  }
>  
> -function ensureElementRemoval(aElement) {
> +function ensureElementRemoval(aElement, shouldWaitForPaint = true) {

Using boolean argument in multiple argumensts is generally a bad idea.  We shouldn't do it.

::: dom/animation/test/style/file_missing-keyframe-on-compositor.html:160
(Diff revision 1)
> +    SpecialPowers.DOMWindowUtils.advanceTimeAndRefresh(50 * MS_PER_SEC);
>      return waitForPaintsFlushed();
>    }).then(() => {
> -    SpecialPowers.DOMWindowUtils.advanceTimeAndRefresh(50 * MS_PER_SEC);
> -

This change is totally wrong.  Before this change, the test assumes;

1) an animation is sent to the compositor
2) call pause(), thus the animation is updated on the main thread
3) then the animation is re-sent to the compositor
4) advance 50ms on the compositor
5) check the value on the compositor

After this change the time is advanced when the animation is on the main thead, right?
Attachment #8930792 - Flags: review?(hikezoe) → review-
If the fallback timer is now only needed by print preview, rather than leaving it in nsPresContext, it seems better to move it to print preview code?
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Comment 15

2 months ago
(In reply to Brian Birtles (:birtles) from comment #11)
> If the fallback timer is now only needed by print preview, rather than
> leaving it in nsPresContext, it seems better to move it to print preview
> code?

Thanks.
I try to remove the timer from nsPresContxt and add this timer to the print preview code. It seems to work correctly in my local environment. So I updated this patches.
Comment hidden (mozreview-request)
(Assignee)

Comment 17

2 months ago
mozreview-review-reply
Comment on attachment 8930792 [details]
Bug 1419226 - Part 3.Wait for MozAfterPaint instead of waiting for frames on file_deferred_start.html .

https://reviewboard.mozilla.org/r/201446/#review207198

Thank you for the review.

> I thinks we should still wait for document load.  I'd suggest to wait for document load only in the first test case and drop other usage in other test cases.  What do you think?

Yes, I think it will clarify the test. So I drop this change.

> I don't quite understand this comment.  Could you please elaborate it?  What will happen in bug 1395971?

Oh,, Sory. This comment is no longer necessary in this change.

> Using boolean argument in multiple argumensts is generally a bad idea.  We shouldn't do it.

OK. I use element.remove instead of using this flag.

> This change is totally wrong.  Before this change, the test assumes;
> 
> 1) an animation is sent to the compositor
> 2) call pause(), thus the animation is updated on the main thread
> 3) then the animation is re-sent to the compositor
> 4) advance 50ms on the compositor
> 5) check the value on the compositor
> 
> After this change the time is advanced when the animation is on the main thead, right?

Thanks. This change is not need for this bug. This was for previous transaction id.
This test will pass even without changing this test.

Comment 18

2 months ago
mozreview-review
Comment on attachment 8930792 [details]
Bug 1419226 - Part 3.Wait for MozAfterPaint instead of waiting for frames on file_deferred_start.html .

https://reviewboard.mozilla.org/r/201446/#review207278

::: commit-message-a3c8b:5
(Diff revision 3)
> +Bug 1419226 - Part 3.Wait for non-timer MozAfterPaint on an animation mochitest.r?hiro
> +
> +This patch will:
> + * Wait for MozAfterPaint without waiting for another event in test_deferred_start.html.
> + * Remove waiting for MozAfterPaint if paint process doesn't occur in test_restyles.html.

Could you please check when MozAfterPaint is not fired exactly?

Initially I was thinking that removing an element is scrolled-out does not fire MozAfterPaint.  I think this is correct.  But those scrolled-out element is inside another element which has basically scrollbar, so removing the parent should fire MozAfterPaint, right?  Also I am wondering whether removing visibility:hidden element does fire MozAfterPaint or not, since visibility:hidden affects layout,  I guess it should fire (if reflow happens?).
Attachment #8930792 - Flags: review?(hikezoe) → review-
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #17)
> Comment on attachment 8930792 [details]
> Bug 1419226 - Part 3.Wait for non-timer MozAfterPaint on an animation
> mochitest.
> 
> https://reviewboard.mozilla.org/r/201446/#review207198
> 
> Thank you for the review.
> 
> > I thinks we should still wait for document load.  I'd suggest to wait for document load only in the first test case and drop other usage in other test cases.  What do you think?
> 
> Yes, I think it will clarify the test. So I drop this change.

I think we should change them in this bug.  What I was concerned is that DidComposite happens when nobody listens MozAfterPaint event during waiting for document load.


  const div = addDiv(t, { class: 'target' });

  return waitForDocLoad().then(() => waitForPaints())

  .then(() => {

With this setup, a paint process for 'addDiv' might happens while waiting for document load, right?  So this setup should be

  return waitForDocLoad().then(() => {
    const div = addDiv(t, { class: 'target' });
    return waitForPaints();
  })
Comment on attachment 8930790 [details]
Bug 1419226 - Part 1.Change observing target window of MozAfterPaint.

https://reviewboard.mozilla.org/r/201442/#review207554

Thanks! This probably took care of some bugs because we were attaching the event handler on the wrong window. :/

::: toolkit/content/browser-content.js:594
(Diff revision 2)
>        // will wait for MozAfterPaint event to be fired.
>        let webProgressListener = {
>          onStateChange(webProgress, req, flags, status) {
>            if (flags & Ci.nsIWebProgressListener.STATE_STOP) {
>              webProgress.removeProgressListener(webProgressListener);
> -            let domUtils = content.QueryInterface(Ci.nsIInterfaceRequestor)
> +            let domUtils = contentWindow.QueryInterface(Ci.nsIInterfaceRequestor)

Oof, good catch!
Attachment #8930790 - Flags: review?(mconley) → review+

Comment 21

2 months ago
mozreview-review
Comment on attachment 8930791 [details]
Bug 1419226 - Part 2. Remove notify did paint timer.

https://reviewboard.mozilla.org/r/201444/#review207616
Attachment #8930791 - Flags: review?(matt.woodrow) → review+
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Comment 25

2 months ago
mozreview-review-reply
Comment on attachment 8930792 [details]
Bug 1419226 - Part 3.Wait for MozAfterPaint instead of waiting for frames on file_deferred_start.html .

https://reviewboard.mozilla.org/r/201446/#review207278

Thanks hiro,

I addressed tests code: 
 * drop file_restyles change.
 * change the timing of creating div element.

> Could you please check when MozAfterPaint is not fired exactly?
> 
> Initially I was thinking that removing an element is scrolled-out does not fire MozAfterPaint.  I think this is correct.  But those scrolled-out element is inside another element which has basically scrollbar, so removing the parent should fire MozAfterPaint, right?  Also I am wondering whether removing visibility:hidden element does fire MozAfterPaint or not, since visibility:hidden affects layout,  I guess it should fire (if reflow happens?).

Sorry, this test is same to test_missing-keyframe-on-compoistor.html. i.e. we don't need to modify on this bug.

Comment 26

2 months ago
mozreview-review
Comment on attachment 8930792 [details]
Bug 1419226 - Part 3.Wait for MozAfterPaint instead of waiting for frames on file_deferred_start.html .

https://reviewboard.mozilla.org/r/201446/#review208038

As I suggested in comment 10, we just need to wait for document load in the *first* test case.  Rest of test cases shouldn't wait for the document load since they are promise test, right?

r=me with that change (comments should also be modified).
Attachment #8930792 - Flags: review?(hikezoe) → review+

Updated

2 months ago
Blocks: 1415783

Updated

2 months ago
Blocks: 1395971
No longer depends on: 1395971
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Comment 30

2 months ago
mozreview-review-reply
Comment on attachment 8930792 [details]
Bug 1419226 - Part 3.Wait for MozAfterPaint instead of waiting for frames on file_deferred_start.html .

https://reviewboard.mozilla.org/r/201446/#review208038

Thank you.
I changed to waiting for document load, then create div element after it in first test case only.

Comment 32

2 months ago
Pushed by mantaroh@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/21d9bedcf411
Part 1.Change observing target window of MozAfterPaint. r=mconley
https://hg.mozilla.org/integration/autoland/rev/8cda3fb3ce1a
Part 2. Remove notify did paint timer. r=mattwoodrow
https://hg.mozilla.org/integration/autoland/rev/5a2460c34657
Part 3.Wait for MozAfterPaint instead of waiting for frames on file_deferred_start.html . r=hiro
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #31)
> Try:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=1491909d4d02e3005643013bc421367f5e1e45a4&group_state=e
> xpanded

I suspect test_bug574663.html is affected by this bug since it uses MozAfterPaint and the intermittent failure (bug 1324160) has never happened for a year. 

https://treeherder.mozilla.org/#/jobs?repo=try&revision=1491909d4d02e3005643013bc421367f5e1e45a4&group_state=expanded&selectedJob=147746596
Flags: needinfo?(mantaroh)
Duplicate of this bug: 1420936
(Assignee)

Comment 38

2 months ago
(In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on intermittent or backout) from comment #36)
> And these intermittent reftest failures:
>
> https://treeherder.mozilla.org/logviewer.html#?job_id=147773530&repo=autoland
> https://treeherder.mozilla.org/logviewer.html#?job_id=147790611&repo=autoland

Thanks.

This reason is that 'apz-repaints-flushed' events didn't happen.[1] I guess wpt failure is same of this. In any way, I'll look into this more.

[1] https://searchfox.org/mozilla-central/rev/7a8c667bdd2a4a32746c9862356e199627c0896d/layout/tools/reftest/reftest-content.js#680
Flags: needinfo?(mantaroh)

Updated

2 months ago
Priority: -- → P3
(Assignee)

Comment 39

2 months ago
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #38)
> (In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on
> intermittent or backout) from comment #36)
> > And these intermittent reftest failures:
> >
> > https://treeherder.mozilla.org/logviewer.html#?job_id=147773530&repo=autoland
> > https://treeherder.mozilla.org/logviewer.html#?job_id=147790611&repo=autoland
> 
> Thanks.
> 
> This reason is that 'apz-repaints-flushed' events didn't happen.[1] I guess
> wpt failure is same of this. In any way, I'll look into this more.
> 
> [1]
> https://searchfox.org/mozilla-central/rev/
> 7a8c667bdd2a4a32746c9862356e199627c0896d/layout/tools/reftest/reftest-
> content.js#680

Oh,, Sorry. This is wrong. Timeout is occurred in next step, i.e. step of waiting to finish. Local test failure log is follow:

REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_FOR_APZ_FLUSH
REFTEST INFO | Saved log: [CONTENT] AfterPaintListener in file:///home/mantaroh/mozilla-central/layout/reftests/border-image/470250-2.html
REFTEST INFO | Saved log: [CONTENT] SendUpdateCanvasForEvent with 0 rects
REFTEST INFO | Saved log: Updating canvas for invalidation
REFTEST INFO | Saved log: [CONTENT] MakeProgress: apz-repaints-flushed fired
REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FINISH
REFTEST INFO | Saved log: [CONTENT] MakeProgress: waiting for MozAfterPaint
REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FINISH
REFTEST INFO | Saved log: [CONTENT] MakeProgress: waiting for MozAfterPaint
REFTEST INFO | Saved log: [CONTENT] AfterPaintListener in file:///home/mantaroh/mozilla-central/layout/reftests/border-image/470250-2.html
REFTEST INFO | Saved log: [CONTENT] SendUpdateCanvasForEvent with 1 rects
REFTEST INFO | Saved log: [CONTENT] Rect: 8 8 60 112
REFTEST INFO | Saved log: Updating canvas for invalidation
REFTEST INFO | Saved log: DoDrawWindow 8,8,52,104
REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FINISH
REFTEST INFO | Saved log: [CONTENT] MakeProgress: waiting for MozAfterPaint
REFTEST TEST-END | file:///home/mantaroh/mozilla-central/layout/reftests/border-image/470250-2.html == file:///home/mantaroh/mozilla-central/layout/reftests/border-image/470250-2-ref.html
++DOMWINDOW == 10 (0x7fbcc8073c00) [pid = 6575] [serial = 872] [outer = 0x7fbcd9a25470]

reftest.jsm will wait for mozAfterPaint after updating canvas, however it didn't receive MozAfterPaint after updating canvas in this case.

[1] https://searchfox.org/mozilla-central/rev/7a8c667bdd2a4a32746c9862356e199627c0896d/layout/tools/reftest/reftest-content.js#704
(Assignee)

Comment 40

2 months ago
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #39)
> (In reply to Mantaroh Yoshinaga[:mantaroh] from comment #38)
> > (In reply to Sebastian Hengst [:aryx][:archaeopteryx] (needinfo on
> > intermittent or backout) from comment #36)
> > > And these intermittent reftest failures:
> > >
> > > https://treeherder.mozilla.org/logviewer.html#?job_id=147773530&repo=autoland
> > > https://treeherder.mozilla.org/logviewer.html#?job_id=147790611&repo=autoland
> > 
> > Thanks.
> > 
> > This reason is that 'apz-repaints-flushed' events didn't happen.[1] I guess
> > wpt failure is same of this. In any way, I'll look into this more.
> > 
> > [1]
> > https://searchfox.org/mozilla-central/rev/
> > 7a8c667bdd2a4a32746c9862356e199627c0896d/layout/tools/reftest/reftest-
> > content.js#680
> 
> Oh,, Sorry. This is wrong. Timeout is occurred in next step, i.e. step of
> waiting to finish. Local test failure log is follow:
> 
> REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_FOR_APZ_FLUSH
> REFTEST INFO | Saved log: [CONTENT] AfterPaintListener in
> file:///home/mantaroh/mozilla-central/layout/reftests/border-image/470250-2.
> html
> REFTEST INFO | Saved log: [CONTENT] SendUpdateCanvasForEvent with 0 rects
> REFTEST INFO | Saved log: Updating canvas for invalidation
> REFTEST INFO | Saved log: [CONTENT] MakeProgress: apz-repaints-flushed fired
> REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FINISH
> REFTEST INFO | Saved log: [CONTENT] MakeProgress: waiting for MozAfterPaint
> REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FINISH
> REFTEST INFO | Saved log: [CONTENT] MakeProgress: waiting for MozAfterPaint
> REFTEST INFO | Saved log: [CONTENT] AfterPaintListener in
> file:///home/mantaroh/mozilla-central/layout/reftests/border-image/470250-2.
> html
> REFTEST INFO | Saved log: [CONTENT] SendUpdateCanvasForEvent with 1 rects
> REFTEST INFO | Saved log: [CONTENT] Rect: 8 8 60 112
> REFTEST INFO | Saved log: Updating canvas for invalidation
> REFTEST INFO | Saved log: DoDrawWindow 8,8,52,104
> REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FINISH
> REFTEST INFO | Saved log: [CONTENT] MakeProgress: waiting for MozAfterPaint
> REFTEST TEST-END |
> file:///home/mantaroh/mozilla-central/layout/reftests/border-image/470250-2.
> html ==
> file:///home/mantaroh/mozilla-central/layout/reftests/border-image/470250-2-
> ref.html
> ++DOMWINDOW == 10 (0x7fbcc8073c00) [pid = 6575] [serial = 872] [outer =
> 0x7fbcd9a25470]
> 
> reftest.jsm will wait for mozAfterPaint after updating canvas, however it
> didn't receive MozAfterPaint after updating canvas in this case.
> 
> [1]
> https://searchfox.org/mozilla-central/rev/
> 7a8c667bdd2a4a32746c9862356e199627c0896d/layout/tools/reftest/reftest-
> content.js#704

I looked into this phenomenon, I guess that we should ensure this event even if display list invalidation doesn't invalidate. (i.e. isMozAfterPaintPending return true since view manager of refresh driver has pending flush, however, this event didn't happen[2])

[1] https://searchfox.org/mozilla-central/rev/7a8c667bdd2a4a32746c9862356e199627c0896d/layout/tools/reftest/reftest-content.js#431
[2] https://searchfox.org/mozilla-central/rev/7a8c667bdd2a4a32746c9862356e199627c0896d/layout/base/nsPresContext.cpp#185

I think that we should add software timer like a print preview into reftest-content.js.

This patch will add software timer which ensures the MozAfterPaint:
https://hg.mozilla.org/try/rev/e5b1466c0da5e4dc4a58e70ba278ba802c4ae7c1

In my local environment, this failure doesn't happen. (try num is 200000). I'll confirm other test cases as well.

Updated

2 months ago
No longer blocks: 1415783
I agree that adding a fallback timer in JS in the reftest harness is preferable to adding it in C++ (where it will affect all tests, including those that don't need/expect it).

One area we might need to check is whether the timer added to the reftest harness will affect crashtests too. I assume it will and it probably is better if it does (since existing crashtests might be relying on that) but we should check.
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #40)
> I looked into this phenomenon, I guess that we should ensure this event even
> if display list invalidation doesn't invalidate. (i.e.
> isMozAfterPaintPending return true since view manager of refresh driver has
> pending flush, however, this event didn't happen[2])

I am wondering the mismatch between isMozAfterPaintPending flag and MozAfterPaint event is not a problem?  It seems to me that the mismatch might affect mochitest and other tests as well, no?
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #40)
> I looked into this phenomenon, I guess that we should ensure this event even
> if display list invalidation doesn't invalidate. (i.e.
> isMozAfterPaintPending return true since view manager of refresh driver has
> pending flush, however, this event didn't happen[2])

Oh wait.  I think Mantaroh was misunderstanding here (or, just wrote a different comment what he had in mind).

As per the log in comment 39, MozAfterPaint event was surely fired,  the reftest got stuck while waiting for clearing isMozAfterPaintPending.  Anyway we have to know how this mismatch happens, the mismatch is not intuitive to me.
(Assignee)

Comment 44

2 months ago
(In reply to Hiroyuki Ikezoe (:hiro) from comment #43)
> (In reply to Mantaroh Yoshinaga[:mantaroh] from comment #40)
> > I looked into this phenomenon, I guess that we should ensure this event even
> > if display list invalidation doesn't invalidate. (i.e.
> > isMozAfterPaintPending return true since view manager of refresh driver has
> > pending flush, however, this event didn't happen[2])
> 
> Oh wait.  I think Mantaroh was misunderstanding here (or, just wrote a
> different comment what he had in mind).
> 
> As per the log in comment 39, MozAfterPaint event was surely fired,  the
> reftest got stuck while waiting for clearing isMozAfterPaintPending.  Anyway
> we have to know how this mismatch happens, the mismatch is not intuitive to
> me.

As commented in comment 39, I guess there are cases that doesn't occur the MozAfterPaint even if isMozAfterPaintPending returned true.[1] For example case, nsIWidgetListener of view manager revoked the pending flush, then view manager didn't paint since damaged region is empty.
In this case, I think that isMozAfterPaintPending will return true, then MozAfterPaintPending will not happen.

[1] https://searchfox.org/mozilla-central/rev/9f3bd430c2b132c86c46126a0548661de876799a/view/nsViewManager.cpp#1097

Please correct me, if I am mistaken.
Flags: needinfo?(hikezoe)
(Assignee)

Comment 45

2 months ago
(In reply to Brian Birtles (:birtles) from comment #41)
> I agree that adding a fallback timer in JS in the reftest harness is
> preferable to adding it in C++ (where it will affect all tests, including
> those that don't need/expect it).
> 
> One area we might need to check is whether the timer added to the reftest
> harness will affect crashtests too. I assume it will and it probably is
> better if it does (since existing crashtests might be relying on that) but
> we should check.

Thanks!
I'll check crash tests as well, and I think that I need to change the Marionette harness(lisetener.js) due to same reason.
(https://searchfox.org/mozilla-central/source/testing/marionette/listener.js#1811)
Clearing ni? to me since I did ask Mantaroh on IRC to elaborate what's going on there in the failure case.

What I can tell from the log in comment 39 is reftest tools does not wait for MozAfterPaint event after taking the snapshot (although the log is 'waiting for MozAfterPaint').  It just waits for clearing isMozAfterPaintPending flag.  I've verified this fact locally to run the failure reftest (470250-2.html).  So this is the question what I've been talking.  Why isMozAfterPaintPending flag remains true?  Doesn't it make other tests failure?  Also why does 470250-2.html fail intermittently?
Flags: needinfo?(hikezoe)
(Assignee)

Comment 47

2 months ago
Here's timer:
https://dxr.mozilla.org/mozilla-central/rev/c2248f85346939d3e0b01f57276c440ccb2d16a1/layout/base/PresShell.cpp#3738

I guess that a test waits for the MozAfterPaint forever, if this timer is fired between the Tick() and IsMozAfterPaintPending(). 
I.e. this timer set mViewManagerFlushIsPending to true before next Tick(), then test script ask isMozAfterPaintPending().
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #47)
> Here's timer:
> https://dxr.mozilla.org/mozilla-central/rev/
> c2248f85346939d3e0b01f57276c440ccb2d16a1/layout/base/PresShell.cpp#3738
> 
> I guess that a test waits for the MozAfterPaint forever,

I wanted to know the fact, not a guess.  Anyway, does 'the MozAfterPaint' mean MozAfterPaint *event*?  I think it's not.
You mean isMozAterPaintPending *flag*, more precisely reftest tools waits for being isMozAfterPaintPending flag cleared [1], right?

[1] https://hg.mozilla.org/mozilla-central/file/38f49346a200/layout/tools/reftest/reftest-content.js#l428
(Assignee)

Comment 49

2 months ago
(In reply to Hiroyuki Ikezoe (:hiro) from comment #48)
> (In reply to Mantaroh Yoshinaga[:mantaroh] from comment #47)
> > Here's timer:
> > https://dxr.mozilla.org/mozilla-central/rev/
> > c2248f85346939d3e0b01f57276c440ccb2d16a1/layout/base/PresShell.cpp#3738
> > 
> > I guess that a test waits for the MozAfterPaint forever,
> 
> I wanted to know the fact, not a guess.  Anyway, does 'the MozAfterPaint'
> mean MozAfterPaint *event*?  I think it's not.
> You mean isMozAterPaintPending *flag*, more precisely reftest tools waits
> for being isMozAfterPaintPending flag cleared [1], right?
> 

isMozAfterPaintPending is not a flag like bool member.
It checks every time :
  * PresContext has unfired MozAfterPaint event(i.e. pending this event) or not.
  * RootPresContext's view manager has the pending flush or not.

In this test failure, the second condition will be true when received MozAfterPaint event, then reftest's status will be 'STATE_WAITING_TO_FINISH' which is waiting for MozAfterPaint event again[1]. Your mentioned check[2] will not start unless receiving the MozAfterPaint event.

If the timer of comment 47 will fire, the second condition of isMozAfterPaintPending will be true due to calling the ScheduleViewManagerFlush()[3]. Then reftest call the isMozAfterPaintPending after this timer finished and status of reftest will be the waiting for the MozAfterPaint event again. 

[1] https://searchfox.org/mozilla-central/rev/be78e6ea9b10b1f5b2b3b013f01d86e1062abb2b/layout/tools/reftest/reftest-content.js#704
[2] https://searchfox.org/mozilla-central/rev/be78e6ea9b10b1f5b2b3b013f01d86e1062abb2b/layout/tools/reftest/reftest-content.js#431
[3] https://searchfox.org/mozilla-central/rev/be78e6ea9b10b1f5b2b3b013f01d86e1062abb2b/layout/base/PresShell.cpp#3751
I am really confused.  If you are right, what happened the log case in comment 39?

REFTEST INFO | Saved log: [CONTENT] SendUpdateCanvasForEvent with 1 rects
REFTEST INFO | Saved log: [CONTENT] Rect: 8 8 60 112
REFTEST INFO | Saved log: Updating canvas for invalidation
REFTEST INFO | Saved log: DoDrawWindow 8,8,52,104
REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FINISH
REFTEST INFO | Saved log: [CONTENT] MakeProgress: waiting for MozAfterPaint

The last line shows us the fact that reftest tools reached inside 'if (shouldWaitForPendingPaints())' block [1].  What does it mean?  If your explanation is correct, the log is not a failure case?

[1] https://hg.mozilla.org/mozilla-central/file/38f49346a200/layout/tools/reftest/reftest-content.js#l603
(Assignee)

Comment 51

2 months ago
(In reply to Hiroyuki Ikezoe (:hiro) from comment #50)
> I am really confused.  If you are right, what happened the log case in
> comment 39?
> 
> REFTEST INFO | Saved log: [CONTENT] SendUpdateCanvasForEvent with 1 rects
> REFTEST INFO | Saved log: [CONTENT] Rect: 8 8 60 112
> REFTEST INFO | Saved log: Updating canvas for invalidation
> REFTEST INFO | Saved log: DoDrawWindow 8,8,52,104
> REFTEST INFO | Saved log: [CONTENT] MakeProgress: STATE_WAITING_TO_FINISH
> REFTEST INFO | Saved log: [CONTENT] MakeProgress: waiting for MozAfterPaint
> 
> The last line shows us the fact that reftest tools reached inside 'if
> (shouldWaitForPendingPaints())' block [1].  What does it mean?  If your
> explanation is correct, the log is not a failure case?
> 
> [1]
> https://hg.mozilla.org/mozilla-central/file/38f49346a200/layout/tools/
> reftest/reftest-content.js#l603

As explained to you on F2F, this timeout intermittent mechanism is as follow:

1) 'reftest-content.js' received the 'MozAfterPaint' event when its status is the 'STATE_WAITING_TO_FINISH', then its 'MozAfterEvent' event handler set timeout for running 'MakeProgress' again.[1]

2) A timer of delayed compress paint happen here[2], then set 'mViewManagerFlushIsPending' of refresh driver to true. This flag is used by 'isMozAfterPaintPending()'.

3) A timer of step 1) happen here, then check whether a 'isMozAfterPaintPending' is true or not.[3]
   In this step, 'isMozAfterPaintPending' returns true due to step 2).
   Then 'reftest-content.js' went into the 'STATE_WAITING_TO_FINISH' state.

4) MozAfterPaint event doesn't happen never after this step, then test will timeout.

[1] https://dxr.mozilla.org/mozilla-central/rev/ba283baf4e98aa3a5f45a17981077b98697aa73a/layout/tools/reftest/reftest-content.js#549
[2] https://dxr.mozilla.org/mozilla-central/rev/ba283baf4e98aa3a5f45a17981077b98697aa73a/layout/base/PresShell.cpp#3751
[3] https://dxr.mozilla.org/mozilla-central/rev/ba283baf4e98aa3a5f45a17981077b98697aa73a/layout/base/nsPresContext.cpp#185

I think that we might not need to wait 'MozAfterPaint' event in this step(i.e. STATE_WAITING_TO_FINISH) since target reftests removed 'reftest-wait' attribute in this step. Or.. We need to drop a timer of above step 2), but I'm not sure that to remove this timer affect other code now.
Thanks for the explanation!

(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #51)
> 3) A timer of step 1) happen here, then check whether a
> 'isMozAfterPaintPending' is true or not.[3]
>    In this step, 'isMozAfterPaintPending' returns true due to step 2).
>    Then 'reftest-content.js' went into the 'STATE_WAITING_TO_FINISH' state.
> 
> 4) MozAfterPaint event doesn't happen never after this step,

To be clear, we believe this is the right behavior, right?  I mean there is no need to fire MozAfterPaint at all, more precisely there is no trigger to paint something.

> I think that we might not need to wait 'MozAfterPaint' event in this
> step(i.e. STATE_WAITING_TO_FINISH) since target reftests removed
> 'reftest-wait' attribute in this step. Or.. We need to drop a timer of above
> step 2), but I'm not sure that to remove this timer affect other code now.

So, after some more thought, we should stop the timer to avoid this kind of failures somehow, otherwise we can easily fall into the same pit fall in various test cases (not only reftest, I think).
(Assignee)

Comment 53

2 months ago
(In reply to Hiroyuki Ikezoe (:hiro) from comment #52)
> Thanks for the explanation!
> 
> (In reply to Mantaroh Yoshinaga[:mantaroh] from comment #51)
> > 3) A timer of step 1) happen here, then check whether a
> > 'isMozAfterPaintPending' is true or not.[3]
> >    In this step, 'isMozAfterPaintPending' returns true due to step 2).
> >    Then 'reftest-content.js' went into the 'STATE_WAITING_TO_FINISH' state.
> > 
> > 4) MozAfterPaint event doesn't happen never after this step,
> 
> To be clear, we believe this is the right behavior, right?  I mean there is
> no need to fire MozAfterPaint at all, more precisely there is no trigger to
> paint something.
> 

Yes. This test behavior is correct since isMozAfterPaint returns true.

> > I think that we might not need to wait 'MozAfterPaint' event in this
> > step(i.e. STATE_WAITING_TO_FINISH) since target reftests removed
> > 'reftest-wait' attribute in this step. Or.. We need to drop a timer of above
> > step 2), but I'm not sure that to remove this timer affect other code now.
> 
> So, after some more thought, we should stop the timer to avoid this kind of
> failures somehow, otherwise we can easily fall into the same pit fall in
> various test cases (not only reftest, I think).

Yes, we should test will fail if test uses 'MozAfterPaint' directly or other test suit use it.
Actually, WPT will happen this problem. So I was about to change the marionette, but I think we had better to remove this timer as well. (I'm wondering why we choice 1000 ms..)
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #53)

> Yes, we should test will fail if test uses 'MozAfterPaint' directly or other
> test suit use it.
> Actually, WPT will happen this problem. So I was about to change the
> marionette, but I think we had better to remove this timer as well. (I'm
> wondering why we choice 1000 ms..)

Note that I don't suggest removing the timer entirely, I'd suggest to stop/cancel the timer somehow.
(Assignee)

Comment 55

2 months ago
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #53)
> Yes, we should test will fail if test uses 'MozAfterPaint' directly or other
> test suit use it.
> Actually, WPT will happen this problem. So I was about to change the
> marionette, but I think we had better to remove this timer as well. (I'm
> wondering why we choice 1000 ms..)

mattwoodrow,
We create the timer which setting the 'paint compress flag' and call 'PressShell::ScheduleViewManagerFlush()' again after 1000 ms[1]. If this timer occurs, we call the 'nsRefreshDriver::ScheduleViewManagerFlush()', as result of it, test suit will mistake that test should wait for 'MozAfterPaint' event since 'isMozAfterPaint' returns true. This flag will be reset by nsRefreshDriver::Tick().

Now I have a question about compress mode of paint. Should we call the 'nsRefreshDriver::ScheduleViewManagerFlush()' again in this timer? Or Can we know when a timer is no longer need to?

[1] https://hg.mozilla.org/mozilla-central/annotate/ba283baf4e98aa3a5f45a17981077b98697aa73a/layout/base/PresShell.cpp#l3734 (This timer introduced by bug 946952)
Flags: needinfo?(matt.woodrow)
Why don't we get a MozAfterPaint event at the end?

The idea is that if we return true from isMozAfterPaintPending, then we're guaranteeing that we'll fire MozAfterPaint soon.

The implementation of nsPresContext::IsDOMPaintEventPending() records an empty rect invalidation when it returns true, so that we'll have something to send to MozAfterPaint even if we don't actually invalidate anything on the next paint.

So we're returning true from isMozAfterPaintPending (and storing an empty rect invalidation on the pres context), and the reftest harness decides to wait for MozAfterPaint. Then what happens? Do we get the scheduled view manager flush? Why isn't this delivering MozAfterPaint?

Is it because the paint ended up making no changes, we never send anything to the compositor, and then we never get DidComposite back? In that case, I think we need to do something to flush pending MozAfterPaints from the pres context immediately.


I also think we can get rid of the PAINT_DELAYED_COMPRESS code entirely (it was added to workaround a b2g issue), but I think we need to fix MozAfterPaint delivery properly, since there's probably other ways to hit this bug.
Flags: needinfo?(matt.woodrow)
(Assignee)

Comment 57

6 days ago
Thanks, mattwoodrow.

(In reply to Matt Woodrow (:mattwoodrow) from comment #56)
> Why don't we get a MozAfterPaint event at the end?
> 
> The idea is that if we return true from isMozAfterPaintPending, then we're
> guaranteeing that we'll fire MozAfterPaint soon.
> 
> The implementation of nsPresContext::IsDOMPaintEventPending() records an
> empty rect invalidation when it returns true, so that we'll have something
> to send to MozAfterPaint even if we don't actually invalidate anything on
> the next paint.
> 
> So we're returning true from isMozAfterPaintPending (and storing an empty
> rect invalidation on the pres context), and the reftest harness decides to
> wait for MozAfterPaint. Then what happens? Do we get the scheduled view
> manager flush? Why isn't this delivering MozAfterPaint?
> 

Yes, a current implementation will send a MozAfterPaint event with empty rect if the paint ended up making no changes. Several animation tests are using this MozAfterPaint event as a trigger for checking the compositor's animation properties. If gecko consumes a time for sending animation to the compositor, the gecko will fire the MozAfterPaint event due to software timer[1]. That is 'MozAfterPaint' might contain an after painting event which is not finishing paint process. I guess this case affect several tests, like talos/reftest/wpt. Especially, this problem affects to animation tests since animation tests check detail condition after painting content.

So I decided removing this timer(attachment 8930791 [details]), however, it means that test suite(reftest/marionette) should care the case which does not receiving MozAfterPaint. (As mentioned you, it's the case of the paint ended up making no changes) As far as I know, we need to fix the marionette for caring this case.[2]

[1] https://searchfox.org/mozilla-central/rev/7fb999d1d39418fd331284fab909df076b967ac6/layout/base/nsPresContext.cpp#3461-3467
[2] https://searchfox.org/mozilla-central/rev/7fb999d1d39418fd331284fab909df076b967ac6/testing/marionette/listener.js#1683-1692

try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6144594c0cbe34fc06317bf2d9db368cc9c3f3cb

mattwoodrow:
I think that user who uses this MozAfterPaint event need to care the case which doesn't happen MozAfterPaint event because current MozAfterPaint contains a fake event which isn't finishing the paint process. So I'd like to remove this fake event and change test suite. What do you think this modification?
Flags: needinfo?(matt.woodrow)
If I understand the Marionette patch correctly it resolves the
promise after 1s, irregardless of whether MozAfterPaint fired.
If MozAfterPaint fires first, the timer isn’t cleared and calls
resolve() after 1s anyway.  Maybe this doesn’t matter because
maybeResolve will have been called first.

(Also note that any change to testing/marionette must also be
reviewed by a Marionette peer.)
WebRender also does not fire MozAfterPaint on some conditions.  It's worth checking.  I guess we can solve this and that bug altogether.
See Also: → bug 1425315
(In reply to Hiroyuki Ikezoe (:hiro) from comment #59)
> WebRender also does not fire MozAfterPaint on some conditions.  It's worth
> checking.  I guess we can solve this and that bug altogether.

The WebRender problem is that we fire too many MozAfterPaint since we don't know when to stop. So it's probably a different problem.
(In reply to Ethan Lin[:ethlin] from comment #60)
> (In reply to Hiroyuki Ikezoe (:hiro) from comment #59)
> > WebRender also does not fire MozAfterPaint on some conditions.  It's worth
> > checking.  I guess we can solve this and that bug altogether.
> 
> The WebRender problem is that we fire too many MozAfterPaint since we don't
> know when to stop. So it's probably a different problem.

Yeah, we've also been seeing unexpected MozAfterPaint events which should not be observed in bug 1395971.  That's the one triggered by the timer.
(In reply to Mantaroh Yoshinaga[:mantaroh] from comment #57)
> I think that user who uses this MozAfterPaint event need to care the case
> which doesn't happen MozAfterPaint event because current MozAfterPaint
> contains a fake event which isn't finishing the paint process. So I'd like
> to remove this fake event and change test suite. What do you think this
> modification?

The current behaviour makes it very easy to write correct JS code, if isMozAfterPaintPending returns true, then you wait for a MozAfterPaint event.

The proposed change means that all callers need to handle the case where the event never gets delivered, which seems annoying and bug prone. We also need to go through all the existing callers (including addons!) and make sure that they are updated correctly to handle this new behaviour.

I'd much prefer if we kept the current behaviour, fixed the bugs where we don't deliver MozAfterPaint normally, and then remove the software time (since it's basically a hack to workaround the bugs).

[1] is at least one place that I'm pretty sure we have this bug. That code runs when we tried to forward changes to the compositor, but it turned out nothing had changed. We return the transaction id to the refresh driver, but nothing will ever tell the pres context to fire the MozAfterPaint event.

The correct fix for this is a bit complicated, since if we're currently waiting for the previous transaction id to be returned from the compositor, we don't want to send the event for those early (or out of order).

I think the correct behaviour is:

* If there are no invalidations on the pres context for the skipped transaction, then do nothing since we must not have promised to anyone that we would fire MozAfterPaint.
* If there are invalidations for the skipped transaction, and also invalidations for an earlier transaction id, then we need to combine them so that the MozAfterPaint events for both gets sent when the earlier one finishes.
* If there are invalidations for the skipped transaction only, then fire MozAfterPaint immediately.



[1] https://searchfox.org/mozilla-central/source/gfx/layers/client/ClientLayerManager.cpp#772
Flags: needinfo?(matt.woodrow)
(Assignee)

Comment 63

2 days ago
Thanks, mattwoodrow.

(In reply to Matt Woodrow (:mattwoodrow) from comment #62)
> (In reply to Mantaroh Yoshinaga[:mantaroh] from comment #57)
> > I think that user who uses this MozAfterPaint event need to care the case
> > which doesn't happen MozAfterPaint event because current MozAfterPaint
> > contains a fake event which isn't finishing the paint process. So I'd like
> > to remove this fake event and change test suite. What do you think this
> > modification?
> 
> The current behaviour makes it very easy to write correct JS code, if
> isMozAfterPaintPending returns true, then you wait for a MozAfterPaint event.
> 
> The proposed change means that all callers need to handle the case where the
> event never gets delivered, which seems annoying and bug prone. We also need
> to go through all the existing callers (including addons!) and make sure
> that they are updated correctly to handle this new behaviour.
> 

I didn't think about add-ons.
Yes, as you mentioned, this proposed fix will affect many tests and add-ons.

> I'd much prefer if we kept the current behaviour, fixed the bugs where we
> don't deliver MozAfterPaint normally, and then remove the software time
> (since it's basically a hack to workaround the bugs).
> 

OK. I'll try it. I don't know why added this timer(100ms), but as far as I can see, several tests of wpt and mochitest will not receive the MozAfterPaint if remove this timer, so we should deliver a MozAfterPaint event in this case. 
(https://treeherder.mozilla.org/#/jobs?repo=try&revision=da9b4cffb73347d6701aa05bf737c80cb10f46af)

> [1] is at least one place that I'm pretty sure we have this bug. That code
> runs when we tried to forward changes to the compositor, but it turned out
> nothing had changed. We return the transaction id to the refresh driver, but
> nothing will ever tell the pres context to fire the MozAfterPaint event.
> 

Thank you for your advice. In my local environment, If added firing MozAfterPaint event in this revoking transaction code, those test will pass without a timeout:
https://hg.mozilla.org/try/rev/6383d2bd34197df023331ed147684a504410ab6a#l3.13
(https://treeherder.mozilla.org/#/jobs?repo=try&revision=1fd06d317ac77ead4941a98b4b63ce4c6de6488d&group_state=expanded)

But, in this result, transition and full screen test are failed on QuantumRender. I'll look into this reason.
You need to log in before you can comment on or make changes to this bug.