Closed Bug 1669239 Opened 4 years ago Closed 4 years ago

Schedule the next tick to paint in PuppetWidget::Paint instead of painting directly

Categories

(Core :: Performance, defect, P3)

defect

Tracking

()

RESOLVED FIXED
83 Branch
Tracking Status
firefox83 --- fixed

People

(Reporter: sefeng, Assigned: sefeng)

References

Details

Attachments

(1 file)

The WillPaintWindow call is problematic for FirstContentfulPaint
algorithm because it causes paints that outside of ticks, which would
revoke the viewer flush for the next tick, hence no firstContentful
paint entries are fired.

The proposed solution here is instead of doing the paint directly in
PuppetWidget::Paint, we ask the next tick to do it.

The WillPaintWindow call is problematic for FirstContentfulPaint
algorithm because it causes paints that outside of ticks, which would
revoke the viewer flush for the next tick, hence no firstContentful
paint entries are fired.

However, this WillPaintWindow call is also useful because it allows
the parent to aware that the layers are updated, so that the parent
can fire corresponding events based on the status. If we don't do it,
parent may end up not knowing that the layers are updated, so some
tests may stall because of the missing events.

The proposed solution here is instead of doing the paint directly in
PuppetWidget::Paint, we ask the next tick to do it.

Attachment #9179621 - Attachment description: Bug 1669239 - Schedule the next tick to paint in PuppetWidget::Paint instead of painting directly → Bug 1669239 - Call Tick to paint in PuppetWidget::Paint instead of using WillPaintWindow
Pushed by sefeng@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/604e2561311f
Call Tick to paint in PuppetWidget::Paint instead of using WillPaintWindow r=emilio,mattwoodrow
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 83 Branch
Regressions: 1671592
Regressions: 1672139

Sean, there's an improvement that seems to be caused by your patch, but around this revision it's also a build failure that prevents me from figuring out for certain what's the paych that caused the improvement. Does the improvement below makes sense to be caused by this patch?
== Change summary for alert #27381 (as of Mon, 26 Oct 2020 21:12:38 GMT) ==

Improvements:

Ratio Suite Test Platform Options Absolute values (old vs new)
4% tresize macosx1014-64-shippable e10s stylo 13.02 -> 12.52

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=27381

Flags: needinfo?(sefeng)

Yeah, the test basically measures the time for window.resizeTo by using MozAfterPaint event, which is indeed something that this patch can impact. Ahthough I am a bit suprised of the improvement, but I think it's possible.

Flags: needinfo?(sefeng)

Actually, if I look more closely, there's a slight improvement in results and the graph became bimodal.

So I tend ore to say that the improvement might be a regression in terms of graph stability. What do you say? Does this make sense to you?

Flags: needinfo?(sefeng)

Yeah, it makes sense to call it an regression in terms of graph stability. This makes sense since the patch started to schedule a paint at where it was painting directly.

I am not worrying about it since the overall performance is better. Would it be okay for you?

Flags: needinfo?(sefeng)

Sorry for replying this late. I am a bit worried as the noise is one of the main factors that are preventing us to identify the regression cause.:sefen

Flags: needinfo?(sefeng)

Wait, did this test stopped running on autoland since more than a month ago?

This patch indeed makes the paint from sync to async, so it makes sense when the tests become nosier. I am not sure if there are things that we can do to improve the graph stability.

Alexandru, the links seem to be broken so I can't check the current status.

Flags: needinfo?(sefeng)

I think I get what you meant. Are you worrying that we might not be able to identify future regressions due to the test becomes nosier? I wish we can still identify regressions for nosier benchmarks....

Honestly, I don't know if there are things that we can do to improve it. What do we usually do for cases like this?

Flags: needinfo?(sefeng) → needinfo?(aionescu)

The performance team is trying to stick the sheriffed tests as stable as possible. A stable test means also a healthy one. I'm not sure what you can do, cases like this don't happen very often and there's no procedure for them. I would look for way to reduce noise, maybe a better approach for the code.
Can you explain is a simple way what is this patch doing, please? My gut feeling is that there's a tradeoff (well, it's always a tradeoff, haha).

Flags: needinfo?(aionescu) → needinfo?(sefeng)

So PuppetWidget::Invalidate was a method that was being used for tab switches and tab resizes. It would make a paint and this paint was a prerequisite for a bunch of painting related events to fire. These events were used in many places, I believe this particular test uses MozAfterPaint.

The issue with this approach was that this paint was executed outside of the Tick, so it messed up first-contentful-paint timing because the spec required paints to happen within the Tick.

So instead of doing paints directly from PuppetWidget::Invalidate, this patch changed it to schedule paints to run at the next Tick by scheduling a Tick.

Since this patch actually scheduled paints to run later than before, it's not clear to me how the performance got improved. I wonder if it's caused by the scheduled paint actually makes MozAfterPaint to be fired earlier because we started to schedule a tick earlier.

Does it sound good?

Flags: needinfo?(sefeng)
Regressions: 1689156
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: