Closed Bug 1675614 Opened 5 years ago Closed 4 years ago

Investigate reducing main thread paint latency by using instant refresh ticks in some scenarios

Tracking

()

Status:

RESOLVED FIXED

Milestone:

89 Branch

Tracking Flags:

Tracking

Status

firefox89

---

fixed

People

(Reporter: mstange, Assigned: mattwoodrow)

References

(Blocks 1 open bug, Regressed 1 open bug)

Details

(Keywords: perf-alert)

Attachments

(3 files)

Bug 1675614 - Don't set the most recent refresh time of the driver when changing timers. r?mstange 4 years ago Matt Woodrow (:mattwoodrow) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1675614 - Try doing a 'catch-up' refresh driver tick if we get a paint request part way into a tick, and haven't yet painted that tick. r?mstange 4 years ago Matt Woodrow (:mattwoodrow) 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1675614 - Process spellcheck in chunks in the parent process, since it's also async. r?m_kato 4 years ago Matt Woodrow (:mattwoodrow) 48 bytes, text/x-phabricator-request		Details \| Review

Markus Stange [:mstange]

Reporter

Description

•

5 years ago

•

Edited

Our regular main thread paints are currently triggered from vsync notifications: the paint happens within a refresh tick, and the refresh driver is controlled by vsync notifications. The original idea behind this is to make sure we don't paint more frequently than the vsync rate.

This means that, whenever a "paint originating event" for a paint occurs, we wait for the next vsync notification before we paint. This introduces up to 16.7ms (on a 60Hz display) of latency, depending on where exactly within the vsync interval the event happened.

This affects keyboard latency, as can be seen in the following profile: https://share.firefox.dev/3mU9o7c
The profile shows keyboard latency numbers between 23ms and 38ms. This "latency" number is the delta between the key event's native timestamp and the end of the composite. Here's the latency breakdown for one of the events, which has a latency of 29.4ms:

3.1ms of latency between the system timestamp and the key event handler in the content process
7.8ms wait time for the refresh tick <-- This is what this bug aims to reduce
0.5ms refresh tick, which includes the paint
0.4ms until the paint has arrived on the compositor thread
15.7ms wait time for the next vsync
1.9ms of compositing

I believe we can address the main thread latency for this use case while still making sure we only have one paint per vsync interval.

The basic implementation idea is the following:

If the refresh driver hasn't ticked in a while, allow an extra "early tick".
Otherwise, keep the current behavior.

Or, more specifically: Maintain a piece of state that says whether we've ticked in the current "vsync interval". If EnsureTimerStarted() is called at a point where we haven't yet had a tick in the current vsync interval, dispatch a high-priority runnable for the tick, and don't wait for vsync. But the next tick will need to wait for vsync, unless enough time elapses that we're in the next vsync interval. Here's some pseudocode.
This "early tick" can also be considered a "late tick" that's borrowed from the previous vsync: We didn't tick at the last vsync time, so we take the hypothetical refresh tick that could have happened at that point, and shift it over to the current time.

I think this will help reduce keyboard latency, at least in the case where no other animations are going on, when nothing else is ticking the refresh driver: The paint will happen earlier, so it will arrive at the compositor earlier, so we may catch a composite one frame earlier to present the result. For the example event from earlier, I'm hoping for this latency breakdown:

3.1ms of latency between the system timestamp and the key event handler in the content process
no wait time for the refresh tick
0.5ms refresh tick, which includes the paint
0.4ms until the paint has arrived on the compositor thread
6.9ms wait time for the next vsync (catching the previous vsync now!)
1.9ms of compositing

for a total of 12.8ms delay, rather than the original 29.4ms. That's a reduction by 16.6ms, one full vsync interval.

Potential concerns:

More work due to more refresh ticks: This idea can lead to extra refresh ticks if there are multiple tick triggers within one vsync interval. The first trigger will get a refresh tick for itself, and the remaining triggers will be processed by the refresh tick at the next vsync. This is one more tick than in the past, and it causes extra work that otherwise could have been coalesced into one refresh tick: it could be an extra layout flush, or firing an extra non-coalesced mouse move event, etc.
Variable frame latency: Another idea behind the "paint on vsync" strategy was to ensure uniform frame latency for paints that complete within a budget of one vsync interval. Handling some paints early will impact this uniformity because the "early paints" will potentially have one less frame of latency.
Delayed processing of other events: If multiple paint triggers happen within one vsync interval, the early tick for the first trigger might take so long that it delays the tick that processes the rest of the triggers. For example, if three images finish decoding in quick succession, this proposal will cause the early paint to paint just the first image, and the two other images will be painted in the next paint. If the early paint is quick, then the next tick happens at vsync time and the paint for the two other images happens at the same time as it would have happened in the past. But if the early paint is slow, then the next tick is delayed, and the other two images paint later than before.

I believe the idea is worth trying despite the concerns. I'm not worried about frame uniformity, because uniformity only really matters during animations. And this proposal does not affect what happens during animations, because there's have a consistent rate of refresh ticks and no refresh tick pauses during those. The extra paint only happens at the beginning of such a stream of ticks.
Similarly, it also doesn't affect coalesced work during continuous streams of events; for example, when the mouse is moving, we will do just as much coalescing of mouse move events as before, except for the first event of the series. The only extra work happens at the beginning of the streams.
As for the "delayed follow-up paint" issue, that is the one issue that is slightly worrying me. However, it's something that can already happen depending on timing, if vsync happens to fire between the events.

Markus Stange [:mstange]

Reporter

Updated

•

5 years ago

Summary: Investigate reducing main thread paint latency using instant refresh ticks, in some scenarios → Investigate reducing main thread paint latency by using instant refresh ticks in some scenarios

Kartikaya Gupta (email:kats@mozilla.staktrace.com)

Comment 1

•

5 years ago

Do you have data on how often keyboard inputs would be able to take advantage of this speedup? i.e. when a person types, will all of their keypresses get sped up (because the inter-keypress delay is high and nothing else is ticking the refresh driver)?

Markus Stange [:mstange]

Reporter

Comment 2

•

5 years ago

Good question, I don't think we have that data. But we have telemetry on keyboard latency so we should hopefully see those numbers go down once this is finished. My intuition is that it'll help on simple pages like bugzilla but not on pages with lots of ads. I've also seen profiles of the google search form where they were running requestAnimationFrame the entire time and not doing anything with it.

Matt Woodrow (:mattwoodrow)

Assignee

Updated

•

4 years ago

Assignee: nobody → matt.woodrow

Matt Woodrow (:mattwoodrow)

Assignee

Comment 3

•

4 years ago

Attached file Bug 1675614 - Don't set the most recent refresh time of the driver when changing timers. r?mstange — Details

I can't see a reason to overwrite the most recent refresh time of the driver with a timestamp at which we didn't tick (no tests seem to depend on it), and we need to not do this in order to be able to tell later on if we've ticked this driver for the current timer's tick.

Matt Woodrow (:mattwoodrow)

Assignee

Comment 4

•

4 years ago

Attached file Bug 1675614 - Try doing a 'catch-up' refresh driver tick if we get a paint request part way into a tick, and haven't yet painted that tick. r?mstange — Details

Depends on D109690

Matt Woodrow (:mattwoodrow)

Assignee

Comment 5

•

4 years ago

Attached file Bug 1675614 - Process spellcheck in chunks in the parent process, since it's also async. r?m_kato — Details

Depends on D109691

Pulsebot

Comment 6

•

4 years ago

Pushed by mwoodrow@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/24d789ec28a6 Don't set the most recent refresh time of the driver when changing timers. r=mstange https://hg.mozilla.org/integration/autoland/rev/bd899813fd0c Try doing a 'catch-up' refresh driver tick if we get a paint request part way into a tick, and haven't yet painted that tick. r=mstange https://hg.mozilla.org/integration/autoland/rev/cc33a0848d85 Process spellcheck in chunks in the parent process, since it's also async. r=m_kato

Atila Butkovits

Comment 7

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/24d789ec28a6
https://hg.mozilla.org/mozilla-central/rev/bd899813fd0c
https://hg.mozilla.org/mozilla-central/rev/cc33a0848d85

Status: NEW → RESOLVED

Closed: 4 years ago

status-firefox89: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 89 Branch

Andreea Pavel [:apavel]

Updated

•

4 years ago

Regressions: 1701168

Alexandru Ionescu (needinfo me) [:alexandrui]

Updated

•

4 years ago

Regressions: 1701866

Alexandru Ionescu (needinfo me) [:alexandrui]

Comment 8

•

4 years ago

(In reply to Pulsebot from comment #6)

Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cc33a0848d85
Process spellcheck in chunks in the parent process, since it's also async.
r=m_kato
== Change summary for alert #29476 (as of Sun, 28 Mar 2021 21:13:01 GMT) ==

Improvements:

Ratio	Suite	Test	Platform	Options	Absolute values (old vs new)
10%	tabpaint		linux1804-64-shippable-qr	e10s stylo webrender	47.15 -> 42.56

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=29476

Alexandru Ionescu (needinfo me) [:alexandrui]

Updated

•

4 years ago

Keywords: perf-alert

Emilio Cobos Álvarez [:emilio]

Updated

•

4 years ago

Regressions: 1702466

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

4 years ago

Regressions: 1617469

Henrik Skupin [:whimboo][⌚️UTC+1]

Updated

•

4 years ago

Regressions: 1673938

Markus Stange [:mstange]

Reporter

Comment 9

•

4 years ago

Bug 1708325 extended the technique from this bug to allow one "extra" tick per frame, if a tick is requested during user input events.

Blocks: 1708325

Timothy Nikkel (:tnikkel)

Updated

•

4 years ago

Regressions: 1718220

Russell Haley

Comment 10

•

4 years ago

I found this through the perf blog, and I think maybe this technique, or something similar, could also remove the frame of lag introduced by Bug 1375949?

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

4 years ago

Regressions: 1712123

Kaiido

Comment 11

•

4 years ago

Is it expected that this also affects requestAnimationFrame?
A lot of user-land scripts do use this method as a delay until next paint, e.g as a mean to throttle high frequency async events. Since this change, these will not work anymore, because now requestAnimationFrame from a non-animated document is just a setTimeout(0).

Also, now the second call to rAF in a loop will be random (on my 60Hz monitor between 10ms and 26ms). https://jsfiddle.net/bpxhug2q/

Note that Chrome also has a bug here, but in worse, their second call to rAF is not even waiting for the next painting frame, meaning they've got two callbacks for the same frame...

Markus Stange [:mstange]

Reporter

Comment 12

•

4 years ago

Yes, that's expected. Yes, the delay will be 0 on a non-animated document, on the first call. But repeated calls will still be throttled to the refresh rate. Or are you saying that you get a delay of zero even on repeated calls, if the document is not animated?

Kaiido

Comment 13

•

4 years ago

No this is indeed the behavior I see, I am saying that this is an unexpected and surprising behavior though, and that many user-land scripts do expect that a call to requestAnimationFrame indeed delays their callback until the next painting frame, even from a non-animated document.

Markus Stange [:mstange]

Reporter

Comment 14

•

4 years ago

I guess I still don't understand how that missing could have an undesirable outcome.

(In reply to Kaiido from comment #13)

many user-land scripts do expect that a call to requestAnimationFrame indeed delays their callback until the next painting frame

And that's still the case - the next paint also happens without a delay.

Alexandru Trif, Desktop Test Engineering [:atrif]

Updated

•

4 years ago

Regressions: 1744227

Timothy Nikkel (:tnikkel)

Updated

•

3 years ago

Regressions: 1783681

You need to log in before you can comment on or make changes to this bug.