1814951 - [speedometer] setTimeout throttling on Windows has latency of 15ms vs 4ms for Chrome

Reporter

Description

•

1 year ago

•

When chaining setTimeout(0) calls, the HTML spec requires that we throttle after a certain point to a minimum of 4ms. When testing this on Firefox builds on Windows, we typically have a latency closer to 16ms.

Simple example: https://jsfiddle.net/sqf2r0wo/

The 16ms is default Windows timer resolution / kernel tick (waves hands someone can probably give a more precise answer here).

Turning on profiling causes the latency issue to go away, which is expected with how the profiler work.

Opening this bug to gather evidence as we evaluate if we need to revisit timer design.

(Also, credit to Doug for re-raising this issue :)

Ted Campbell [:tcampbell]

Reporter

Comment 1

•

1 year ago

•

Edited

On a Linux machine, I experimented with modifying Chrome builds to adjust the throttle from 4ms to 16ms and looked at impact on Speedometer 2.1. With that experiment, I see speedometer score drop from 145 -> 140 so there continues to be sensitivity to setTimeout scheduling in the benchmark.

(Bug 1808582 covers some notes on building local optimized chrome builds)

Alex Thayer [:alexical] (she/her)

Comment 2

•

1 year ago

https://randomascii.wordpress.com/2020/10/04/windows-timer-resolution-the-great-rule-change/ for some interesting talk about this (it's not directly on topic, but it mentions a few times off-hand what Chrome does. (Which is, they set the global Windows timer interrupt period to 1ms, or 8ms if on battery.))

Ted Campbell [:tcampbell]

Reporter

Comment 3

•

1 year ago

•

Edited

EDIT: This is a weird temporary effect and after a minute or two things get back to normal.

Using ClockRes I see that when I using GeckoProfiler it changes timer to 1ms, but when I finish profiling it sticks at a number from 4-8 ms and doesn't return to 16ms. Restarting the browser gets back to the normal value.

When I get in this mode, I notice a measurable impact to speedometer score (3-5%). Our browsertime runs seem to include some initial profile in them and I'm wondering if this is causing us to have strange results in our scores, and also if this is why we have had a hard time reproducing this (and the throttle depth changes) in CI.

Ted Campbell [:tcampbell]

Reporter

Updated

•

1 year ago

Depends on: 1814966

Alex Thayer [:alexical] (she/her)

Comment 4

•

1 year ago

On try I don't see nearly as exaggerated effect as I do locally: https://treeherder.mozilla.org/perfherder/comparesubtest?originalProject=try&newProject=try&newRevision=8b4e763d1d7720a900b9840e6336831fc43bb15f&originalSignature=3451139&newSignature=3451139&framework=13&originalRevision=487d5fad0b7e8785ce652405a06084349d875658&page=1. I do think something is fishy here.

Note the following comment in the documentation of timeBeginPeriod:

Prior to Windows 10, version 2004, this function affects a global Windows setting. For all processes Windows uses the lowest value (that is, highest resolution) requested by any process. Starting with Windows 10, version 2004, this function no longer affects global timer resolution. For processes which call this function, Windows uses the lowest value (that is, highest resolution) requested by any process. For processes which have not called this function, Windows does not guarantee a higher resolution than the default system resolution.

One guess at what is going on here is that on systems which don't see an improvement from this change, another process is giving us this win for free. I think that should generally not be the case though - so I think it's very important that we settle on a strategy for this.

Ted Campbell [:tcampbell]

Reporter

Comment 5

•

1 year ago

•

Edited

The worker-pool for performance tests are labelled as win10-64-1803 which could either be an out of date name, or they could still be running very old Win10 kernels. I will try to get better info on what is actually running in CI next week.

EDIT: Confirmed that the CI hardware is indeed running Win10 18.03 builds so it does have the old timer behaviour.

Jeff Muizelaar [:jrmuizel]

Comment 6

•

1 year ago

It would be very interesting to know how setting timeBeingPeriod compares to using the new high res timer API: CREATE_WAITABLE_TIMER_HIGH_RESOLUTION. Perhaps it lets us have nice things without causing as much of a power regression.

Bas Schouten (:bas.schouten)

Comment 7

•

1 year ago

Justin might have thoughts on this as our newly-found resident timing expert :-).

Performance Impact: --- → none

Flags: needinfo?(jlink)

Justin Link

Assignee

Comment 8

•

1 year ago

I probably don't have much to share yet but this is definitely on my radar and, in fact, something that I'm thinking a bit about right now. This ties in closely, and is essentially a follow up, to https://bugzilla.mozilla.org/show_bug.cgi?id=1783405, which is really about making TimerThread efficient and effective. What can be changed as far as timer coalescing on Windows is a bit limited right (the low frequency of the thread scheduling is essentially forcing some timer coalescing but not really allowing us to control it). Increasing the frequency could potentially us to have both responsive timers and efficient use of wake-ups (at least to have a better dial for choosing between the two).

Flags: needinfo?(jlink)

Justin Link

Assignee

Updated

•

1 year ago

Assignee: nobody → jlink

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

1 year ago

Whiteboard: [sp3]

Jira Integration Bot

Updated

•

1 year ago

See Also: → https://mozilla-hub.atlassian.net/browse/SP3-69

Justin Link

Assignee

Comment 9

•

1 year ago

In experimenting with increasing the thread scheduling frequency via timeBeginPeriod() I can confirm that, with a 1ms period (what Chrome uses), the delay measured by Ted's test does go down to the expected values and is similar to what I see in Chrome as well.

Justin Link

Assignee

Updated

•

1 year ago

Blocks: 1824472

Justin Link

Assignee

Updated

•

1 year ago

Depends on: 1826224

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 10

•

7 months ago

•

Edited

bug 1826224 didn't affect sp3 numbers, as far as I see (and I didn't expect so), right? So perhaps this shouldn't have [sp3] nor block speedometer3 meta bug?
(This was filed for sp2.1, which has very different runner.)

Flags: needinfo?(jlink)

Justin Link

Assignee

Comment 11

•

7 months ago

smaug: Initially we were seeing a pretty clear improvement locally in SP3 from increasing the timer resolution (and none in CI, for reasons that are understood) so I think that's why it was tagged with sp3.

When checked again more recently, increasing the timer resolution had a much smaller impact locally (probably between 0.5 and 1.0 percent) and still none in CI.

So there probably is some impact to SP3, but it's not huge and it won't be seen in CI.

Flags: needinfo?(jlink)

Alice0775 White

Updated

•

5 months ago

Updated

•

3 months ago

Depends on: 1881627

Olli Pettay [:smaug][bugs@pettay.fi]

Updated

•

3 months ago

No longer blocks: speedometer3

Olli Pettay [:smaug][bugs@pettay.fi]

Comment 12

•

3 months ago

I still can't understand how this could really affect sp3 :), so cleared dependency. Feel free to add it back.

Bugzilla

Quick Search

[speedometer] setTimeout throttling on Windows has latency of 15ms vs 4ms for Chrome

Categories

(Core :: Performance, defect)

Tracking

()

People

(Reporter: tcampbell, Assigned: jlink)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Whiteboard: [sp3])

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Updated

Updated

Comment 9

Updated

Updated

Comment 10

Comment 11

Updated

Updated

Updated

Comment 12