Closed Bug 1566900 Opened 5 years ago Closed 5 years ago

setTimeout and setInterval take twice as long as they should unless performance profiler is used

Categories

(Core :: XPCOM, defect, P3)

70 Branch
x86_64
Windows 10
defect

Tracking

()

RESOLVED WONTFIX

People

(Reporter: AaronWright3, Unassigned)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:70.0) Gecko/20100101 Firefox/70.0

Steps to reproduce:

Used either a nested setTimeout or a setInterval function with a delay of 1000/60 (for 60fps) which measured the delay between intervals and outputted the average number of intervals per second. Tried the same thing while using the performance profiler, to diagnose delay issues.

Actual results:

Function performed at 25-32ms delay, rather than expected 16-18ms, averaging between 33fps and 41fps. Halving the delay actually resulted in the expected framerate. Using the performance profiler on the original delay resulted in the expected framerate as well, but it returned to 33fps upon closing the developer tools after using the profiler. Because the delay behaves as expected only when using the performance profiler, diagnosing was impossible. Behavior was the same regardless of the complexity of the function within, and persisted across new profiles and between Firefox 69 and 70. Using requestAnimationFrame also resulted in the expected framerate regardless of whether the profiler was used, but this is impractical for my intents (the main loop of a game), due to being tied directly to rendering performance. Behavior persisted across reboots as well, but not between different computers. Behavior ceased on its own after several hours, but has happened multiple times.

Expected results:

Delay should have been close to the specified 16.666ms/60fps, and/or should have remained the same during profiling.

OS: Unspecified → Windows 10
Hardware: Unspecified → x86_64
Component: Untriaged → Gecko Profiler
Product: Firefox → Core

Markus and Jan, I think you know more about how the JS engine and the profiler work together... Any idea about this?

Should this bug be moved to Core/Javascript Engine, as it feels to me it's more a JS issue (or feature?) that the profiler somehow influences. (It would be nice if the profiler didn't influence what's it's measuring, even positively.) :-)

Flags: needinfo?(mstange)
Flags: needinfo?(jdemooij)

This is probably caused by insufficient Windows system timer resolution. See bug 1492523 comment 26 for a similar issue.

The recommended solution is to use requestAnimationFrame, because rAF does not rely on high-resolution timers - it uses monitor vsync instead, so it can be accurate even if the Windows system timer resolution is set to the default. If requestAnimationFrame does not work for your particular use case, then I don't have a solution for you, unfortunately.

Does Chrome fire timers accurately in your testing?

Flags: needinfo?(mstange)
Flags: needinfo?(jdemooij)

(In reply to Gerald Squelart [:gerald] from comment #1)

Should this bug be moved to Core/Javascript Engine, as it feels to me it's more a JS issue (or feature?) that the profiler somehow influences.

(Note that setTimeout and setInterval are implemented outside the JS engine, so this would be a DOM issue.)

(In reply to Markus Stange [:mstange] from comment #2)

Does Chrome fire timers accurately in your testing?

It does, I have no timing issues under chrome or edge.

The priority flag is not set for this bug.
:gerald, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(gsquelart)

Hi :hsinyi, reaching out to you as DOM:Core triage owner. As per comment 3, setTimeout and setInterval are implemented in DOM.
Could you please have a look, and either move the bug to your component, or recommend who/where else to go to next?

Flags: needinfo?(gsquelart) → needinfo?(htsai)

Setting priority to satisfy the bot. To be reassessed...

Priority: -- → P3

Olli, do you think the longer time here is kinda expected as our internal throttling mechanism?

Flags: needinfo?(htsai) → needinfo?(bugs)

Any testcase for this?
A simple case, http://mozilla.pettay.fi/moztests/setInterval16.html , seems to work well.
But I was testing on linux. I was told it doesn't work so well on Windows.

Flags: needinfo?(bugs) → needinfo?(AaronWright3)
Component: Gecko Profiler → XPCOM

And since DOM is just using XPCOM timers, the issue is lower level.
Jimm, do you know who could take a look at this? This seems to be rather Windows specific.

Flags: needinfo?(AaronWright3) → needinfo?(jmathies)

(In reply to Olli Pettay [:smaug] from comment #9)

Any testcase for this?
A simple case, http://mozilla.pettay.fi/moztests/setInterval16.html , seems to work well.
But I was testing on linux. I was told it doesn't work so well on Windows.

I'm averaging around 20-25 (sometimes it jumps to 30) on the affected machine (a first generation Thinkpad Yoga X1 with an i5).

It stayed very close to 16.666 for a few times refreshes in a row, but that hasn't happened since.

In case it's any help, I also used clockres to check my system timer resolution and got fairly normal results:

Maximum timer interval: 15.625 ms
Minimum timer interval: 0.500 ms
Current timer interval: 0.997 ms

Also, I'm not sure if this is notable, the very first average interval displayed by the test upon loading the page is usually much closer to the target, often between 13 and 18.

I see the same odd behavior in the test case on win7/64 bit. Adam, this isn't supper pressing, but looks like something we should at least dig into to try and understand. Bent used to work on timers, but he's not around now. When you have some free cycles, maybe you can poke at this.

Flags: needinfo?(jmathies) → needinfo?(agashlin)

Yeah there's definitely something weird going on at the Windows OS level. I checked smaug's comment 9 example in Chrome, and it was keeping time pretty well, but then I noticed that while I had it running, the instance running in Nightly was also keeping time much more closely. And upon closing the tab in Chrome, Nightly returned to the ~6ms slow timing. This might explain why it was accurate while the performance profiler was up, as that probably is asking for a particular timer from Windows. (Testing on Win 10 1903)

I'll keep digging for a bit.

Thanks to mstange's comment 2 I looked over bug 1492523, which deals with the same thing generally. It does look like this is mostly a matter of the resolution being left at the default 15ms. In general it looks like we don't call timeBeginPeriod() to change the resolution unless playing media (bug 1361629), profiling or (base profiling), during some netwerk scheduling thing, or during calibration for PR_Now().

I think at the bottom of all our timers on Windows is a SleepConditionVariableSRW() via this Wait() call in TimerThread. I think the resolution of this, like other sleep/wait stuff on Windows, is all down to that system resolution.

Chrome sets the timer resolution to 1 ms when the delay is < 32 ms. This is disabled when the device is on battery, see the comments here (which references our bug 363258 when the calibration loop was added to nspr), it looks like work on this began back in 2010 to avoid excessive power consumption. We already have some of it in tree, but unused.

It might be worth investigating increasing the resolution in some cases, but I don't personally have the time to do that work in the near future.


Aaron, it generally isn't a good idea to rely on specific timing for a browser game. This is why requestAnimationFrame() comes with a high-resolution timestamp, so that you can adjust to the actual frame rate, and as mstange mentions it will animate more smoothly than trying to simulate 60fps because it's tied into rendering. This doesn't have to mean that your logic has to be in lock step with the frame rate, you can watch the timer and update state whenever you determine a "game state" tick has passed. If you don't want to have big jumps when rendering is delayed or paused, you can keep your own timer and clamp the interval.

Something like this:

var TICK_TIME = 1000 / 100; // 1 tick = 1/100th of a second
var MAX_TIME_STEP = TICK_TIME * 10; // don't jump ahead by more than 10 ticks
var prev_t = 0;
var prev_systime = 0;
var next_tick = 0;

function frame(systime) {
  var elapsed = Math.min(systime - prev_systime, MAX_TIME_STEP);
  var t = prev_t + elapsed;

  // catch up state
  while (next_tick <= t) {
    // do one tick's worth of work

    next_tick += TICK_TIME;
  } 

  // draw frame
  // ...

  prev_t = t;
  prev_systime = systime;
  requestAnimationFrame(frame);
}

prev_systime = performance.now();
requestAnimationFrame(frame);
Flags: needinfo?(agashlin)
Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: