Last Comment Bug 731974 - requestAnimationFrame generates too short/long frames, especially at the beginning of animation
: requestAnimationFrame generates too short/long frames, especially at the begi...
Status: RESOLVED FIXED
[games:p1]
:
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: unspecified
: x86_64 Linux
: -- normal with 4 votes (vote)
: mozilla20
Assigned To: Vladimir Vukicevic [:vlad] [:vladv]
:
Mentors:
Depends on: 822096 927507 798872 809123 821230 824085 850562 1106115
Blocks: gecko-games 753139 763122 801310 799242
  Show dependency treegraph
 
Reported: 2012-03-01 05:31 PST by Benoit Jacob [:bjacob] (mostly away)
Modified: 2014-12-09 06:15 PST (History)
46 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
test case (click me!) (998 bytes, text/html)
2012-03-01 05:31 PST, Benoit Jacob [:bjacob] (mostly away)
no flags Details
IRC chat log (53.33 KB, text/plain)
2012-06-06 11:54 PDT, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details
Updated test to work in all browsers (1.69 KB, text/html)
2012-06-06 13:52 PDT, Martin Best (:mbest)
no flags Details
testcase + buffer output + simulate game logic CPU time (3.45 KB, text/html)
2012-06-07 10:56 PDT, Benoit Girard (:BenWa)
no flags Details
updated benchmark, prettified output (2.99 KB, text/html)
2012-06-18 13:00 PDT, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details
win32 timer query tool (1.37 KB, text/plain)
2012-06-18 13:39 PDT, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details
updated benchmark (3.74 KB, text/html)
2012-06-29 11:52 PDT, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details
patch (18.30 KB, patch)
2012-06-29 12:01 PDT, Vladimir Vukicevic [:vlad] [:vladv]
bzbarsky: feedback+
Details | Diff | Review
use custom timer for refresh driver (24.54 KB, patch)
2012-08-31 12:47 PDT, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details | Diff | Review
use custom timer for refresh driver (v2) (27.04 KB, patch)
2012-09-04 13:26 PDT, Vladimir Vukicevic [:vlad] [:vladv]
roc: review+
bzbarsky: review+
bugs: review-
Details | Diff | Review
use custom timer, part 2 (5.54 KB, patch)
2012-09-05 12:24 PDT, Vladimir Vukicevic [:vlad] [:vladv]
bugs: review+
bzbarsky: review+
Details | Diff | Review
updated combined patch (31.10 KB, patch)
2012-09-10 11:12 PDT, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details | Diff | Review
final updated patch (31.39 KB, patch)
2012-10-01 14:49 PDT, Vladimir Vukicevic [:vlad] [:vladv]
roc: review+
Details | Diff | Review
newest version, v1 (39.65 KB, patch)
2012-12-05 18:41 PST, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details | Diff | Review
interdiff (7.99 KB, patch)
2012-12-11 10:12 PST, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details | Diff | Review
more context in interdiff (31.79 KB, patch)
2012-12-11 10:23 PST, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details | Diff | Review
more correct and context in interdiff (14.16 KB, patch)
2012-12-11 10:24 PST, Vladimir Vukicevic [:vlad] [:vladv]
ehsan: review+
Details | Diff | Review

Description Benoit Jacob [:bjacob] (mostly away) 2012-03-01 05:31:50 PST
Created attachment 601937 [details]
test case (click me!)

We shouldn't get frames shorter than 16 ms with requestAnimationFrame.

The attached testcase shows that we do, especially in the first few frames which can be shorter than 1 ms!

Reload a few times, as results vary considerably across runs.

Results are not affected by the fact that this animation doesn't paint anything else than the text. I originally got similar results with a WebGL benchmark.

Here's a typical result I'm getting (Nightly 13.0a1, Linux x86-64):

frame 1, 13 ms after start, frame duration 3 ms, far too short!
frame 2, 17 ms after start, frame duration 4 ms, far too short!
frame 3, 23 ms after start, frame duration 6 ms, far too short!
frame 4, 30 ms after start, frame duration 7 ms, far too short!
frame 5, 38 ms after start, frame duration 8 ms, far too short!
frame 6, 46 ms after start, frame duration 8 ms, far too short!
frame 7, 56 ms after start, frame duration 10 ms, slightly too short
frame 8, 65 ms after start, frame duration 9 ms, far too short!
frame 9, 76 ms after start, frame duration 11 ms, slightly too short
frame 10, 92 ms after start, frame duration 16 ms
frame 11, 107 ms after start, frame duration 15 ms, slightly too short
frame 12, 124 ms after start, frame duration 17 ms
frame 13, 141 ms after start, frame duration 17 ms
frame 14, 161 ms after start, frame duration 20 ms
frame 15, 174 ms after start, frame duration 13 ms, slightly too short
frame 16, 190 ms after start, frame duration 16 ms
frame 17, 206 ms after start, frame duration 16 ms
frame 18, 223 ms after start, frame duration 17 ms
Comment 1 Boris Zbarsky [:bz] 2012-03-01 06:57:19 PST
The XPCOM timer implementation is busted.  We know it.  Not much use filing more bugs; we just need to fix it...  :(

Benoit, how do things look for you if you change the 1.5 to a 0 on http://hg.mozilla.org/mozilla-central/file/1c3b291d0830/xpcom/threads/TimerThread.cpp#l241 ?
Comment 2 Benoit Jacob [:bjacob] (mostly away) 2012-03-01 07:57:53 PST
With the tweak suggested in comment 1, it's better, but I still get some too-fast frames. Also, the frame durations remain very irregular:

frame 1, 45 ms after start, frame duration 18 ms
frame 2, 63 ms after start, frame duration 18 ms
frame 3, 80 ms after start, frame duration 17 ms
frame 4, 97 ms after start, frame duration 17 ms
frame 5, 114 ms after start, frame duration 17 ms
frame 6, 131 ms after start, frame duration 17 ms
frame 7, 149 ms after start, frame duration 18 ms
frame 8, 169 ms after start, frame duration 20 ms
frame 9, 184 ms after start, frame duration 15 ms, slightly too short
frame 10, 207 ms after start, frame duration 23 ms
frame 11, 227 ms after start, frame duration 20 ms
frame 12, 236 ms after start, frame duration 9 ms, far too short!
frame 13, 253 ms after start, frame duration 17 ms
frame 14, 270 ms after start, frame duration 17 ms
frame 15, 295 ms after start, frame duration 25 ms
frame 16, 306 ms after start, frame duration 11 ms, slightly too short
frame 17, 323 ms after start, frame duration 17 ms
frame 18, 341 ms after start, frame duration 18 ms
frame 19, 358 ms after start, frame duration 17 ms
frame 20, 381 ms after start, frame duration 23 ms
frame 21, 396 ms after start, frame duration 15 ms, slightly too short
frame 22, 411 ms after start, frame duration 15 ms, slightly too short
frame 23, 428 ms after start, frame duration 17 ms
Comment 3 Boris Zbarsky [:bz] 2012-03-01 07:59:14 PST
That's interesting.  I would not have expected too-short frames in that case...

And I can absolutely believe the frames are irregular.  Especially on Windows, where if I recall correctly the relevant timer is only accurate to like 16ms at best to start with.
Comment 4 Benoit Jacob [:bjacob] (mostly away) 2012-03-01 08:05:30 PST
Another strange phenomenon: i seem to get more requestAnimationFrame callbacks than I do really composited frames.

steps-to-reproduce:
1. run the testcase.
2. as results accumulate, sometimes there is a noticeable pause (maybe a GC pause or something). That's fine. But then, we should not get requestAnimationFrame callbacks, and the next frame we get should show a long duration (like, 100 ms). Instead, when rendering resumes, a bunch of new frames are now reported all at once, with short durations each.

Having more requestAnimationFrame callbacks than composited frames seems like a waste of time, and a different bug than the timers bug.
Comment 5 Boris Zbarsky [:bz] 2012-03-01 08:08:09 PST
Is that with the 1.5 changed to 0 or without?
Comment 6 Benoit Jacob [:bjacob] (mostly away) 2012-03-01 08:09:56 PST
That is with the 1.5 changed to 0.
Comment 7 Benoit Jacob [:bjacob] (mostly away) 2012-03-01 08:17:44 PST
But, the same issue (comment 4) also happens without that 1.5 changed to 0.

The only thing is, you need a non-optimized build to get noticeable pauses in this simple test.
Comment 8 Boris Zbarsky [:bz] 2012-03-01 08:21:58 PST
> But, the same issue (comment 4) also happens without that 1.5 changed to 0.

Yes, that's expected.  With the 1.5 the timer code tries to "guess" when to fire timers so they'll fire "at the right time" based on previous delays, so it can trigger timers that run way too early.

With the 1.5 changed to 0, though, that should not be happening.  Can you post a testcase that reliably reproduces the problem for you when the 1.5 is changed to 0?  You can trigger a guaranteed 100ms pause by doing something like this:

  setTimeout(function() {
      var start = new Date;
      while (new Date - start < 100) /* do nothing */ ;
    }, 200);

Oh, and are you testing on Linux, not on Windows?  In that case you should be getting better accuracy out of your XPCOM timers...
Comment 9 Alon Zakai (:azakai) 2012-05-24 08:30:48 PDT
I am seeing slightly different data than in the first comments here. There are in fact far-too-short requestAnimationFrame firings, but they happen after delays. So for example if frames are 20ms (just to make it simple), we often fire in patterns like this:

0
20
40
70 - took 30ms, not 20, due to some browser delay like a gc
80 - took 10ms, in order to return to the same rhythm as before
100 - back to normal
120 - etc.

So a long frame leads to a short frame after it.

The alternative to this would be something like

0
20
40
70 - 30ms frame, like before
90 - wait 20ms after the long frame, unlike last time
110 - continue in a new rhythm, now %20 is 10 and not 0
Comment 10 Randell Jesup [:jesup] 2012-05-24 12:07:56 PDT
Short delays after long ones makes sense if you're trying to match some realtime timeline.  

There are 3 basic options with 1 variant each:

1) Approximate realtime - long delays are followed by short delays until the next quanta.
   1a) if "long delay" is > 2 quanta, you can either play the next frame, or skip it to stick to realtime (keep sync with other events)

2) Pausing to the next quanta (frame duplication if you will) will guarantee each frame is seen for at least the specified quanta (long delay, followed by another long delay until the next quanta).
   2a) Same, but drop frames if they can't be output "on time" (think vsync tracking - if it doesn't make the deadline, you drop it).

3) Display each frame for >= quanta - if there's a long delay, the nominal next frame will be quanta later, establishing a new rhythm.  This means sync to other elements wanders, but all frames are displayed for quanta ms.  Basically you ignore that a frame took too long and start a new timer after each frame.
   3a) Same, but you drop frames if the accumulated slip pushes you past the expected point for that frame (i.e. frame-drop to maintain sync)

Part of the problem is "which is appropriate for the application"?  We either need multiple functions, parameters, or choose one and let applications that need others in some manner roll their own given what we provide.
Comment 11 Boris Zbarsky [:bz] 2012-05-24 12:22:42 PDT
Sure.  So in comment 10 terms translated to nsITimer, #1 sounds like TYPE_REPEATING_PRECISE_CAN_SKIP (which is what the refresh driver uses).  #3 is TYPE_REPEATING_SLACK.  I'm not sure we have an equivalent of #2 right now.
Comment 12 Boris Zbarsky [:bz] 2012-05-24 12:23:13 PDT
One other note: other UAs are just tying requestAnimationFrame to vsync.  This would be more like #2, in fact.
Comment 13 Jeff Gilbert [:jgilbert] 2012-05-29 17:16:17 PDT
To match vsync behavior, we need our timers trigger on 16.67ms intervals (read: 16 or 17), and no frame can be less than 16ms. If frame 1 takes 20ms, the timer for frame 2 should be 28ms, ideally giving us timer triggers at t=[0,20,48]. Even if frame 2 can hit the t=32ms mark, it should instead wait until t=48ms to fire.

This is #2 above. It seems like #3 might be ideal for us, since it basically only serves to limit us to ~60fps while allowing maximum framerate, which are both desired in the absence of vsync.

An aside that 16ms frames yields 62.5fps, while 17ms yields ~58.8.
Comment 14 Benoit Girard (:BenWa) 2012-05-29 21:17:58 PDT
We can't assume that when the timer fires JS can draw and gecko can compose a frame instantaneously. If we want to achieve something that synchronized to when we present the frame we will need to tie in composite notification to our internal logic.

If we decide to delay when we begin rendering a frame because we fell behind in calling it on time then we're going to be multiplying the delay. If we do this on a per frame basis then we're going to be forcing ourselves to jump from 60 Hz to 30 Hz frequently and that I imagine that feels worse then drawing at 60 Hz and showing a few frames twice when we have contention (GC, background tabs).
Comment 15 Benoit Jacob [:bjacob] (mostly away) 2012-05-30 04:49:33 PDT
(In reply to Benoit Girard (:BenWa) from comment #14)
> If we decide to delay when we begin rendering a frame because we fell behind
> in calling it on time then we're going to be multiplying the delay. If we do
> this on a per frame basis then we're going to be forcing ourselves to jump
> from 60 Hz to 30 Hz frequently and that I imagine that feels worse then
> drawing at 60 Hz and showing a few frames twice when we have contention (GC,
> background tabs).

That would need to be verified. I rather agree with Jeff here as I expect that the user experience is mostly determined by the worst framerate, not the average framerate, in a certain time window (maybe 1 second, maybe 0.1 second, whatever unit of time our brain works with).

So I expect that having one frame shown twice gives the same user experiece as staying at 30 Hz for a certain amount of time (maybe 1 second, maybe 0.1 second). I expect that amount of time to be significantly greater than 0.033 second = 1/(30 Hz) so, if that is confirmed, trying to catch up after a repeated frame would be futile.

Experimentation/documentation needed!
Comment 16 Alon Zakai (:azakai) 2012-05-30 16:52:11 PDT
https://tbpl.mozilla.org/?tree=Try&rev=382e2ac770e0

is a build with one potential approach. It adds a new type of timer, NEVERSHORT, which are never too fast. They ignore the timer system's internal adjustment mechanism (that makes the next firing delay short if the current one was long), and they schedule the next firing time right before actually firing and not before scheduling the firing event (which can get stalled if there are other events). This ensures they are never too early. If they are too late, they just pick up from there, so delays can look like 17, 17, 17, 1000, 17, 17, 17 etc.

A main downside of this is reduced frame rates. We have longer frames but no shorter frames, so on average we decrease. Also, even at the fastest we seem to get frames of 17-18 and not 16-17, I suspect because we (purposefully) don't adjust for the average wait times of the event queue.

The upside is we no longer get frames that are too short. But, I don't personally see a subjective benefit to this, I'm not that sensitive to this sort of thing though so maybe I am a bad tester. If others can test those builds that would be cool.
Comment 17 Alon Zakai (:azakai) 2012-05-30 17:40:12 PDT
Builds showing up in

http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/azakai@mozilla.com-382e2ac770e0/
Comment 18 Randell Jesup [:jesup] 2012-05-30 21:22:35 PDT
Alon's build implements type 3 from my post above.

Part of the issue is knowing the purpose of the animation timer. The requirements from a usecase could be any of 1, 2 or 3 (and the 'a' variants).  They are of differing levels of frequency of use, and a number of them can be emulated to any degree or another if they don't exist (and the app knows what type it *is* getting).

Games in particular have very different needs than something showing a cartoon-ish anim or a video clip.  (they want enough time to render the next frame, and preferably have the firing synced to vsync).  Items meant to be synchronized to other events want a different approach.

And for smooth, 30 or 60Hz playback (as monitors are strongly likely to be 60Hz nowadays), you need consistent 16.6*ms rates, and preferably with vsync.  (If you have 2ms of jitter, and the firing is roughly aligned with vsync, you can have a wildly variable framerate visible to the user of 30-60Hz and the firing time jitters to one side or another of vsync with each frame.)

The people at the Games Work Week got direct feedback on this issue; I didn't head it all, but check with them (Martin Best, the JS guys who were there, etc)
Comment 20 Martin Best (:mbest) 2012-05-31 13:24:06 PDT
So Randell's insights are right on in my view.  When using vsync for games you want to target either 30-60fps and that's 16.667ms.  If you do drop down to 30fps, you want to average the idle time for the last few seconds to see if it's greater than 50% to jump decide to jump back up to 60fps to keep it from thrashing back and forth between the two.

However, many gamers prefer to have vsync off and live with consequences.  In fact most PC games have this as the default setting.  This allows for 52 fps for example which provides a more responsive feel to game play at the cost of the occasional tearing artifact.

Having frames go shorter than the computation and rendering time seems counter productive.
Comment 21 Jeff Gilbert [:jgilbert] 2012-05-31 13:43:31 PDT
A note that with vsync, there doesn't need to be any decision whether to render at 60 or 30 (or less), as this behavior is just the result of the constraints:
When swapping, block until vblank, swap, then return.

Or more generally:
At most one swap per vblank cycle.
Swap must occur at vblank.

If the frame takes 20ms, it will naturally overflow the first vblank cycle into the next, taking two vblank cycles to display, yielding 30fps if the base framerate is 60fps.
Comment 22 Martin Best (:mbest) 2012-05-31 14:09:08 PDT
	1	16
25	2	32
	3	48
50	4	64
75	5	80
	6	96
100	7	112

Here is the issue that the averaging is meant to solve.  May not be an issue but just wanted to make sure to discuss.  The right column shows when a new frame is ready, in this case every 25ms, the middle column is the number of frames at 60fps and the left column is the 60fps ruff ms count.  You see the gaps that are created, this can give a jagged look so you want to stick to 30fps in these cases.  You can do this at the games level by simply requesting a larger on requestAnimationFrame.
Comment 23 Jeff Gilbert [:jgilbert] 2012-05-31 14:16:37 PDT
VSync with 25ms frame times looks like:

vblank 1 at 16ms
frame 1 finishes at 25ms, blocking until...
vblank 2 at 32ms, swap frame 1, begin work on frame 2
vblank 3 at 48ms,
frame 2 finishes at (frame_2_start + frame_2_time) = (32ms + 25ms) = 57ms, blocks until...
vblank 4 at 64ms, swap frame 2, begin work on frame 3
etc.

Note that frames are swapped in only every other vblank, since work on the next frame starts at the vblank that swaps the previous frame. Thus, for 25ms frames, they always miss the next frame window, but land part way through the second, and are swapped 32ms after work began.
Comment 24 Randell Jesup [:jesup] 2012-05-31 14:25:23 PDT
Right.  Though if the calculation time is close to 16.6ms it will jitter back and forth between 30 and 60 (and all frame calcs have jitter, and other processes/processing/etc will cause more jitter).  This is something game people are used to.

For most other stuff (non-games), calculation time isn't often the issue, it's mostly timer jitter and thread pauses - wanting to show something smoothly *or* keep it in sync with a different animation, etc.
Comment 25 K. Gadd (:kael) 2012-06-02 01:55:53 PDT
This is causing a lot of problems for my demos in Firefox. Sometimes they run at 75fps or faster (due to requestAnimationFrame) and it's very difficult to correct for this without dropping frames because of the way the timing seems to work.

Is there anything I can do to prevent the requestAnimationFrame timer from running too fast/slow - for example, using requestAnimationFrame to kick off a pushMessage so that the timer doesn't try to compensate for the duration of my frames?

My desired behavior for requestAnimationFrame would be:
When possible, run requestAnimationFrame callback once for every vertical sync
When not possible, run as close to 60hz as possible
and then have the frames I generate presented at the next vsync.

Of course, at present, I'd be happy to just actually get 60hz :) Chrome doesn't seem to vsync requestAnimationFrame output either (lots of tearing), but it at least manages to hit 60hz very precisely. IE does too, except for the fact that their requestAnimationFrame is broken so I can't use it.
Comment 26 Jeff Gilbert [:jgilbert] 2012-06-02 14:06:56 PDT
If you use setTimeout(15) or so, iirc I got a more consistent trigger rate. The real answer is it needs to be fixed.
Comment 27 Jeff Gilbert [:jgilbert] 2012-06-05 18:58:16 PDT
So this is interesting: Without Alon's modification, I get about ~46fps (21.7ms), and with, I get ~26fps (38.4ms). It seem like the modification is now appending ~16ms to the time the frame takes, as 46fps = ~22ms, 22+16ms = 38ms = ~26fps.

This would make sense if each requestAnimationFrame() call sets up a new timer which now always takes 16ms. (and I believe this is the current setup)

Really what I think we want is a consistent timer which fires every 16ms, calling the RAF callback if one has been set.
Comment 28 Boris Zbarsky [:bz] 2012-06-05 19:13:09 PDT
Except we don't actually want wakeups every 1ms.  And we also want sane handling of cases when a callback takes 128ms.
Comment 29 Benoit Jacob [:bjacob] (mostly away) 2012-06-06 04:54:02 PDT
(In reply to Boris Zbarsky (:bz) from comment #28)
> Except we don't actually want wakeups every 1ms.  And we also want sane
> handling of cases when a callback takes 128ms.

What do you call sane handling of this case? I hope that doesnn't involve trying to catch up on skipped frames. If a callback takes 128ms, too bad for the skipped frames, I don't see anything to do about that.
Comment 30 K. Gadd (:kael) 2012-06-06 05:40:27 PDT
Honestly, if a frame callback takes longer than (1000 / refreshRate), just schedule the next frame as soon as possible (i.e. maintain UI responsiveness by pumping messages, but don't wait). That's how basically every modern game works, unless it's directly slaved to vertical sync.

When you're directly slaved to vertical sync you're basically just running as fast as possible, but then you block your main thread on vsync every time you paint, which ends up locking you to integral factors of the vertical sync rate (60, 30, 15, etc). But for requestAnimationFrame it probably makes the most sense to just do the above - try to run at 60hz, and when the callback duration gets too long, just run as fast as possible. It shouldn't be necessary to do any crazy timer scheduling for this.

Also, if you could actually just slave to vsync, that would eliminate the need to do any timer scheduling for 60hz either, but I imagine it's not easy to actually slave to vsync on linux/mac, and I know you can't on windows if you aren't using DirectX.
Comment 31 Boris Zbarsky [:bz] 2012-06-06 08:05:06 PDT
> What do you call sane handling of this case?

Something that doesn't involve 8 immediate calls in a row to the callback after the 128ms thing returns, which is what the setup in comment 27 is liable to produce (see also the REPEATING_PRECISE nsITimer variety and the reasons we stopped using it for requestAnimationFrame).
Comment 32 Benoit Jacob [:bjacob] (mostly away) 2012-06-06 08:29:16 PDT
(In reply to Boris Zbarsky (:bz) from comment #31)
> > What do you call sane handling of this case?
> 
> Something that doesn't involve 8 immediate calls in a row to the callback
> after the 128ms thing returns

Great, we're on the same page.
Comment 33 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-06 09:48:12 PDT
So, to make sure; what we want (ignoring vsync):

- requestAnimationFrame will ideally trigger a frame 16ms from the last time its callback started executing (note, started, not anything else)
- if the last time one started executing was > 16ms, then it should trigger the callback immediately

Note I'm using 16ms here for 60fps; substitute whatever other value is appropriate for the target frame rate. (Would be nice if we had a way to let the page specify its target fps..)

Implementation wise, seems like we should schedule a 16ms timer right at the start of the callback routine.  If RAF is called, just set a flag that it was called, we've already got a timer set up, and carry on.  If it's not called, then the flag won't be set, and the next time the timer fires we should just ignore it.  This approach would handle both cases above -- if the callback takes longer the desired time, the timer will naturally fire as soon as it can afterwards.
Comment 34 Boris Zbarsky [:bz] 2012-06-06 10:02:19 PDT
> Implementation wise, seems like we should schedule a 16ms timer right at the start of
> the callback routine.

Here's how the current implementation works, for reference.  First of all there is only one timer involved.  All RAF calls for the tab, and all async reflow and restyling happen off that one timer.  So the "set a flag" thing happens automatically.  This timer is a  TYPE_REPEATING_PRECISE_CAN_SKIP XPCOM timer, so its behavior is as follows.

1) When the XPCOM timer thread triggers and posts the event to the main thread to run the
   nsITimer's callback, we add 16ms to "now" and record that time in the timer object.

2) When the event posted in step 1 fires on the main thread, we run the website callback.

3) After running the website callback, we schedule a new timer to fire at the time
   recorded in step 1 (or ASAP if that time has already passed).

Note that there can be a noticeable delay between #1 and #2 depending on what else the main thread is doing.  Not much we can do to avoid that.

So unless I'm missing something, this is in fact very much what comment 33 is asking for, no?
Comment 35 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-06 11:54:56 PDT
Created attachment 630653 [details]
IRC chat log

bz, kevin, and I chatted about this in irc today; here's the log.  there are the beginnings of a plan that someone will summarize soon!
Comment 36 Martin Best (:mbest) 2012-06-06 13:52:44 PDT
Created attachment 630707 [details]
Updated test to work in all browsers

So that we can compare, I updated the test case to work in all browsers.  This falls back to setTimeout() if some version of requestAnimationFrame is not present (ie9).
Comment 37 Martin Best (:mbest) 2012-06-06 13:56:44 PDT
I tested the updated test in Firefox 12 and 15, Chrome 19, Safari 5.1.7, IE9.  If someone can test that and make sure it's a valid test.  Currently Chrome seems to have the most consistent result.
Comment 38 Bobby Richter [:secretrobotron] 2012-06-06 14:10:25 PDT
Just tested on Opera a few times. Never dips below 17ms.
Comment 39 Bobby Richter [:secretrobotron] 2012-06-06 14:10:49 PDT
It's interesting to note here that FF is the only browser that dips below 16ms.
Comment 40 Boris Zbarsky [:bz] 2012-06-06 14:12:35 PDT
It's not that interesting.  We know exactly why we dip below 16ms.  The only question is what, if anything, should be done about it.
Comment 41 Alon Zakai (:azakai) 2012-06-06 14:16:56 PDT
(In reply to Boris Zbarsky (:bz) from comment #40)
> We know exactly why we dip below 16ms.  The only
> question is what, if anything, should be done about it.

Yes, exactly. There are benefits to dipping below 16ms (keeping frame rates stable). The question is whether there is a downside to doing so. We suspected that rendering might look less smooth that way, but I haven't seen a subjective visual test that showed a difference. mbest also did some testing and told me the same.
Comment 42 Martin Best (:mbest) 2012-06-06 15:08:22 PDT
Just to clarify, the test Alon is referring too ran about 18 fps without dipping and this test had no really impact on the precised quality in my opinion.  That isn't to say that it wouldn't look better running closer to the refresh rate nor does it reveal that having a lower frame rate was a good or bad thing.  After thinking about it, I'm not sure that having pretty far of the sync rate makes it a test that we can really consider definitive.

To talk about the down side for a moment, here are my thoughts on that:

I think we need to be careful not to get too comfortable with the idea that 14-18 ms spread when no significant JS work executing is acceptable.  We should get as close to 16.667 as possible in the empty loop case.  As long as the delta time provided to game developers is in microsecond, they will be able to deal with any reasonable drift but the closer we can get the better.

Having animation trigger before the JS work of the current frame is done executing seems to me to provide no value from a games perspective.  My understanding is that this will just lead to posting the same frame buffer as last time, will you not?

I think that if we are the only ones doing it following the current model, then we should either change to conform to the implied standard or update/clarify the standard to reflect our desired solution.  This is pretty fundamental as it provides the core animation (or game) loop to developers.  

The fracturing standard is the number 2 argument against html5 I run into when talking to developers.  Poor and fractured Audio being number 1.  Having this inconstant across browsers seems like a very bad idea to help with HTML5 adoption if there is no clear advantage to the current approach.
Comment 43 Boris Zbarsky [:bz] 2012-06-06 15:19:42 PDT
Please see attached IRC discussion.  The fact that people are using rAF for both their animation loop and their game loop is in fact a problem that can't be fixed by tweaks to rAF behavior; it needs a different solution.
Comment 44 Jeff Gilbert [:jgilbert] 2012-06-06 15:38:31 PDT
(In reply to Boris Zbarsky (:bz) from comment #43)
> Please see attached IRC discussion.  The fact that people are using rAF for
> both their animation loop and their game loop is in fact a problem that
> can't be fixed by tweaks to rAF behavior; it needs a different solution.

It shouldn't be a problem to do that, though. This is sort of orthogonal to the discussion of rAF behaving poorly even in the empty loop case.
Comment 45 Randell Jesup [:jesup] 2012-06-06 15:54:32 PDT
Agreed - though they may want animations to be synced to game state changes - not sure.  As for real animations, jitter in timing can cause really annoying variable frame rates if the nominal time is close to vsync (frames will semi-randomly fall on one or the other side of the vsync, causing dups or skips depending if it's on the same side as the previous frame.

Even if they're using a separate mechanism for state updates, the same issues apply including jitter vs. vsync.

And I haven't read the log (I will), we need to make sure developers have a standard way to do this - introducing new standard mechanisms can be done, but it will be a while(!) before they land and are universal, and game writers will need to have fallback code for current rAF (for example).
Comment 46 Benoit Jacob [:bjacob] (mostly away) 2012-06-06 17:09:00 PDT
Quick question: do we all agree that if a game cares about optimal responsiveness (like a first-person shooter), it should not run its entire game loop in the requestAnimationFrame callback, instead it should only do rendering in the requestAnimationFrame callback and run the main game loop in a separate callback, presumably a setTimeout(0) callback?
Comment 47 Boris Zbarsky [:bz] 2012-06-06 17:36:16 PDT
> As for real animations, jitter in timing can cause really annoying variable frame rates

Yes.  We need to fix XPCOM timers, and we have bugs covering that.

In case that wasn't clear, this means:

1)  Resolve the 15ms minimal resolution problem on Windows.
2)  Reland the fix in bug 590422.

Once that's done, you will no longer have problems with too-small delays.

An alternative to #2 that might not require #1 as a requirement is improving the delay-line algorithm to somehow ignore outliers. 

> run the main game loop in a separate callback, presumably a setTimeout(0) callback?

Why?  How does that help?
Comment 48 Jeff Gilbert [:jgilbert] 2012-06-06 17:39:57 PDT
(In reply to Benoit Jacob [:bjacob] from comment #46)
> Quick question: do we all agree that if a game cares about optimal
> responsiveness (like a first-person shooter), it should not run its entire
> game loop in the requestAnimationFrame callback, instead it should only do
> rendering in the requestAnimationFrame callback and run the main game loop
> in a separate callback, presumably a setTimeout(0) callback?

Possibly, but only because it's possible to get slightly better input->network latency with this method (basically pumping the input stack faster). For rendering something at 60fps already, it is at-best the same as batching update-and-render, and slightly worse in the simple case. (rendering using positional data which is up to 4ms old)

A game which is pushing the envelope timing-wise will likely behave better with split callbacks, but a monolithic callback should work fine for anything that's sub-15ms anyways.
Comment 49 Boris Zbarsky [:bz] 2012-06-06 17:41:50 PDT
> Possibly, but only because it's possible to get slightly better input->network latency 

Input should be happening off event handlers, not off a timer loop, I would think.....
Comment 50 Alon Zakai (:azakai) 2012-06-06 17:46:06 PDT
(In reply to Boris Zbarsky (:bz) from comment #47)
> > As for real animations, jitter in timing can cause really annoying variable frame rates
> 
> Yes.  We need to fix XPCOM timers, and we have bugs covering that.
> 
> In case that wasn't clear, this means:
> 
> 1)  Resolve the 15ms minimal resolution problem on Windows.
> 2)  Reland the fix in bug 590422.
> 
> Once that's done, you will no longer have problems with too-small delays.

I think we will still have delays shorter than 16.67 ms. The
timer code uses an adjustment to compensate for the lag between
posting an event and that event running. So if the event loop
is slow for a period of time, we fire events earlier. But when
the event loop becomes fast again, we fire some timers earlier
than they should.

We can fix that too though (one approach, not a good one, is
in the build I posted).
Comment 51 Benoit Jacob [:bjacob] (mostly away) 2012-06-06 17:47:39 PDT
(In reply to Boris Zbarsky (:bz) from comment #49)
> > Possibly, but only because it's possible to get slightly better input->network latency 
> 
> Input should be happening off event handlers, not off a timer loop, I would
> think.....

Input event handlers should only update variables directly representing input (e.g. "up arrow key is pressed"), but the main game engine loop updating characters' positions, doing physics etc, should definitely not run off input event handlers (if only because it will need more frequent callbacks than just the two "key pressed" and "key released" callbacks). Since, in a responsiveness-sensitive game, the game engine loop should also be decoupled from rendering, it shouldn't run off requestAnimationFrame either. So I only see setTimeout(0) i.e. get a callback ASAP.
Comment 52 Boris Zbarsky [:bz] 2012-06-06 17:48:55 PDT
> The timer code uses an adjustment 

Item 2 in my list of things to do is removing that code.
Comment 53 Benoit Jacob [:bjacob] (mostly away) 2012-06-06 17:54:47 PDT
Do I correctly understand that having requestAnimationFrame callbacks being given by a timer is only a fallback/temporary solution for lack of proper vsync handling, and that once we get code to use vsync, we should simply fire a requestAnimationFrame callback on vsync (unless another one is still pending)?

Or is there some deeper reason why timers are relevant to requestAnimationFrame even in presence of vsync?
Comment 54 Alon Zakai (:azakai) 2012-06-06 17:56:31 PDT
(In reply to Boris Zbarsky (:bz) from comment #52)
> > The timer code uses an adjustment 
> 
> Item 2 in my list of things to do is removing that code.

Sorry, I should have read that more carefully. I'll comment in that bug.
Comment 55 Boris Zbarsky [:bz] 2012-06-06 17:59:48 PDT
As far as vsync... there's some discussion about that in the attached log.  But yes, if we stop triggering rAF from timers then the XPCOM timer stuff stops being relevant.
Comment 56 Benoit Girard (:BenWa) 2012-06-07 10:56:27 PDT
Created attachment 631040 [details]
testcase + buffer output + simulate game logic CPU time
Comment 57 Jeff Gilbert [:jgilbert] 2012-06-08 17:54:08 PDT
Submitted bug 763122 to track rAF discussion outside the scope of the main issue with timers/timing in this bug.
Comment 58 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-18 13:00:13 PDT
Created attachment 634148 [details]
updated benchmark, prettified output

Updated the testcase and prettified the output.

I've been experimenting with this with a build that has bug 590422 applied (removing timer filter/adjusting stuff), and that does timeBeginPeriod(1) around XRE_main.  The problem is, with any combination of them, or without any changes, in a fresh build/launch I tend to get 16-17ms reliably basically throughout, with 5-10ms of logic time.  Still experimenting more.
Comment 59 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-18 13:39:16 PDT
Created attachment 634155 [details]
win32 timer query tool

Here's a little tool that can query the current timer resolution and change it.  Note that it uses the internal NtQueryTimerResolution/NtSetTimerResolution functions that timeBeginPeriod/timeEndPeriod use instead of using the time* functions; there's no way to query the current resolution with the time* functions.

On my laptop, when it's plugged in, I see a resolution of 0.5ms, with a range of 0.5ms-15.6001 ms.  I can't get this to budge; I try setting it to 1ms, 5ms, 15ms, and it stays at 0.5ms.  I'll try again when it's on laptop power later, but this could explain why I have a hard time reproducing any timer issues. (Just run to query, run with an (optionally floating-point) arg to set resolution in ms.)
Comment 60 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-18 19:03:19 PDT
On my desktop, the range is also 0.5-15.6001, but the resolution happens to be set at 1.0ms.  I have a bunch of flash running, which I think bumps it up; would be interesting to look at this on a fresh system boot.  However, I think what it's showing is that just a higher resolution timer setting is probably not enough.
Comment 61 Nat Duca 2012-06-22 07:56:58 PDT
If it helps, we have two approximate implementations of the requestAnimationFrame in Chromium.


The first approach, which we've been using since Chrome 13 up to and including now, uses the following trick: on almost all OS', the first swapbuffers/present call on a double-buffered swapchain swapbuffers/D3D present is nearly instananeous, when the system was previously is idle. But, if you then call present *again* right after that one returns, the second present will almost always block until the next vblank interval.

In chrome, we execute these swapbuffers on a secondary thread from javascript. So, effectively, when you first request an animation frame, we fire it immediately. This triggers rendering and sends a swapbuffers to our GL thread which completes immediately. The next request for an animation frame, we also fire immediately. This sends another swap to our GL thread. But, this swap now blocks until the next vsync. The third request animaiton frame that we get, we delay until the previous swap completes.

As it turns out, this gives really smooth frame rates, and recovers quite gracefully when, for example, a webgl game is really GPU heavy. In that case, you dont want to keep ticking rAF at 60Hz--- doing so would just cause more and more work to pile up on the driver.




Chrome Android uses our v2 scheduling approach. It is based on regular timers, but with error correction to deal with the fact that no OS will ever give you a 16.6666ms callback. The implementation of our timer for this is here, and is activated when chrome is in threaded compositing mode (about:flags): trac.webkit.org/browser/trunk/Source/WebCore/platform/graphics/chromium/cc/CCDelayBasedTimeSource.cpp . As it turns out, this timer has slight +/-1 noise when it runs per frame, but on average, it actually does average out to the correct target frame rate due to this error correction.

"Locking to Vsync" is just a matter of starting this timer (or drifting it) such that it its m_timebase is aligned with the hardware vsync timer. Almost all OS' have apis for getting that timebase, so we periodically ask for it, then update the delay based time source to use that new timebase. Our measurements with about:tracing have shown that this is surprisingly effective.

We're in the process of switching desktop chrome over to use this same technique. Our tracking bug for this is:
http://code.google.com/p/chromium/issues/detail?id=129674


Hope this is useful perspective!
Comment 62 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-26 07:12:07 PDT
Hmm, we need to fix this, ideally for fx15.  bz's two points that we need to resolve are:

> 1)  Resolve the 15ms minimal resolution problem on Windows.

I propose that we do this by using timeBeginPeriod/timeEndPeriod based on whether there is requestAnimationFrame active, perhaps just in the current tab.  (Or a low setInterval?)

We can then evaluate the rAF algorithm and see if we need to do the adjustments that chrome does as descried above; I think this will get handled by the timer infrastructure.

> 2)  Reland the fix in bug 590422.

Once we have the above, then look into doing this.  We might regress whatever DHTML test, but we should actually look and see what the test is, and see if it's an actual regression or just on paper.  (e.g. was it depending on old timer behaviour of firing a bunch of 0-spaced ticks to 'catch up', thus giving it a pile of frames at once?  should the tests be rewritten to use rAF?)

Any concerns with the above plan?  I'm going to make myself responsible for this bug, unless anyone really objects :)
Comment 63 Boris Zbarsky [:bz] 2012-06-26 07:19:11 PDT
> e.g. was it depending on old timer behaviour of firing a bunch of 0-spaced ticks to
> 'catch up', thus giving it a pile of frames at once

Yes.  The test was getting 10ms average interframe delay out of a bunch of 0ms and 15ms delays, basically.  It's not just tDHTML; some of the scrolling performance tests were also affected.

> should the tests be rewritten to use rAF?

That depends on what we want to measure.  The user experience the first time I tried to land that bug was that scrolling using timeouts would take longer 50% longer...  So it wasn't just a paper regression.
Comment 64 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-26 08:37:58 PDT
(In reply to Boris Zbarsky (:bz) from comment #63)
> > e.g. was it depending on old timer behaviour of firing a bunch of 0-spaced ticks to
> > 'catch up', thus giving it a pile of frames at once
> 
> Yes.  The test was getting 10ms average interframe delay out of a bunch of
> 0ms and 15ms delays, basically.  It's not just tDHTML; some of the scrolling
> performance tests were also affected.
> 
> > should the tests be rewritten to use rAF?
> 
> That depends on what we want to measure.  The user experience the first time
> I tried to land that bug was that scrolling using timeouts would take longer
> 50% longer...  So it wasn't just a paper regression.

Hmm. Do we care, though? I guess it was using setTimeout(0)?

I'm going to have to get a separate machine to test... after a while, my laptop ends up at 0.5ms timer resolution mode, and there's no point in testing this unless the system is at the default 15ms.  Could be virtualbox; 'powercfg -energy' says that the google talk plugin at least requests 5ms, though not 0.5ms.
Comment 65 Boris Zbarsky [:bz] 2012-06-26 08:51:41 PDT
> Hmm. Do we care, though? 

That's a good question, yes.  Real pages _do_ do this sort of thing.

> I guess it was using setTimeout(0)?

It might have been using 10, not 0.  But something along those lines.
Comment 66 Alon Zakai (:azakai) 2012-06-26 10:09:19 PDT
(In reply to Boris Zbarsky (:bz) from comment #63)
> The user experience the first time
> I tried to land that bug was that scrolling using timeouts would take longer
> 50% longer...  So it wasn't just a paper regression.

That makes sensek, if we remove the timer adjustment entirely then events will simply fire later than they should - because currently the timer adjustment is what tries to make them fire at the right time, when there is load. So removing the timer adjustment will make us less snappy.

An alternative is to add a few lines of code to the RAF implementation that prevents frames that are too short (if it's too soon, schedule a callback for the right time). This would not regress general responsiveness, and would still prevent frames that are too short.
Comment 67 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-26 13:07:11 PDT
What should we look at as success criteria here?

1) 60fps, +/- 1ms, when run on a windows system with default 15ms platform timer granularity (whether we change it internally or not)

2) No too-short frames

Is testing with the benchmark that's here good enough, maybe with 10ms of logic time "work"?
Comment 68 Jeff Gilbert [:jgilbert] 2012-06-26 15:57:53 PDT
For testing, I should think we should shoot for roughly 60fps (16ms is 62.5fps) with <15ms of 'logic', with FPS slowly dropping below 60 as we add more 'logic'.
Comment 69 James Robinson 2012-06-26 18:07:38 PDT
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #67)
> What should we look at as success criteria here?
> 
> 1) 60fps, +/- 1ms, when run on a windows system with default 15ms platform
> timer granularity (whether we change it internally or not)
> 
> 2) No too-short frames
> 
> Is testing with the benchmark that's here good enough, maybe with 10ms of
> logic time "work"?


I think the success criteria for RAF is exactly 1 frame is produced in every vsync interval of the display.  It's possible - deceptively easy, in fact - to have an average framerate that exactly matches the display but miss a ton of frames by producing them just before/just after vsync lines, producing a terrible user experience even though the averages look good.
Comment 70 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-26 18:24:02 PDT
Yep, I think getting it lined up with vsync will be a followup though -- we need to get to solid 16ms first, and then get it aligned correctly.
Comment 71 Nat Duca 2012-06-26 18:31:33 PDT
I buy that. You might shoot for a solid 1000/60.0 == 16.6' then, so that your timer code correctly handles the fact that the OS will never give you what you asked for.
Comment 72 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-29 11:52:22 PDT
Created attachment 637954 [details]
updated benchmark

Added an option to run long, and some additional display to the results (fps, % for each delay time)
Comment 73 Vladimir Vukicevic [:vlad] [:vladv] 2012-06-29 12:01:20 PDT
Created attachment 637957 [details] [diff] [review]
patch

Here's a first attempt at a patch.

- Instead of creating a new timer per RefreshDriver, it just creates two -- one regular, one throttled -- and shares them amongst all RefreshDrivers.

- It enters 1ms-precision mode on windows while requestAnimationFrame is active/outstanding, leaves it afterwards.

- It tracks the desired interval and similar as doubles with sub-ms precision, and then does explicitly figures out when to schedule the next timer instead of relying on nsITimer's interval mode.

- It breaks tests that use advanceTimeAndRefresh.  :(  I'll fix this later, likely with another RefreshDriver implementation (and making some of the calls virtual).

- I noticed that the event tracer often kicks in and throws things for a loop; I added an env var, MOZ_NO_EVENT_TRACER, to test what things are like without it.

With this patch, I can get much smoother framerates with the testcase on Windows.  Still not as smooth as I'd like, but much better.  It doesn't even need the patch to remove the timer filter stuff, though we do see those effects on occasion (which is annoying; we really shouldn't, because we're not doing intervals!).

There's an #if 0 in there too; right now, if the amount of work being done exceeds one frame time, we'll still delay the next frame until the next natural frame tick.  So if you do 20ms of work, your next frame will show up at the 32ms mark.  This is a tossup -- if we had vsync, this is what would happen, but it will also likely "regress" some performance benchmarks.  Open to suggestions on which way we should go.

Note that with this approach we have the option of creating another refresh timer which uses vsync, or uses its own timers separate from nsTimer; however, they all have to post back to the main thread's event loop for execution.
Comment 74 Vladimir Vukicevic [:vlad] [:vladv] 2012-07-10 09:04:15 PDT
bz: feedback ping!
Comment 75 Boris Zbarsky [:bz] 2012-07-12 20:59:02 PDT
Comment on attachment 637957 [details] [diff] [review]
patch

I'm really sorry for the lag here.  I really needed to sit down and think this stuff through.  :(

Tick() probably needs to hold refs to the refresh drivers in the array, right?

This is no longer doing backoff on background refresh drivers, as far as I can tell.  It just ticks them at 1Hz.  Is that OK?

Generally looks OK, I think....
Comment 76 Vladimir Vukicevic [:vlad] [:vladv] 2012-07-13 05:34:52 PDT
(In reply to Boris Zbarsky (:bz) from comment #75)
> Comment on attachment 637957 [details] [diff] [review]
> patch
> 
> I'm really sorry for the lag here.  I really needed to sit down and think
> this stuff through.  :(

No problem!

> Tick() probably needs to hold refs to the refresh drivers in the array,
> right?

Hmm.. I convinced myself that it didn't need to, but perhaps erroneously.  My thinking was that Tick() will always be called on the main thread, and that you can only remove refresh drivers on the main thread... so unless one Tick() can remove a later refresh driver, it should be safe to not take refs.  (Though, now that I think about it -- the callback could close another window or something, which could cause a later refresh driver to be removed maybe?)

> This is no longer doing backoff on background refresh drivers, as far as I
> can tell.  It just ticks them at 1Hz.  Is that OK?

That code looked a little weird to me, but maybe not:

-  rate = mThrottled ? DEFAULT_THROTTLED_FRAME_RATE : DEFAULT_FRAME_RATE;
-  PRInt32 interval = NSToIntRound(1000.0/rate);
-  if (mThrottled) {
-    interval = NS_MAX(interval, mLastTimerInterval * 2);
-  }

Was the intent to keep slowing down background refresh drivers?  1s (1Hz from DEFAULT_THROTTLED_FRAME_RATE) -> 2s -> 4s -> 8s -> 16s etc?

If so, ticking them along at 1Hz is probably not good, especially for mobile power usage.  I can probably do something better there.
Comment 77 Boris Zbarsky [:bz] 2012-07-13 09:03:36 PDT
> the callback could close another window or something

Yeah, callbacks can run arbitrary script, including spinning the event loop, etc.  Think sync XHR or alert() or opening a modal window in the callback.  ;)

> Was the intent to keep slowing down background refresh drivers?

Yes, precisely.
Comment 78 Vladimir Vukicevic [:vlad] [:vladv] 2012-07-13 11:03:22 PDT
(In reply to Boris Zbarsky (:bz) from comment #77)
> > Was the intent to keep slowing down background refresh drivers?
> 
> Yes, precisely.

My first thought to do this is to still have them all be driven off the same timer, but have it slow down globally, and reset (globally) to 1s whenever a new one is added.  That's probably the simplest; it does mean if you have 50 tabs in the background for a while, and then you background a new one, they'll all start back up again at 1Hz, but I don't think that's that big of a deal.

That case is somewhat crappy anyway, now that I think about it -- having *all* your background tabs trigger off the same timer is either good or bad, depending on how you look at it.  It's bad in that they'll all trigger at the same time, potentially causing a user visible delay.  It's good because they'll all trigger at the same time, letting the system be idle the rest of the time, instead of being interleaved.
Comment 79 Boris Zbarsky [:bz] 2012-07-13 12:28:25 PDT
> and reset (globally) to 1s whenever a new one is added.

Yeah, this is probably fine.

I'm still moderately interested in just turning off the refresh driver altogether in background tabs, but I haven't been able to sell people on it yet.  ;)
Comment 80 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-17 20:04:05 PDT
(In reply to comment #79)
> I'm still moderately interested in just turning off the refresh driver
> altogether in background tabs, but I haven't been able to sell people on it
> yet.  ;)

What are the arguments of the people who oppose this idea?
Comment 81 Boris Zbarsky [:bz] 2012-07-17 20:34:59 PDT
Refresh driver drives some things that are detectable from script, in particular firing of SMIL and transitionend and animationend events.  So if turned it off, script would never get those events until the tab goes back to the foreground.

I happen to think this is perfectly fine, personally.
Comment 82 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-17 20:39:51 PDT
(In reply to comment #81)
> Refresh driver drives some things that are detectable from script, in
> particular firing of SMIL and transitionend and animationend events.  So if
> turned it off, script would never get those events until the tab goes back to
> the foreground.
> 
> I happen to think this is perfectly fine, personally.

Can the script detect that this stuff has actually happened without receiving the event?  If yes, then I don't think that's fine (unless you argue that very few people use them in practice...)
Comment 83 Boris Zbarsky [:bz] 2012-07-17 20:44:27 PDT
> Can the script detect that this stuff has actually happened without receiving the event?

The script can detect that time has passed (via setTimeout and Date.now()) and it can detect that animations/transitions have not ticked yet (by checking computed style).  But we won't end up in a situation where the transition _has_ ticked to the end and the event has not fired.  So the most a script can detect is that the time has come for the animation to end, as measured on the wall-clock, but it hasn't ended yet per computed style.  Of course there's no real spec for when animations/transitions tick....
Comment 84 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-17 20:55:18 PDT
(In reply to comment #83)
> > Can the script detect that this stuff has actually happened without receiving the event?
> 
> The script can detect that time has passed (via setTimeout and Date.now()) and
> it can detect that animations/transitions have not ticked yet (by checking
> computed style).  But we won't end up in a situation where the transition _has_
> ticked to the end and the event has not fired.  So the most a script can detect
> is that the time has come for the animation to end, as measured on the
> wall-clock, but it hasn't ended yet per computed style.  Of course there's no
> real spec for when animations/transitions tick....

Hmm, OK, then this urges me to fall on your side in this battle!  But that's the topic of another bug...
Comment 85 Vladimir Vukicevic [:vlad] [:vladv] 2012-07-18 05:53:49 PDT
Yep, the way I look at it is that you'd see similar behaviour if the user just suspended their laptop.  I think we should treat background tabs basically exactly the same way if they haven't been visited in a while.
Comment 86 :Ehsan Akhgari (busy, don't ask for review please) 2012-08-15 10:27:14 PDT
It would be great if we can get a hacks blog post about this when it gets in!
Comment 87 Vladimir Vukicevic [:vlad] [:vladv] 2012-08-31 12:47:48 PDT
Created attachment 657410 [details] [diff] [review]
use custom timer for refresh driver

Ok, here's an updated patch.  Things that are changed:

- the RefreshDriverTimer class is now a virtual base class.  There are two concrete implementations:
-- PreciseRefreshDriverTimer: this implements the core algorithm described here, and is similar to the google approach.  It uses one-shot nsITimers.
-- InactiveRefreshDriverTimer: this is for inactive pages; it doubles the time of each firing, up to 15min (currently set just in the code), at which point it stops firing anything.  It resets back to 1s any time a new timer is made inactive.

- the testing functions work and tests pass.

With the RefreshDriverTimer class, we should be able to implement vsync in a straightforward way in a followup patch.

On windows, a debug build hits 60fps on the benchmark with over 95% of the frames being 16 or 17ms with no nsTimer changes.  (MOZ_EVENT_TRACE from the profiler was kicking in occasionally which can cause hurt.)

Try server run is in progress at https://tbpl.mozilla.org/?tree=Try&rev=410aea338816
Comment 88 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-09-02 19:20:51 PDT
Comment on attachment 657410 [details] [diff] [review]
use custom timer for refresh driver

Review of attachment 657410 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, but we need some comments describing the requirements and the design.

::: layout/base/nsRefreshDriver.cpp
@@ +61,5 @@
> +    NS_ASSERTION(!mRefreshDrivers.Contains(aDriver), "AddRefreshDriver for a refresh driver that's already in the list!");
> +    mRefreshDrivers.AppendElement(aDriver);
> +
> +    if (mRefreshDrivers.Length() == 1)
> +      StartTimer();

{}

@@ +71,5 @@
> +    NS_ASSERTION(mRefreshDrivers.Contains(aDriver), "RemoveRefreshDriver for a refresh driver that's not in the list!");
> +    mRefreshDrivers.RemoveElement(aDriver);
> +
> +    if (mRefreshDrivers.Length() == 0)
> +      StopTimer();

{}

@@ +117,5 @@
> +        continue;
> +
> +      drivers[i]->Tick(jsnow, now);
> +    }
> +    //printf_stderr("(done ticks)\n");

Get rid of these printf_stderrs or convert them to PR_LOG

@@ +211,5 @@
> +    }
> +#endif
> +
> +    // calculate lateness
> +    TimeDuration lastTickLateness = aNowTime - mTargetTime;

Remove this since you're not using it? Or else inline it into the logging call, if you're going to keep that.

@@ +265,5 @@
> +    uint32_t delay = static_cast<uint32_t>(mNextTickDuration);
> +    mTimer->InitWithFuncCallback(TimerTick, this, delay, nsITimer::TYPE_ONE_SHOT);
> +
> +    // double the next tick time
> +    mNextTickDuration += mNextTickDuration;

*= 2 reads better

@@ +486,5 @@
>  
> +  // we got here because we're either adjusting the time *or* we're
> +  // starting it for the first time
> +  if (mActiveTimer)
> +    StopTimer();

Just call StopTimer() unconditionally.

@@ +502,5 @@
> +  mActiveTimer->RemoveRefreshDriver(this);
> +  mActiveTimer = nullptr;
> +
> +  if (mRequestedHighPrecision)
> +    SetHighPrecisionTimers(false);

{}
Comment 89 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-09-02 19:21:22 PDT
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #87)
> - the RefreshDriverTimer class is now a virtual base class.

It's not a virtual base class, which is good because virtual inheritance is evil :-)
Comment 90 Vladimir Vukicevic [:vlad] [:vladv] 2012-09-04 13:26:38 PDT
Created attachment 658203 [details] [diff] [review]
use custom timer for refresh driver (v2)

Updated based on review comments.  Added more comments, {}'s, and added useful logging (for debug builds, at least, since FORCE_PR_LOGGING is not enabled).
Comment 91 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-09-04 16:37:21 PDT
Comment on attachment 658203 [details] [diff] [review]
use custom timer for refresh driver (v2)

Review of attachment 658203 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/base/nsRefreshDriver.h
@@ +269,5 @@
>    };
> +
> +  friend class mozilla::RefreshDriverTimer;
> +
> +  void SetHighPrecisionTimers(bool aEnable);

SetHighPrecisionTimersEnabled
Comment 92 Olli Pettay [:smaug] 2012-09-05 02:38:19 PDT
Comment on attachment 658203 [details] [diff] [review]
use custom timer for refresh driver (v2)

Nit, you're using inconsistent coding style.
Try to use Gecko coding style.

>+class RefreshDriverTimer {
{ should be in the next line

>+public:
>+  RefreshDriverTimer(double aRate) {
ditto. Same also elsewhere with classes and methods.
Comment 93 Olli Pettay [:smaug] 2012-09-05 02:50:12 PDT
Comment on attachment 658203 [details] [diff] [review]
use custom timer for refresh driver (v2)

Doesn't this end up running all the inactive drivers in a loop?
That could be rather bad, especially if one has several background tabs.
Pause times might become significantly long.
Comment 94 Olli Pettay [:smaug] 2012-09-05 04:08:51 PDT
Comment on attachment 658203 [details] [diff] [review]
use custom timer for refresh driver (v2)

I see you mentioned the background tab problem yourself in comment 78.
It really needs to be fixed. We're trying to make
bg tabs to cause less jank, not more.

+    nsTArray<nsRefreshDriver*> drivers(mRefreshDrivers);
+    for (size_t i = 0; i < drivers.Length(); ++i) {
+      // don't poke this driver if it's in test mode
+      if (drivers[i]->mTestControllingRefreshes)
+        continue;
+
+      drivers[i]->Tick(jsnow, now);
+    }
This screams about sg:crit bug ;)
Comment 95 Olli Pettay [:smaug] 2012-09-05 05:01:56 PDT
Comment on attachment 658203 [details] [diff] [review]
use custom timer for refresh driver (v2)

>@@ -7,11 +7,27 @@
> /*
>  * Code to notify things that animate before a refresh, at an appropriate
>  * refresh rate.  (Perhaps temporary, until replaced by compositor.)
>+ *
>+ * Each document has its own RefreshDriver, 
Btw, this comment is not correct.
Comment 96 Vladimir Vukicevic [:vlad] [:vladv] 2012-09-05 07:17:04 PDT
(In reply to Olli Pettay [:smaug] from comment #94)
> Comment on attachment 658203 [details] [diff] [review]
> use custom timer for refresh driver (v2)
> 
> I see you mentioned the background tab problem yourself in comment 78.
> It really needs to be fixed. We're trying to make
> bg tabs to cause less jank, not more.

Depends; having multiple background tabs all trigger on their own timers could cause more overall slowness.  However, an alternative could be to only run *one* driver per timer tick, and only double the time after all drivers have ticked once.  This means that initially we'll have only one background tab ticking every second; the amount of time between ticks for each individual tab would depend on the number of background tabs.  But this doesn't sound bad to me, given that they're in the background anyway.  I'm happy to implement that if that sounds better.

> +    nsTArray<nsRefreshDriver*> drivers(mRefreshDrivers);
> +    for (size_t i = 0; i < drivers.Length(); ++i) {
> +      // don't poke this driver if it's in test mode
> +      if (drivers[i]->mTestControllingRefreshes)
> +        continue;
> +
> +      drivers[i]->Tick(jsnow, now);
> +    }
> This screams about sg:crit bug ;)

Can you explain?

> >@@ -7,11 +7,27 @@
> > /*
> >  * Code to notify things that animate before a refresh, at an appropriate
> >  * refresh rate.  (Perhaps temporary, until replaced by compositor.)
> >+ *
> >+ * Each document has its own RefreshDriver, 
> Btw, this comment is not correct.

What's the correct version?
Comment 97 Olli Pettay [:smaug] 2012-09-05 08:09:15 PDT
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #96)
> Depends; having multiple background tabs all trigger on their own timers
> could cause more overall slowness.
Well, bg timers run very rarely. And when they run, they just handle one tab.


>  However, an alternative could be to only
> run *one* driver per timer tick, and only double the time after all drivers
> have ticked once.  This means that initially we'll have only one background
> tab ticking every second; the amount of time between ticks for each
> individual tab would depend on the number of background tabs.  But this
> doesn't sound bad to me, given that they're in the background anyway.
This might work, although in my case it would mean about one minute delay immediately when
tab goes to background. That is a lot. Also, each bg tab would be handled the same way.
I think those tabs which have been in bg longer, should get longer intervals.


> > This screams about sg:crit bug ;)
> 
> Can you explain?
Make sure you keep stuff alive when calling JS callbacks.

> > Btw, this comment is not correct.
> 
> What's the correct version?
There is a refreshdriver per top level chrome document, and refresh driver for top level content document.
Comment 98 Olli Pettay [:smaug] 2012-09-05 09:56:17 PDT
how does this all work with dlbi?
RefreshDrivers end up to the timer's list in random(?) order. Should we guarantee that
top-level chrome refresh drivers are handled after content drivers?
Comment 99 Vladimir Vukicevic [:vlad] [:vladv] 2012-09-05 12:24:59 PDT
Created attachment 658586 [details] [diff] [review]
use custom timer, part 2

Part 2, with smaug's comments incorporated:

- Uses a nsTArray<nsRefPtr<nsRefreshDriver> > instead of bare nsRefreshDriver*s
- The inactive timer ticks only one driver per tick, keeping the same rate until all of them have been poked once; then the rate doubles as before.  This eliminates the possibility that lots of background tabs will cause lots of jank while still poking them on a regular basis -- less frequently based on the number of tabs, which seems like the right thing to do.
Comment 100 Vladimir Vukicevic [:vlad] [:vladv] 2012-09-05 12:25:46 PDT
I can roll part 1 and part 2 into a single patch if that's easier (and will probably do that for eventual landing); wanted to make the followup easier for roc and others who have looked at the original patch.
Comment 101 Vladimir Vukicevic [:vlad] [:vladv] 2012-09-05 12:31:22 PDT
(In reply to Olli Pettay [:smaug] from comment #98)
> how does this all work with dlbi?
> RefreshDrivers end up to the timer's list in random(?) order. Should we
> guarantee that
> top-level chrome refresh drivers are handled after content drivers?

I don't know -- you tell me if that's a thing we should do, or if we need to do it in a followup :)  It seems like they would fire in basically a random order before as well (with each one having its own timer), no?
Comment 102 Olli Pettay [:smaug] 2012-09-05 12:42:26 PDT
Comment on attachment 658586 [details] [diff] [review]
use custom timer, part 2


-      if (drivers[i]->mTestControllingRefreshes)
+      if (drivers[i]->IsTestControllingRefreshesEnabled())
         continue;
if (expr) {
  stmt;
}

>+  static void TickDriver(nsRefreshDriver* driver, int64_t jsnow, TimeStamp now) {
aDriver, aJSNow, aNow
And { goes to the next line.
Though, I don't understand the reason for the method.


>+  /* Runs just one driver's tick. */
>+  void TickOne() {
{ goes to the next line

>+    int64_t jsnow = JS_Now();
>+    TimeStamp now = TimeStamp::Now();
>+
>+    ScheduleNextTick(now);
>+
>+    mLastFireEpoch = jsnow;
>+    mLastFireTime = now;
>+
>+    nsTArray<nsRefPtr<nsRefreshDriver> > drivers(mRefreshDrivers);
>+    if (mNextDriverIndex < drivers.Length() &&
>+        !drivers[mNextDriverIndex]->IsTestControllingRefreshesEnabled())
>+    {
{ should be in the previous line ('if' is a control structure, not method or class).


>+  static void TimerTickOne(nsITimer *aTimer, void *aClosure) {
* goes with types and { should be in the next line.


>+  bool IsTestControllingRefreshesEnabled() const
>+  {
>+      return mTestControllingRefreshes;
2 space indentation everywhere, please.

r=me with nits fixed.


Ask roc or mattwoodrow about dlbi
Comment 103 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-09-05 19:16:23 PDT
Comment on attachment 658586 [details] [diff] [review]
use custom timer, part 2

Review of attachment 658586 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think this impacts DLBI.

::: layout/base/nsRefreshDriver.cpp
@@ +354,5 @@
> +    {
> +      TickDriver(drivers[mNextDriverIndex], jsnow, now);
> +    }
> +
> +    mNextDriverIndex++;

This approach is somewhat weird IMHO. Making the delay depend on the number of background tabs open doesn't seem very well-motivated. Spacing them out does seem like a good idea though.

How about we give each background tab an independent timer?
Comment 104 Olli Pettay [:smaug] 2012-09-05 22:54:58 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #103)
> 
> How about we give each background tab an independent timer?
That is the current behavior (which is IMO ok).
Comment 105 Boris Zbarsky [:bz] 2012-09-06 22:18:14 PDT
Comment on attachment 658203 [details] [diff] [review]
use custom timer for refresh driver (v2)

The comment about each document having a refresh driver is wrong: there's one for chrome and one per tab, basically.  Though if we really wanted to we could do it that way if we're coalescing timers.  I think the main reason we did the refresh driver sharing was to coalesce timers for everything in a given tab.

The constructor for RefreshDriverTtimer should document what aRate is.  Is it Hz?  Is it ms?  Something else?

Tick()'s documentation means to say "poking all the refresh drivers"?

> It always schedules ticks on multiples of aRate

I assume you've checked that this does not regress the testcase in bug 630127?  Seems like it should be ok, but worth testing.

>+    int numElapsedIntervals = static_cast<int>((aNowTime - mTargetTime).ToMilliseconds() / mRateMilliseconds);

  int numElapsedIntervals = static_cast<int>((aNowTime - mTargetTime) / mRateDuration);

?

And maybe TimeDuration should have a way to multiply by a double that we could use in the lastEffectiveTick calculation?

For that matter, I'm not sure why we need the lastEffectiveTick temporary.  Can't we just add (numElapsedIntervals+1)*mRateDuration to mTargetTime without the temporary?

In the two interval getters, the prefName should probably be a static const char[], now that it's not conditional.

> nsRefreshDriver::AdvanceTimeAndRefresh(int64_t aMilliseconds)

Why are we adding a test for aMilliseconds > 0 here?

Why do we not need to tick in RestoreNormalRefresh?

Why is the removal of EnsureTimerStarted from the MostRecentRefresh getters ok?

> nsRefreshDriver::EnsureTimerStarted(bool aAdjustingTimer)

The comment on the mFrozen || !mPresContext case shouldn't mention "already been started".

r=me modulo the concerns smaug had.
Comment 106 Boris Zbarsky [:bz] 2012-09-06 22:18:50 PDT
Comment on attachment 658586 [details] [diff] [review]
use custom timer, part 2

I think this is OK.
Comment 107 Vladimir Vukicevic [:vlad] [:vladv] 2012-09-09 08:20:54 PDT
(In reply to Boris Zbarsky (:bz) from comment #105)
> Comment on attachment 658203 [details] [diff] [review]
> use custom timer for refresh driver (v2)
> 
> The comment about each document having a refresh driver is wrong: there's
> one for chrome and one per tab, basically.  Though if we really wanted to we
> could do it that way if we're coalescing timers.  I think the main reason we
> did the refresh driver sharing was to coalesce timers for everything in a
> given tab.

Fixed.

> The constructor for RefreshDriverTtimer should document what aRate is.  Is
> it Hz?  Is it ms?  Something else?

Done.  It's ms (as a double).  (Maybe aInterval is a better arg name?)

> Tick()'s documentation means to say "poking all the refresh drivers"?

Yep, fixed.

> > It always schedules ticks on multiples of aRate
> 
> I assume you've checked that this does not regress the testcase in bug
> 630127?  Seems like it should be ok, but worth testing.

Yep, seems fine; tested.  Even better than without, as the testcase is more consistent 60fps in many cases.

> >+    int numElapsedIntervals = static_cast<int>((aNowTime - mTargetTime).ToMilliseconds() / mRateMilliseconds);
> 
>   int numElapsedIntervals = static_cast<int>((aNowTime - mTargetTime) /
> mRateDuration);
> 
> ?
> 
> And maybe TimeDuration should have a way to multiply by a double that we
> could use in the lastEffectiveTick calculation?
> 
> For that matter, I'm not sure why we need the lastEffectiveTick temporary. 
> Can't we just add (numElapsedIntervals+1)*mRateDuration to mTargetTime
> without the temporary?

Yeah, I added a multiply-by-double to TimeDuration.  Mildly worried about overflow, but oh well.  Combined some of the other calculations as well.

> In the two interval getters, the prefName should probably be a static const
> char[], now that it's not conditional.

I just got rid of the separate prefName variables; it was only used once.

> > nsRefreshDriver::AdvanceTimeAndRefresh(int64_t aMilliseconds)
> 
> Why are we adding a test for aMilliseconds > 0 here?

Sillyness.  Earlier bug that left mMostRecentRefresh to be 0.  Fixed.

> Why do we not need to tick in RestoreNormalRefresh?

I didn't think we'd want to -- we start up the normal timer again, and it'll tick when appropriate.

> Why is the removal of EnsureTimerStarted from the MostRecentRefresh getters
> ok?

They're updated in the constructor, in Tick, and in the test function -- the constructor initializes it to "Now", which isn't technically correct; but I think initializing it to 0 would lead to problems elsewhere in code.  We don't need the timer to be running to find out the most recent refresh.

> > nsRefreshDriver::EnsureTimerStarted(bool aAdjustingTimer)
> 
> The comment on the mFrozen || !mPresContext case shouldn't mention "already
> been started".

Fixed.  Thanks for the review!  New patch soon.
Comment 108 Boris Zbarsky [:bz] 2012-09-09 08:44:11 PDT
> I didn't think we'd want to

That depends on whether tests measure things immediately after restoring normal refresh, I expect.

> We don't need the timer to be running to find out the most recent refresh.

Hmm.  I think it used to be that we wanted to make sure the timer started when we handed out recent refresh times so things would sync op properly, but with the new timer setup maybe it's ok.

Looking forward to updated patch!  This is looking good.
Comment 109 Vladimir Vukicevic [:vlad] [:vladv] 2012-09-10 11:12:39 PDT
Created attachment 659787 [details] [diff] [review]
updated combined patch

Updated, combined patch.  Try server results at https://tbpl.mozilla.org/?tree=Try&rev=e2133c25f150 -- I'm still looking through all the failures.  All of them seem to be known intermittent ones, but there's still a lot of them; not sure what to make of that.

Builds are at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/vladimir@pobox.com-e2133c25f150 if anyone wants to try them.
Comment 110 :Ehsan Akhgari (busy, don't ask for review please) 2012-09-19 14:14:50 PDT
Comment on attachment 659787 [details] [diff] [review]
updated combined patch

Review of attachment 659787 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/base/nsRefreshDriver.cpp
@@ +47,4 @@
>  
>  using namespace mozilla;
>  
> +#ifdef PR_LOGGING

Please verify that this won't be defined in regular release builds so that we can avoid the logging overhead in the cases where performance matters.

@@ +198,5 @@
> +protected:
> +
> +  virtual void StartTimer()
> +  {
> +    mTimer = do_CreateInstance(NS_TIMER_CONTRACTID);

I think it's best to not create a timer object each time this code is called, can you move this to the ctor perhaps?

@@ +379,5 @@
> +
> +    mNextDriverIndex++;
> +  }
> +
> +  static void TimerTickOne(nsITimer* aTimer, void* aClosure)

It would be good if we could ensure that calls to this function and ScheduleNextTick are always interleaved.

::: xpcom/ds/TimeStamp.h
@@ +92,5 @@
> +  TimeDuration operator*(const uint32_t aMultiplier) const {
> +    return TimeDuration::FromTicks(mValue * int64_t(aMultiplier));
> +  }
> +  TimeDuration operator*(const int64_t aMultiplier) const {
> +    return TimeDuration::FromTicks(mValue * int64_t(aMultiplier));

Nit: this cast seems unnecessary.
Comment 111 Vladimir Vukicevic [:vlad] [:vladv] 2012-09-20 13:58:37 PDT
So, an update -- I'm working through fixing a few of the test issues; the only thing so far that isn't a test issue is a few bidi reftest failures, such as layout/reftests/bidi/779003-1.html .  It's almost as if bidi resolution isn't happening correctly; on reload it's a tossup on whether it renders correctly or not (failures always look the same).  But looking at the observer callbacks in the refresh driver, I don't see anything different being called between the good and bad cases!  nsBidiPresUtils::Resolve is also being called an identical number of times for the two cases.  Not really sure how to track it down, any suggestions would be appreciated.
Comment 112 Vladimir Vukicevic [:vlad] [:vladv] 2012-09-20 16:15:16 PDT
Cc'ing smontagu for potential bidi help -- see comment #111.
Comment 113 :Ehsan Akhgari (busy, don't ask for review please) 2012-09-20 16:22:15 PDT
(In reply to comment #112)
> Cc'ing smontagu for potential bidi help -- see comment #111.

So my ideas that I talked to Vlad about on IRC are:

1) Compare the frametrees in both cases.
2) Compare the mozFlushTypes passed to FlushPendingNotifications.

One thing which _could_ be happening here is that we might be hitting a bug in bidi resolution depending on when we exactly reflow.
Comment 114 Vladimir Vukicevic [:vlad] [:vladv] 2012-09-21 10:56:32 PDT
Good news!  I just filed bug 793233 for the failing reftest, because it fails in the same way in a nightly build in the face of a trivial dynamic modification.  Ehsan's theory is that reflow is being triggered at a different point in time due to my changes and so is showing the problem more aggressively.
Comment 115 Vladimir Vukicevic [:vlad] [:vladv] 2012-10-01 14:49:31 PDT
Created attachment 666731 [details] [diff] [review]
final updated patch

Final patch; will carry forward bz's and smaug's reviews, but would like roc's final r+ as well.  This is clean on the tryserver, with the following patches applied (all will be part of the same push):

bug 793233: mark two bidi reftests as random
bug 796156: fix test_bug574663 to drive scrolling directly via advanceTimeAndRefresh
bug 768838: fix intermittent test_bug549262 failure (also drive scrolling directly via advanceTimeAndRefresh)
Comment 116 Vladimir Vukicevic [:vlad] [:vladv] 2012-10-02 07:55:22 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/16ae4d5d27d7

Fingers crossed.
Comment 117 Ed Morley [:emorley] 2012-10-02 09:40:47 PDT
Backed out for causing:
https://tbpl.mozilla.org/php/getParsedLog.php?id=15744043&tree=Mozilla-Inbound
...on all platforms. Was this tested locally before pushing?

https://hg.mozilla.org/integration/mozilla-inbound/rev/c0ef7fe877d2
Comment 118 Ed Morley [:emorley] 2012-10-02 09:49:02 PDT
(Seems like not all platforms & just opt; was all of the mochitest-other at time of posting :-))
Comment 119 Vladimir Vukicevic [:vlad] [:vladv] 2012-10-02 10:13:53 PDT
Yeah, it had extensive tryserver runs :p  The various tests are badly written, and depend on not-guaranteed timing.  Working on fixing the tests now.
Comment 120 :Ehsan Akhgari (busy, don't ask for review please) 2012-10-05 12:45:58 PDT
FWIW, this seems to have regressed a bunch of Talos benchmarks, most notably trace malloc allocs increase of about 1% across the board, some ~30% tscroll regressions on 10.6 and 10.7, and others so far.

https://groups.google.com/d/topic/mozilla.dev.tree-management/aT9oWSzjhAA/discussion

Although it seems to have improved tscroll on Linux, for example...
Comment 121 Vladimir Vukicevic [:vlad] [:vladv] 2012-10-05 12:53:56 PDT
Hm, I'll have to look at how tscroll is written.  I don't remember seeing that big percentages on try server runs, but I can see this changing tscroll for sure.  Especially since we're doing everything on 16ms steps now...
Comment 122 :Ehsan Akhgari (busy, don't ask for review please) 2012-10-05 13:34:36 PDT
There seems to be a bunch of Ts, Paint regressions as well.
Comment 123 Randell Jesup [:jesup] 2012-10-05 17:58:15 PDT
Be careful looking at Talos numbers for things like this.  Changing the number and timing of animations/paints/etc can make the code do more (or less) work.  For example, if an anim was running at 8fps because of this, and the patch fixed it to run at the "proper" 10fps, you'll see a lot of extra paints, allocs, CPU use, etc.

That isn't to say it isn't a bug - it might be - but it might not too.  Talos may not be a great test for this type of change.
Comment 124 Ed Morley [:emorley] 2012-10-06 12:50:59 PDT
https://hg.mozilla.org/mozilla-central/rev/739aff49b8bb
Comment 125 Ed Morley [:emorley] 2012-10-08 03:01:31 PDT
Backed out for turning WinXP debug m-oth permaorange (and yes; our WinXP coalescing plus dire mochitest-other intermittent failure situation mean that we didn't notice for 3 days!):
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=Rev3%20WINNT%205.1%20mozilla-inbound%20debug%20test%20mochitest-other&rev=739aff49b8bb

Backout:
https://hg.mozilla.org/mozilla-central/rev/9738e5a0190a
Comment 126 Vladimir Vukicevic [:vlad] [:vladv] 2012-10-11 14:17:17 PDT
Once more into the breach... https://hg.mozilla.org/integration/mozilla-inbound/rev/5b727a94eebd (includes crash fix from bug 799242)
Comment 127 Ed Morley [:emorley] 2012-10-12 02:49:34 PDT
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #126)
> Once more into the breach...
> https://hg.mozilla.org/integration/mozilla-inbound/rev/5b727a94eebd
> (includes crash fix from bug 799242)

Caused bug 797263 again.

Am going to need to back this out, but currently cannot push to inbound due to bug 766810 :-(
Comment 128 Ed Morley [:emorley] 2012-10-12 03:28:36 PDT
Backout:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb55953508db
Comment 129 Mark Rejhon 2012-10-27 15:39:18 PDT
Fix the root bug (it sounds like a bug here), but please do not hard code a 16ms limiter, when requestAnimFrame().

In Safari and Chrome (GPU accelerated mode), requestAnimationFrame on 120Hz computer monitors (e.g. Acer VG236H or Samsung S23A700D) lasts for about 8.333ms, and rightfully so.   Computers running 120Hz monitors are usually i7 computers with powerful GPU's, and the user has intentionally switched to 120Hz mode, so when you synchronize requestAnimationFrame() to VSYNC, _please_ honor the VSYNC.   Laptops generally never run at 120Hz, so there's no power concern.
Comment 130 Mark Rejhon 2012-10-27 15:40:29 PDT
Correction: Asus VG236H (not Acer) is the 120Hz model.
Comment 131 Mark Rejhon 2012-10-27 15:53:29 PDT
Correction: Asus VG236H 120Hz or VG278H/VG278HE 144Hz (not Acer)

P.S. The DEFAULT_THROTTLED_FRAME_RATE has got to go -- it is mutually exclusive from the future goal of being able to run requestAnimationFrame 120 times on a 120Hz computer monitor (like other VSYNC'd browsers already do).

You don't need precise VSYNC, as long as you've got compositing.  The frame timings could be +/- 50% of the VSYNC, and the presentation will still be perfectly smooth.   Chrome's timer jitter during VSYNC is still approximately 1ms on a fast system.  Heck!  You could even poll the scan line register to predict when the next VSYNC will occur (on Windows, it's Direct3D "RasterStatus.ScanLine"), and schedule the timer based on that.  The prediction will be accurate enough.
Comment 132 Mark Rejhon 2012-10-27 15:57:35 PDT
Actually, just ran a small test on Chrome -- it has a framerate throttler when chrome://flags has "Disable GPU VSync" enabled; the throttler is apparently configured to 250 in Chrome.

So, when fixing Mozilla to synchronize to VSYNC, then modify DEFAULT_THROTTLED_FRAME_RATE to a high value well beyond the best computer monitors (240 Hz is probably a future-proof value).
Comment 133 K. Gadd (:kael) 2012-10-27 16:13:06 PDT
Running rAF above 60hz is going to break tons of web games. I don't think you can make a change like that unilaterally without putting it behind a pref or a new API, or at least doing a survey of existing content to see how wide the damage would be.

Can you point to any reasonably complex HTML5 games that work with 120hz rAF?
Comment 134 Mark Rejhon 2012-10-27 16:39:08 PDT
A different perspective: How many users are going to run at 120Hz today?  Not many right now.  Damage will be near zero because requestAnimationFrame will keep running at 60Hz for the vast majority of users (even when the throttle is removed, because now VSYNC is throttling requestAnimationFrame).  

Web games that don't adjust themselves are the ones that "broken".  I agree with Chrome's decision to sync to VSYNC, and web games are slowly adjusting to that. 

HTML5 games using WebGL, works wonderfully at 120Hz.  Nothing needs to be fixed for most of these games, because many WebGL games update 3D game world by the real-world timer.   Framerate fluctuates up and down due to performance anyway, so these games are all naturally framerate adaptive, and they all adapted to 120Hz with no modification -- without the game authors knowing it.

It's the Canvas2D games that are written. Some of the most fiercely-complaining users are the users of emulators running illegal online ROM's.  Other users are of the good HTML5 Canvas2D games are framerate adaptive.  Games that adapt to framerate drops due to system performance, almost also always adapt to unexpectedly high framerate.

The "60" for requestAnimFrame is an artifical number that came from the "60" of the common refresh rate.  Don't we want to honor the original purpose of the number "60"?  We don't want to stay as a 60Hz dinosaur, even when we go beyond to 120 Hz.  Stop being a 60Hz dinosaur.

Again, RISK IS LOW because of the low but gradual adoption rate of 120Hz. :-)
Comment 135 Mark Rejhon 2012-10-27 16:42:09 PDT
Oh, and yes -- there's a lot of 2D games that won't work properly at 120Hz.  Provide a fallback for those.  The early 120Hz users will often be power users, and those will often know how to re-enable a framerate throttle.  However, you're claiming a non-issue.  Combined in all browsers syncing to VSYNC, more than 50% of the browser market sync's to VSYNC, and the games that don't adapt to that, are "broken".
Comment 136 Olli Pettay [:smaug] 2012-10-27 17:00:46 PDT
This bug is not about whether to sync to VSYNC or not, bug 707884 is about that.
And before that bug is fixed, 60Hhz will be used.
Comment 137 K. Gadd (:kael) 2012-10-27 17:05:16 PDT
You're already busting out the ad hominems and broken reasoning over *framerate scheduling*. Chill.

The rAF spec does not clearly specify anything resembling the behavior you describe as "correct", nor does it specify that the games you describe are "broken". If you want the behavior you describe to be standard, perhaps you should work to make it so. http://www.w3.org/TR/animation-timing/#requestAnimationFrame

Even if a game happens to run logic based on elapsed time, that does not actually mean it will work correctly at 120hz (or above). Elapsed time based game logic can easily be subject to accumulation of error due to the fact that the numbers in logic computations get smaller at higher framerates. Accumulated error can cause incorrect game behavior, desyncs in multiplayer games, and even crashes. The only way to be sure that a game works at 120hz is to test it, and nobody without a 120hz monitor can realistically test games this way right now. Please note that this is not the same as a game dropping to 30hz (like you mentioned in #134). A change like this would need to come with a feature for letting developers test their games at other update rates. The low precision of Date.now() and other timing mechanisms also present a problem once you go past 60hz and for things to be robust those games would have to already be using the bleeding-edge high precision timing API.

Lots of games run with fixed frame rates for design reasons or process things in an event-based fashion. Even if those games work with rAF running at 120hz, running it at 120hz is wasteful if the game logic only runs at, say, 30hz.

Your comments about games being wrong because they're written using Canvas or because they're emulators add nothing to the discussion. Please refrain from painting developers who disagree with you as incompetent or criminals.

You can write 'RISK IS LOW' in caps and insist that you're correct but given that you continue to provide no data or even examples of games you're wasting your breath. Building HTML5 games that actually run smooth and work right across all browsers is hard enough without introducing more complication, so there really needs to be a demonstratable upside and clear path to improvement.

For example, what OS, GPU, and browser version did you use for your Safari and Chrome tests? Were the browsers using GPU accelerated rendering and compositing? Did you have any flash player plugins running in any processes on your computer? Were you on a desktop, a laptop on mains power, or a laptop on battery? Does your PC have power management enabled? All of these factors can affect the scheduling behavior you see for requestAnimationFrame in modern browsers. I can trivially convince Chrome to drop rAF to 30hz on my top-end gaming machine due to bugs in their implementation. Simply saying 'it works in Safari and Chrome' is not adequate here.
Comment 138 Vladimir Vukicevic [:vlad] [:vladv] 2012-10-27 17:10:48 PDT
Wrong place for this discussion, but decent discussion to have.  What we have now clearly isn't working; the patch and approach here doesn't make it any worse, and makes it much easier to move to 120hz, 240hz, or whatever.  60Hz is the current target, so that's what it'll be.
Comment 139 Mark Rejhon 2012-10-28 01:24:11 PDT
Thanks, yes, it is a decent discussion.  And apologies, if this is the wrong place.   And yes, the patch here is an improvement.  But it can still be so much better.

Answers to your questions:

>>Even if those games work with rAF running at 120hz, running it at 120hz is wasteful if the game logic only runs at, say, 30hz.
-- Depends. If the parallax logic still smooth scrolls at 120fps, it's worthwhile even if the enemies move only at 30Hz.
-- Conversely, if the control logic is sampled during requestAnimFrame, you get less input latency (1/120sec rather than 1/30sec) even if animations run only at 30Hz.
-- This is a mountain out of a molehill "nitpick"

>>Tests: http://www.playwebgl.com (several of the most popular ones) and a few apps from Chrome (Angry Birds, Plants vs Zombies, etc).  My tests were more than a month ago so I cannot remember all names, but I will endeavour to find time in the next month to start a spreadsheet of observations at 60Hz and 120Hz on one of my systems.

>> OS and GPU
- I tested Opera, Safari, Chrome on their Mac, iOS, and Windows versions, as well as IE10 on Windows.  (IE8 doesn't VSYNC).   I have a Linux box, but it's currently headless so I haven't tested that.  Chrome also claims VSYNC support on both Android and Linux according to chrome://gpu.

>>Flash plugins
- Installed but not running.  Occasionally, the flash banners were running on some sites.

>>2 desktops, 3 laptops, iPod, iPad, 2 Android, PlayBook.  Applicable observations:
- Chrome: VSYNC behavior on all 5 machines: Mac x 2, WindowsXP, Windows7, Windows8.
- Safari: VSYNC behavior on all current Apple platforms (Mac, iPhone 6, iPad 6) but doesn't VSYNC on Windows.
- IE10: VSYNC on Windows8 machine.
- Galaxy SII Android (4.0): VSYNC behavior.
- Old Android (2.3): Does not VSYNC.
- BlackBerry PlayBook: Does not VSYNC.
- Mains power made no differences in all tests, with the sole exception of the WindowsXP laptop which switched downwards to 50Hz mode in battery saver mode.  requestAnimationFrame operated 50 times per second in Chrome.  Also, iOS4 and lower platforms do not VSYNC.  iOS 6 does.  iOS 5 is untested.

I am impressed at how suddenly (in a 12 month time period) the majority of both desktop and mobile browsers are now choosing to VSYNC.

>>"I can trivially convince Chrome to drop rAF to 30hz on my top-end gaming machine due to bugs in their implementation. Simply saying 'it works in Safari and Chrome' is not adequate here."

I haven't seen that lately.   I was only able to do that on the WindowsXP machine and the slower Mac, especially when opening new tabs and windows.  I had lots of difficulty making Chrome on Windows 8 (i7, 3770K CPU, SSD, with Radeon HD6850) slow down, even when loading applications in the background and loading other webpages -- it barely went below VSYNC, with only a few frame drops.  IE10 behaved the same (window resizing didn't slow down requestAnimFrame and animations were full framerate even during rapid window resizing).  Animations in games (whether Angry Birds or the Quake 3 Arena clone) ran blissfully unaware of whatever I was doing on the desktop computer (quad core, good CPU, SSD).  On the other hand, I easily got Chrome to stutter on the Windows XP laptop when I opened a new Chrome window while watching animations in existing WebGL games.

Also, more the reason that games need to optimize for unexpected rates.  Safari ran at only approximately 15Hz on my Windows machines.

_____

Some notes:

It appears that eventually, a group of us could approach W3C to standardize web-based VSYNC method.  Perhaps non-requestAnimationFrame methods -- I'm interested in joining that group.  There's a need.  If requestAnimationFrame *isn't* it, there'll eventually be another standardized method, or a JavaScript method to turn on/off vsync in requestAnimationFrame.  Debate and figure out something that makes you and me and others happy.  This is a worthy, and important discussion, due to web browsers recently (as of 2012) suddenly gaining widespread VSYNC behavior.  Suddenly, something that was formerly not practical, is now suddenly practical.  Maybe you are right, but that does not deny the need of *some* kind of mechanism.

One possible interim compromise (if it meets FireFox needs) is a rAF that syncs to VSYNC only when it's near a 60Hz multiple.  (rAF every VSYNC at 60hz, rAF every 'other' VSYNC at 120Hz).  Not ideal, but would smooth out a lot of animations.

Although not fully relevant here, I should point out it is a best-practices in the video game industry (and even in the professionally made web games, like Angry Birds) to synchronize to real-world.  Simple-made Canvas2D web games resembles assumptions that existed in DOS games.  (leading DOS games designed for 386's to run excessively fast on Pentium's, etc).
Comment 140 Boris Zbarsky [:bz] 2012-10-28 13:41:14 PDT
Vlad, please re-cc me if you need my feedback here again (and if the noise level dies down)...
Comment 141 Vladimir Vukicevic [:vlad] [:vladv] 2012-12-05 18:41:07 PST
Created attachment 689051 [details] [diff] [review]
newest version, v1

Here is the newest version of this, with a few changes.  On windows, we drop the high precision timer only after 90 seconds of no requests that need high precision timers.  This avoids us ping-ponging high precision to low precision all the time.  Also caught a related bug, in that we were dropping to low precision every frame because we were scheduling a new timer before we actually got notified that we needed high precision timers for the request (because rAF and similar have to be called in the callback).

A try server run is at https://tbpl.mozilla.org/?tree=Try&rev=73d3112e9f45 with builds at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/vladimir@pobox.com-73d3112e9f45 .

Unfortunately, there are still some orange tests, most of them known orange failures but unfortunately not random in some cases.  It's really annoying.  I've fixed a number of orange tests before, but now there's a new set; I'll look into those tomorrow.
Comment 142 Vladimir Vukicevic [:vlad] [:vladv] 2012-12-11 10:12:39 PST
Created attachment 690941 [details] [diff] [review]
interdiff

The interdiff
Comment 143 Vladimir Vukicevic [:vlad] [:vladv] 2012-12-11 10:23:38 PST
Created attachment 690945 [details] [diff] [review]
more context in interdiff
Comment 144 Vladimir Vukicevic [:vlad] [:vladv] 2012-12-11 10:24:57 PST
Created attachment 690946 [details] [diff] [review]
more correct and context in interdiff
Comment 145 :Ehsan Akhgari (busy, don't ask for review please) 2012-12-11 10:31:02 PST
Comment on attachment 690946 [details] [diff] [review]
more correct and context in interdiff

Review of attachment 690946 [details] [diff] [review]:
-----------------------------------------------------------------

::: nsRefreshDriver.cpp.old
@@ +418,2 @@
>  static int32_t sHighPrecisionTimerRequests = 0;
> +static nsCOMPtr<nsITimer> sDisableHighPrecisionTimersTimer = nullptr;

Please make this a plain pointer, to avoid adding a static constructor.

@@ +710,5 @@
> +      // after 90 seconds.  This is arbitrary, but hopefully good
> +      // enough.
> +      NS_ASSERTION(!sDisableHighPrecisionTimersTimer, "We shouldn't have an outstanding disable-high-precision timer !");
> +
> +      sDisableHighPrecisionTimersTimer = do_CreateInstance(NS_TIMER_CONTRACTID);

(Remember to AddRef the plain pointer here.)
Comment 146 Vladimir Vukicevic [:vlad] [:vladv] 2012-12-11 14:17:57 PST
Fixed and pushed to inbound:
    https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=513ec84b5c88

Based on 100% green tryserver run (well, before the plain pointer changes):
    https://tbpl.mozilla.org/?tree=Try&rev=06cda839b7ec
Comment 147 Ed Morley [:emorley] 2012-12-12 02:05:56 PST
https://hg.mozilla.org/mozilla-central/rev/513ec84b5c88
Comment 148 neil@parkwaycc.co.uk 2012-12-12 02:31:53 PST
(In reply to Ehsan Akhgari from comment #145)
> (From update of attachment 690946 [details] [diff] [review])
> > +      sDisableHighPrecisionTimersTimer = do_CreateInstance(NS_TIMER_CONTRACTID);
> (Remember to AddRef the plain pointer here.)
Vlad improved on this by calling forget on a temporary nsCOMPtr into the plain pointer. (Alternatively CallCreateInstance(NS_TIMER_CONTRACTID, &sDisableHighPrecisionTimersTimer); would also have worked.)
Comment 149 Girish Sharma [:Optimizer] 2012-12-18 08:13:49 PST
I know this is closed bug. But since around 4-5 days, I am seeing frames being dropped while doing tab animation. In a sense, that while dragging, I do not see the complete animation, the tab jumps from place to place sometimes.

Windows 7 x64 . 32 bit Nightly build
HWA On. NVidia GTX 260
Comment 150 Vladimir Vukicevic [:vlad] [:vladv] 2012-12-18 08:54:17 PST
Can you file a new bug for that please?
Comment 151 Girish Sharma [:Optimizer] 2012-12-18 09:09:25 PST
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #150)
> Can you file a new bug for that please?

Sure, filed bug 822694.

Note You need to log in before you can comment on or make changes to this bug.