Closed Bug 1080869 Opened 10 years ago Closed 8 years ago

requestAnimationFrame() <canvas> animations are very jerky, vsync issue?

Categories

(Core :: Graphics, defect)

32 Branch
x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: jerryj, Unassigned)

References

Details

(Whiteboard: [webvr])

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/38.0.2125.101 Safari/537.36

Steps to reproduce:

Review http://www.duckware.com/test/chrome/jerky3.html

After waiting a few seconds for the animation to stabilize (repeat), notice that the animation is very jerky.


Actual results:

The animation for me is very jerky -- even though the animation code is incredibly simple and does almost nothing.  I suspect almost half the rendered frames are not making it to the screen.  Could this be a vsync issue (days ago I found a vsync issue in Chrome).  I am running on a laptop that is three years old, which may have something to do with it?


Expected results:

Run the same jerky3.html test in IE and notice that the animation is incredibly smooth.  The same test in Chrome with vsync off results in over 400 frames/second, suggestion plenty of cpu and gpu headroom.  The monitoring code in the callback suggests that 60FPS are being rendered, but due to the jerks seen, those frames are not making it to the screen.  Are multiple rendered frames ending up in a single vsync frame?
Flags: needinfo?(bgirard)
My machine is having no problem with this content (6ms in rAF).
http://people.mozilla.org/~bgirard/cleopatra/#report=13587efd6d78aa52f81cd66e5eb39224ee2bf6f0

The main-thread/compositor are sleeping yet not aligning particularly well to vsync.

tn did you profile show the same?
Component: Untriaged → Graphics
Depends on: 1071275
Flags: needinfo?(bgirard) → needinfo?(tnikkel)
Product: Firefox → Core
I am seeing the problem on a Dell L702X (2011) / Windows 7.  On a newer Dell Inspiron 15R (2014) Windows 8.1 I do not see the problem.

I suspect FF is running into a subtle timing issue.  If I open and maximize IE and run the test -- AND then open Firefox and maximize and run the test (so both are running the test, at the same time, FF on top of IE), all of the jerkiness in FF goes away (ultra smooth).  But the moment I close IE, FF is then jerky again.  But if I run FF and IE next to each other (no overlap), FF is jerky.  A clue?

Is there a way to run FF with vsync off?
See Also: → 707884
#1:Benoit, was your test on Windows 7?

I suspect that I am missing every other frame.

So, I updated the jerky3.html test to include a red box in a grid of 60 boxes in two rows, where each box represents a unique animation frame.  And this new test I believe proves that is the case.

I see the red box move horizontally on the top row, or the bottom row, and then sometimes jump between rows (staying in one row means 30fps to the screen).  Compare to IE where the red box alternates between the top and bottom row (proving 60fps to the screen).

But this new test has also revealed something else that I believe is very significant.

In IE, each red box is a lighter shade of red (as the background is always white, except for one entire frame where the box is full red -- so the effect is a lighter shade of red).

But in FF, when I do sometimes (for unknown reasons; what causes that?) see the red box alternate between top row and bottom row (indicating 60fps), I am seeing definite 'tearing' in the red boxes, which is a dead giveaway that updates to the screen are not actually synchronized to vsync.

Again, simply moving a running FF over top of a running IE (window on top of window) causes FF to then run smoothly -- as if IE is causes the GPU to 'update' as per vsync, but the GPU is then grabbing the correct data for the screen from the FF buffer, since that is covering IE, and the display looks good.
Short video of FF32 software rendering at 60fps, but only 30fps are making it to the screen:

  http://www.duckware.com/test/chrome/vid/HHI_8853-FF32.MOV
The test at http://www.duckware.com/test/chrome/jerky3.html has been significantly updated.  There is now a very obvious graphical indicator whenever there is no VSYNC synchronization.  Firefox has problems with it.
(In reply to Benoit Girard (:BenWa) from comment #1)
> My machine is having no problem with this content (6ms in rAF).
> http://people.mozilla.org/~bgirard/cleopatra/
> #report=13587efd6d78aa52f81cd66e5eb39224ee2bf6f0
> 
> The main-thread/compositor are sleeping yet not aligning particularly well
> to vsync.
> 
> tn did you profile show the same?

The profiler addon seems to have stopped working for me. Whenever I click analyze it never opens a new tab with the results. So I can't answer your question until it's working again.
Flags: needinfo?(tnikkel) → needinfo?(bgirard)
(In reply to jerryj from comment #4)
> Short video of FF32 software rendering at 60fps, but only 30fps are making
> it to the screen:
> 
>   http://www.duckware.com/test/chrome/vid/HHI_8853-FF32.MOV

Windows? That's bug 1065233. DWM might be forcing things down to 30 FPS under GPU load. When you get into this state it's likely that your entire window manager is running at 30 FPS.
Flags: needinfo?(bgirard)
(In reply to Benoit Girard (:BenWa) from comment #7)
> Windows? That's bug 1065233. DWM might be forcing things down to 30 FPS
> under GPU load. When you get into this state it's likely that your entire
> window manager is running at 30 FPS.

This is not 1065233.  The video shows that the animation callback is around 60FPS (see above chart).  When the frame rate is allowed to become unbounded (frame-rate=0), then the display is very smooth, and shows a FPS around the 140's (clearly, the display is only updating at 59.8Hz, so many frames are being produced and dumped).  This is all just a bad vsync timing issue/bug in FF.

When frame-rate=-1, the animation callback is called back at the expected 59.8 FPS, matching the VSYNC refresh rate for my monitor.  It is just that not all software generated frames are making it to the screen in their intended VYSNC slots (some slots get two frames, some slots get one frame).

From other bug reports that I have since found, it is VERY clear that Mozilla has NO INTEREST in finding, or fixing, any vsync bugs -- see comment #52 in 894128 by Avi Halachmi -- and that a fix will have to wait.  So just close this bug as "will not fix".

Just make sure that when the fix referred to by Avi is eventually implemented down the road, that the fix is tested against http://www.duckware.com/test/chrome/jerky3.html.

What all tests that I have seen so far lack is validation that every animation frame produced is making it into the proper (and unique) VSYNC time slot on the display -- that that is something that the jerky3.html test provides (albeit a visual test -- can not access results via a program -- since the test is a visual effect causes by your eyes, and proper VSYNC timing).
Just want to point out lots of Construct 2 users have noticed issues with v-sync in Firefox lately. Here's a ~90 page thread on our forums discussing the issue: https://www.scirra.com/forum/about-the-jerkiness-on-the-movement_t117554

I filed bug 1028893 with a test that makes measurements that can be compared between browsers; IE11 definitely measures a lot better, and subjectively appears a lot smoother too, so I think there are real v-sync timing issues to be investigated.
(In reply to jerryj from comment #8)
> 
> From other bug reports that I have since found, it is VERY clear that
> Mozilla has NO INTEREST in finding, or fixing, any vsync bugs -- see comment
> #52 in 894128 by Avi Halachmi -- and that a fix will have to wait.  So just
> close this bug as "will not fix".
> 

Hi jerryj. I understand your frustration with the vsync implementation and that it causes janky behavior for you. However, it is very clear that Mozilla is fixing vsync bugs. From the bug you linked, we are not improving the current implementation, we are replacing it such that our architecture is triggered on vsync. (https://wiki.mozilla.org/Project_Silk, bug 987532). I apologize if I misunderstand you're intent / comments, the internet is hard to communicate through heh :). But, we are fixing it, and we will test it against http://www.duckware.com/test/chrome/jerky3.html.
See Also: → 1092245
(In reply to Mason Chang [:mchang] from comment #10)
> Hi jerryj. I understand your frustration with the vsync implementation and
> that it causes janky behavior for you. However, it is very clear that
> Mozilla is fixing vsync bugs. From the bug you linked, we are not improving
> the current implementation, we are replacing it such that our architecture
> is triggered on vsync. (https://wiki.mozilla.org/Project_Silk, bug 987532).
> I apologize if I misunderstand you're intent / comments, the internet is
> hard to communicate through heh :). But, we are fixing it, and we will test
> it against http://www.duckware.com/test/chrome/jerky3.html.

I seriously doubt that project Silk will address the vsync issues that I am seeing with Firefox.  Please point me to the Mozilla analysis of my test code that points to the ultimate cause of the jankiness.  Unless the ultimate cause of the jankiness is fully understood, the bug described above will never be 'fixed'.

So lets throw vsync synchronizing out the door, and answer the only question that matters.  Can Firefox be used to create a very precisely timed animation?

To find out, I created my own rAF implementation in JavaScript that is able to trigger a rAF callback to the microsecond (and precisely tuned to the Hz of the display).  The animation callback does nothing but display a moving vertical bar.  With OS level double buffering (Windows Areo), that creates a very smooth display.  Without double buffering, there WILL be screen tearing.  To see the tearing, I run under Win7 with Aero off.  When the code is run under IE or Chrome, the results are absolutely what is expected -- screen tearing is seen and locked to a fixed location on the display (because the rAF callback is called back at precisely the Hz of my monitor).  That means that there is almost no frame to frame variation in timings.  However, with Firefox, the display is a mess of flicking because the screen tearing location is 'random' -- since the code is obviously the same as the coding running in IE/Chrome, the only possible cause is internal Firefox jankiness.

So the bottom line: Attempting to synchronize an animation to vsync in Firefox is pointless -- if the animation without vsync is already janky due to unknown Firefix internal jankiness.  Until that jankiness in Firefox is fixed, synchronizing to vsync is rather pointless.

From other bug reports, I know that Mozilla does not test Firefox on older hardware (like from three years ago).  I think that is a contributing factor in this case, as I am running on an 8-thread i7 with Intel HD Graphics 3000.  IE and Chrome run fine on that hardware, but Firefox does not.  How is Mozilla going to fix a bug is they can't even replicate the bug to find the ultimate cause?
(In reply to jerryj from comment #11)
> I seriously doubt that project Silk will address the vsync issues that I am
> seeing with Firefox.  Please point me to the Mozilla analysis of my test
> code that points to the ultimate cause of the jankiness.  Unless the
> ultimate cause of the jankiness is fully understood, the bug described above
> will never be 'fixed'.

That's not true, as we accidentally fix bugs all the time as unexpected side-effects of making other changes. :) So it's possible that project silk will fix this although, as you say, we won't know that for sure until either (a) we spend the time to investigate this issue or (b) we finish project silk and re-test.

(a) is going to be time consuming - obviously you have already spent a lot of time on this problem and failed to uncover the root cause. I doubt we would do much better without investing a lot of time. And as we only have a limited amount of time, we have to carefully consider whether the time investment is worth it, specially given that we're already hard at work on project silk and (b) is a much cheaper alternative for us.

> From other bug reports, I know that Mozilla does not test Firefox on older
> hardware (like from three years ago).  I think that is a contributing factor
> in this case, as I am running on an 8-thread i7 with Intel HD Graphics 3000.
> IE and Chrome run fine on that hardware, but Firefox does not.

If you're going to keep comparing Mozilla to IE and Chrome I think it would only be fair to also compare the amount of money and developer time that IE and Chrome have compared to Mozilla. If you stack up the numbers I'm sure you'll see at least an order of magnitude difference. I'm not trying to excuse what we're doing (or not doing), but trying to give you a sense of why we have to be more careful in picking our battles. If we want to stay relevant with "mass market users" we have to devote a certain amount of time and effort to more basic things and only then can we spend left over time chasing down issues like this one which many "mass market users" don't care about (think grandma just trying to check her email, not gamer or web developer).

> How is
> Mozilla going to fix a bug is they can't even replicate the bug to find the
> ultimate cause?

We rely much more heavily on contributions from the Mozilla community than IE and Chrome. You've already done a tremendous amount of work on this issue - your test case is by far the most polished test case I've ever seen posted to Bugzilla. It clearly indicates you care a lot about this issue, and we would absolutely welcome your help in tracking down this issue. If you're willing to spend some time building Firefox yourself and debugging, I'm sure we could find some people on the graphics team to mentor you through tracking down this problem.
An update to my comment #11 above...

I just figured out that Firefox is unable to time anything accurately (under Windows).  Once I figured that out, I modified the vsync test at http://www.vsynctester.com to work around the issue, and sure enough, I can now do in JavaScript what Firefox is unable to do in native CPP code -- create animation callbacks and an animation that is NOT janky.  And here is the video proof:

  http://www.duckware.com/test/chrome/vid/ff33.mp4

The video starts out with Firefox's requestAnimationFrame(), and the animation is janky.  I then check the checkbox the switches over to a JavaScript rAF implementation that is able to accurately time events to the Hz of my display, and all jankiness goes away.  Very repeatable.
Just tested with Silk enabled on OSX and Windows. On Windows 7, we're as good as Chrome / IE 11. The animation in the background never janks. On OS X, in full screen, it's silky smooth but in non-full screen it's still janky. We should probably take another look at vsync implementation on OS X or how we're ticking the refresh driver.
(In reply to Mason Chang [:mchang] from comment #14)
> We should probably take another look at vsync implementation on OS X
> or how we're ticking the refresh driver.

Do you intend to do this in this bug or ist there another bug open for this?
(In reply to Florian Bender from comment #15)
> (In reply to Mason Chang [:mchang] from comment #14)
> > We should probably take another look at vsync implementation on OS X
> > or how we're ticking the refresh driver.
> 
> Do you intend to do this in this bug or ist there another bug open for this?

I'm not sure yet, I'll have to dig in some more until I find a root cause. Comment 14 was just a quick test, but I'll make a better conclusion once I have more time to investigate.
For some reason I'm not getting the vsync markers in the profile, will probably have to investigate more. But we're getting really large layer transactions which is interesting. The animation callback times are also 2-3x longer than on Windows.

I also verified that I vsynced to the correct display by using CVDisplayLinkCreateWithCGDisplay instead of  CVDisplayLinkCreateWithActiveCGDisplays, which vsyncs to a specific display rather than all displays.
Interestingly on mac, chrome is failing as well. Only Safari is passing.
Whiteboard: [webvr]
This should be fixed with Silk. Resolving as WFM.
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: