Closed Bug 555834 Opened 13 years ago Closed 4 years ago

Use beamsync (vsync) for rendering. Content tearing on scroll and other things


(Core :: Widget: Cocoa, defect)

Not set





(Reporter: simon.bugzilla, Unassigned)



(Whiteboard: tpi:-)


(1 file)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.3a4pre) Gecko/20100329 Minefield/3.7a4pre
Build Identifier: 

Scrolling using two finger scroll using touchpad on Mac OS X causes the content to tear. Enabling beamsync using Quartz Debug fixes the tearing issue.

Oh, the tearing is driving me nuts!

Reproducible: Always
Does Safari have the same problem?

We explicitly turned off Coalesced Updates (CGDisableCoalescedUpdates in our Info.plist) because we were spending a huge amount of time waiting for sync in our CGFlush (and CGFlush-alike?) calls. Turning off coalesced updates causes tearing, but a big improvement in performance.

To fix this right, we have to be synced up with the OS wrt rendering time. (That is the definition of tearing.) I don't know the details of how to do this best. 

Potentially related: bug 552020.
Component: Graphics → Widget: Cocoa
QA Contact: thebes → cocoa
We need to get to off-main-thread compositing, then that thread can wait for beam sync as much as it likes without hurting our performance.
Blocks: 676248
This is what we'd need to do for OpenGL.
Depends on: omtc
Ever confirmed: true
Version: unspecified → Trunk
Tearing appears on more drawing than just scrolling.
Summary: Use beamsync (vsync) for rendering. Content tearing on scroll → Use beamsync (vsync) for rendering. Content tearing on scroll and other things
As I understand it we might also need to use glFences (cheaper than glFinish) to make sure we're not queueing frames faster then we vsync.
If we queue frames faster than we vsync, we'll block in [context flushBuffer].
Right. [context flushBuffer] is what we currently call, and I think that's just a wrapper for CGLFlushDrawable.
(In reply to Markus Stange from comment #11)
> If we queue frames faster than we vsync, we'll block in [context
> flushBuffer].

Frames queued faster than vsync are wasted. We should get at most one frame ahead.
Depends on: 739421
This does not have to wait until OMTC, FWIW. 

This is conveniently exactly what we want, and should be possible to do:

The only performance hit is that we can't begin work on the next frame while waiting for vsync. However, if a frame takes longer than a vsync interval to render, it just plain takes that long. OMTC will free of us this minor performance hit.

If OMTC is very close to landing for everyone, we probably shouldn't bother with this intermediary fix, except for implementing vsync for OMTC. (Which I hope would be a given)
Interesting. I wonder if we can also do that with eglSwapBuffer on android OMTC. The eglSwapBuffer is really slow and we would love to be free to answer transactions with content during that time.
No longer depends on: omtc
Depends on: omtc
So this was actually fixed for OpenGL in bug 748816.

We could leave this bug open for non-accelerated rendering on OS X, I suppose.
No longer blocks: 676248
I have done some VSYNC tests on multiple web browsers.
Applicable observations on 2 desktops, 3 laptops, iPod, iPad, 2 Android, PlayBook:

- Chrome: VSYNC behavior on all 5 machines: Mac x 2, WindowsXP, Windows7, Windows8.
- Safari: VSYNC behavior on all current Apple platforms (Mac, iPhone 6, iPad 6) but doesn't VSYNC on Windows.
- IE10: VSYNC on Windows8 machine.
- Galaxy SII Android (4.0): VSYNC behavior.
- Old Android (2.3): Does not VSYNC.
- BlackBerry PlayBook: Does not VSYNC.
- Mains power made no differences in all tests, with the sole exception of the WindowsXP laptop which switched downwards to 50Hz mode in battery saver mode.  requestAnimationFrame operated 50 times per second in Chrome.  Also, iOS4 and lower platforms do not VSYNC.  iOS 6 does.  iOS 5 is untested.

I am impressed at how suddenly (in a 12 month time period) the majority of both desktop and mobile browsers are now choosing to VSYNC.  It is important that FireFox follows this path, I feel.

Also, 120Hz-native-refresh computer monitors are also gradually becoming cheaper, too.  Models such as Asus VG236H (120Hz), VG278H (144Hz), Samsung S23A700D (120Hz), Acer GD235Hz (120Hz), Benq XL2420T (120Hz).   On these displays, moving text much clearer in both Chrome(all) and Safari(Mac) at 120Hz - especially when scrolling.  Text doesn't blur as much when doing things like dragging around the window, scrolling, etc, and animations in many web games are much smoother.  I'd love to see this happen for FireFox.
I've added more research to #707884
(In reply to Jeff Gilbert [:jgilbert] from comment #14)
> Frames queued faster than vsync are wasted. We should get at most one frame
> ahead.

There's a subtle factor to consider: Running at higher framerate than refresh rate, has some reduced input lag benefit, if you simply use fresher-generated frames to simply replace not-yet-displayed older-generated frames.   This reduces input lag, sometimes by up to a full refresh cycle (if you have a very powerful GPU) -- 16 milliseconds at 60Hz.

Just like for Olympics 100 meter competition, the competition online gaming environment means if two people try to shoot to each other at the same time -- the person that shoots first wins -- even if they press the fire button 1 millisecond sooner than the other guy.  Even though you can't "feel" one millisecond, it can make a difference, and 10 milliseconds can actually be "felt" by some professionals, too -- an even bigger difference.  Disabling VSYNC means a more freshly generated frame can be displayed at the next VSYNC, and then become seen by human eyes sooner.

This causing reaction time to be a few milliseconds sooner because they saw a frame that's a few milliseconds fresher during this VSYNC rather than during the next VSYNC.   So an uncapped frame rendering rate, has a benefit for competitive online gaming environments.   Although an edge case, it illustrates a subtle factor widely known in the video game programming industry (are there any browser developers here that are ex-EA or ex-Ubisoft or ex-iD Software programmers?), that is not widely known by browser programmers or W3C standardization.

This is common by competition video gamers to manually disable VSYNC, to let the framerate become uncapped (e.g. run at 200fps at 60Hz), for the reduced input lag benefit during competition video gaming.  Many desktop PC video games have a setting that allows you to turn on/off VSYNC for this reason.

Long term, at the W3C agreement level, there ideally should be a JavaScript method of enabling/disable VSYNC (e.g. window.mozSyncOnVsync = true), that would apply to both Canvas2D / WebGL, perhaps even apply it to the entire browser subsystem (e.g. compositing framerate), along with a window.mozMaxFrameRate = 200 setting (for those times you want to uncap the framerate, or to support 120Hz monitors such as Asus VG236H, Samsung S23A700D, Benq XL2420T, etc).

It is extremely meritworthy to find a resolution to #707884 too, as well, because it's probably related (in a "subsystem" manner) to this bug.
Indeed, I'm not sure what my logic was. (Maybe a concern about too-deep chaining? Who knows) As an avid FPS player and engine hacker myself, I'm totally aware of the benefits of uncapped framerate.

As there are even cases where an app may not want to update more than 30fps, I think an ideal thing to do is to add a setting to rAF which allows specification of framerate cap in lieu of limiting to vsync. This probably includes a number of 2d games, and probably a decent chunk of mobile games.

I should add that while it used to be easiest to just disable vsync to uncap the framerate, the more modern solution is to ask for triple-buffering, instead of double-buffering. Triple-buffering eliminates tearing while still allowing the producer to render frames at arbitrary speed. It's naturally still an average of half a (producer, not vsync) frame slower, but for us, it would generally be half a composite-frame-length, not a canvas-render-frame-length. Updating recompositing a page with one dirty rectangle should be trivially fast, so being uncapped here shouldn't really buy us anything.

The WebGL backend is already transitioning to triple-buffering, but I believe our layers code should be updated to allow for this. I would guess that with our current implementation with vsync, after we composite a frame, we (would) sit on our hands until the frame goes through.

As for 120Hz monitors, I imagine waiting on vsync there should default to 8.3ms frames. Naturally, once we base rAF against this, we should get at least the baseline functionality we need, and we can move forward with API proposals from there.
At last, someone who understands the VSYNC situation as well as I do.  I agree with everything you have just said, and I have some secret pre-production pages I'd like to share with you for testing if you want to contact me privately.  It will be an excellent test case; I will be launching by end of this year but I'd like to give you a preview because it is one of the first ever webpages that defines rAF VSYNC as a validity requirement for a precision scientific test, and I had to make a browser detect module because there is no object method of detecting rAF VSYNC support.  

One warning about triple buffering.  It is best to Let the compositing manager do that instead, don't do it at the openGL or Canvas2D level because then you get two layers of double or triple buffering.   Yoy get the problem of triple buffering that now gets rebuffered by the compositing manager.  Worst case scenario is two compositing managers (application builtin compositing and OS level compositing).  That means three layers of either double or triple buffering.  You get input lag hell.   Be careful you don't rearchitecture into a Catch 22 like this.  The ideal low-lag situation is to detect if the OS is using GPU accelerated compositing, and then treat that as the last two buffers.  Here, you can even flickery-draw (flickers under windows XP) but looks perfectly buffered under Windows 7 or Windows 8.  In this case, you actually want to disable doublebuffering and triple buffering at all other layers where unnecessary, even the OpenGL layer.   From what I read, Chrome automatically does this.  Plan your buffer layers to avoid unnecessary lag.
Yes, this lines up with what I've heard with regards to modern windowing systems. It's practically another form of buffer bloat, and something to be wary of.

The architecture change happening right now for WebGL is adding flexibility, but some degree of buffering is unfortunately necessary in the general case. (Basically the compositor has to sample from a texture that WebGL drew the frame to, in order to do page layout stuff) It's possible that certain cases (like full-screen with no other effects or overlays) could eventually be optimized to bypass this. 

I would say the priority right now is tear-less compositing that doesn't reduce content (WebGL and such) production rates.

For anything not suitable for a public forum, please email me. (The email address linked to my account here is correct)
Aha, right.  Then there's an additional factor: Several compositing managers already sync on VSYNC by default -- for example, Windows 7 and 8, (and Mac's if you enable beamsync).  In this case, you can leave the final 2 buffers (of triple buffering) to the compositing manager; and it will never cause tearing (no matter how out-of-sync to VSYNC you are at passing frames).  You'll need to keep a list of platforms that have compositors (and methods of detecting them), and which platform compositors are already 100% tearproof.

So there's a few potential approaches to minimize buffering:
- Immediate buffer is always internal (to allow texture sampling)
- Triple buffering is left to the compositing window manager (just blit the frames from your immediate buffer, to your application browser display buffer.  That's the middle buffer of triple buffering).  
- The final buffer is the OS-level window compositing.

So you're essentially just double buffering (At the application level), but the compositing window manager becomes your final buffer.   I did some Windows 7 and 8 DirectX videogame tests in some double-buffered games with VSYNC disabled - instead of showing tearing, it behaved *exactly* like triple buffering (fps above refresh, but without lag and without tearing).  Apparently, you actually have to do lots of "work" to force tearing to show up on Windows 7 and Windows 8 platforms, because their compositors are 100% tearproof on modern GPU's.

That eliminates one buffer layer, while keeping the full triple buffering benefits, and while keeping ability to sample from texture.
By "application browser display buffer", I mean platform appropriate API for the actual display.  Not application-level compositing (internal framebuffer).  If you're unable to bypass application-level compositing (for various reasons), then you got a forced minimum of quadruple buffering (webGL immediate buffer, application-level compositing, OS-level compositing, and the actual visible buffer)
Jeff, sent you the test case (good for all platforms).
This also happens on Windows in the current release. (18.0)

If I scroll on Windows, I get sync tearing, even though I have set AMD's control panel to force vsync on. My Windows Aero is off, and even though I turn off smooth and autoscrolling, it still tears even if it's not noticeable.

So this doesn't affect just Mac.
It's extremely noticeable on this page:
This also happens on Linux Bug 947913
Can we revive this for the platforms that already support OMTC, please?
Whiteboard: tpi:-

Since this bug was originally filed about macOS, I'm marking it as a duplicate of the bug where macOS "beam sync" was enabled, many years ago.

Closed: 4 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.