requestAnimationFrame() callback timestamp argument is incorrect

RESOLVED WONTFIX

Status

()

RESOLVED WONTFIX
2 years ago
5 months ago

People

(Reporter: jerryj, Assigned: mchang)

Tracking

52 Branch
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: gfx-noted)

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
Created attachment 8807510 [details]
firefox-late.jpg

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36

Steps to reproduce:

In nightly, visit www.vsynctester.com, click on the gear option, click on the 'late' (purple) performance graph line.  This line graphs the 'current time' minus the 'rAF callback time' argument, as in:

  function rafcallback( tRAF ) {
    tLate = window.performance.now()-tRAF;
    }



Actual results:

This purple 'late' line must either be flat (the rAF callback argument is the current time as per the spec), or it must track the frame offset from vsync (the rAF callback argument is the vsync time, as per recommended change to spec).

In Firefox, it is neither.  See attached.  The brownish line is the calculated actual vsync offset (but possible phase shifted up/down a little).




Expected results:

Running an independent third party EXE on the same test system (Win 8.1 / Intel integrated graphics / Dell notebook) that tests DWN vsync timings (the EXE from Chrome issue 467617) shows that timing information from multiple DWM sources all look just fine, so there is no reason for Firefox to have a problem.

Compare the Chrome Canary.

Updated

2 years ago
Component: Untriaged → DOM
Product: Firefox → Core
(Reporter)

Comment 1

2 years ago
I just reviewed gfxWindowsPlatform.cpp and immediately ran into code that is making wrong assumptions about what is being returned by DWM.  And then even more code that attempts to correct the wrong assumptions.  What a mess.
(Reporter)

Comment 2

2 years ago
verified. Firefox computes the wrong refresh rate -- and then run lots of other bogus code to 'correct' what it thinks are DWM errors (when the true error is that Firefox is using the wrong refresh rate).
Do we track rAF issues in Layout, Jet?
Component: DOM → Layout
Flags: needinfo?(bugs)
(Reporter)

Comment 4

2 years ago
To replicate this issue, run FF on a Windows system with a known display Hz just over 60Hz (the problem system was 60.107).
(In reply to Andrew Overholt [:overholt] from comment #3)
> Do we track rAF issues in Layout, Jet?

We typically do, but Windows vsync is GFx. Bas: wdyt?
Flags: needinfo?(bugs) → needinfo?(bas)
(Assignee)

Comment 6

2 years ago
(In reply to jerryj from comment #1)
> I just reviewed gfxWindowsPlatform.cpp and immediately ran into code that is
> making wrong assumptions about what is being returned by DWM.  And then even
> more code that attempts to correct the wrong assumptions.  What a mess.

Can you please clarify what assumptions are  wrong here please?

(In reply to jerryj from comment #2)
> verified. Firefox computes the wrong refresh rate -- and then run lots of
> other bogus code to 'correct' what it thinks are DWM errors (when the true
> error is that Firefox is using the wrong refresh rate).

Which part is the wrong refresh rate [1]. All the info we're getting is from the DWM.

I briefly looked at the source in https://bugs.chromium.org/p/chromium/issues/detail?id=467617#c27, and from your prototype, it says the DWM is nice and buggy.

As for your suggestion to use D3DKMTWaitForVerticalBlankEvent, we took a look at it in https://bugzilla.mozilla.org/show_bug.cgi?id=1146942 but determined not to use it.




[1] http://searchfox.org/mozilla-central/source/gfx/thebes/gfxWindowsPlatform.cpp#1764

> In nightly, visit www.vsynctester.com, click on the gear option, click on
> the 'late' (purple) performance graph line.  This line graphs the 'current
> time' minus the 'rAF callback time' argument, as in:
> 
>   function rafcallback( tRAF ) {
>     tLate = window.performance.now()-tRAF;
>     }
> 
> 
> 
> Actual results:
> 
> This purple 'late' line must either be flat (the rAF callback argument is
> the current time as per the spec), or it must track the frame offset from
> vsync (the rAF callback argument is the vsync time, as per recommended
> change to spec).
 
> In Firefox, it is neither.  See attached.  The brownish line is the
> calculated actual vsync offset (but possible phase shifted up/down a little).

How are you calculating the actual vsync time? With the current rAF callback, we're always feeding it the reported vsync time. However, it is expected that this value would not be equal to time.now(). Also, in bug 1278408, we made this a constant time. I would expect some jitter before bug 1278408, which didn't land until Firefox 50. Can you try on a nightly?
Flags: needinfo?(bas) → needinfo?(jerryj)
(Assignee)

Updated

2 years ago
Assignee: nobody → mchang
(Assignee)

Updated

2 years ago
Component: Layout → Graphics
Whiteboard: gfx-noted
(Reporter)

Comment 7

2 years ago
> Can you please clarify what assumptions are wrong here please?

Sure, this computation:

    float rate = ((float) refreshRate.uiDenominator
        / (float) refreshRate.uiNumerator) * 1000;

does not work.  As a result, a lot of the other code is also wrong.  For a system with a known rate of 16.637, the 'rate' above is computed to be '16.666'.  That may not seem significant, but it is.  Because 'rate' is wrong, that means 'mVsyncRate' is now wrong.  Than, that means that 'vsync' in VBlankLoop() is also wrong:

    vsync = mPrevVsync + mVsyncRate;

That means the following code is being run (when it should not be run), not because "DWMFlush woke up very early", but because 'vsync' is wrong:

    if (vsync > now) {
        // DWMFlush woke up very early, so readjust our times again
        vsync = GetVBlankTime();
        }

Firefox computes a 'vsync' that is pushed forward (for the example above; other systems it can be pushed backwards) of real correct time by 0.029ms every frame, and after 17-18 frames, that catches up with 'now' (when DwmFlush returns, which (in this case) is 'around' 0.5ms past true vsync.

That is the cause of the zip-zag seen in the original firefox-late.jpg attachment above (constant interframe times then a spike, repeated).

> I briefly looked at the source...

Well, look again (just updated to include FF issues above).  Will you run that EXE on a system that you believe has problems and post the resulting test output (or email to me)?

> As for your suggestion to use D3DKMTWaitForVerticalBlankEvent, we took a look at it in
> https://bugzilla.mozilla.org/show_bug.cgi?id=1146942 but determined not to use it.

Too bad.  D3DKMTWaitForVerticalBlankEvent wakes up within microseconds of true vsync, whereas DwmFlush wakes up 0.5ms to 3-4ms past true vsync.  The more the load on the system, the worse DwmFlush gets.

> How are you calculating the actual vsync time?

Review the 'var HZ' object in the vsynctester.com HTML source code.  It is intentionally very public.

> With the current rAF callback, we're always feeding it the reported vsync time

No, not now (because 'vsync' is computed incorrectly as per above), but you should!  The vsync time coming out of dwm appears with microseconds of actual vsync.  Jitter in the microsecond range for the time passed to the rAF callback is just fine and should be used as-is.

Firefox would be far better off just getting rid of all the special case code trying to manually compute 'vsync', which is wrong anyway, and just take the 'now' time (DwmFlush wake up time) and simply compute the vsync time for that 'now' time.  Given ANY vsync time reference whatsoever from the DWM (it does NOT have to be current; it can be any time in the past or any time in the future, which eliminates special case code I see for Win10), the vsync time for 'now' can be computed very simply (single line of code).

>Can you try on a nightly?

This issue was reported against nightly.  Actually, the prior FF works far better (the purple line tracks the brown line at vsynctester.com).
Flags: needinfo?(jerryj) → needinfo?(mchang)
(Assignee)

Comment 8

2 years ago
(In reply to jerryj from comment #7)
> > Can you please clarify what assumptions are wrong here please?
> 
> Sure, this computation:
> 
>     float rate = ((float) refreshRate.uiDenominator
>         / (float) refreshRate.uiNumerator) * 1000;
> 
> does not work.  As a result, a lot of the other code is also wrong.  For a
> system with a known rate of 16.637, the 'rate' above is computed to be
> '16.666'.  That may not seem significant, but it is.  Because 'rate' is
> wrong, that means 'mVsyncRate' is now wrong.  Than, that means that 'vsync'
> in VBlankLoop() is also wrong:

What do you propose instead of using the DWM? That's the rate the DWM is giving us. I understand that the rate
can change every flush due to when the hardware actually vsyncs, but we deliberetly decided to not listen to vsync every vblank.

> 
>     vsync = mPrevVsync + mVsyncRate;
> 
> That means the following code is being run (when it should not be run), not
> because "DWMFlush woke up very early", but because 'vsync' is wrong:
> 
>     if (vsync > now) {
>         // DWMFlush woke up very early, so readjust our times again
>         vsync = GetVBlankTime();
>         }

I forget which machine, but I've had this happen where DWMFlush can wake up in like 11 ms
even though the previous interval was 16 ms. 

> Firefox computes a 'vsync' that is pushed forward (for the example above;
> other systems it can be pushed backwards) of real correct time by 0.029ms
> every frame, and after 17-18 frames, that catches up with 'now' (when
> DwmFlush returns, which (in this case) is 'around' 0.5ms past true vsync.
> 
> That is the cause of the zip-zag seen in the original firefox-late.jpg
> attachment above (constant interframe times then a spike, repeated).
> 
> > I briefly looked at the source...
> 
> Well, look again (just updated to include FF issues above).  Will you run
> that EXE on a system that you believe has problems and post the resulting
> test output (or email to me)?

Sure, I don't remember if I have the exact machine though.

> > How are you calculating the actual vsync time?
> 
> Review the 'var HZ' object in the vsynctester.com HTML source code.  It is
> intentionally very public.

I still don't understand after looking how you're getting the real vsync time from hardware. It looks like you're computing it rather than asking for it from some source. The source looks like some calculations dependent on the rAF callback timestamps and performance.now().

> > With the current rAF callback, we're always feeding it the reported vsync time
> 
> No, not now (because 'vsync' is computed incorrectly as per above), but you
> should!  The vsync time coming out of dwm appears with microseconds of
> actual vsync.  Jitter in the microsecond range for the time passed to the
> rAF callback is just fine and should be used as-is.

This jitter was not ok, which is why we did https://bugzilla.mozilla.org/show_bug.cgi?id=1278408.

> Firefox would be far better off just getting rid of all the special case
> code trying to manually compute 'vsync', which is wrong anyway, and just
> take the 'now' time (DwmFlush wake up time) and simply compute the vsync
> time for that 'now' time.  Given ANY vsync time reference whatsoever from
> the DWM (it does NOT have to be current; it can be any time in the past or
> any time in the future, which eliminates special case code I see for Win10),
> the vsync time for 'now' can be computed very simply (single line of code).

We did this for a while - http://hg.mozilla.org/releases/mozilla-esr45/file/tip/gfx/thebes/gfxWindowsPlatform.cpp#l2922
but changed it to due to the jitter in the vsync timestamp.
Flags: needinfo?(mchang) → needinfo?(jerryj)
(Reporter)

Comment 9

2 years ago
Wow.  I you don't get it by now, you won't get it.
Flags: needinfo?(jerryj)
(Reporter)

Updated

2 years ago
Status: UNCONFIRMED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → WONTFIX

Comment 10

2 years ago
The owner(?) of this bug made a blog post about this issue here:

http://www.vsynctester.com/firefoxisbroken.html

Here is also a Reddit thread by a user that wanted to bring attention to this issue:

https://www.reddit.com/r/firefox/comments/5ec5ui/vsync_in_firefox_under_windows_is_broken/
Flags: needinfo?(mchang)
(Assignee)

Comment 11

2 years ago
(In reply to 6lobe from comment #10)
> The owner(?) of this bug made a blog post about this issue here:
> 
> http://www.vsynctester.com/firefoxisbroken.html
> 
> Here is also a Reddit thread by a user that wanted to bring attention to
> this issue:
> 
> https://www.reddit.com/r/firefox/comments/5ec5ui/
> vsync_in_firefox_under_windows_is_broken/

Please read the comment sections in reddit and the back comments here. Unfortunately, we seem to be at a standstill. I've tried many of the things he suggested but the solutions do not always work on all machines.
Flags: needinfo?(mchang)
(Assignee)

Updated

2 years ago
Duplicate of this bug: 1333454
(Assignee)

Updated

2 years ago
Depends on: 1295214
(Assignee)

Updated

2 years ago
Depends on: 1278408
You need to log in before you can comment on or make changes to this bug.