Closed Bug 1190257 Opened 5 years ago Closed 5 years ago

Assertion failure: Given time appears to be in the future: 'aTime <= TimeStamp::Now()' in FrameAnimator

Categories

(Core :: Graphics, defect)

x86_64
Windows 10
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla43
Tracking Status
firefox40 --- affected
firefox41 --- fixed
firefox42 --- fixed
firefox43 --- fixed

People

(Reporter: birtles, Assigned: mchang)

References

Details

(Keywords: regression, Whiteboard: gfx-noted)

Attachments

(1 file, 3 obsolete files)

Given trunk build from a few hours ago I get a bunch of assertion failures:

  ###!!! ASSERTION: Given time appears to be in the future: 'aTime <= TimeStamp::Now()', file c:/moz/src/image/FrameAnimator.cpp, line 81:

by running e.g.

  ./mach test dom/animation/test/css-transitions/test_animation-pausing.html

(but I see this with every test I've run so far, so it's not specific to this test)

The stack is:
[6148] ###!!! ASSERTION: Given time appears to be in the future: 'aTime <= TimeStamp::Now()', file c:/moz/src/image/FrameAnimator.cpp, line 81
#01: mozilla::image::FrameAnimator::RequestRefresh (c:\moz\src\image\frameanimator.cpp:203)
#02: mozilla::image::RasterImage::RequestRefresh (c:\moz\src\image\rasterimage.cpp:321)
#03: nsRefreshDriver::Tick (c:\moz\src\layout\base\nsrefreshdriver.cpp:1867)
#04: mozilla::RefreshDriverTimer::TickDriver (c:\moz\src\layout\base\nsrefreshdriver.cpp:198)
#05: mozilla::RefreshDriverTimer::Tick (c:\moz\src\layout\base\nsrefreshdriver.cpp:188)
#06: mozilla::VsyncRefreshDriverTimer::RunRefreshDrivers (c:\moz\src\layout\base\nsrefreshdriver.cpp:440)
#07: mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver (c:\moz\src\layout\base\nsrefreshdriver.cpp:375)
#08: nsRunnableMethodArguments<mozilla::TimeStamp>::apply<mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver,void (__thiscall mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::*)(mozilla::TimeStamp)> (c:\moz\src\obj-debug\dist\include\nsthreadutils.h:675)
#09: nsRunnableMethodImpl<void (__thiscall mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::*)(mozilla::TimeStamp),1,mozilla::TimeStamp>::Run (c:\moz\src\obj-debug\dist\include\nsthreadutils.h:870)
#10: nsThread::ProcessNextEvent (c:\moz\src\xpcom\threads\nsthread.cpp:867)
#11: NS_ProcessNextEvent (c:\moz\src\xpcom\glue\nsthreadutils.cpp:277)
#12: mozilla::ipc::MessagePump::Run (c:\moz\src\ipc\glue\messagepump.cpp:95)
#13: MessageLoop::RunInternal (c:\moz\src\ipc\chromium\src\base\message_loop.cc:235)
#14: MessageLoop::RunHandler (c:\moz\src\ipc\chromium\src\base\message_loop.cc:228)
#15: MessageLoop::Run (c:\moz\src\ipc\chromium\src\base\message_loop.cc:202)
#16: nsBaseAppShell::Run (c:\moz\src\widget\nsbaseappshell.cpp:167)
#17: nsAppShell::Run (c:\moz\src\widget\windows\nsappshell.cpp:178)
#18: nsAppStartup::Run (c:\moz\src\toolkit\components\startup\nsappstartup.cpp:280)
#19: XREMain::XRE_mainRun (c:\moz\src\toolkit\xre\nsapprunner.cpp:4288)
#20: XREMain::XRE_main (c:\moz\src\toolkit\xre\nsapprunner.cpp:4385)
#21: XRE_main (c:\moz\src\toolkit\xre\nsapprunner.cpp:4474)
#22: do_main (c:\moz\src\browser\app\nsbrowserapp.cpp:212)
#23: NS_internal_main (c:\moz\src\browser\app\nsbrowserapp.cpp:399)
#24: wmain (c:\moz\src\toolkit\xre\nswindowswmain.cpp:131)
#25: __tmainCRTStartup (f:\dd\vctools\crt\crtw32\startup\crt0.c:255)
#26: BaseThreadInitThunk[KERNEL32 +0x13744]
#27: RtlSetCurrentTransaction[ntdll +0x5a064]
#28: RtlSetCurrentTransaction[ntdll +0x5a02f]

In the above test I see this same failure about 60 times.
(In reply to Brian Birtles (:birtles) from comment #0)
> Given trunk build from a few hours ago I get a bunch of assertion failures:

Specifically, I'm looking at revision 32712cd01159.
FWIW, I don't get this on mac with rev 1d4f44ee5166.

Did you try this command with earlier revisions and got no asserts?
(In reply to Timothy Nikkel (:tn) from comment #2)
> FWIW, I don't get this on mac with rev 1d4f44ee5166.
> 
> Did you try this command with earlier revisions and got no asserts?

Yeah, I'd never seen this until yesterday so it appears to be a regression from the last few days (although I did upgrade to Windows 10 a few days ago but I'm pretty sure this appeared after).

I see this with pretty much every test I run.
It might be a Windows 10 thing after all. I'm seeing this with a trunk build from last Thursday (revision 89167fe9d3ba). I'll try going back a bit further.
This depend on VsyncObserver (DWM vsync code).  RefreshDriverVsyncObserver::NotifyVsync seems to be set feature time.

In D3DVsyncSource::D3DVsyncDisplay::VBlankLoop, qpcNow is 0x0000003c`553d53e2, but vblankTime.qpcVBlank is 0x0000003c`553efea8. So usAdjust is negative, then it sets feature time....
Is this a bug in our code? Our what we are getting from Windows?
Flags: needinfo?(m_kato)
(In reply to Timothy Nikkel (:tn) from comment #6)
> Is this a bug in our code? Our what we are getting from Windows?

Both.  DWM's vsync should not be future.  And if is future, we shouldn't adjust time.
Flags: needinfo?(m_kato)
See https://bugzilla.mozilla.org/show_bug.cgi?id=1147953#c21
Assignee: nobody → mchang
Blocks: 1147953
Component: ImageLib → Graphics
Whiteboard: gfx-noted
Pretty sure I saw this on my Windows 7 debug build.
Waiting on try before requesting review:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=187ddab441cd

Looks like a few things have changed since I tried Windows 10 in April. Before, the vblank time was around TimeStamp::Now() +- a few microseconds. Now the vblank time Windows 10 reports from DwmGetCompositionTimingInfo() acts much like CVDisplayLink on OSX. When DWMFlush() wakes up, the vblank time is for the next upcoming vblank in the future on Windows 10. On Windows 7/8, it is the vblank that JUST happened, in the past. This patch fixes the vsync timestamp based on Windows version and adds a few asserts.
(In reply to Milan Sreckovic [:milan] from comment #9)
> Pretty sure I saw this on my Windows 7 debug build.

Hmm, I haven't been able to reproduce this on my Windows 7 machine. I'd be interested in seeing if the build in comment 10 asserts out when it's finished. 

What's also interesting is that on Windows 10 with this Thinkpad, the vsync timestamps aren't actually very clean. They are +-2 ms from 16.6 hz, which is roughly in the range of software timers. Testing on vsynctester.com, all the browsers have lots of jitter, but we're equal to Chrome. Edge is actually the worst performing.
(In reply to Mason Chang [:mchang] from comment #10)
> Created attachment 8645941 [details] [diff] [review]
> Normalize Windows vsync timestamps
> 
> Waiting on try before requesting review:
> 
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=187ddab441cd
> 

Once the failures on other platforms are sorted, I suspect this will be necessary to fix Bug 1193040  (and probably similar other tests that hit that same assertion)
Blocks: 1193040
(In reply to Justin Wood (:Callek) from comment #12)
> Once the failures on other platforms are sorted, I suspect this will be
> necessary to fix Bug 1193040  (and probably similar other tests that hit
> that same assertion)

P.s. please do not push to try, explicitly mentioning win10 until coordinating with me, there is literally only 1 machine to speak of so far. (Takes >30 hours to run the full gambit of tests)
(In reply to Justin Wood (:Callek) from comment #13)
> (In reply to Justin Wood (:Callek) from comment #12)
> > Once the failures on other platforms are sorted, I suspect this will be
> > necessary to fix Bug 1193040  (and probably similar other tests that hit
> > that same assertion)

The results from try from comment 10 are interesting. From the logs, it actually looks like this callstack is causing the assertion failure:

15:28:25     INFO -  Assertion failure: aNowTime <= TimeStamp::Now(), at c:/builds/moz2_slave/try-w64-d-00000000000000000000/build/src/layout/base/nsRefreshDriver.cpp:1673
15:28:25     INFO -  #01: nsRefreshDriver::AdvanceTimeAndRefresh(__int64) [layout/base/nsRefreshDriver.cpp:1094]
15:28:25     INFO -  #02: nsDOMWindowUtils::AdvanceTimeAndRefresh(__int64) [dom/base/nsDOMWindowUtils.cpp:2410]

This isn't coming from Windows / DWMGetCompositionTimingInfo.

> P.s. please do not push to try, explicitly mentioning win10 until
> coordinating with me, there is literally only 1 machine to speak of so far.
> (Takes >30 hours to run the full gambit of tests)

I'm sorry, I think I'm confused. When I pushed to try using try chooser, only Windows XP, 7, and 8 are checked off. Is there something that automatically runs these tests on Windows 10 if I mention Windows 10 in the commit message? What commit message should I avoid? Thanks!
(In reply to Mason Chang [:mchang] from comment #14)
> (In reply to Justin Wood (:Callek) from comment #13)
> > (In reply to Justin Wood (:Callek) from comment #12)
> > > Once the failures on other platforms are sorted, I suspect this will be
> > > necessary to fix Bug 1193040  (and probably similar other tests that hit
> > > that same assertion)
> 
> The results from try from comment 10 are interesting. From the logs, it
> actually looks like this callstack is causing the assertion failure:
> 
> 15:28:25     INFO -  Assertion failure: aNowTime <= TimeStamp::Now(), at
> c:/builds/moz2_slave/try-w64-d-00000000000000000000/build/src/layout/base/
> nsRefreshDriver.cpp:1673
> 15:28:25     INFO -  #01: nsRefreshDriver::AdvanceTimeAndRefresh(__int64)
> [layout/base/nsRefreshDriver.cpp:1094]
> 15:28:25     INFO -  #02: nsDOMWindowUtils::AdvanceTimeAndRefresh(__int64)
> [dom/base/nsDOMWindowUtils.cpp:2410]
> 
> This isn't coming from Windows / DWMGetCompositionTimingInfo.

There are a LOT of things hitting this assert on https://treeherder.mozilla.org/#/jobs?repo=try&revision=8cf3952a5c3a&exclusion_profile=false I don't know root cause for any of them, but they all are breaking tests on win10 ;-)

> > P.s. please do not push to try, explicitly mentioning win10 until
> > coordinating with me, there is literally only 1 machine to speak of so far.
> > (Takes >30 hours to run the full gambit of tests)
> 
> I'm sorry, I think I'm confused. When I pushed to try using try chooser,
> only Windows XP, 7, and 8 are checked off. Is there something that
> automatically runs these tests on Windows 10 if I mention Windows 10 in the
> commit message? What commit message should I avoid? Thanks!

Anything that trychooser can generate is "good", I however pushed targetting win10, its not enabled by default, and as of right now can't be (we only have the one machine with win10 on it, I'm doing these pushes to get "in production" test results). That was mainly a request *not* to emulate my try: message to use win10 yourself [yet]
Use the previous vsync timestamp on windows 10 so that we don't have timestamps sent from vsync that are in the future. This behavior mimics CVDisplayLink on OS X. This patch removes one source of future timestamps sent to the refresh driver. Successful try:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=f56e1a515c3a

Unfortunately, this doesn't actually fix all the assertion errors across platforms. It looks like the test suite itself is giving timestamps in the future, which are NOT coming from vsync:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=8c8f2989dd27
Attachment #8645941 - Attachment is obsolete: true
Attachment #8646425 - Flags: review?(jmuizelaar)
Comment on attachment 8646425 [details] [diff] [review]
Use the previous vsync timestamp on windows 10

Review of attachment 8646425 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxWindowsPlatform.cpp
@@ +2566,5 @@
> +                                          QPC_TIME& aQpcVblankTime)
> +      {
> +        TimeStamp vsync = TimeStamp::Now();
> +        LARGE_INTEGER qpcNow;
> +        QueryPerformanceCounter(&qpcNow);

You should drop:
LARGE_INTEGER qpcNow;
QueryPerformanceCounter(&qpcNow);

into block where it's used
Attachment #8646425 - Flags: review?(jmuizelaar) → review+
Carrying r+, also correctly use the vsync timestamp on Windows 10. 

Successful try - https://treeherder.mozilla.org/#/jobs?repo=try&revision=ec14eb426660
Attachment #8646425 - Attachment is obsolete: true
Attachment #8646973 - Flags: review+
Keywords: checkin-needed
(In reply to Justin Wood (:Callek) from comment #12)
> (In reply to Mason Chang [:mchang] from comment #10)
> > Created attachment 8645941 [details] [diff] [review]
> > Normalize Windows vsync timestamps
> > 
> > Waiting on try before requesting review:
> > 
> > https://treeherder.mozilla.org/#/jobs?repo=try&revision=187ddab441cd
> > 
> 
> Once the failures on other platforms are sorted, I suspect this will be
> necessary to fix Bug 1193040  (and probably similar other tests that hit
> that same assertion)

(In reply to Mason Chang [:mchang] from comment #18)
> Created attachment 8646973 [details] [diff] [review]
> Use the previous vsync timestamp on windows 10
> 
> Carrying r+, also correctly use the vsync timestamp on Windows 10. 
> 
> Successful try -
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=ec14eb426660

I added some win10 jobs to this push, for my own confirmation (of c#12). They should not be indicative of any issue with this patch, for clarity.
(In reply to Mason Chang [:mchang] from comment #18)
> Created attachment 8646973 [details] [diff] [review]
> Use the previous vsync timestamp on windows 10
> 
> Carrying r+, also correctly use the vsync timestamp on Windows 10. 
> 
> Successful try -
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=ec14eb426660

:mchang, :jmuizelaar,

Is the crash on win10 ( https://treeherder.mozilla.org/logviewer.html#?job_id=10363928&repo=try ) that is crashing in 09:04:53 WARNING - PROCESS-CRASH | dom/tests/mochitest/general/test_bug1161721.html | application crashed [@ D3DVsyncSource::D3DVsyncDisplay::VBlankLoop()]   indicative of a problem with this patch?

Noteworthy though, this did eliminate the VAST majority of the assertions we were hitting on win10 M(4)
Flags: needinfo?(mchang)
Flags: needinfo?(jmuizelaar)
url:        https://hg.mozilla.org/integration/mozilla-inbound/rev/32db8dc79f005bb7e5f23090f8f2808170729ff6
changeset:  32db8dc79f005bb7e5f23090f8f2808170729ff6
user:       Mason Chang <mchang@mozilla.com>
date:       Wed Aug 12 09:20:00 2015 -0700
description:
Bug 1190257. Use the previous vsync timestamp on Windows 10. r=jrmuizel
url:        https://hg.mozilla.org/integration/mozilla-inbound/rev/74ec3ff61daaffbe5bb30c7dde09e292be87b66e
changeset:  74ec3ff61daaffbe5bb30c7dde09e292be87b66e
user:       Mason Chang <mchang@mozilla.com>
date:       Wed Aug 12 10:15:54 2015 -0700
description:
Backed out changeset 32db8dc79f00 (bug 1190257)
(In reply to Justin Wood (:Callek) from comment #20)
> (In reply to Mason Chang [:mchang] from comment #18)
> > Created attachment 8646973 [details] [diff] [review]
> > Use the previous vsync timestamp on windows 10
> > 
> > Carrying r+, also correctly use the vsync timestamp on Windows 10. 
> > 
> > Successful try -
> > https://treeherder.mozilla.org/#/jobs?repo=try&revision=ec14eb426660
> 
> :mchang, :jmuizelaar,
> 
> Is the crash on win10 (
> https://treeherder.mozilla.org/logviewer.html#?job_id=10363928&repo=try )
> that is crashing in 09:04:53 WARNING - PROCESS-CRASH |
> dom/tests/mochitest/general/test_bug1161721.html | application crashed [@
> D3DVsyncSource::D3DVsyncDisplay::VBlankLoop()]   indicative of a problem
> with this patch?
> 
> Noteworthy though, this did eliminate the VAST majority of the assertions we
> were hitting on win10 M(4)

Backed out due to Mochitest failures. Investigating. It's odd though from the log, it doesn't look like it's instant and it actually takes a while until the failure occurs.
Flags: needinfo?(mchang)
Flags: needinfo?(jmuizelaar)
Keywords: checkin-needed
Interestingly, I was able to reproduce this once on the integrated Intel HD card with the patch for Windows 10 vsync timestamps. It looked like every vsync timestamp given to us was still in the future, even using the past vsync timestamp. I restarted nightly, and the problem went away after that again. I wonder if this is the case on discrete GPUs as Jeff stated he still had the problem on an AMD GPU. Justin's try servers use a dedicated nvidia GPU as well.

Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=a873afd76b2c
Files: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/mchang@mozilla.com-a873afd76b2c

I would expect with this push, that the machines would report "DWM Vsync timestamp is in the future" and no longer see the FrameAnimator.cpp assertion error. I'd be very interested to see what the timestamps look like.

Jeff, Justin, can you please try those builds and paste a sample of the log you get? The builds should be done after 8/12/2015 after 4 PM PST. Thanks!
Another try -  https://treeherder.mozilla.org/#/jobs?repo=try&revision=b889bc89d74f

Looks like Windows 10, can once in a while, have two successive DwmFlush wake ups before vblank, causing the previous vsync to still be in the future by ~ < 0.01ms.

Also, Windows 10 can get a vblank time in the past. This patch tries to correct for both cases. Essentially, there isn't a lot of regularity with DwmFlush / DwmGetCompositionTimingInfo on Windows 10 :(.
The upcoming vsync reported by DwmGetCompositionTimingInfo can still be BEFORE the previous vsync. Another try:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=2645169cefad
Windows 10 and Intel HD vsync timestamps are messy and all over the place once in a while. Most of the time, it reports the upcoming vsync. Sometimes, that upcoming vsync is in the past. Sometimes that upcoming vsync is before the previously seen vsync. Sometimes, the previous vsync is still in the future. In these error cases, we try to normalize to Now().
Attachment #8646973 - Attachment is obsolete: true
Attachment #8648866 - Flags: review?(jmuizelaar)
Comment on attachment 8648866 [details] [diff] [review]
Use the previous vsync timestamp on windows 10

Review of attachment 8648866 [details] [diff] [review]:
-----------------------------------------------------------------

It might be worth integrating this information 'Windows 10 and Intel HD vsync timestamps are messy and all over the place once in a while. Most of the time, it reports the upcoming vsync. Sometimes, that upcoming vsync is in the past. Sometimes that upcoming vsync is before the previously seen vsync. Sometimes, the previous vsync is still in the future. In these error cases, we try to normalize to Now().' into a comment in the source.
Attachment #8648866 - Flags: review?(jmuizelaar) → review+
url:        https://hg.mozilla.org/integration/mozilla-inbound/rev/634dc82163f42cf287d54ff9c45ae3d52f849991
changeset:  634dc82163f42cf287d54ff9c45ae3d52f849991
user:       Mason Chang <mchang@mozilla.com>
date:       Tue Aug 18 09:11:12 2015 -0700
description:
Bug 1190257. Use the previous vsync timestamp on windows 10. r=jrmuizel
(In reply to Jeff Muizelaar [:jrmuizel] from comment #29)
> Comment on attachment 8648866 [details] [diff] [review]
> Use the previous vsync timestamp on windows 10
> 
> Review of attachment 8648866 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> It might be worth integrating this information 'Windows 10 and Intel HD
> vsync timestamps are messy and all over the place once in a while. Most of
> the time, it reports the upcoming vsync. Sometimes, that upcoming vsync is
> in the past. Sometimes that upcoming vsync is before the previously seen
> vsync. Sometimes, the previous vsync is still in the future. In these error
> cases, we try to normalize to Now().' into a comment in the source.

Pushed with this comment.
https://hg.mozilla.org/mozilla-central/rev/634dc82163f4
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
See Also: → 1196453
Comment on attachment 8648866 [details] [diff] [review]
Use the previous vsync timestamp on windows 10

Approval Request Comment
[Feature/regressing bug #]: Windows 10.
[User impact if declined]: There could be some weird animations / rendering glitches based on Timestamps on Windows 10.
[Describe test coverage new/current, TreeHerder]: Manual testing on Windows 10. Worked with Justin Wood on testing on preliminary Windows 10 try machines.
[Risks and why]: Low, this fixes Windows 10 Vsync Timestamps to be in line with other the other platforms.
[String/UUID change made/needed]: None.
Attachment #8648866 - Flags: approval-mozilla-aurora?
Comment on attachment 8648866 [details] [diff] [review]
Use the previous vsync timestamp on windows 10

Let's fix this windows 10 regression. 
Mason, any reason why you didn't request the uplift to beta?
Flags: needinfo?(mchang)
Attachment #8648866 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(In reply to Sylvestre Ledru [:sylvestre] from comment #34)
> Comment on attachment 8648866 [details] [diff] [review]
> Use the previous vsync timestamp on windows 10
> 
> Let's fix this windows 10 regression. 
> Mason, any reason why you didn't request the uplift to beta?

I want to do a slow rollout with more Windows 10 users. I wanted to let sit on Nightly for a couple of days, Aurora for a couple of days, then I'd request uplift to beta.
Flags: needinfo?(mchang)
Comment on attachment 8648866 [details] [diff] [review]
Use the previous vsync timestamp on windows 10

See comment 33.
Attachment #8648866 - Flags: approval-mozilla-beta?
Comment on attachment 8648866 [details] [diff] [review]
Use the previous vsync timestamp on windows 10

Firefox experience on win10 needs to be awesome! Also, special thanks to Mason for stabilizing the patch over Nightly, Aurora, before requesting uplift to Beta.
Attachment #8648866 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Duplicate of this bug: 1196453
You need to log in before you can comment on or make changes to this bug.