Intermittent application crashed [@ mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp)] after Assertion failure: aVsyncTimestamp <= *&rightnow

NEW
Unassigned

Status

()

defect
3 years ago
28 days ago

People

(Reporter: intermittent-bug-filer, Unassigned)

Tracking

({intermittent-failure, leave-open})

unspecified
x86
Windows 7
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox52 wontfix, firefox53 wontfix, firefox61 affected, firefox62 affected)

Details

(Whiteboard: [stockwell disabled], crash signature)

Attachments

(1 attachment)

Component: Layout → Graphics
Too late for firefox 52, mass-wontfix.
In the last 7 days there have been 32 failures.
The most affected platform is the Windows 7 one, but there some failures on windows7-32-stylo-disabled, too.
The failures occur only on the debug build type.

Here is an example of a recent log: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=141105563&lineNumber=32682
And a relevant snippet from the log:

17:16:57     INFO -  716 INFO TEST-START | browser/components/sessionstore/test/browser_merge_closed_tabs.js
32658
17:16:57     INFO -  GECKO(5208) | ++DOCSHELL 16F09800 == 52 [pid = 5208] [id = {53198832-1a09-4736-b13b-f372b10379a7}]
32659
17:16:57     INFO -  GECKO(5208) | ++DOMWINDOW == 171 (16F09C00) [pid = 5208] [serial = 1159] [outer = 00000000]
32660
17:16:57     INFO -  GECKO(5208) | ++DOMWINDOW == 172 (170C6800) [pid = 5208] [serial = 1160] [outer = 16F09C00]
32661
17:16:57     INFO -  GECKO(5208) | ++DOCSHELL 1616B000 == 53 [pid = 5208] [id = {9ab453a6-223e-41be-bca2-9e148fc13f0d}]
32662
17:16:57     INFO -  GECKO(5208) | ++DOMWINDOW == 173 (1616B400) [pid = 5208] [serial = 1161] [outer = 00000000]
32663
17:16:58     INFO -  GECKO(5208) | ++DOCSHELL 17683000 == 54 [pid = 5208] [id = {9e3855eb-5b20-4f93-9890-2fe7d7494e0d}]
32664
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 174 (18B30800) [pid = 5208] [serial = 1162] [outer = 00000000]
32665
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 175 (191CE400) [pid = 5208] [serial = 1163] [outer = 18B30800]
32666
17:16:58     INFO -  GECKO(5208) | ++DOCSHELL 180F0400 == 5 [pid = 5856] [id = {31d8fbc1-ced2-4e72-8489-0243de6eded7}]
32667
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 24 (180F0C00) [pid = 5856] [serial = 151] [outer = 00000000]
32668
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 25 (18278C00) [pid = 5856] [serial = 152] [outer = 180F0C00]
32669
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 176 (15B15C00) [pid = 5208] [serial = 1164] [outer = 1616B400]
32670
17:16:58     INFO -  GECKO(5208) | [Parent 5208, Main Thread] WARNING: '!staticSink->EnsureInitActiveTIPKeyboard()', file z:/build/build/src/widget/windows/TSFTextStore.cpp, line 1089
32671
17:16:58     INFO -  GECKO(5208) | [Parent 5208, Main Thread] WARNING: '!staticSink->EnsureInitActiveTIPKeyboard()', file z:/build/build/src/widget/windows/TSFTextStore.cpp, line 1091
32672
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 26 (1827EC00) [pid = 5856] [serial = 153] [outer = 180F0C00]
32673
17:16:58     INFO -  GECKO(5208) | ++DOCSHELL 07FA3000 == 4 [pid = 4600] [id = {c8a9c1a4-ccc8-47e0-9ec3-27ed682625cf}]
32674
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 28 (0E5F4000) [pid = 4600] [serial = 251] [outer = 00000000]
32675
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 29 (0EDDEC00) [pid = 4600] [serial = 252] [outer = 0E5F4000]
32676
17:16:58     INFO -  GECKO(5208) | ++DOCSHELL 0EBEA000 == 3 [pid = 2884] [id = {f1d1faa5-f6e2-4d00-ba77-b031fff253fc}]
32677
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 10 (0EBEB400) [pid = 2884] [serial = 49] [outer = 00000000]
32678
17:16:58     INFO -  GECKO(5208) | ++DOCSHELL 1BF2C400 == 55 [pid = 5208] [id = {7c378516-6fd8-445b-95c6-acb447e00572}]
32679
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 177 (1C279800) [pid = 5208] [serial = 1165] [outer = 00000000]
32680
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 11 (101C3400) [pid = 2884] [serial = 50] [outer = 0EBEB400]
32681
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 178 (1C2A7000) [pid = 5208] [serial = 1166] [outer = 1C279800]
32682
17:16:58     INFO -  GECKO(5208) | Assertion failure: aVsyncTimestamp <= TimeStamp::Now(), at z:/build/build/src/layout/base/nsRefreshDriver.cpp:676
32683
17:16:58     INFO -  GECKO(5208) | #01: mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp) [layout/base/nsRefreshDriver.cpp:664]
32684
17:16:58     INFO -  GECKO(5208) | #02: mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync(mozilla::TimeStamp) [layout/base/nsRefreshDriver.cpp:587]
32685
17:16:58     INFO -  GECKO(5208) | #03: mozilla::layout::VsyncChild::RecvNotify(mozilla::TimeStamp const &) [layout/ipc/VsyncChild.cpp:70]
32686
17:16:58     INFO -  GECKO(5208) | #04: mozilla::layout::PVsyncChild::OnMessageReceived(IPC::Message const &) [s3:gecko-generated-sources:9a6479c28b0f57ad4e6d1aec1c5258678abc48225caeaa2a25f8a6893b83595687422ba915c00ef9e645c78dda14478aa8e2f891ae54d4764ccdba97f65bc47d/ipc/ipdl/PVsyncChild.cpp::155]
32687
17:16:58     INFO -  GECKO(5208) | #05: mozilla::ipc::PBackgroundChild::OnMessageReceived(IPC::Message const &) [s3:gecko-generated-sources:aa5c68372962a138ea539374325ce3a54189c523188b7763dbaf4b97aa5115b79016d84059c63a72c6b0fb941ba602d83972ff1e64b65142b0726b9024634c46/ipc/ipdl/PBackgroundChild.cpp::1755]
32688
17:16:58     INFO -  GECKO(5208) | #06: mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const &) [ipc/glue/MessageChannel.cpp:2120]
32689
17:16:58     INFO -  GECKO(5208) | #07: mozilla::ipc::MessageChannel::DispatchMessageW(IPC::Message &&) [ipc/glue/MessageChannel.cpp:2051]
32690
17:16:58     INFO -  GECKO(5208) | #08: mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask &) [ipc/glue/MessageChannel.cpp:1896]
32691
17:16:58     INFO -  GECKO(5208) | #09: mozilla::ipc::MessageChannel::MessageTask::Run() [ipc/glue/MessageChannel.cpp:1929]
32692
17:16:58     INFO -  GECKO(5208) | #10: nsThread::ProcessNextEvent(bool,bool *) [xpcom/threads/nsThread.cpp:1038]
32693
17:16:58     INFO -  GECKO(5208) | #11: NS_ProcessNextEvent(nsIThread *,bool) [xpcom/threads/nsThreadUtils.cpp:513]
32694
17:16:58     INFO -  GECKO(5208) | #12: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate *) [ipc/glue/MessagePump.cpp:97]
32695
17:16:58     INFO -  GECKO(5208) | #13: mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate *) [ipc/glue/MessagePump.cpp:301]
32696
17:16:58     INFO -  GECKO(5208) | #14: MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:326]
32697
17:16:58     INFO -  GECKO(5208) | #15: MessageLoop::RunHandler() [ipc/chromium/src/base/message_loop.cc:320]
32698
17:16:58     INFO -  GECKO(5208) | #16: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:300]
32699
17:16:58     INFO -  GECKO(5208) | #17: nsBaseAppShell::Run() [widget/nsBaseAppShell.cpp:160]
32700
17:16:58     INFO -  GECKO(5208) | #18: nsAppShell::Run() [widget/windows/nsAppShell.cpp:230]
32701
17:16:58     INFO -  GECKO(5208) | #19: XRE_RunAppShell() [toolkit/xre/nsEmbedFunctions.cpp:877]
32702
17:16:58     INFO -  GECKO(5208) | #20: mozilla::ipc::MessagePumpForChildProcess::Run(base::MessagePump::Delegate *) [ipc/glue/MessagePump.cpp:269]
32703
17:16:58     INFO -  GECKO(5208) | #21: MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:326]
32704
17:16:58     INFO -  GECKO(5208) | #22: MessageLoop::RunHandler() [ipc/chromium/src/base/message_loop.cc:320]
32705
17:16:58     INFO -  GECKO(5208) | #23: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:300]
32706
17:16:58     INFO -  GECKO(5208) | #24: XRE_InitChildProcess(int,char * * const,XREChildData const *) [toolkit/xre/nsEmbedFunctions.cpp:707]
32707
17:16:58     INFO -  GECKO(5208) | #25: mozilla::BootstrapImpl::XRE_InitChildProcess(int,char * * const,XREChildData const *) [toolkit/xre/Bootstrap.cpp:69]
32708
17:16:58     INFO -  GECKO(5208) | #26: content_process_main(mozilla::Bootstrap *,int,char * * const) [ipc/contentproc/plugin-container.cpp:64]
32709
17:16:58     INFO -  GECKO(5208) | #27: NS_internal_main(int,char * *,char * *) [browser/app/nsBrowserApp.cpp:280]
32710
17:16:58     INFO -  GECKO(5208) | #28: wmain [toolkit/xre/nsWindowsWMain.cpp:114]
32711
17:16:58     INFO -  GECKO(5208) | #29: __scrt_common_main_seh [f:/dd/vctools/crt/vcstartup/src/startup/exe_common.inl:283]
32712
17:16:58     INFO -  GECKO(5208) | #30: kernel32.dll + 0x53c45
32713
17:16:58     INFO -  GECKO(5208) | #31: ntdll.dll + 0x637f5
32714
17:16:58     INFO -  GECKO(5208) | #32: ntdll.dll + 0x637c8
32715
17:16:58     INFO -  GECKO(5208) | ++DOMWINDOW == 179 (1CB67C00) [pid = 5208] [serial = 1167] [outer = 1C279800]
32716
17:16:58     INFO -  GECKO(5208) | ###!!! [Parent][MessageChannel] Error: (msgtype=0x150062,name=PBrowser::Msg_UpdateDimensions) Channel error: cannot send/recv
32717
17:16:58     INFO -  GECKO(5208) | ###!!! [Parent][MessageChannel] Error: (msgtype=0x150084,name=PBrowser::Msg_Destroy) Channel error: cannot send/recv
32718
17:16:58     INFO -  GECKO(5208) | MEMORY STAT | vsize 1166MB | vsizeMaxContiguous 296MB | residentFast 324MB | heapAllocated 122MB

:milan, would you please take a look?
Flags: needinfo?(milan)
Whiteboard: [stockwell needswork]
There were 36 failures in the last 7 days.
The failure pattern is the same as in Comment 22
In the last 7 days there are 31 failures. They occur on Windows 7 - debug. 
A recent log example: https://treeherder.mozilla.org/logviewer.html#?repo=autoland&job_id=147939958&lineNumber=29033

:milan, can you please take a look?
Whiteboard: [stockwell unknown] → [stockwell needswork:owner]
Summary: Intermittent font-size-16.html | application crashed [@ mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp)] after Assertion failure: aVsyncTimestamp <= TimeStamp::Now() → Intermittent application crashed [@ mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp)] after Assertion failure: aVsyncTimestamp <= TimeStamp::Now()
Duplicate of this bug: 1420664
Duplicate of this bug: 1367182
Duplicate of this bug: 1389331
Duplicate of this bug: 1412655
Duplicate of this bug: 1418555
Duplicate of this bug: 1416091
Duplicate of this bug: 1364384
Duplicate of this bug: 1417675
Duplicate of this bug: 1400756
Duplicate of this bug: 1401720
Duplicate of this bug: 1405155
Duplicate of this bug: 1247512
Duplicate of this bug: 1424068
Duplicate of this bug: 1431908
Duplicate of this bug: 1443044
Duplicate of this bug: 1447810
Duplicate of this bug: 1446379
This bug failed 33 times in the last 7 days. Occurs on windows7 platforms and on debug build type.

Recent log:
https://treeherder.mozilla.org/logviewer.html#?job_id=200373801&repo=autoland&lineNumber=1295

INFO - TEST-OK | dom/security/test/csp/test_301_redirect.html | took 1274ms
06:15:02     INFO - GECKO(1628) | ++DOMWINDOW == 12 (00F6A400) [pid = 3588] [serial = 12] [outer = 00FC17A0]
06:15:02     INFO - TEST-START | dom/security/test/csp/test_302_redirect.html
06:15:02     INFO - GECKO(1628) | ++DOMWINDOW == 13 (00F64C00) [pid = 3588] [serial = 13] [outer = 00FC17A0]
06:15:03     INFO - GECKO(1628) | ++DOCSHELL 07FDD400 == 4 [pid = 3588] [id = {c97b4916-ef98-4be3-b359-25a82e1e2e9c}]
06:15:03     INFO - GECKO(1628) | ++DOMWINDOW == 14 (00FC1540) [pid = 3588] [serial = 14] [outer = 00000000]
06:15:03     INFO - GECKO(1628) | ++DOMWINDOW == 15 (07FDE000) [pid = 3588] [serial = 15] [outer = 00FC1540]
06:15:03     INFO - GECKO(1628) | ++DOMWINDOW == 16 (07FDF000) [pid = 3588] [serial = 16] [outer = 00FC1540]
06:15:03     INFO - GECKO(1628) | [Child 3588, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x805E0006: file z:/build/build/src/dom/security/nsContentSecurityManager.cpp, line 678
06:15:03     INFO - GECKO(1628) | [Child 3588, Main Thread] WARNING: 'NS_FAILED(rv)', file z:/build/build/src/netwerk/protocol/http/HttpChannelChild.cpp, line 2723
06:15:03     INFO - GECKO(1628) | [Child 3588, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x805E0006: file z:/build/build/src/dom/security/nsContentSecurityManager.cpp, line 678
06:15:03     INFO - GECKO(1628) | [Child 3588, Main Thread] WARNING: 'NS_FAILED(rv)', file z:/build/build/src/netwerk/protocol/http/HttpChannelChild.cpp, line 2723
06:15:03     INFO - GECKO(1628) | Assertion failure: aVsyncTimestamp <= TimeStamp::Now(), at z:/build/build/src/layout/base/nsRefreshDriver.cpp:665
06:15:14     INFO - GECKO(1628) | #01: mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync(mozilla::TimeStamp) [layout/base/nsRefreshDriver.cpp:573]
06:15:14     INFO - 
06:15:14     INFO - GECKO(1628) | #02: mozilla::layout::VsyncChild::RecvNotify(mozilla::TimeStamp const &) [layout/ipc/VsyncChild.cpp:78]

davidb: Can you please take a look at this bug?
Flags: needinfo?(dbolter)
Whiteboard: [stockwell unknown] → [stockwell needswork]
I'm not sure how to understand if I should worry about this frequency of failure. How does this rank among other intermittents? If we look at the long history of this one... it is fairly bumpy. I think it if spiked more than it ever has in history* it would be interesting to call out.

* https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?startday=2018-05-24&endday=2018-09-21&tree=all&bug=1306896
Flags: needinfo?(milaninbugzilla)
Flags: needinfo?(dbolter)
Priority: -- → P5
Duplicate of this bug: 1502153
Summary: Intermittent application crashed [@ mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp)] after Assertion failure: aVsyncTimestamp <= TimeStamp::Now() → Intermittent application crashed [@ mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::TimeStamp)] after Assertion failure: aVsyncTimestamp <= *&rightnow
So the assertion that's failing is at [1]. This happens because the vsync timestamp is greater than Now(). The vsync timestamp follows a long and convoluted path, but originates propagation at [2]. Note that on the line just before, there is another assertion that the vsync timestamp is <= Now(), and presumably that one passes all the time. So either Now() is returning a non-monotonic value, or the value is getting corrupted somewhere along the way.

The fact that this happens only on Windows, and given bug 1447643, makes me think it's the former.

[1] https://searchfox.org/mozilla-central/rev/7c848ac7630df5baf1314b0c03e015683599efb9/layout/base/nsRefreshDriver.cpp#669
[2] https://searchfox.org/mozilla-central/rev/7c848ac7630df5baf1314b0c03e015683599efb9/gfx/thebes/gfxWindowsPlatform.cpp#1995
:Kats are you still working on this? Is there a way to fix it?  anything we can help with?
Flags: needinfo?(kats)
No, I'm not actively working on this. I don't know if there's a way to fix it - I don't think we actually understand what is happening here yet.
Flags: needinfo?(kats)
:Kats, thank you for clarifying. 

Joel, I saw this whiteboard tag on older bugs [wontfix?] , is it correct to add it here too?
Flags: needinfo?(jmaher)
for bugs that are fixed or stopped failing we can use whiteboard  tags, unfortunately here, we might be stuck with one of those hard to fix or disable bugs.
Flags: needinfo?(jmaher)

Over the last 7 days this bug has 31 failures. These happen only on windows7-32

Here is the most recent log example: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=221712888&repo=autoland&lineNumber=24311

13:13:10 INFO - TEST-START | /mixed-content/link-css-tag/http-csp/same-host-http/top-level/swap-scheme-redirect/blockable/opt-in-blocks.https.html
13:13:10 INFO - Closing window 6442450950
13:13:10 INFO - PID 3212 | ++DOCSHELL 0126A800 == 1 [pid = 4036] [id = {b1fb9c0a-f928-4ea7-b92d-b48935f2d6c7}]
13:13:10 INFO - PID 3212 | [Child 4036, Main Thread] WARNING: 'NS_FAILED(rv)', file z:/build/build/src/widget/windows/nsLookAndFeel.cpp, line 854
13:13:10 INFO - PID 3212 | ++DOMWINDOW == 1 (0890A2E0) [pid = 4036] [serial = 1] [outer = 00000000]
13:13:10 INFO - PID 3212 | ++DOMWINDOW == 2 (0126BC00) [pid = 4036] [serial = 2] [outer = 0890A2E0]
13:13:10 INFO - PID 3212 | Assertion failure: aVsyncTimestamp <= *&rightnow, at z:/build/build/src/layout/base/nsRefreshDriver.cpp:640
13:13:11 WARNING - Traceback (most recent call last):
13:13:11 WARNING - File "Z:\task_1547469100\build\tests\web-platform\tests\tools\wptrunner\wptrunner\executors\executormarionette.py", line 588, in _run
13:13:11 WARNING - self.result = True, self.func(self.protocol, self.url, self.timeout)
13:13:11 WARNING - File "Z:\task_1547469100\build\tests\web-platform\tests\tools\wptrunner\wptrunner\executors\executormarionette.py", line 700, in do_testharness
13:13:11 WARNING - test_window = protocol.testharness.get_test_window(self.window_id, parent_window)
13:13:11 WARNING - File "Z:\task_1547469100\build\tests\web-platform\tests\tools\wptrunner\wptrunner\executors\executormarionette.py", line 204, in get_test_window
13:13:11 WARNING - elif after[0] == parent and len(after) > 2:
13:13:11 WARNING - IndexError: list index out of range
13:13:11 WARNING -
13:13:11 INFO - mozcrash Copy/paste: Z:\task_1547469100\build\win32-minidump_stackwalk.exe c:\users\task_1547469100\appdata\local\temp\tmpdtvuej.mozrunner\minidumps\077a7578-991c-4bfa-b782-2f08588694ea.dmp Z:\task_1547469100\build\symbols
13:13:22 INFO - mozcrash Saved minidump as Z:\task_1547469100\build\blobber_upload_dir\077a7578-991c-4bfa-b782-2f08588694ea.dmp
13:13:22 INFO - mozcrash Saved app info as Z:\task_1547469100\build\blobber_upload_dir\077a7578-991c-4bfa-b782-2f08588694ea.extra
13:13:22 INFO - PROCESS-CRASH | /mixed-content/link-css-tag/http-csp/same-host-http/top-level/swap-scheme-redirect/blockable/opt-in-blocks.https.html | application crashed [@ mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>,mozilla::TimeStamp)]
13:13:22 INFO - Crash dump filename: c:\users\task_1547469100\appdata\local\temp\tmpdtvuej.mozrunner\minidumps\077a7578-991c-4bfa-b782-2f08588694ea.dmp
13:13:22 INFO - Operating system: Windows NT
13:13:22 INFO - 6.1.7601 Service Pack 1
13:13:22 INFO - CPU: x86
13:13:22 INFO - GenuineIntel family 6 model 63 stepping 2
13:13:22 INFO - 8 CPUs
13:13:22 INFO -
13:13:22 INFO - GPU: UNKNOWN
13:13:22 INFO -
13:13:22 INFO - Crash reason: EXCEPTION_BREAKPOINT
13:13:22 INFO - Crash address: 0x58572631
13:13:22 INFO - Assertion: Unknown assertion type 0x00000000
13:13:22 INFO - Process uptime: 3 seconds
13:13:22 INFO -

Flags: needinfo?(jbonisteel)

Hey Timothy - could you take a look at what additional debugging assertions we could add here that would help us figure out if we could take action on this bug?

Flags: needinfo?(jbonisteel) → needinfo?(tnikkel)
Priority: -- → P4
Whiteboard: [stockwell unknown] → [stockwell needswork:owner]

There are 33 failures in the last 7 days.
They occur on windows7-32 debug.
Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=225802710&repo=autoland&lineNumber=8491
tnikkel: Hi, Do you have any updates on this? Thanks!

I haven't had a chance to look at this because I've been working on webrender stuff.

Update: there have been 58 failures within the last 7 days on Windows 7x 32 Debug.

Recent log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=227773086&repo=autoland&lineNumber=81104

There are 43 total failures in the last 7 days on windows7-32 debug.

Recent failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=230014849&repo=mozilla-inbound&lineNumber=9624

01:24:29 INFO - TEST-START | devtools/client/performance/test/browser_perf-options-flatten-tree-recursion-02.js
01:24:29 INFO - GECKO(2760) | Adding tab with url: http://example.com/browser/devtools/client/performance/test/doc_simple-test.html#6062736375190076.
01:24:29 INFO - GECKO(2760) | ++DOCSHELL 0B77A400 == 3 [pid = 1588] [id = {c3ef19c4-d439-49ed-beaa-1bc41f10fe14}]
01:24:29 INFO - GECKO(2760) | ++DOMWINDOW == 12 (087152E0) [pid = 1588] [serial = 107] [outer = 00000000]
01:24:29 INFO - GECKO(2760) | ++DOMWINDOW == 13 (0B77DC00) [pid = 1588] [serial = 108] [outer = 087152E0]
01:24:29 INFO - GECKO(2760) | [Parent 4704, Main Thread] WARNING: 'NS_FAILED(rv)', file z:/build/build/src/netwerk/url-classifier/UrlClassifierCommon.cpp, line 122
01:24:29 INFO - GECKO(2760) | ++DOMWINDOW == 14 (09E49000) [pid = 1588] [serial = 109] [outer = 087152E0]
01:24:30 INFO - GECKO(2760) | Initializing a performance panel.
01:24:32 INFO - GECKO(2760) | Assertion failure: aVsyncTimestamp <= *&rightnow, at z:/build/build/src/layout/base/nsRefreshDriver.cpp:685
01:24:32 INFO - GECKO(2760) | Assertion failure: aVsyncTimestamp <= *&rightnow, at z:/build/build/src/layout/base/nsRefreshDriver.cpp:685
01:24:32 INFO - GECKO(2760) | Event: 'Performance:UI:MemoryFlameGraphRendered' on [object MemoryFlameGraphView] received.
01:24:32 INFO - GECKO(2760) | Destroying panel.
01:24:32 INFO - GECKO(2760) | ++DOMWINDOW == 36 (10765000) [pid = 4704] [serial = 442] [outer = 0D84D2E0]
01:24:32 INFO - GECKO(2760) | #01: mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::NotifyVsync(mozilla::VsyncEvent const &) [layout/base/nsRefreshDriver.cpp:593]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #02: mozilla::layout::VsyncChild::RecvNotify(mozilla::VsyncEvent const &) [layout/ipc/VsyncChild.cpp:67]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #03: mozilla::layout::PVsyncChild::OnMessageReceived(IPC::Message const &) [s3:gecko-generated-sources:77ce59d8b2c7052469c47c063657e9de1ccc8108986d35814c718a6919e13f00c69b96f485bc73c2590f51f3ea688a95fac179d8497a06fdf9265adfe5cefbb3/ipc/ipdl/PVsyncChild.cpp::168]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #04: mozilla::ipc::PBackgroundChild::OnMessageReceived(IPC::Message const &) [s3:gecko-generated-sources:0040703f80f4c0152f502cebe59e5dbc8d4bb44c958609e14ba1df7a323416ac82d056516b592df5ae81ee8c9aa398d80e5348f15d03fd52af42b5894f8b2880/ipc/ipdl/PBackgroundChild.cpp::2808]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #05: mozilla::ipc::MessageChannel::DispatchAsyncMessage(IPC::Message const &) [ipc/glue/MessageChannel.cpp:2151]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #06: mozilla::ipc::MessageChannel::DispatchMessage(IPC::Message &&) [ipc/glue/MessageChannel.cpp:2077]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #07: mozilla::ipc::MessageChannel::RunMessage(mozilla::ipc::MessageChannel::MessageTask &) [ipc/glue/MessageChannel.cpp:1937]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #08: mozilla::ipc::MessageChannel::MessageTask::Run() [ipc/glue/MessageChannel.cpp:1969]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #09: nsThread::ProcessNextEvent(bool,bool *) [xpcom/threads/nsThread.cpp:1162]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #10: NS_ProcessNextEvent(nsIThread *,bool) [xpcom/threads/nsThreadUtils.cpp:474]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #11: mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate *) [ipc/glue/MessagePump.cpp:110]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #12: MessageLoop::RunInternal() [ipc/chromium/src/base/message_loop.cc:315]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #13: MessageLoop::RunHandler() [ipc/chromium/src/base/message_loop.cc:309]
01:24:32 INFO -
01:24:32 INFO - GECKO(2760) | #14: MessageLoop::Run() [ipc/chromium/src/base/message_loop.cc:291]

Whiteboard: [stockwell disable-recommended] → [stockwell needswork:owner]

There are 50 failures in the last 7 days. All on windows7-32 debug.
:jbonisteel , can you please take a look?

Flags: needinfo?(jbonisteel)
Whiteboard: [stockwell disable-recommended] → [stockwell needswork:owner]
OS: Unspecified → Windows 7
Hardware: Unspecified → x86

The Timestamp implementation on Windows seems to be based off of two system APIs: GetTickCount64 and QueryPerformanceCounter. QPC seems to be preferred over GTC, unless a "failure" in QPC is detected.

There is code in the Timestamp implementation to detect if QPC "fails" be comparing elapsed time of QPC to elapsed time of GTC.

Through logging on try server I observed the following in different failures:

  1. the QPC failure detection kicks in a lot and disables use of QPC
  2. the QPC of the vsync timestamp was in the future compared to the QPC of TimeStamp::Now().
  3. the QPC of TimeStamp::Now() was missing (presumably because it was disabled) and the GTC of the vsync timestamp was in the future compared to the GTC of TimeStamp::Now()
  4. the vsync timestamp printf of raw GTC and QPC values matched at the source (SoftwareDisplay::NotifyVsync) and in the refresh driver, so the vsync timestamp wasn't getting messed up.

I guess we could spend more time on the error detection in our Windows TimeStamp code, but this failure is only happening on 32-bit machines.

Here's a patch to disable the assertions on 32-bit Windows. Depends how important we think this assertion is relative to the work to investigate more.

Flags: needinfo?(tnikkel)
Flags: needinfo?(jbonisteel)

Thanks for digging into this! I think disabling the assertions on Win32 is fine.

Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b7e35dab3ad5
Disable assert on 32-bit Windows. r=kats
Keywords: leave-open
Whiteboard: [stockwell disable-recommended] → [stockwell disabled]
Crash Signature: [@ mozilla::VsyncRefreshDriverTimer::RefreshDriverVsyncObserver::TickRefreshDriver(mozilla::layers::BaseTransactionId<mozilla::VsyncIdType>,mozilla::TimeStamp)]

Honza, here is seemingly another example of Timestamps going backwards, which from seeing other bugs today might be of interest to you. (The assert is disabled now, so it's not of immediate concern)

Flags: needinfo?(honzab.moz)

I'm running out of ideas. Bug 1487778, now on 65 (past release), should protect. The two release failures in Comment 154 are pretty weird.

I'm thinking of simplifying the code and removing the windows timestamp complicated fallback code altogether (but I don't understand how that could go produce values going back either!) and use gettickcount (or even something else) only for fixing bug 1515155 (check the stability and cpu family, then no runtime checking).

Flags: needinfo?(honzab.moz)
You need to log in before you can comment on or make changes to this bug.