Closed Bug 1305628 Opened 3 years ago Closed 3 years ago

Fix vsync sometimes not firing after the GPU process crashes

Categories

(Core :: Graphics: Layers, defect)

defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: dvander, Assigned: dvander)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

This is totally crazy IMO, but apparently ClientLayerManager's destructor can fire a DidComposite, which can trigger a refresh tick, and a repaint. This causes the compositor to be recreated in the compositor's destructor. I would love to know why we do that, and if we can have it be a message posted to the event loop instead.

Anyway, because of this, we have to make sure to remove any compositor-associated pointers on nsWindow before we release the compositor. In this case the old vsync dispatcher was getting picked up by the new compositor, then destroyed in the assumption no new compositor existed.
Attached patch fixSplinter Review
This fixes comment #0, as well as a crash that can occur if the InProcessCompositorWidget is shut down after the compositor.
Attachment #8795166 - Flags: review?(matt.woodrow)
(In reply to David Anderson [:dvander] from comment #0)
> This is totally crazy IMO, but apparently ClientLayerManager's destructor
> can fire a DidComposite, which can trigger a refresh tick, and a repaint.
> This causes the compositor to be recreated in the compositor's destructor. I
> would love to know why we do that, and if we can have it be a message posted
> to the event loop instead.

The DidComposite events get delivered to the refresh driver which control throttling of it. If the ClientLayerManager was waiting on a DidComposite from the Compositor when it gets destroyed, then we risk losing this message and having the refresh driver wait indefinitely for the message that never arrives. 

It's a very weird corner case that might not actually happen ever. We can definitely do it with a message posted to the event loop instrad.

> 
> Anyway, because of this, we have to make sure to remove any
> compositor-associated pointers on nsWindow before we release the compositor.
> In this case the old vsync dispatcher was getting picked up by the new
> compositor, then destroyed in the assumption no new compositor existed.
Attachment #8795166 - Flags: review?(matt.woodrow) → review+
Pushed by danderson@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7bcb0c169466
Fix vsync sometimes not firing after the GPU process crashes. (bug 1305628, r=mattwoodrow)
Attached patch followup for gtkSplinter Review
The null-check in the Windows code needed to be added to the GTK code as well.
Attachment #8795470 - Flags: review?(matt.woodrow)
Attachment #8795470 - Flags: review?(matt.woodrow) → review+
Backed out for crashing with mozilla::OffTheBooksMutex::Lock:

https://hg.mozilla.org/integration/mozilla-inbound/rev/e3f2279355ba4974913f3d3415bf2da04eee0355

Push with crashes: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=7bcb0c169466f11c639dc1f9a3e36072ddffad01
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=36554959&repo=mozilla-inbound

14:10:09  WARNING -  TEST-UNEXPECTED-FAIL | devtools/client/performance/test/browser_timeline-waterfall-generic.js | application terminated with exit code 11
14:10:09     INFO -  runtests.py | Application ran for: 0:00:47.574551
14:10:09     INFO -  zombiecheck | Reading PID log: /tmp/tmpVDnVY_pidlog
14:10:09     INFO -  ==> process 32065 launched child process 32086
14:10:09     INFO -  ==> process 32065 launched child process 32116
14:10:09     INFO -  zombiecheck | Checking for orphan process with PID: 32086
14:10:09     INFO -  zombiecheck | Checking for orphan process with PID: 32116
14:10:09     INFO -  mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/EnCg1wT2QjWOXh8r7F-pzw/artifacts/public/build/firefox-52.0a1.en-US.linux-x86_64.crashreporter-symbols.zip
14:10:27     INFO -  mozcrash Copy/paste: /builds/slave/test/build/linux64-minidump_stackwalk /tmp/tmpnB_4Y9.mozrunner/minidumps/05ee682d-ddb2-7551-4d223003-67af9695.dmp /tmp/tmpqrbafB
14:10:39     INFO -  mozcrash Saved minidump as /builds/slave/test/build/blobber_upload_dir/05ee682d-ddb2-7551-4d223003-67af9695.dmp
14:10:39     INFO -  mozcrash Saved app info as /builds/slave/test/build/blobber_upload_dir/05ee682d-ddb2-7551-4d223003-67af9695.extra
14:10:39  WARNING -  PROCESS-CRASH | devtools/client/performance/test/browser_timeline-waterfall-generic.js | application crashed [@ mozilla::OffTheBooksMutex::Lock]
14:10:39     INFO -  Crash dump filename: /tmp/tmpnB_4Y9.mozrunner/minidumps/05ee682d-ddb2-7551-4d223003-67af9695.dmp
14:10:39     INFO -  Operating system: Linux
14:10:39     INFO -                    0.0.0 Linux 3.2.0-76-generic #111-Ubuntu SMP Tue Jan 13 22:16:09 UTC 2015 x86_64
14:10:39     INFO -  CPU: amd64
14:10:39     INFO -       family 6 model 62 stepping 4
14:10:39     INFO -       1 CPU
14:10:39     INFO -  GPU: UNKNOWN
14:10:39     INFO -  Crash reason:  SIGSEGV
14:10:39     INFO -  Crash address: 0x10
14:10:39     INFO -  Process uptime: not available
14:10:39     INFO -  Thread 22 (crashed)
14:10:39     INFO -   0  libxul.so!mozilla::OffTheBooksMutex::Lock [Mutex.h:7bcb0c169466 : 69 + 0x1]
14:10:39     INFO -      rax = 0x0000000000000000   rdx = 0x00007fc10895b1b0
14:10:39     INFO -      rcx = 0x0000000000000000   rbx = 0x0000000000000000
14:10:39     INFO -      rsi = 0x0000000000000000   rdi = 0x0000000000000010
14:10:39     INFO -      rbp = 0x00007fc0ed5fe7b0   rsp = 0x00007fc0ed5fe770
14:10:39     INFO -       r8 = 0x0000000000000008    r9 = 0x0000000000000008
14:10:39     INFO -      r10 = 0x0000000000000000   r11 = 0x00007fc10fe00618
14:10:39     INFO -      r12 = 0x0000000000000000   r13 = 0x0000000000000000
14:10:39     INFO -      r14 = 0x00007fc0ed5fe9a8   r15 = 0x0000000000000000
14:10:39     INFO -      rip = 0x00007fc104f9f46b
14:10:39     INFO -      Found by: given as instruction pointer in context
14:10:39     INFO -   1  libxul.so!mozilla::CompositorVsyncDispatcher::SetCompositorVsyncObserver [Mutex.h:7bcb0c169466 : 164 + 0x5]
14:10:39     INFO -      rbx = 0x0000000000000000   rbp = 0x00007fc0ed5fe7b0
14:10:39     INFO -      rsp = 0x00007fc0ed5fe780   r12 = 0x0000000000000000
14:10:39     INFO -      r13 = 0x0000000000000000   r14 = 0x00007fc0ed5fe9a8
14:10:39     INFO -      r15 = 0x0000000000000000   rip = 0x00007fc106404d38
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -   2  libxul.so!mozilla::widget::InProcessX11CompositorWidget::ObserveVsync [InProcessX11CompositorWidget.cpp:7bcb0c169466 : 29 + 0xc]
14:10:39     INFO -      rbx = 0x0000000000000000   rbp = 0x00007fc0ed5fe7e0
14:10:39     INFO -      rsp = 0x00007fc0ed5fe7c0   r12 = 0x00007fc108cae5f8
14:10:39     INFO -      r13 = 0x00007fc0ed5fe838   r14 = 0x00007fc0ed5fe9a8
14:10:39     INFO -      r15 = 0x0000000000000000   rip = 0x00007fc1064339ca
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -   3  libxul.so!mozilla::layers::CompositorVsyncScheduler::UnobserveVsync [CompositorBridgeParent.cpp:7bcb0c169466 : 633 + 0xd]
14:10:39     INFO -      rbx = 0x00007fc0d8c65c80   rbp = 0x00007fc0ed5fe800
14:10:39     INFO -      rsp = 0x00007fc0ed5fe7f0   r12 = 0x00007fc108cae5f8
14:10:39     INFO -      r13 = 0x00007fc0ed5fe838   r14 = 0x00007fc0ed5fe9a8
14:10:39     INFO -      r15 = 0x0000000000000000   rip = 0x00007fc10579475c
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -   4  libxul.so!mozilla::layers::CompositorVsyncScheduler::Destroy [CompositorBridgeParent.cpp:7bcb0c169466 : 426 + 0x5]
14:10:39     INFO -      rbx = 0x00007fc0d8c65c80   rbp = 0x00007fc0ed5fe820
14:10:39     INFO -      rsp = 0x00007fc0ed5fe810   r12 = 0x00007fc108cae5f8
14:10:39     INFO -      r13 = 0x00007fc0ed5fe838   r14 = 0x00007fc0ed5fe9a8
14:10:39     INFO -      r15 = 0x0000000000000000   rip = 0x00007fc10579dd2d
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -   5  libxul.so!mozilla::layers::CompositorBridgeParent::StopAndClearResources [CompositorBridgeParent.cpp:7bcb0c169466 : 836 + 0x5]
14:10:39     INFO -      rbx = 0x00007fc0d8b6f800   rbp = 0x00007fc0ed5fe860
14:10:39     INFO -      rsp = 0x00007fc0ed5fe830   r12 = 0x00007fc108cae5f8
14:10:39     INFO -      r13 = 0x00007fc0ed5fe838   r14 = 0x00007fc0ed5fe9a8
14:10:39     INFO -      r15 = 0x0000000000000000   rip = 0x00007fc1057a3bab
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -   6  libxul.so!mozilla::layers::CompositorBridgeParent::RecvWillClose [CompositorBridgeParent.cpp:7bcb0c169466 : 847 + 0x5]
14:10:39     INFO -      rbx = 0x00007fc0d8b6f800   rbp = 0x00007fc0ed5fe870
14:10:39     INFO -      rsp = 0x00007fc0ed5fe870   r12 = 0x00007fc0ed5fea08
14:10:39     INFO -      r13 = 0x00007fc0ed5fe900   r14 = 0x00007fc0ed5fe9a8
14:10:39     INFO -      r15 = 0x0000000000000000   rip = 0x00007fc1057a3be7
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -   7  libxul.so!mozilla::layers::PCompositorBridgeParent::OnMessageReceived [PCompositorBridgeParent.cpp:7bcb0c169466 : 1447 + 0xc]
14:10:39     INFO -      rbx = 0x00007fc0d8b6f800   rbp = 0x00007fc0ed5fe990
14:10:39     INFO -      rsp = 0x00007fc0ed5fe880   r12 = 0x00007fc0ed5fea08
14:10:39     INFO -      r13 = 0x00007fc0ed5fe900   r14 = 0x00007fc0ed5fe9a8
14:10:39     INFO -      r15 = 0x0000000000000000   rip = 0x00007fc105496ba5
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -   8  libxul.so!mozilla::ipc::MessageChannel::DispatchSyncMessage [MessageChannel.cpp:7bcb0c169466 : 1634 + 0x10]
14:10:39     INFO -      rbx = 0x00007fc0ed5fea08   rbp = 0x00007fc0ed5fe9e0
14:10:39     INFO -      rsp = 0x00007fc0ed5fe9a0   r12 = 0x00007fc0ed5feab0
14:10:39     INFO -      r13 = 0x00007fc0d8b6f830   r14 = 0x00007fc0ed5fe9a8
14:10:39     INFO -      r15 = 0x0000000000000000   rip = 0x00007fc1052b9f45
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -   9  libxul.so!mozilla::ipc::MessageChannel::DispatchMessage [MessageChannel.cpp:7bcb0c169466 : 1597 + 0xe]
14:10:39     INFO -      rbx = 0x00007fc0ed5feab0   rbp = 0x00007fc0ed5fea90
14:10:39     INFO -      rsp = 0x00007fc0ed5fe9f0   r12 = 0x00007fc0d8b6f830
14:10:39     INFO -      r13 = 0x00007fc0ed5fea08   r14 = 0x00007fc0ed5fea38
14:10:39     INFO -      r15 = 0x00007fc0ed5fea10   rip = 0x00007fc1052c2167
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  10  libxul.so!mozilla::ipc::MessageChannel::OnMaybeDequeueOne [MessageChannel.cpp:7bcb0c169466 : 1568 + 0xb]
14:10:39     INFO -      rbx = 0x00007fc0ed5feab0   rbp = 0x00007fc0ed5feb40
14:10:39     INFO -      rsp = 0x00007fc0ed5feaa0   r12 = 0x00007fc0d8b6f830
14:10:39     INFO -      r13 = 0x00007fc0ed5fec01   r14 = 0x00007fc0ed5fec58
14:10:39     INFO -      r15 = 0x00007fc0ed6e6f88   rip = 0x00007fc1052c4274
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  11  libxul.so!mozilla::detail::RunnableMethodImpl<bool (mozilla::ipc::MessageChannel::*)(), false, true>::Run [nsThreadUtils.h:7bcb0c169466 : 729 + 0x12]
14:10:39     INFO -      rbx = 0x00007fc0ed5fece8   rbp = 0x00007fc0ed5feb50
14:10:39     INFO -      rsp = 0x00007fc0ed5feb50   r12 = 0x00007fc0ed5febd8
14:10:39     INFO -      r13 = 0x00007fc0ed5fec01   r14 = 0x00007fc0ed5fec58
14:10:39     INFO -      r15 = 0x00007fc0ed6e6f88   rip = 0x00007fc1052b61dd
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  12  libxul.so!mozilla::ipc::MessageChannel::DequeueTask::Run [MessageChannel.h:7bcb0c169466 : 540 + 0x6]
14:10:39     INFO -      rbx = 0x00007fc0ed5fece8   rbp = 0x00007fc0ed5feb60
14:10:39     INFO -      rsp = 0x00007fc0ed5feb60   r12 = 0x00007fc0ed5febd8
14:10:39     INFO -      r13 = 0x00007fc0ed5fec01   r14 = 0x00007fc0ed5fec58
14:10:39     INFO -      r15 = 0x00007fc0ed6e6f88   rip = 0x00007fc1052b5051
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  13  libxul.so!MessageLoop::RunTask [message_loop.cc:7bcb0c169466 : 346 + 0x6]
14:10:39     INFO -      rbx = 0x00007fc0ed5fece8   rbp = 0x00007fc0ed5feb90
14:10:39     INFO -      rsp = 0x00007fc0ed5feb70   r12 = 0x00007fc0ed5febd8
14:10:39     INFO -      r13 = 0x00007fc0ed5fec01   r14 = 0x00007fc0ed5fec58
14:10:39     INFO -      r15 = 0x00007fc0ed6e6f88   rip = 0x00007fc1052a3081
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  14  libxul.so!MessageLoop::DeferOrRunPendingTask [message_loop.cc:7bcb0c169466 : 354 + 0x5]
14:10:39     INFO -      rbx = 0x00007fc0ed5fec01   rbp = 0x00007fc0ed5febc0
14:10:39     INFO -      rsp = 0x00007fc0ed5feba0   r12 = 0x00007fc0ed5febd8
14:10:39     INFO -      r13 = 0x00007fc0ed5fec01   r14 = 0x00007fc0ed5fec58
14:10:39     INFO -      r15 = 0x00007fc0ed6e6f88   rip = 0x00007fc1052a712c
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  15  libxul.so!MessageLoop::DoWork [message_loop.cc:7bcb0c169466 : 429 + 0x5]
14:10:39     INFO -      rbx = 0x00007fc0ed5fece8   rbp = 0x00007fc0ed5fec10
14:10:39     INFO -      rsp = 0x00007fc0ed5febd0   r12 = 0x00007fc0ed5febd8
14:10:39     INFO -      r13 = 0x00007fc0ed5fec01   r14 = 0x00007fc0ed5fec58
14:10:39     INFO -      r15 = 0x00007fc0ed6e6f88   rip = 0x00007fc1052a71c3
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  16  libxul.so!base::MessagePumpDefault::Run [message_pump_default.cc:7bcb0c169466 : 36 + 0xa]
14:10:39     INFO -      rbx = 0x00007fc0ed6e6f70   rbp = 0x00007fc0ed5fec90
14:10:39     INFO -      rsp = 0x00007fc0ed5fec20   r12 = 0x00007fc0ed5fec48
14:10:39     INFO -      r13 = 0x00007fc0ed5fece8   r14 = 0x00007fc0ed5fec58
14:10:39     INFO -      r15 = 0x00007fc0ed6e6f88   rip = 0x00007fc1052a1173
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  17  libxul.so!MessageLoop::Run [message_loop.cc:7bcb0c169466 : 225 + 0x8]
14:10:39     INFO -      rbx = 0x00007fc0ed5fece8   rbp = 0x00007fc0ed5fecd0
14:10:39     INFO -      rsp = 0x00007fc0ed5feca0   r12 = 0x00007fc0ed5fece8
14:10:39     INFO -      r13 = 0x00007fc0ed5ff9c0   r14 = 0x0000000000000000
14:10:39     INFO -      r15 = 0x0000000000000003   rip = 0x00007fc1052a0d1c
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  18  libxul.so!base::Thread::ThreadMain [thread.cc:7bcb0c169466 : 180 + 0x8]
14:10:39     INFO -      rbx = 0x00007fc0ed6e6f10   rbp = 0x00007fc0ed5fee90
14:10:39     INFO -      rsp = 0x00007fc0ed5fece0   r12 = 0x00007fc0ed5fece8
14:10:39     INFO -      r13 = 0x00007fc0ed5ff9c0   r14 = 0x0000000000000000
14:10:39     INFO -      r15 = 0x0000000000000003   rip = 0x00007fc1052a8e97
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  19  libxul.so!ThreadFunc [platform_thread_posix.cc:7bcb0c169466 : 38 + 0x3]
14:10:39     INFO -      rbx = 0x0000000000000000   rbp = 0x00007fc0ed5feea0
14:10:39     INFO -      rsp = 0x00007fc0ed5feea0   r12 = 0x00007fff84069f88
14:10:39     INFO -      r13 = 0x00007fc0ed5ff9c0   r14 = 0x0000000000000000
14:10:39     INFO -      r15 = 0x0000000000000003   rip = 0x00007fc1052a72c1
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  20  libpthread-2.15.so + 0x7e9a
14:10:39     INFO -      rbx = 0x0000000000000000   rbp = 0x0000000000000000
14:10:39     INFO -      rsp = 0x00007fc0ed5feeb0   r12 = 0x00007fff84069f88
14:10:39     INFO -      r13 = 0x00007fc0ed5ff9c0   r14 = 0x0000000000000000
14:10:39     INFO -      r15 = 0x0000000000000003   rip = 0x00007fc110fa4e9a
14:10:39     INFO -      Found by: call frame info
14:10:39     INFO -  21  libc-2.15.so + 0xf338d
14:10:39     INFO -      rsp = 0x00007fc0ed5fefc0   rip = 0x00007fc1100b438d
14:10:39     INFO -      Found by: stack scanning
Flags: needinfo?(dvander)
Pushed by danderson@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9ce619a6dcae
Fix vsync sometimes not firing after the GPU process crashes. (bug 1305628, r=mattwoodrow)
Backout by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9d18bbd1e614
Backed out changeset 9ce619a6dcae for crashes
btw this seems related to the backout in comment #5, maybe worth to have try run before next try to checkin
Pushed by danderson@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7eaccbb8e8d3
Fix vsync sometimes not firing after the GPU process crashes. (bug 1305628, r=mattwoodrow)
Flags: needinfo?(dvander)
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.