20+ms CompositorD3D WaitForGPUQuery with Facebook

RESOLVED WONTFIX

Status

()

Core
General
RESOLVED WONTFIX
11 months ago
10 months ago

People

(Reporter: mchang, Assigned: bas)

Tracking

(Blocks: 3 bugs)

55 Branch
x86_64
Windows 10
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [QRC][QRC_NeedAnalysis], URL)

Attachments

(1 obsolete attachment)

(Reporter)

Description

11 months ago
+++ This bug was initially created as a clone of Bug #1363424 +++
Steps to reproduce:
1. Launch browser.
2. Fill Facebook.com in Navigation Start, then press enter.
3. Login with credentials.
4. Record with Gecko Profile:
   Tap on a facebook group, and wait to have the first item displayed. 
5. Share the profile.

Gecko profile:
https://perfht.ml/2q0q2Yr

Notes:
https://docs.google.com/spreadsheets/d/1Kxn850VasyaG_WfRg3pMInW0hbIT8LP7pRPBYTIpdbM/edit?pli=1#gid=679575660


Looks like we spend a lot of time in the Compositor thread at [1]. I think waiting for the commands to finish also cause the content process to block while waiting to get a lock on the TextureClient since the Compositor has it right now. Is there a particular reason we need to wait for the GPU to flush all commands here?

[1] http://searchfox.org/mozilla-central/source/gfx/layers/d3d11/CompositorD3D11.cpp#1131
(Reporter)

Updated

11 months ago
Flags: needinfo?(dvander)
(Reporter)

Updated

11 months ago
Depends on: 1373816
See bug 1260611 for why we do this - we can't let the compositor get more than 1 frame ahead. Buy maybe there is a smarter way to wait.

If you drop the call to GetDeviceRemovedReason here [1], do we do any better? I think it's unnecessary. We can check that the GetData return code is DXGI_ERROR_DEVICE_REMOVED instead.

[1] http://searchfox.org/mozilla-central/rev/b2d072aa5847996b8276bd8d7b150e0ea6bf6283/gfx/layers/d3d11/HelpersD3D11.h#20
Flags: needinfo?(dvander)

Comment 2

11 months ago
Bas, please take a look at the profiler from Mason. Thanks!
Assignee: nobody → bas
Comment hidden (mozreview-request)
Haven't measured if this helps, so we shouldn't close this bug if this patch lands.  On the other hand, it shouldn't make it worse, and try is decent: https://treeherder.mozilla.org/#/jobs?repo=try&revision=756ba63c31b4dbf7aa2be44665f5e47257298610
Comment on attachment 8883705 [details]
Bug 1373815: Remove checking for device reset (twice).

https://reviewboard.mozilla.org/r/154614/#review159702

::: gfx/layers/d3d11/HelpersD3D11.h:20
(Diff revision 1)
>  template <typename T> static inline bool
>  WaitForGPUQuery(ID3D11Device* aDevice, ID3D11DeviceContext* aContext, ID3D11Query* aQuery, T* aOut)
>  {
>    TimeStamp start = TimeStamp::Now();
>    while (aContext->GetData(aQuery, aOut, sizeof(*aOut), 0) != S_OK) {
> -    if (aDevice->GetDeviceRemovedReason() != S_OK) {
> +    MOZ_ASSERT(aDevice->GetDeviceRemovedReason() == S_OK);

I think the assert could fire if there's a device reset. I'd just drop the assert entirely, or if we really want, we can save the return value of GetData and return false if it's DXGI_ERROR_DEVICE_REMOVED.
(In reply to David Anderson [:dvander] from comment #5)
> Comment on attachment 8883705 [details]
> Bug 1373815: Remove checking for device reset (twice).
> 
> https://reviewboard.mozilla.org/r/154614/#review159702
> 
> ::: gfx/layers/d3d11/HelpersD3D11.h:20
> (Diff revision 1)
> >  template <typename T> static inline bool
> >  WaitForGPUQuery(ID3D11Device* aDevice, ID3D11DeviceContext* aContext, ID3D11Query* aQuery, T* aOut)
> >  {
> >    TimeStamp start = TimeStamp::Now();
> >    while (aContext->GetData(aQuery, aOut, sizeof(*aOut), 0) != S_OK) {
> > -    if (aDevice->GetDeviceRemovedReason() != S_OK) {
> > +    MOZ_ASSERT(aDevice->GetDeviceRemovedReason() == S_OK);
> 
> I think the assert could fire if there's a device reset. I'd just drop the
> assert entirely, or if we really want, we can save the return value of
> GetData and return false if it's DXGI_ERROR_DEVICE_REMOVED.

Aren't we going to break out of the while loop if GetData returns anything but S_OK?
(In reply to Milan Sreckovic [:milan] from comment #6)
> (In reply to David Anderson [:dvander] from comment #5)
> > Comment on attachment 8883705 [details]
> > Bug 1373815: Remove checking for device reset (twice).
> > 
> > https://reviewboard.mozilla.org/r/154614/#review159702
> > 
> > ::: gfx/layers/d3d11/HelpersD3D11.h:20
> > (Diff revision 1)
> > >  template <typename T> static inline bool
> > >  WaitForGPUQuery(ID3D11Device* aDevice, ID3D11DeviceContext* aContext, ID3D11Query* aQuery, T* aOut)
> > >  {
> > >    TimeStamp start = TimeStamp::Now();
> > >    while (aContext->GetData(aQuery, aOut, sizeof(*aOut), 0) != S_OK) {
> > > -    if (aDevice->GetDeviceRemovedReason() != S_OK) {
> > > +    MOZ_ASSERT(aDevice->GetDeviceRemovedReason() == S_OK);
> > 
> > I think the assert could fire if there's a device reset. I'd just drop the
> > assert entirely, or if we really want, we can save the return value of
> > GetData and return false if it's DXGI_ERROR_DEVICE_REMOVED.
> 
> Aren't we going to break out of the while loop if GetData returns anything
> but S_OK?

We break out if it *does* return S_OK, we loop around again in case it hasn't resolved yet. But no matter what it returns I think a device reset state could happen at any time.
Comment on attachment 8883705 [details]
Bug 1373815: Remove checking for device reset (twice).

Misread the existing code.
Attachment #8883705 - Attachment is obsolete: true
Attachment #8883705 - Flags: review?(dvander)
(Reporter)

Updated

11 months ago
Blocks: 1363424
No longer depends on: 1363424

Comment 9

10 months ago
Based on IRC chat with Bas.. He indicated that "We've actually enabled advanced layers by default now, this code shouldn't even be run in the near future."
Status: NEW → RESOLVED
Last Resolved: 10 months ago
Resolution: --- → WONTFIX
Well, OK, but it'd be nice to actually measure and see that before we just close the bug :)  Also, advanced layers do not apply to 100% of the user base.  I don't think we actually have the numbers, though we do have estimates, as it is a run time test for the support.
Depends on: 1379731, 1375743
Also AL goes through this exact same path so probably the same delay is there. Though it would be good to verify.

Re: numbers, it's still too early to get something accurate since AL has been on and off for Win8 and was *just* turned on for Win7. But I'll get that on the GFX dashboard shortly.
You need to log in before you can comment on or make changes to this bug.