Closed Bug 1387327 Opened 7 years ago Closed 7 years ago

Investigate large discrepancy in d2d vs skia in OMTP

Categories

(Core :: Graphics: Layers, enhancement, P3)

40 Branch
enhancement

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: dvander, Assigned: mchang)

References

Details

(Whiteboard: [gfx-noted])

Quite consistently on my machine, D2D performs *much* better on the main thread vs Skia on the main thread, but much worse than Skia on the paint thread. In theory it doesn't matter a whole lot as the paints are asynchronous, but we should try to figure out why. It could very well be the Flush problem (bug 1386966).
Note that we also saw similar results when Dominic tested on the Sony Vaio I lent him.
Assignee: nobody → mchang
Profile from comment 2 is with OMTP enabled and AL disabled. AL enabled/disabled doesn't matter. On arstechnica.com
Updated profile with OMTP and AL - https://perfht.ml/2x21grd
Another profile - https://perfht.ml/2x2cQm7
Sorry correct profile with Paint Thread - https://perfht.ml/2wbNZQA
Can you take a look Bas? Apply the patches in bug 1390755 and set the preference "layers.omtp.enabled" to true, restart Nightly. I'm still confused. arstechnica.com on a hidpi display on any gpu seems to die.

Couple things that didn't matter:
A) Allowing internal threading optimizations [1]
B) Just seeing if we can not wait for GPU queries if that would help, didn't help [2]
C) Tried forcing sync omtp [3], which should basically cause the same flow as pre-omtp in that the main thread blocks while the paint thread does sync runnables. Still happens.

[1] http://searchfox.org/mozilla-central/source/gfx/thebes/DeviceManagerDx.cpp#291
[2] http://searchfox.org/mozilla-central/source/gfx/layers/d3d11/MLGDeviceD3D11.cpp#505
[3] http://searchfox.org/mozilla-central/source/gfx/thebes/gfxPrefs.h#604
Flags: needinfo?(bas)
Tried looking through the compositor to see where we lock and don't see anything. Also looked at increasing maximum frame latency [1], but we don't get IDXGISwapChain2 to do that. Mostly looks like if Present is locking up, we're sending too many frames to the GPU and Present blocks until the GPU can clear everything out. I've had 500ms present calls though. Partial present doesn't seem to matter as well versus presenting the whole screen.

[1] https://msdn.microsoft.com/en-us/library/windows/desktop/dn268313(v=vs.85).aspx
I narrowed it down, at least for arstechnica.com, to the blend effect at FinalizeDrawing[1]. Essentially, arstechnica.com uses images with the multiply background-blend-mode for the images for it's articles. This causes something terrible to happen perf wise at [2]. I wonder if D2D is just terribly slow at non-primitive composite types? If I just draw the image directly with operator source over, arstechnica.com is smooth as butter and drawsurface times drop from sometimes 2 seconds to a consistent < 0.001ms (probably just means the flush moves somewhere else). 

[1] http://searchfox.org/mozilla-central/source/gfx/2d/DrawTargetD2D1.cpp#1412
[2] http://searchfox.org/mozilla-central/source/gfx/2d/DrawTargetD2D1.cpp#1546
Also should note that what's happening here is that we're painting with d2d on both the main and paint thread. The ClientPaintedLayer has display items that are in an inactive layer and painted with basic layers, which means we paint and blend on the main thread. The paint that can also at the same time be accessing d2d resources, but it shouldn't. D3D resources such as the factory are guarded by a mutex and the main thread / paint thread should be having their own instances of DrawTargetD2D1 which create their own DeviceContexts.
Depends on: 1392453
Flags: needinfo?(bas)
Priority: -- → P3
Whiteboard: [gfx-noted]
I can't reproduce this anymore, and we have bugs on file for the concrete performance issues we've found.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.