Closed Bug 1548187 Opened 5 years ago Closed 3 years ago

21.05 - 38.25% glterrain (windows10-64-shippable, windows7-32-shippable) regression on push e8d2d9aff5026ef1f1777b781b47fdcbdb9d8f20 (Mon Apr 29 2019)

Categories

(Core :: Graphics: Layers, defect, P3)

defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- wontfix
firefox69 --- wontfix
firefox70 --- wontfix
firefox71 --- wontfix

People

(Reporter: ccoroiu, Unassigned)

References

(Regression)

Details

(4 keywords)

Talos has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=e8d2d9aff5026ef1f1777b781b47fdcbdb9d8f20

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

38% glterrain windows10-64-shippable opt e10s stylo 0.73 -> 1.01
21% glterrain windows7-32-shippable opt e10s stylo 0.74 -> 0.89

Improvements:

11% displaylist_mutate linux64-shippable opt e10s stylo 1,673.98 -> 1,488.03

You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=20725

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Performance_sheriffing/Talos/RegressionBugsHandling

Component: General → Graphics: Layers
Flags: needinfo?(bas)
Product: Testing → Core
Flags: needinfo?(igoldan)

So Linux is not affected by this pref change, so that improvement is most certainly nonsense.

As for the glterrain regression, we had measured this ourselves as well, we don't understand completely what glterrain is measuring here but the number is so small in absolutes, and this patch merely reduces the work we do, we believe it is some sort of ordering or scheduling issue.

There is significant pageload improvements from this patch so we would be very reluctant to switch back, in my discussions with graphics team members there seemed to be agreement with this.

Jeff, do you have any idea how doing -less- work could be regressing glterrain? I'm a little unsure how this number is being generated. As a point of information here, the only -additional- work this patch do, is copy regions which are -valid- in the front buffer from the previous frame but invalid in the backbuffer from the front to the backbuffer before validating the backbuffer.

Flags: needinfo?(bas) → needinfo?(jgilbert)

Sotaro, Jeff Muizelaar suggested you may have an idea on this as well, I have actually come up with a theory of why this is based on how the test works, and why this regression also occurs with WebRender), I'd like to know what your idea is.

Flags: needinfo?(sotaro.ikeda.g)

I'm curious what the theories are, but I'd be very comfortable accepting such a small absolute-unit regression. This degree of change is generally from adding or removing a copy, which is not a deal-breaker.

It's possible webgl isn't properly tied into the damage-rect code. (over-invalidation?)
We might have been getting lucky with single-buffer mode on this test, also. (Never blocking on reuse, in part because the workload is so small) Switching to double-buffered would thus potentially incur extra memory bandwidth.

Also, switching to double-buffered would indeed incur an extra copy if we're preserving the undamaged remainder of the page around the damanged webgl rect. Particularly if this surrounding content is cheap to recreate (color layers, clears), this could amount to more work being done by copying undirtied regions front-to-back-buffer.

Flags: needinfo?(jgilbert)
Priority: -- → P3
Blocks: 1534654
Flags: needinfo?(igoldan)

(In reply to Bas Schouten (:bas.schouten) from comment #2)

Sotaro, Jeff Muizelaar suggested you may have an idea on this as well, I have actually come up with a theory of why this is based on how the test works, and why this regression also occurs with WebRender), I'd like to know what your idea is.

Sorry for late response. Japan was in long public holidays. I am going to take a look.

It might be caused by a difference of BitBlt model and Flip model.
https://docs.microsoft.com/en-us/windows/desktop/direct3ddxgi/dxgi-flip-model

On Flip model, 2 buffers are used on D3D compositor. In this model, if 2 presents are called during 1 vsync period or 2 buffers are still in use because of heavy tasks, WaitForPreviousPresentQuery() or Present() is going to block until one buffer becomes available. To avoid it, WebRender uses 3 buffers by Bug 1500017.

On BitBlt model, if the above problem happens, it just drops Present() frame.

Flags: needinfo?(sotaro.ikeda.g)

During looking into this bug, I found that Bug 1547775 caused another problem like Bug 1435995. If DXGI_SWAP_EFFECT_FLIP_SEQUENTIAL is used for HWND, when firefox fallback to BasicCompositor, Firefox does not update painting anymore until restart.

It looks like the glterrain results were restored after bug 1547805 landed, is this likely to be related, or is there still potential for improvements here?

Flags: needinfo?(sotaro.ikeda.g)

bug 1547805 seems not related to this bug. Bug 1544538 disabled "advanced layers" usage and bug 1547805 just re-enabled "advanced layers" usage. performance of "advanced layers" is better than CompositorD3D11 in general.

Flags: needinfo?(sotaro.ikeda.g)

[:sotaro] any other progress on this?

Flags: needinfo?(sotaro.ikeda.g)
See Also: → 1594436

As in comment 5, there is a SwapChain model difference.

When double buffering is disabled, BitBlt mode SwapChain is used. When double buffering is enabled, flip model SwapChain is used.

When BitBlt mode SwapChain is used, there could be a case that gecko rendering is skipped and the rendering is not displayed to display. But when flip model SwapChain is used, the rendering skip does not happen. Then, when BitBlt mode SwapChain is used, score of glterrain could be better by the rendering skip. But it does not mean that actual fps on display is better by disabling double buffering.

By Bug 1547775 comment 0, actual performance became better by enabling double buffering. And by Comment 11, actual glterrain regression is not big. Then it seems that we could accept the glterrain regression.

Flags: needinfo?(sotaro.ikeda.g)
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → WONTFIX
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.