Closed Bug 1789849 Opened 2 years ago Closed 2 years ago

46.3 - 15.6% pdfpaint / pdfpaint + 3 more (Linux, OSX) regression on Tue September 6 2022

Categories

(Core :: Graphics: Canvas2D, defect, P3)

defect

Tracking

()

RESOLVED FIXED
106 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox-esr102 --- unaffected
firefox104 --- unaffected
firefox105 --- unaffected
firefox106 --- fixed

People

(Reporter: aglavic, Assigned: lsalzman)

References

(Regression)

Details

(4 keywords)

Attachments

(3 files)

Perfherder has detected a talos performance regression from push 490f5f14eda710bdb43b7cd839e1967da6d434d9. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
46% pdfpaint macosx1015-64-shippable-qr e10s fission stylo webrender 669.20 -> 979.06
46% pdfpaint macosx1015-64-shippable-qr e10s fission stylo webrender-sw 668.46 -> 976.87
40% pdfpaint macosx1015-64-shippable-qr e10s fission stylo webrender 670.43 -> 940.07
17% pdfpaint linux1804-64-shippable-qr e10s fission stylo webrender 764.39 -> 891.59
16% pdfpaint linux1804-64-shippable-qr e10s fission stylo webrender-sw 742.89 -> 858.79

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests. Please follow our guide to handling regression bugs and let us know your plans within 3 business days, or the offending patch(es) may be backed out in accordance with our regression policy.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

For more information on performance sheriffing please see our FAQ.

Flags: needinfo?(lsalzman)
Severity: -- → S3
Priority: -- → P3
Flags: needinfo?(lsalzman)

:lsalzman, since you are the author of the regressor, bug 1773280, could you take a look?

For more information, please visit auto_nag documentation.

Flags: needinfo?(lsalzman)
Flags: needinfo?(lsalzman)

Sometimes the clip state is thrashed when we need to temporarily override
clipping to disable it. However, in this case, the clip mask itself remains
unchanged. The current invalidation scheme doesn't discern between generation
of the clip mask itself and setting the clip state for the shader, leading to
unnecessary regeneration of the clip mask.

This code just tries to discern when this is happening so we can refresh the
clip state without having to regenerate the clip mask unless truly necessary.

Assignee: nobody → lsalzman
Status: NEW → ASSIGNED

Certain events like waiting on a round-trip to verify that the HostWebGLContext is
done using a shmem, or pushing a Skia layer which will need to be flatten later, can
be expensive, especially if they are used many times throughout a frame. However,
we weren't currently incremening the profile counters for these situations which can
lead to accelerated rendering persisting even when it would be more judicious to
fallback to software rendering.

Depends on D157048

Previously we were reusing the framebuffer's Skia DT to render the clip mask.
This was the path of least resistance since SkCanvas does not allow exporting
clip information, and there is no way to reset the bitmap storage inside an
SkCanvas temporarily.

However, this can cause a feedback cycle of unnecessary WaitForShmem operations,
since we need to wait before we can generate the clip mask into the Skia target,
and then anything else after it needs to wait for the clip mask to finish uploading
before the Skia DT can be used again.

To alleviate this, we just allocate a new DrawTargetSkia to render the clip mask
into. We carefully clip the size of the DT so that in the common case we avoid
having to upload a surface the size of the entire framebuffer. Further, since
this is a completely different DT, we can now use an A8 format (1/4 the memory
overhead) instead of a BGRA8 format for the clip mask, which gives a further
memory usage gain.

A further complication is that we need to log the current clip stack state so
that we can replay it onto the new DrawTargetSkia. This avoids having to add
a mechanism to SkCanvas to export clip information.

Depends on D157049

Pushed by lsalzman@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9cef48a4e5b6 Avoid regenerating clip mask unless necessary. r=aosmond https://hg.mozilla.org/integration/autoland/rev/0b582207c30d Increment profile counters for a few more expensive events. r=aosmond https://hg.mozilla.org/integration/autoland/rev/b3e786ad051c Use a separate DrawTargetSkia for rendering clip masks. r=aosmond

== Change summary for alert #35384 (as of Wed, 14 Sep 2022 01:01:28 GMT) ==

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
21% pdfpaint linux1804-64-shippable-qr e10s fission stylo webrender 979.43 -> 770.29
17% pdfpaint linux1804-64-shippable-qr e10s fission stylo webrender-sw 904.35 -> 750.90
11% pdfpaint macosx1015-64-shippable-qr e10s fission stylo webrender-sw 791.41 -> 706.21
11% pdfpaint macosx1015-64-shippable-qr e10s fission stylo webrender 793.21 -> 707.97

For up to date results, see: https://treeherder.mozilla.org/perfherder/alerts?id=35384

Regressions: 1799495
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: