YouTube: Investigate rasterization times for bug 1363335

RESOLVED FIXED in Firefox 56

Status

()

Core
Graphics
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: Ehsan, Assigned: lsalzman)

Tracking

unspecified
mozilla56
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox56 fixed)

Details

(Whiteboard: [gfx-noted][QRC_Analyzed][qf:p3][QRC])

Attachments

(1 attachment)

(Reporter)

Description

a year ago
There are some long running rasterizations in the profiles for bug 1363335.  For example, see https://perfht.ml/2rgF0Lj.  There is a 70ms and a 50ms rasterization, among others.

Bas, do you mind having a look at these times and seeing what we can do about them?  This is part of the release criteria so we should do all at our disposal to bring down these times.  Thanks!
Flags: needinfo?(bas)
(Reporter)

Updated

a year ago
Blocks: 1363335
Whiteboard: [gfx-noted]
The 70ms one seems to be 'uploading a bunch of things' (but that still in total only 13ms of the rasterize step). Other than that, nothing stands out in these rasterizes from a quick glance. It seems to just be 'drawing lots of stuff'. Was this on a high-DPI screen? How much content was onscreen? Perhaps there is something we can do about before scrolling executes first getting what's visible in the viewport drawn and then drawing the rest? That would be a non-insignificant amount of work though.
Flags: needinfo?(bas)
Whiteboard: [gfx-noted] → [gfx-noted][QRC_Analyzed][qf]
Bas, you can now look at the profile from the reference hardware, from bug 1363335 comment 4, which answers some of the questions about the hardware.  The workflow is captured in bug 1363335 comment 1 so it should be easy to answer the other questions.
Assignee: nobody → bas
Flags: needinfo?(bas)
Summary: Investigate rasterization times for bug 1363335 → YouTube: Investigate rasterization times for bug 1363335
Whiteboard: [gfx-noted][QRC_Analyzed][qf] → [gfx-noted][QRC_Analyzed][qf][QRC]
(In reply to Milan Sreckovic [:milan] from comment #2)
> Bas, you can now look at the profile from the reference hardware, from bug
> 1363335 comment 4, which answers some of the questions about the hardware. 
> The workflow is captured in bug 1363335 comment 1 so it should be easy to
> answer the other questions.

For some reason that profile is missing symbols for D2D, but that doesn't matter much since it shows about 70% of the time spent in glyph rasterization in one of the very long rasterization calls. That isn't our code so it's not obvious what we can do. It's possible we're overdrawing glyphs, that's in layout land but reducing the amount of glyphs we draw will obviously help. Alternatively it's possible coalescing glyph runs better will offer improvement, however the slow code appears to be 'per glyph' code inside DWrite, not 'per glyph run' code so it's not obvious that would help.
Flags: needinfo?(bas)
Can we profile with D2D disabled, using Skia instead?  How is Chrome doing on this example?
Flags: needinfo?(bas)
p3 unless it shows up as part of a profile that is impacting user experience.
Whiteboard: [gfx-noted][QRC_Analyzed][qf][QRC] → [gfx-noted][QRC_Analyzed][qf:p3][QRC]
(Assignee)

Comment 6

a year ago
Created attachment 8878646 [details] [diff] [review]
optimize box blur surfaces for destination draw target

As part of bug 1357692, I introduced a change that attempted to not use HW accel for box shadows under a certain size to limit the overhead of D2D resource creation. This introduced the other problem that now every time a box shadow is drawn in software, it must be uploaded to D2D if that is the destination draw target type.

For mirrored box shadows that do several draws using the surface, this can mean several individual uploads, at least one for each quadrant drawn. Even if the box shadow is cached, since we're caching the software surface, that means it is still uploading this surface again every time we go to use the result from cache. This can be avoided by just optimizing the blurred surface for the destination draw target before we put it in the cache. That way, there is no upload cost the next time we go to use a cached result. And even ignoring the cache, this means the surface only gets uploaded once, even if we need to draw multiple quadrants from it via multiple FillRects when using the mirroring/minimization optimizations.

In this particular profile, it looks like we have a youtube video with text overlays using box shadows, as is common on youtube videos. This is unfortunately making up the majority of the cost of these 70ms and 50ms paints. This patch should help this greatly.
Assignee: bas → lsalzman
Status: NEW → ASSIGNED
Flags: needinfo?(bas)
Attachment #8878646 - Flags: review?(mchang)
Comment on attachment 8878646 [details] [diff] [review]
optimize box blur surfaces for destination draw target

Review of attachment 8878646 [details] [diff] [review]:
-----------------------------------------------------------------

Yay.
Attachment #8878646 - Flags: review?(mchang) → review+
(Assignee)

Updated

a year ago
Blocks: 1357692

Comment 8

a year ago
Pushed by lsalzman@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/46868432ff69
optimize box blur surfaces for destination draw target. r=mchang
https://hg.mozilla.org/mozilla-central/rev/46868432ff69
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
status-firefox56: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56

Updated

a year ago
Duplicate of this bug: 1371643
You need to log in before you can comment on or make changes to this bug.