Closed Bug 1360222 Opened 7 years ago Closed 1 year ago

CanvasRenderingContext2D.fillText(text, x, y) is 4x slower in Firefox than in Chromium

Categories

(Core :: Graphics: Canvas2D, defect, P4)

53 Branch
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mathieu, Assigned: nical)

References

Details

(Whiteboard: [gfx-noted])

Attachments

(2 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3053.3 Safari/537.36 OPR/46.0.2567.0 (Edition developer)

Steps to reproduce:

The following page renders 2048 individual characters using ctx.fillText(string[index], x, y);

http://jsbin.com/mukaxeneti/edit?html,output


Actual results:

On my laptop, each frame renders in ~26ms in Firefox 53, ~16ms in Edge 14 and ~7ms in Chromium 59




Expected results:

The frames should render under 10ms in Firefox
Component: Untriaged → Canvas: 2D
Product: Firefox → Core
Sorry, this slipped under the radar. Do you still have this test case lying around?
This could be due to D2D being slow at this sort of things. Edge runs canvas commands on a separate thread which may mitigate this if the main thread is busy but firefox does't do that.

Could you try to run the same test case with the pref "gfx.canvas.azure.backends" set to "skia" in about:config?
Assignee: nobody → nical.bugzilla
Flags: needinfo?(mathieu)
Priority: -- → P4
Whiteboard: [gfx-noted]
The test case is still at http://jsbin.com/mukaxeneti/edit?html,output

Setting "gfx.canvas.azure.backends" to "skia" and restarting Firefox 53 didn't change the perf. Still ~26ms in Firefox 53 vs ~7ms in Chrome 59
Flags: needinfo?(mathieu)
Profile of the test case (using the skia backend). https://perfht.ml/2sqe34R
less than a quarter of the time is spent in DrawTarget::FillGlyphs which is full of sadness. having off main thread canvas would help a whole lot here but we should be able to improve things significantly without that.
First low hanging fruit. Getting the style context is expensive, we can cache it for at least the duration of the frame (we could cache it for much longer but I don't know exactly if/when a canvas can move to another style context so let's play it safe for now).
Attachment #8884001 - Flags: review?(bas)
Another paper cut is the creation of temporary gfxContexts, which is pure technical debt. It'd be nice to remove that.
Another one is that call to Telemetry::Accumulate in gfxFontCache::Lookup. The doc [1] warns about using this in performance sensitive places. It should not be too hard to maintain a counter in the font cache and flush it to telemetry less often, and it would probably benefit non-canvas code paths as well. Arguably this one is a small chunk of the profile (about 5%) but a lot of the inefficiencies here are close to this amount and add up.

[1] https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/collection/histograms.html
Thanks for investigating.

What does the perf look like with this patch ?


(In reply to Nicolas Silva [:nical] from comment #4)

> Created attachment 8884001 [details] [diff] [review]
> Cache the style context for the duration of the frame.
> 
> First low hanging fruit. Getting the style context is expensive, we can
> cache it for at least the duration of the frame (we could cache it for much
> longer but I don't know exactly if/when a canvas can move to another style
> context so let's play it safe for now).
(In reply to Mathieu 'p01' Henri from comment #7)
> What does the perf look like with this patch ?

Still pretty terrible. Right now the per-drawText call overhead is death by a thousand cuts, so it'll be a matter of fixing small things one after the other until the overall time gets to something satisfying. Hopefully the common case is to render larger text runs rather than drawing glyphs one by one. In any case it won't quite catch up with chromium or edge unless we implement off-main-thread canvas (my understanding is that it's what they do). Most likely the time measured in their case is the time spent recording canvas commands to send to the canvas thread rather than the actual time spent rendering, while gecko renders the canvas in the content thread directly. Painting canvases off of the main thread is certainly the right thing to do so it's no excuse, but I'm pointing that out because we are probably not measuring exactly the same thing. Off-main-thread canvas would be a fun project but probably not something we can afford to spend time on right now.
Attachment #8884001 - Flags: review?(bas) → review+
(In reply to Nicolas Silva [:nical] from comment #8)
> (In reply to Mathieu 'p01' Henri from comment #7)
> > What does the perf look like with this patch ?
> 
> Still pretty terrible. Right now the per-drawText call overhead is death by
> a thousand cuts, so it'll be a matter of fixing small things one after the
> other until the overall time gets to something satisfying. Hopefully the
> common case is to render larger text runs rather than drawing glyphs one by
> one. In any case it won't quite catch up with chromium or edge unless we
> implement off-main-thread canvas (my understanding is that it's what they
> do). Most likely the time measured in their case is the time spent recording
> canvas commands to send to the canvas thread rather than the actual time
> spent rendering, while gecko renders the canvas in the content thread
> directly. Painting canvases off of the main thread is certainly the right
> thing to do so it's no excuse, but I'm pointing that out because we are
> probably not measuring exactly the same thing. Off-main-thread canvas would
> be a fun project but probably not something we can afford to spend time on
> right now.

How hard would it be to batch a bunch of subsequent DrawText calls and execute them in a single fillglyph? Not certain whether the usecase is common enough to do this though.

FWIW, on my machine we perform slightly worse than Edge. But not by much, so it seems they haven't bothered to make this optimization.
Flags: needinfo?(nical.bugzilla)
> How hard would it be to batch a bunch of subsequent DrawText calls and execute them in a single fillglyph? Not certain whether the usecase is common enough to do this though.

I'd rather not add complexity to specifically optimize successions of many DrawText calls with single few glyphs (unless we find out that it's a common pattern for whatever reason). I am still hopeful we can reduce the overhead of using text in canvases in the general case by fixing paper cuts here and there, though.
Flags: needinfo?(nical.bugzilla)
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true

There's a r+ patch which didn't land and no activity in this bug for 2 weeks.
:nical, could you have a look please?

Flags: needinfo?(nical.bugzilla)
Severity: normal → S3

reporters_Standalone_testcase

Profiles on my Win11x64+AMD iGPU machine:

  1. GPU-canvas: https://share.firefox.dev/43jY7Sm (7-8ms)
  2. Skia-Canvas: https://share.firefox.dev/3PPlNej (8-9ms)
  3. D2d-canvas: https://share.firefox.dev/3NHCgib (5-6ms)
  4. Alternate D2d-canvas: https://share.firefox.dev/3NZsfhv (36ms. Constant memory increase. https://share.firefox.dev/3NZsfhv) [bug 1842213]
  5. Chrome: 2-3ms

Hi Lee. Are these numbers good enough and can this bug be closed?

Flags: needinfo?(lsalzman)
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Flags: needinfo?(lsalzman)
Resolution: --- → WORKSFORME
Flags: needinfo?(nical.bugzilla)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: