Closed Bug 491595 Opened 14 years ago Closed 14 years ago

Canvas: drawing text outside of the visible canvas is very slow


(Core :: Graphics, defect)

Not set





(Reporter: stas, Assigned: roc)




(Keywords: fixed1.9.1, perf)


(7 files, 1 obsolete file)

Canvas method fillText() is very slow when a big text is drawn and it doesn't fit into the visible part of canvas. It seems that we're drawing the whole text instead of just clipping the path to fit in the viewport.

Attached is a suite of testcases that I used to test the performance.
Attached file Testcase
That's a pretty awesome set of test cases.
It's confirmed macos only.

On Linux clip() is fastest, on Windows source-in is.

On Safari/Webkit increasing font instead of scaling is way faster and it seems they never redraw what's outside of the viewport.
Convolving and integrating in my RGB?

It's more likely than you think!

	0.0%	79.0%	CoreGraphics	                                       CGContextShowGlyphsWithAdvances	
	0.0%	79.0%	CoreGraphics	                                        draw_glyphs	
	0.0%	79.0%	libRIP.A.dylib	                                         ripc_DrawGlyphs	
	0.0%	78.8%	CoreGraphics	                                          CGGlyphLockLockGlyphBitmaps	
	0.0%	78.8%	CoreGraphics	                                           create_missing_bitmaps	
	0.0%	78.7%	CoreGraphics	                                            CGFontCreateGlyphBitmaps	
	0.0%	78.7%	CoreGraphics	                                             CGFontCreateGlyphBitmaps32	
	0.0%	78.7%	CoreGraphics	                                              CGFontGetGlyphPaths	
	0.0%	78.7%	CoreGraphics	                                               glyph_path_end	
	1.3%	77.8%	CoreGraphics	                                                CGSScanconverterRenderMask	
	76.4%	76.4%	CoreGraphics	                                                 CGSScanConvolveAndIntegrateRGB
It definitely looks like the CG glyph cache is not working.

I thought perhaps the problem might be that we're drawing the same CGFont at different scales, and possibly CG only caches glyphs for a single CTM per CGFont. But the test that changes the font size uses separate CGFonts with a constant CTM and that is just as slow.
One thing that's clear is that if you shorten the string to just "A", there is no problem. So CG is definitely rasterizing all glyphs without looking at the clip extents to see if a glyph is visible. It might be worthwhile for us to do that; even if the glyph cache is working, if we're doing rotation so that a lot of (glyph, transform) pairs are never actually visible, we'd be wasting a lot of work rasterizing those glyphs.
Yea, I confirm that shortening the string solves the problem, so it has to be CG rasterising glyphs each frame.

I'm not sure if not rasterizing off-screen glyphs will solve the problem because we still can do a lot of harm by just rotating/scaling/moving the very same glyph at the enormous scale inside the viewport.

Should we narrow this down to not caching glyphs and threat ignoring off-screen glyphs as a nice addition?
(In reply to comment #7)
> Should we narrow this down to not caching glyphs and threat ignoring off-screen
> glyphs as a nice addition?

The problem is that "not caching glyphs" isn't something we can fix, whereas "ignoring off-screen glyphs" is something we can do in our code..
As I tried to say in comment #6, even if the glyphs are properly cached, there are lots of examples where ignoring glyphs outside the clip region would still save a lot of work. So we should do that. I'll give it a go, I think it should be fairly straightforward since we already have an efficient "loose" glyph bounds cache that will be fine for this.

I'd kind of like to know why CG glyph caching is behaving the way it is, for future reference, but figuring it out would be a huge pain and I don't think we have anyone volunteering to do that. The best approach would probably be to write a small test program equivalent to what we do, and see if we can get someone at Apple to explain why it's not receiving the benefits of glyph caching.
Assignee: nobody → roc
Flags: wanted1.9.1?
This patch sets doUseIntermediateSurface to always be true for text fills. It makes us uniformly fast for all the fillText tests! WHY???
Hmm.  I wonder if there's just no clip set on the canvas surface, which is confusing CG?  Though with a group there isn't a clip either, so.. hm.
(In reply to comment #10)
> Created an attachment (id=376553) [details]
> always PushGroup when drawing text
> This patch sets doUseIntermediateSurface to always be true for text fills. It
> makes us uniformly fast for all the fillText tests! WHY???

I could not reproduce the improvement on my mozilla-central build with this patch applied.
My results were the same with and without the patch.
I think this patch is reasonable. If we have glyph extents information, then we use it to drop glyphs that are outside the clip region. The overhead in the Draw path looks pretty low.

We do have to add TEXT_NEEDS_BOUNDING_BOX to nsCanvasRenderingContext2D's textruns, to make that extents data available. This will slow down canvas text drawing a little bit in general, but this tradeoff is unavoidable as far as I can see. Extents are cached aggressively so I expect that heavy canvas text users like Bespin won't notice anything.

With this patch, the testcase gives about the same results whether the test string is "A" or some long string. That's good. However, source-over fillText() is still slower than source-in and mozPathText() by nearly a factor of two, which suggests to me that the CG glyph cache is still not working.

(I wonder whether source-in is fast here because CG gives up on regular glyph rasterization for some reason and just fills the path instead.)
I revived my dormant 133t sk1llz and started reverse-engineering CoreGraphics a bit. CGFontCachePrint is a useful little function, and running it at every fillText call shows us a clear picture of what's going on. Basically the glyph cache size is fixed at 1000000 bytes (at least, I think they're bytes), and we're thrashing it.
Indeed, calling CGFontCacheSetMaxSize(..., 0xd0eedb0, 10*1024*1024) stops us thrashing and makes the test run about twice as fast.

CGFontCacheSetMaxSize is undocumented of course, but WKSetUpFontCache in WebKitSystemInterface.h (Webkit's private interface to undocumented OS X functionality) uses it. In Safari 3.2 with the standard Leopard system Webkit, WKSetUpFontCache calls CGFontCacheSetMaxSize with a parameter of 6MB, which explains why that Webkit would beat us.

Curiously, on trunk Webkit WKSetUpFontCache does not call CGFontCacheSetMaxSize, it just calls CGFontSetShouldUseMulticache(true), and I'm not sure what that does. Of course Safari 4 itself might still be calling CGFontCacheSetMaxSize. The call to CGFontCacheSetMaxSize was removed in this commit:
which I don't understand yet.

I guess we should try calling CGFontSetShouldUseMulticache to see if that helps, and if it doesn't, we could consider wrangling CGFontCacheSetMaxSize. Although maybe it doesn't exist in Snow Leopard?
CGFontSetShouldUseMulticache doesn't seem to make a difference. Calling CGFontCacheSetMaxSize to set the size above 3MB helps a lot.

So I don't know of any undocumented stuff that *trunk* Webkit is doing that would make it faster.
This patch makes us do the cache size mojo that Leopard's Webkit does.
Another thing we could consider doing is making all text above some size threshold (in device space) be drawn using path filling, on the assumption that the glyphs will be so large that we'll win due to not having to fill the whole path due to clipping, and due to not busting glyph caches.
Oops, I forgot to attach this glyph cache dump.
Looks like someone in cairo already had this idea! This patch just lets us use a much more reasonable font size threshold if the surface supports native fills (in which case we assume filling is reasonably fast).

This patch alone, without the other patches, nails the testcase in this bug.
I lost my URL to the "real" demo that inspired this bug. It would be great if someone could re-test that demo with the patch in comment #21, the patch in comment #14, and both patches.
Comment on attachment 376659 [details] [diff] [review]
use path filling for large font sizes

This patch is a no-brainer, I reckon.
Attachment #376659 - Flags: review?(jmuizelaar)
No-brainer patch looks good to me; maybe kick off a try server run with the glyph cache magic thing, see if it makes a difference on Tp?
I recompiled with one and I recompiled with the other and both gave no signs of improvement for me.

I'm wondering if I might recompile improperly after applying patches.
Comment on attachment 376659 [details] [diff] [review]
use path filling for large font sizes

I'll take that as Vlad's r+
Attachment #376659 - Flags: review?(jmuizelaar) → review+
Gandalf: it sounds like you aren't rebuilding properly... you did a full build?
Whiteboard: [needs landing]
No, I just rebuilt the components modified by the patch. Should I do a full rebuild?
Yeah, that's the no-brainer way to fix things. You're on Mac so a full build shouldn't take long.

Roc, you rock! :) We're faster in all tests and the default behavior is the fastest one now!!!

We're getting back on track with the demo, thanks!
Which patches was that for?

Have you tried just the patch in comment #21 and no other patches?
No, this was with patches from Comment 21 and from Comment 13
If you try just the patch in comment #21, are the results good?
yes! it works perfectly fine with the same performance with patch from comment #21 only.
Attached patch follow-up patch (obsolete) — Splinter Review
There is a build error:

c:\cygwin\home\user\hg\mozilla-central\gfx\cairo\cairo\src\cairo-platform.h : wa
rning C4819: The file contains a character that cannot be represented in the cur
rent code page (932). Save the file in Unicode format to prevent data loss
: warning C4005: 'cairo_public' : macro redefinition
) : see previous definition of 'cairo_public'
: error C2143: syntax error : missing ';' before 'type'
: error C2065: 'path_fill_threshold' : undeclared identifier
make[1]: *** [cairo-gstate.obj] Error 2
Comment on attachment 377281 [details] [diff] [review]
follow-up patch

Oops! Fixed already.
Attachment #377281 - Attachment is obsolete: true
Checked into trunk:
Bustage fixes:

Those weren't quite just bustage fixes .... I wasn't sure what was going on so I moved the size threshold calculation code into cairo-surface.c, where it seems to be better belong. Hijoyuki's patch makes it obvious the problem was declaring a variable not at the begining of a function in a C file. (Too bad the compiler error message is completely useless.)
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [needs landing] → [needs 191 approval]
Comment on attachment 376659 [details] [diff] [review]
use path filling for large font sizes

Attachment #376659 - Flags: approval1.9.1? → approval1.9.1+
Whiteboard: [needs 191 approval] → [needs 191 landing]
You need to log in before you can comment on or make changes to this bug.