Closed
Bug 328811
Opened 19 years ago
Closed 18 years ago
very slow text rendering on linux
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
WORKSFORME
mozilla1.9alpha8
People
(Reporter: vlad, Unassigned)
References
Details
(Keywords: perf, Whiteboard: linux-platform)
Attachments
(1 obsolete file)
On linux, with the default options, we render text using subpixel AA. This means that we render text using component alpha, which is the slowest of the slow paths in compositing: we do a general argb -> argb composite through a mask with component alpha. My oprofile was showing us spending over 65% of all system time in fbCompositeSolidMask_nx8888x8888Cmmx in the X server.
Reporter | ||
Comment 1•19 years ago
|
||
This patch makes grayscale AA the default unless an env var is specified; as I mention in the comments, this type of init should go in the widget code, but this is a bandaid to make cairo linux builds usable until we have time to debug and otherwise fix this slow path. (Note that our current builds aren't anywhere near as slow, so there's certainly some improvement that we can make.)
Attachment #213424 -
Flags: review?(roc)
Reporter | ||
Comment 2•19 years ago
|
||
Bleh, I should look at patches before I post them; ignore everything but the first chunk in that patch (the CurrentPoint and NegativeArc defs).
Is it worth doing this instead of just fixing the problem?
Reporter | ||
Comment 4•19 years ago
|
||
Here's what oprofile says, with FC5t3 and nvidia drivers, RenderAccel turned off, and with this patch applied. The test case is the Trender misc-text1 test case, at http://rig.vlad1.com/~vladimir/misc/misc-test1.html running through my time-render bookmarklet, linked from http://blog.vlad1.com/archives/2005/10/28/74/ (increase the c variable to something like 100 to get a reasonable time during which to collect oprofile data).
samples % image name app name symbol name
751330 46.5737 libfb.so libfb.so fbCompositeSolidMask_nx8x8888mmx
238393 14.7776 Xorg Xorg (no symbols)
40438 2.5067 libglib-2.0.so.0.1000.0 libglib-2.0.so.0.1000.0 (no symbols)
28987 1.7969 libc-2.3.91.so libc-2.3.91.so _int_malloc
28687 1.7783 nvidia_drv.so nvidia_drv.so _nv001743X
24726 1.5327 libextmod.so libextmod.so XvDestroyPixmap
23466 1.4546 libfb.so libfb.so fbComposite
21604 1.3392 libpango-1.0.so.0.1199.0 libpango-1.0.so.0.1199.0 pango_default_break
20582 1.2758 libthebes.so libthebes.so _cairo_xlib_surface_old_show_glyphs
20532 1.2727 libfb.so libfb.so fbCompositeSrcAdd_8000x8000mmx
18313 1.1352 libfb.so libfb.so fbSolidFillmmx
17990 1.1152 libm-2.3.91.so libm-2.3.91.so floor
17740 1.0997 libc-2.3.91.so libc-2.3.91.so memcpy
17476 1.0833 libthebes.so libthebes.so _cairo_hash_table_lookup_internal
15879 0.9843 libthebes.so libthebes.so gfxPangoTextRun::DrawString(gfxContext*, gfxPoint)
15877 0.9842 libXft.so.2.1.2 libXft.so.2.1.2 (no symbols)
15671 0.9714 libthebes.so libthebes.so _cairo_scaled_font_glyph_device_extents
14574 0.9034 libc-2.3.91.so libc-2.3.91.so _int_free
14245 0.8830 nvidia_drv.so nvidia_drv.so _nv000310X
11672 0.7235 libgobject-2.0.so.0.1000.0 libgobject-2.0.so.0.1000.0 (no symbols)
With nvidia's RenderAccel turned on, 81.78 drivers, this doesn't change much (the resulting time is maybe 1% faster), except that we go through an nvidia private compositing function (which I'd guess isn't accelerated at all!):
samples % image name app name symbol name
741137 56.2159 nvidia_drv.so nvidia_drv.so _nv000990X
52919 4.0140 Xorg Xorg (no symbols)
40007 3.0346 nvidia_drv.so nvidia_drv.so _nv002111X
36694 2.7833 libglib-2.0.so.0.1000.0 libglib-2.0.so.0.1000.0 (no symbols)
20112 1.5255 nvidia_drv.so nvidia_drv.so _nv000417X
19430 1.4738 libthebes.so libthebes.so _cairo_xlib_surface_old_show_glyphs
19376 1.4697 libpango-1.0.so.0.1199.0 libpango-1.0.so.0.1199.0 pango_default_break
18788 1.4251 nvidia_drv.so nvidia_drv.so _nv000052X
16222 1.2305 libm-2.3.91.so libm-2.3.91.so floor
I'm not sure that that (no symbols) is for Xorg, though; I have the debuginfo package installed and all that.
Reporter | ||
Comment 5•19 years ago
|
||
Scratch that; Xorg was suid root, so only root had read-access to the binary. Here's the no-nvidia-accel profile again:
samples % image name app name symbol name
751330 46.5737 libfb.so libfb.so fbCompositeSolidMask_nx8x8888mmx
28987 1.7969 libc-2.3.91.so libc-2.3.91.so _int_malloc
28687 1.7783 nvidia_drv.so nvidia_drv.so _nv001743X
26121 1.6192 Xorg Xorg miGlyphs
24726 1.5327 libextmod.so libextmod.so XvDestroyPixmap
23466 1.4546 libfb.so libfb.so fbComposite
21604 1.3392 libpango-1.0.so.0.1199.0 libpango-1.0.so.0.1199.0 pango_default_break
20582 1.2758 libthebes.so libthebes.so _cairo_xlib_surface_old_show_glyphs
20532 1.2727 libfb.so libfb.so fbCompositeSrcAdd_8000x8000mmx
18313 1.1352 libfb.so libfb.so fbSolidFillmmx
17990 1.1152 libm-2.3.91.so libm-2.3.91.so floor
17740 1.0997 libc-2.3.91.so libc-2.3.91.so memcpy
17476 1.0833 libthebes.so libthebes.so _cairo_hash_table_lookup_internal
15879 0.9843 libthebes.so libthebes.so gfxPangoTextRun::DrawString(gfxContext*, gfxPoint)
15671 0.9714 libthebes.so libthebes.so _cairo_scaled_font_glyph_device_extents
15557 0.9644 Xorg Xorg miComputeCompositeRegion
14574 0.9034 libc-2.3.91.so libc-2.3.91.so _int_free
14534 0.9009 Xorg Xorg CompareISOLatin1Lowered
14245 0.8830 nvidia_drv.so nvidia_drv.so _nv000310X
13662 0.8469 Xorg Xorg miModifyPixmapHeader
12576 0.7796 Xorg Xorg __i686.get_pc_thunk.bx
10603 0.6573 Xorg Xorg FindGlyphRef
10403 0.6449 Xorg Xorg FreePicture
10368 0.6427 Xorg Xorg ProcRenderCompositeGlyphs
10210 0.6329 Xorg Xorg damageComposite
9235 0.5725 Xorg Xorg miCompositeSourceValidate
What does a non-cairo Xft-subpixel-AA profile look like?
Reporter | ||
Comment 7•19 years ago
|
||
I'll fire off a non-cairo build in a sec. However, I just added a very hacked-up Xft fast path that draws glyphs using Xft instead of going through cairo_show_glyphs; here are some timings:
Xft cairo_show_glyphs cairo+nvidia render accel
subpixel 275ms 363ms 369ms
gray 279ms 351ms 335ms
This is on a different machine (desktop rather than laptop) than when I originally tested the difference between subpixel and gray AA; I don't have any idea why the subpixel case was so much slower for me originally, because I can't reproduce that right now.
Reporter | ||
Comment 8•19 years ago
|
||
Current non-cairo Xft:
subpixel 252
gray 260
(why the heck is the gray case consistently slower??)
Reporter | ||
Comment 9•19 years ago
|
||
subpixel AA, noncairo Xft profile:
samples % image name app name symbol name
468604 68.1154 libfb.so libfb.so fbCompositeSolidMask_nx8x8888mmx
15234 2.2144 libfb.so libfb.so fbComposite
13449 1.9549 nvidia_drv.so nvidia_drv.so _nv001743X
12493 1.8160 libfontconfig.so.1.0.4 libfontconfig.so.1.0.4 (no symbols)
12375 1.7988 libfb.so libfb.so fbCompositeSrcAdd_8000x8000mmx
11220 1.6309 Xorg Xorg miComputeCompositeRegion
8008 1.1640 libfb.so libfb.so fbSolidFillmmx
7495 1.0895 libgkgfx.so libgkgfx.so NSToCoordRound(float)
7415 1.0778 Xorg Xorg damageComposite
7069 1.0275 Xorg Xorg FindGlyphRef
6628 0.9634 libXft.so.2.1.2 libXft.so.2.1.2 XftGlyphFontSpecRender
6535 0.9499 nvidia_drv.so nvidia_drv.so _nv000794X
6066 0.8817 libXft.so.2.1.2 libXft.so.2.1.2 XftCharIndex
6065 0.8816 Xorg Xorg CompositePicture
5900 0.8576 nvidia_drv.so nvidia_drv.so _nv000990X
4967 0.7220 libgfx_gtk.so libgfx_gtk.so nsFontXft::DrawStringSpec(unsigned int*, unsigned int, void*)
4931 0.7168 Xorg Xorg miGlyphs
4433 0.6444 Xorg Xorg miModifyPixmapHeader
4275 0.6214 Xorg Xorg miGlyphExtents
4135 0.6011 libXft.so.2.1.2 libXft.so.2.1.2 XftGlyphExtents
I think I just proved that rendering text sucks on linux, no matter how you do it. I guess I'll stop now.
Reporter | ||
Updated•19 years ago
|
Attachment #213424 -
Attachment is obsolete: true
Attachment #213424 -
Flags: review?(roc)
Reporter | ||
Comment 10•19 years ago
|
||
Cairo build using Xft with no AA gives me 74ms.
Updated•19 years ago
|
Flags: blocking1.9a1?
Reporter | ||
Updated•19 years ago
|
Flags: blocking1.9a1? → blocking1.9+
Reporter | ||
Updated•18 years ago
|
Assignee: vladimir → nobody
Whiteboard: linux-platform
Target Milestone: --- → mozilla1.9alpha6
Comment 11•18 years ago
|
||
punting remaining a6 bugs to b1, all of these shipped in a5, so we're at least no worse off by doing so.
Target Milestone: mozilla1.9alpha6 → mozilla1.9beta1
Comment 12•18 years ago
|
||
i'm not sure this is still much of an issue -- text renders pretty quickly for me on linux. can we close this/let other bugs take its place?
Updated•18 years ago
|
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Updated•14 years ago
|
Resolution: FIXED → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•