Closed Bug 328811 Opened 18 years ago Closed 17 years ago

very slow text rendering on linux

Categories

(Core :: Graphics, defect)

x86
Linux
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
mozilla1.9alpha8

People

(Reporter: vlad, Unassigned)

References

Details

(Keywords: perf, Whiteboard: linux-platform)

Attachments

(1 obsolete file)

On linux, with the default options, we render text using subpixel AA.  This means that we render text using component alpha, which is the slowest of the slow paths in compositing: we do a general argb -> argb composite through a mask with component alpha.  My oprofile was showing us spending over 65% of all system time in fbCompositeSolidMask_nx8888x8888Cmmx in the X server.
Attached patch disable subpixel aa by default (obsolete) — Splinter Review
This patch makes grayscale AA the default unless an env var is specified; as I mention in the comments, this type of init should go in the widget code, but this is a bandaid to make cairo linux builds usable until we have time to debug and otherwise fix this slow path.  (Note that our current builds aren't anywhere near as slow, so there's certainly some improvement that we can make.)
Attachment #213424 - Flags: review?(roc)
Bleh, I should look at patches before I post them; ignore everything but the first chunk in that patch (the CurrentPoint and NegativeArc defs).
Is it worth doing this instead of just fixing the problem?
Here's what oprofile says, with FC5t3 and nvidia drivers, RenderAccel turned off, and with this patch applied.  The test case is the Trender misc-text1 test case, at http://rig.vlad1.com/~vladimir/misc/misc-test1.html running through my time-render bookmarklet, linked from http://blog.vlad1.com/archives/2005/10/28/74/ (increase the c variable to something like 100 to get a reasonable time during which to collect oprofile data).

samples  %        image name               app name                 symbol name
751330   46.5737  libfb.so                 libfb.so                 fbCompositeSolidMask_nx8x8888mmx
238393   14.7776  Xorg                     Xorg                     (no symbols)
40438     2.5067  libglib-2.0.so.0.1000.0  libglib-2.0.so.0.1000.0  (no symbols)
28987     1.7969  libc-2.3.91.so           libc-2.3.91.so           _int_malloc
28687     1.7783  nvidia_drv.so            nvidia_drv.so            _nv001743X
24726     1.5327  libextmod.so             libextmod.so             XvDestroyPixmap
23466     1.4546  libfb.so                 libfb.so                 fbComposite
21604     1.3392  libpango-1.0.so.0.1199.0 libpango-1.0.so.0.1199.0 pango_default_break
20582     1.2758  libthebes.so             libthebes.so             _cairo_xlib_surface_old_show_glyphs
20532     1.2727  libfb.so                 libfb.so                 fbCompositeSrcAdd_8000x8000mmx
18313     1.1352  libfb.so                 libfb.so                 fbSolidFillmmx
17990     1.1152  libm-2.3.91.so           libm-2.3.91.so           floor
17740     1.0997  libc-2.3.91.so           libc-2.3.91.so           memcpy
17476     1.0833  libthebes.so             libthebes.so             _cairo_hash_table_lookup_internal
15879     0.9843  libthebes.so             libthebes.so             gfxPangoTextRun::DrawString(gfxContext*, gfxPoint)
15877     0.9842  libXft.so.2.1.2          libXft.so.2.1.2          (no symbols)
15671     0.9714  libthebes.so             libthebes.so             _cairo_scaled_font_glyph_device_extents
14574     0.9034  libc-2.3.91.so           libc-2.3.91.so           _int_free
14245     0.8830  nvidia_drv.so            nvidia_drv.so            _nv000310X
11672     0.7235  libgobject-2.0.so.0.1000.0 libgobject-2.0.so.0.1000.0 (no symbols)



With nvidia's RenderAccel turned on, 81.78 drivers, this doesn't change much (the resulting time is maybe 1% faster), except that we go through an nvidia private compositing function (which I'd guess isn't accelerated at all!):

samples  %        image name               app name                 symbol name
741137   56.2159  nvidia_drv.so            nvidia_drv.so            _nv000990X
52919     4.0140  Xorg                     Xorg                     (no symbols)
40007     3.0346  nvidia_drv.so            nvidia_drv.so            _nv002111X
36694     2.7833  libglib-2.0.so.0.1000.0  libglib-2.0.so.0.1000.0  (no symbols)
20112     1.5255  nvidia_drv.so            nvidia_drv.so            _nv000417X
19430     1.4738  libthebes.so             libthebes.so             _cairo_xlib_surface_old_show_glyphs
19376     1.4697  libpango-1.0.so.0.1199.0 libpango-1.0.so.0.1199.0 pango_default_break
18788     1.4251  nvidia_drv.so            nvidia_drv.so            _nv000052X
16222     1.2305  libm-2.3.91.so           libm-2.3.91.so           floor

I'm not sure that that (no symbols) is for Xorg, though; I have the debuginfo package installed and all that.
Scratch that; Xorg was suid root, so only root had read-access to the binary.  Here's the no-nvidia-accel profile again:

samples  %        image name               app name                 symbol name
751330   46.5737  libfb.so                 libfb.so                 fbCompositeSolidMask_nx8x8888mmx
28987     1.7969  libc-2.3.91.so           libc-2.3.91.so           _int_malloc
28687     1.7783  nvidia_drv.so            nvidia_drv.so            _nv001743X
26121     1.6192  Xorg                     Xorg                     miGlyphs
24726     1.5327  libextmod.so             libextmod.so             XvDestroyPixmap
23466     1.4546  libfb.so                 libfb.so                 fbComposite
21604     1.3392  libpango-1.0.so.0.1199.0 libpango-1.0.so.0.1199.0 pango_default_break
20582     1.2758  libthebes.so             libthebes.so             _cairo_xlib_surface_old_show_glyphs
20532     1.2727  libfb.so                 libfb.so                 fbCompositeSrcAdd_8000x8000mmx
18313     1.1352  libfb.so                 libfb.so                 fbSolidFillmmx
17990     1.1152  libm-2.3.91.so           libm-2.3.91.so           floor
17740     1.0997  libc-2.3.91.so           libc-2.3.91.so           memcpy
17476     1.0833  libthebes.so             libthebes.so             _cairo_hash_table_lookup_internal
15879     0.9843  libthebes.so             libthebes.so             gfxPangoTextRun::DrawString(gfxContext*, gfxPoint)
15671     0.9714  libthebes.so             libthebes.so             _cairo_scaled_font_glyph_device_extents
15557     0.9644  Xorg                     Xorg                     miComputeCompositeRegion
14574     0.9034  libc-2.3.91.so           libc-2.3.91.so           _int_free
14534     0.9009  Xorg                     Xorg                     CompareISOLatin1Lowered
14245     0.8830  nvidia_drv.so            nvidia_drv.so            _nv000310X
13662     0.8469  Xorg                     Xorg                     miModifyPixmapHeader
12576     0.7796  Xorg                     Xorg                     __i686.get_pc_thunk.bx
10603     0.6573  Xorg                     Xorg                     FindGlyphRef
10403     0.6449  Xorg                     Xorg                     FreePicture
10368     0.6427  Xorg                     Xorg                     ProcRenderCompositeGlyphs
10210     0.6329  Xorg                     Xorg                     damageComposite
9235      0.5725  Xorg                     Xorg                     miCompositeSourceValidate
What does a non-cairo Xft-subpixel-AA profile look like?
I'll fire off a non-cairo build in a sec.  However, I just added a very hacked-up Xft fast path that draws glyphs using Xft instead of going through cairo_show_glyphs; here are some timings:

            Xft   cairo_show_glyphs  cairo+nvidia render accel
subpixel   275ms      363ms              369ms
gray       279ms      351ms              335ms

This is on a different machine (desktop rather than laptop) than when I originally tested the difference between subpixel and gray AA; I don't have any idea why the subpixel case was so much slower for me originally, because I can't reproduce that right now.
Current non-cairo Xft:

subpixel   252
gray       260

(why the heck is the gray case consistently slower??)

subpixel AA, noncairo Xft profile:

samples  %        image name               app name                 symbol name
468604   68.1154  libfb.so                 libfb.so                 fbCompositeSolidMask_nx8x8888mmx
15234     2.2144  libfb.so                 libfb.so                 fbComposite
13449     1.9549  nvidia_drv.so            nvidia_drv.so            _nv001743X
12493     1.8160  libfontconfig.so.1.0.4   libfontconfig.so.1.0.4   (no symbols)
12375     1.7988  libfb.so                 libfb.so                 fbCompositeSrcAdd_8000x8000mmx
11220     1.6309  Xorg                     Xorg                     miComputeCompositeRegion
8008      1.1640  libfb.so                 libfb.so                 fbSolidFillmmx
7495      1.0895  libgkgfx.so              libgkgfx.so              NSToCoordRound(float)
7415      1.0778  Xorg                     Xorg                     damageComposite
7069      1.0275  Xorg                     Xorg                     FindGlyphRef
6628      0.9634  libXft.so.2.1.2          libXft.so.2.1.2          XftGlyphFontSpecRender
6535      0.9499  nvidia_drv.so            nvidia_drv.so            _nv000794X
6066      0.8817  libXft.so.2.1.2          libXft.so.2.1.2          XftCharIndex
6065      0.8816  Xorg                     Xorg                     CompositePicture
5900      0.8576  nvidia_drv.so            nvidia_drv.so            _nv000990X
4967      0.7220  libgfx_gtk.so            libgfx_gtk.so            nsFontXft::DrawStringSpec(unsigned int*, unsigned int, void*)
4931      0.7168  Xorg                     Xorg                     miGlyphs
4433      0.6444  Xorg                     Xorg                     miModifyPixmapHeader
4275      0.6214  Xorg                     Xorg                     miGlyphExtents
4135      0.6011  libXft.so.2.1.2          libXft.so.2.1.2          XftGlyphExtents

I think I just proved that rendering text sucks on linux, no matter how you do it.  I guess I'll stop now.
Blocks: 266582
Attachment #213424 - Attachment is obsolete: true
Attachment #213424 - Flags: review?(roc)
Cairo build using Xft with no AA gives me 74ms.
Keywords: perf
Blocks: 334720
Flags: blocking1.9a1?
Flags: blocking1.9a1? → blocking1.9+
Assignee: vladimir → nobody
Whiteboard: linux-platform
Target Milestone: --- → mozilla1.9alpha6
punting remaining a6 bugs to b1, all of these shipped in a5, so we're at least no worse off by doing so.
Target Milestone: mozilla1.9alpha6 → mozilla1.9beta1
i'm not sure this is still much of an issue -- text renders pretty quickly for me on linux.  can we close this/let other bugs take its place?
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Resolution: FIXED → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: