Closed
Bug 328811
Opened 18 years ago
Closed 17 years ago
very slow text rendering on linux
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
WORKSFORME
mozilla1.9alpha8
People
(Reporter: vlad, Unassigned)
References
Details
(Keywords: perf, Whiteboard: linux-platform)
Attachments
(1 obsolete file)
On linux, with the default options, we render text using subpixel AA. This means that we render text using component alpha, which is the slowest of the slow paths in compositing: we do a general argb -> argb composite through a mask with component alpha. My oprofile was showing us spending over 65% of all system time in fbCompositeSolidMask_nx8888x8888Cmmx in the X server.
Reporter | ||
Comment 1•18 years ago
|
||
This patch makes grayscale AA the default unless an env var is specified; as I mention in the comments, this type of init should go in the widget code, but this is a bandaid to make cairo linux builds usable until we have time to debug and otherwise fix this slow path. (Note that our current builds aren't anywhere near as slow, so there's certainly some improvement that we can make.)
Attachment #213424 -
Flags: review?(roc)
Reporter | ||
Comment 2•18 years ago
|
||
Bleh, I should look at patches before I post them; ignore everything but the first chunk in that patch (the CurrentPoint and NegativeArc defs).
Is it worth doing this instead of just fixing the problem?
Reporter | ||
Comment 4•18 years ago
|
||
Here's what oprofile says, with FC5t3 and nvidia drivers, RenderAccel turned off, and with this patch applied. The test case is the Trender misc-text1 test case, at http://rig.vlad1.com/~vladimir/misc/misc-test1.html running through my time-render bookmarklet, linked from http://blog.vlad1.com/archives/2005/10/28/74/ (increase the c variable to something like 100 to get a reasonable time during which to collect oprofile data). samples % image name app name symbol name 751330 46.5737 libfb.so libfb.so fbCompositeSolidMask_nx8x8888mmx 238393 14.7776 Xorg Xorg (no symbols) 40438 2.5067 libglib-2.0.so.0.1000.0 libglib-2.0.so.0.1000.0 (no symbols) 28987 1.7969 libc-2.3.91.so libc-2.3.91.so _int_malloc 28687 1.7783 nvidia_drv.so nvidia_drv.so _nv001743X 24726 1.5327 libextmod.so libextmod.so XvDestroyPixmap 23466 1.4546 libfb.so libfb.so fbComposite 21604 1.3392 libpango-1.0.so.0.1199.0 libpango-1.0.so.0.1199.0 pango_default_break 20582 1.2758 libthebes.so libthebes.so _cairo_xlib_surface_old_show_glyphs 20532 1.2727 libfb.so libfb.so fbCompositeSrcAdd_8000x8000mmx 18313 1.1352 libfb.so libfb.so fbSolidFillmmx 17990 1.1152 libm-2.3.91.so libm-2.3.91.so floor 17740 1.0997 libc-2.3.91.so libc-2.3.91.so memcpy 17476 1.0833 libthebes.so libthebes.so _cairo_hash_table_lookup_internal 15879 0.9843 libthebes.so libthebes.so gfxPangoTextRun::DrawString(gfxContext*, gfxPoint) 15877 0.9842 libXft.so.2.1.2 libXft.so.2.1.2 (no symbols) 15671 0.9714 libthebes.so libthebes.so _cairo_scaled_font_glyph_device_extents 14574 0.9034 libc-2.3.91.so libc-2.3.91.so _int_free 14245 0.8830 nvidia_drv.so nvidia_drv.so _nv000310X 11672 0.7235 libgobject-2.0.so.0.1000.0 libgobject-2.0.so.0.1000.0 (no symbols) With nvidia's RenderAccel turned on, 81.78 drivers, this doesn't change much (the resulting time is maybe 1% faster), except that we go through an nvidia private compositing function (which I'd guess isn't accelerated at all!): samples % image name app name symbol name 741137 56.2159 nvidia_drv.so nvidia_drv.so _nv000990X 52919 4.0140 Xorg Xorg (no symbols) 40007 3.0346 nvidia_drv.so nvidia_drv.so _nv002111X 36694 2.7833 libglib-2.0.so.0.1000.0 libglib-2.0.so.0.1000.0 (no symbols) 20112 1.5255 nvidia_drv.so nvidia_drv.so _nv000417X 19430 1.4738 libthebes.so libthebes.so _cairo_xlib_surface_old_show_glyphs 19376 1.4697 libpango-1.0.so.0.1199.0 libpango-1.0.so.0.1199.0 pango_default_break 18788 1.4251 nvidia_drv.so nvidia_drv.so _nv000052X 16222 1.2305 libm-2.3.91.so libm-2.3.91.so floor I'm not sure that that (no symbols) is for Xorg, though; I have the debuginfo package installed and all that.
Reporter | ||
Comment 5•18 years ago
|
||
Scratch that; Xorg was suid root, so only root had read-access to the binary. Here's the no-nvidia-accel profile again: samples % image name app name symbol name 751330 46.5737 libfb.so libfb.so fbCompositeSolidMask_nx8x8888mmx 28987 1.7969 libc-2.3.91.so libc-2.3.91.so _int_malloc 28687 1.7783 nvidia_drv.so nvidia_drv.so _nv001743X 26121 1.6192 Xorg Xorg miGlyphs 24726 1.5327 libextmod.so libextmod.so XvDestroyPixmap 23466 1.4546 libfb.so libfb.so fbComposite 21604 1.3392 libpango-1.0.so.0.1199.0 libpango-1.0.so.0.1199.0 pango_default_break 20582 1.2758 libthebes.so libthebes.so _cairo_xlib_surface_old_show_glyphs 20532 1.2727 libfb.so libfb.so fbCompositeSrcAdd_8000x8000mmx 18313 1.1352 libfb.so libfb.so fbSolidFillmmx 17990 1.1152 libm-2.3.91.so libm-2.3.91.so floor 17740 1.0997 libc-2.3.91.so libc-2.3.91.so memcpy 17476 1.0833 libthebes.so libthebes.so _cairo_hash_table_lookup_internal 15879 0.9843 libthebes.so libthebes.so gfxPangoTextRun::DrawString(gfxContext*, gfxPoint) 15671 0.9714 libthebes.so libthebes.so _cairo_scaled_font_glyph_device_extents 15557 0.9644 Xorg Xorg miComputeCompositeRegion 14574 0.9034 libc-2.3.91.so libc-2.3.91.so _int_free 14534 0.9009 Xorg Xorg CompareISOLatin1Lowered 14245 0.8830 nvidia_drv.so nvidia_drv.so _nv000310X 13662 0.8469 Xorg Xorg miModifyPixmapHeader 12576 0.7796 Xorg Xorg __i686.get_pc_thunk.bx 10603 0.6573 Xorg Xorg FindGlyphRef 10403 0.6449 Xorg Xorg FreePicture 10368 0.6427 Xorg Xorg ProcRenderCompositeGlyphs 10210 0.6329 Xorg Xorg damageComposite 9235 0.5725 Xorg Xorg miCompositeSourceValidate
What does a non-cairo Xft-subpixel-AA profile look like?
Reporter | ||
Comment 7•18 years ago
|
||
I'll fire off a non-cairo build in a sec. However, I just added a very hacked-up Xft fast path that draws glyphs using Xft instead of going through cairo_show_glyphs; here are some timings: Xft cairo_show_glyphs cairo+nvidia render accel subpixel 275ms 363ms 369ms gray 279ms 351ms 335ms This is on a different machine (desktop rather than laptop) than when I originally tested the difference between subpixel and gray AA; I don't have any idea why the subpixel case was so much slower for me originally, because I can't reproduce that right now.
Reporter | ||
Comment 8•18 years ago
|
||
Current non-cairo Xft: subpixel 252 gray 260 (why the heck is the gray case consistently slower??)
Reporter | ||
Comment 9•18 years ago
|
||
subpixel AA, noncairo Xft profile: samples % image name app name symbol name 468604 68.1154 libfb.so libfb.so fbCompositeSolidMask_nx8x8888mmx 15234 2.2144 libfb.so libfb.so fbComposite 13449 1.9549 nvidia_drv.so nvidia_drv.so _nv001743X 12493 1.8160 libfontconfig.so.1.0.4 libfontconfig.so.1.0.4 (no symbols) 12375 1.7988 libfb.so libfb.so fbCompositeSrcAdd_8000x8000mmx 11220 1.6309 Xorg Xorg miComputeCompositeRegion 8008 1.1640 libfb.so libfb.so fbSolidFillmmx 7495 1.0895 libgkgfx.so libgkgfx.so NSToCoordRound(float) 7415 1.0778 Xorg Xorg damageComposite 7069 1.0275 Xorg Xorg FindGlyphRef 6628 0.9634 libXft.so.2.1.2 libXft.so.2.1.2 XftGlyphFontSpecRender 6535 0.9499 nvidia_drv.so nvidia_drv.so _nv000794X 6066 0.8817 libXft.so.2.1.2 libXft.so.2.1.2 XftCharIndex 6065 0.8816 Xorg Xorg CompositePicture 5900 0.8576 nvidia_drv.so nvidia_drv.so _nv000990X 4967 0.7220 libgfx_gtk.so libgfx_gtk.so nsFontXft::DrawStringSpec(unsigned int*, unsigned int, void*) 4931 0.7168 Xorg Xorg miGlyphs 4433 0.6444 Xorg Xorg miModifyPixmapHeader 4275 0.6214 Xorg Xorg miGlyphExtents 4135 0.6011 libXft.so.2.1.2 libXft.so.2.1.2 XftGlyphExtents I think I just proved that rendering text sucks on linux, no matter how you do it. I guess I'll stop now.
Reporter | ||
Updated•18 years ago
|
Attachment #213424 -
Attachment is obsolete: true
Attachment #213424 -
Flags: review?(roc)
Reporter | ||
Comment 10•18 years ago
|
||
Cairo build using Xft with no AA gives me 74ms.
Updated•18 years ago
|
Flags: blocking1.9a1?
Reporter | ||
Updated•18 years ago
|
Flags: blocking1.9a1? → blocking1.9+
Reporter | ||
Updated•17 years ago
|
Assignee: vladimir → nobody
Whiteboard: linux-platform
Target Milestone: --- → mozilla1.9alpha6
Comment 11•17 years ago
|
||
punting remaining a6 bugs to b1, all of these shipped in a5, so we're at least no worse off by doing so.
Target Milestone: mozilla1.9alpha6 → mozilla1.9beta1
Comment 12•17 years ago
|
||
i'm not sure this is still much of an issue -- text renders pretty quickly for me on linux. can we close this/let other bugs take its place?
Updated•17 years ago
|
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Updated•13 years ago
|
Resolution: FIXED → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•