Pango text measurement perf issues with thebes GFX

RESOLVED FIXED in Future

Status

()

defect
--
major
RESOLVED FIXED
14 years ago
8 years ago

People

(Reporter: bzbarsky, Assigned: bzbarsky)

Tracking

({perf})

Trunk
Future
x86
Linux
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9 +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(6 attachments, 1 obsolete attachment)

I just ran the following test:

1)  Start a thebes-gtk2 build on a FC4 system.
2)  Set default browser charset to UTF8 in the menu.
3)  Load /usr/share/mime-info/gnome-vfs.keys (or you can unzip and load
    https://bugzilla.mozilla.org/attachment.cgi?id=205680).
4)  Start sysprof
5)  After a few minutes, stop sysprof.
6)  Kill the completely hung process.

For what it's worth, on my system 20+ minutes is not enough for that build to load that file.  A GTK2/XFT build from the same source does it in about 2 minutes, which is not great but a lot better.

I'll attach the sysprof output, but the salient part is that 60.5% of total time  (including the system, etc) is spent under gfxPangoTextRun::MeasureString.

For comparison, in the non-thebes build only about 30% of the time is spent under MeasureText, according to sysprof.

Updated

14 years ago
Summary: Text measurement perf issues with thebes GFX → Pango text measurement perf issues with thebes GFX
another sysprof of the same problem, with debuginfo packages installed for pango/freetype/Xft
Note that Red Hat's default firefox has the same problem (since they build with pango text rendering).  You need to go to View -> Character Encoding and switch to UTF8 to see the problem (otherwise it all gets forced as ascii).
so after looking at the code that's referenced in the profile and running with XFT_DEBUG=16 (XFT_DBG_REFS), it looks like we're thrashing in Xft's font cache -- it's constantly loading and unloading fonts to deal with the various scripts.

The reason is that:

> int XftMaxFreeTypeFiles = 5;

in xftfreetype.c -- and there's no way to change this value programatically, short of just declaring it "extern int XftMaxFreeTypeFiles" somewhere (!).
Something like this fixes it for me; 50 might be too high, but 5 is definitely too low.  Not sure what an appropriate compromise value would be.
Attachment #214747 - Flags: review?(bzbarsky)
20:02 <@dbaron> vlad, if the intent is to increase it, you might want to test
                the old value as well to make sure you're actually increasing
                rather than decreasing
20:03 <@dbaron> (the pango folks could decide it should be 100 for the next
                release)

(fixed)
Oy.  I'm not sure I'm qualified to review this.  :(  I really don't know enough about this sort of thing to tell whether it's safe to poke random symbols in a different library like that.  I _can_ test this, of course, and I think I can sr, but I'd really like someone more competent to r=.

Should we file a bug on the xft folks to make this configurable?
Luckily, while we figure out how best to fix this, there seems to be a further bug -- with this applied, my page rendering just freezes after scrolling past a certain point in that file.  It's pretty near the top, so I don't think it's anything coordinate related, but I'm guessing something somewhere throws an error and we propagate it up to cairo (which bails on error and causes further drawing to be noop).  I need to add some better error-checking and notification to thebes in debug builds...
Bug on file for the error-prop case you just described?
OK, so I ran with that patch.  No font cache thrashing (as per XFT_DEBUG=16), but I still see something like 70+% of the time spent under nsTextFrame::MeasureText.  The time under gfxPangoTextRun::MeasureString is around 66%.  I let it run for about 10 minutes (not profiling that whole time) before killing the process.

For reference, the non-thebes (gtk2/xft) build from the same source tree is done rendering the page in about 90 seconds.
Attachment #214747 - Flags: review?(bzbarsky) → review?(pavlov)
Hmm, I wonder what's going on; for me, with that patch, it finishes in a similar amount of time to 1.5 (maybe around 10-15% slower?).  I guess I should measure that so I can give some real numbers...
Note that I'm comparing to current trunk, not 1.5... I dunno that much has changed here since 1.5, of course.

Attachment #214867 - Attachment is patch: false
Attachment #214867 - Attachment mime type: text/plain → application/zip
I reprofiled today with roc's native theme stuff (and vlad's patch for this bug still applied).  The results are:

Total hit count: 1462711
  1102212 nsTextFrame::MeasureText
    1053091 nsThebesFontMetrics::GetWidth
      962236 gfxPangoTextRun::Measure
      60966 gfxPangoTextRun::Release
      23156 gfxPangoFontGroup::MakeTextRun

etc.  So not much has changed here.  We're still dog-slow (not repainting for minutes at a time over here), and most of the time is in pango text measurement.
What pango version?  I wonder if they fixed something with the version in FC5 that significantly improves performance..

Comment 17

13 years ago
there was a lot of pango perf work done between fc4 and fc5 timeframes
~% rpm -q pango
pango-1.8.1-2

Note that we're aiming to support FC3; I could try to do some similar testing there (I'd need to build an optimized build, and that machine is a _lot_ faster, so I'm not sure what the numbers would look like).

Is there something they fixed in particular that we're hitting?  Can we avoid it?
Blocks: 334720
OK.  So I decided to test our theory from this bug.  As the testcase I used about the first half of the testcase for this bug (so it'd load in some sort of finite amount of time).  Results are from page load start to onload firing, in seconds (according to JS Date.now()).

                                       gtk2            cairo-gtk2
pango-1.8.1/glib2-2.6.4                ~23                ~82
pango-1.12.0/glib2.10.1                ~23                ~93

I uninstalled and reinstalled the packages several times to make sure; I'm getting those numbers pretty consistently.

So the newer pango does not in fact help.  :(

Does it matter what fonts you have installed?  I have the following font packages (under FC4; not sure which of these fontconfig necessarily sees):

bitmap-fonts-0.3-4
bitstream-vera-fonts-1.10-5
fonts-arabic-1.5-3
fonts-bengali-1.10-2
fonts-chinese-2.15-2
fonts-gujarati-1.10-2
fonts-hebrew-0.100-4
fonts-hindi-1.10-2
fonts-ISO8859-2-100dpi-1.0-14
fonts-ISO8859-2-1.0-14
fonts-ISO8859-2-75dpi-1.0-14
fonts-japanese-0.20050222-3
fonts-KOI8-R-100dpi-1.0-7
fonts-KOI8-R-1.0-7
fonts-KOI8-R-75dpi-1.0-7
fonts-korean-1.0.11-4
fonts-punjabi-1.10-2
fonts-tamil-1.10-2
fonts-xorg-100dpi-6.8.2-1
fonts-xorg-75dpi-6.8.2-1
fonts-xorg-base-6.8.2-1
fonts-xorg-cyrillic-6.8.2-1
fonts-xorg-ISO8859-9-100dpi-6.8.2-1
fonts-xorg-ISO8859-9-75dpi-6.8.2-1
ghostscript-fonts-5.50-13
kon2-fonts-0.3.9b-26
taipeifonts-1.2-26
tetex-fonts-3.0-9.FC4
urw-fonts-2.3-1

Updated

13 years ago
Flags: blocking1.9a1?
Flags: blocking1.9a1? → blocking1.9+
So I just tracked down the difference between our nightlies and my own optimized build over here to this patch (which I had applied in my tree).  The basic difference was that the nightlies freeze for 7-8 seconds about 80% of the time when I scroll to the bottom of a bug page (like if I want to, say... submit the bug form).  After dealing with this for a week, I've gone back to a non-cairo build because it made doing actual work pretty much impossible.  So could we _please_ get this patch in as a bandaid for now?  :(

Details:  For a while I couldn't reproduce those freezes in my own build, but then tried backing this patch out, and they became quite reproducible.  Profile shows more or less the following callstack:

FT_New_Face
...
pango_shape
...
gfxPangoFontGroup::MakeTextRun
....
nsThebesRenderingContext::GetWidth
nsLayoutUtils::GetStringWidth
reflow stuff
PresShell::FlushPendingNotifications
nsGfxScrollFrameInner::FireScrollPortEvent

About 95% of the time spent under PresShell::FlushPendingNotifications is actually spent under FT_New_Face, mostly in ps_hints_apply, FT_Free, FT_Alloc, FT_Stream_OpenLZW, and the other paraphernalia of thrashing the font cache.
Comment on attachment 214747 [details] [diff] [review]
bump the Xft cache size to something higher to avoid font thrashing

roc, Stuart says you're a better bet....
Attachment #214747 - Flags: review?(pavlov) → review?(roc)
Posted patch Merged to tipSplinter Review
Attachment #214747 - Attachment is obsolete: true
Checked in this on top of the other patch
Not sure whether we should resolve this or leave it open in search of a "better way".  I'd be tempted to resolve and file other bugs for any followup issues...
I agree; going to mark this FIXED.  We'll want to revisit this testcase once the new text stuff lands, since it's going to change the performance profile significantly.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → Future
Depends on: 367177
Assignee: nobody → bzbarsky
You need to log in before you can comment on or make changes to this bug.