Closed Bug 205621 Opened 21 years ago Closed 21 years ago

Performance degradation over time on Xft-enabled builds

Categories

(Core Graveyard :: GFX: Gtk, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: bigglesworth, Assigned: blizzard)

References

Details

(Keywords: perf, relnote)

Attachments

(1 file)

136.84 KB, application/x-gzip
Details
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4b) Gecko/20030513
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.4b) Gecko/20030513

After browsing for a while (sometimes 10 minutes, sometimes 1 hour), Mozilla's
rendering speed decreases gradually but dramatically, and the entire program
(including preference dialog and menus) becomes increasingly unresponsive.

A restart is the only cure.

When this happens, rendering of (in particular) pages with lots of text take
extremely long (say a minute instead of a second), scrolling slows down, and
even minimizing/restoring the Mozilla window takes forever. 100% CPU is used in
these cases, but only when Mozilla is actually drawing/rendering something.

It only seems to happen on Xft-enabled builds, but it happens both on Redhat 8
and 9, with the 1.2.1 that ships with RH9, 1.3 and every 1.4 I've tested, and
both with gtk2-enabled builds as well as Xft-enabled only.

I've tried with a fresh profile (same result). I've also tried (seeing that this
might somehow be font related) with freetype 2.1.4 and fontconfig 2.2 as well as
with whatever-version-it-is-that-redhat-ships with RH8 and RH9. Same result.

Sorry for being vague about this: I haven't been able to find a simple way to
reproduce this problem. Please tell me if there is anything I can do to track
this down.

But it is a real problem, and at least I'm not the only one seeing this (see
http://www.mozillazine.org/forums/viewtopic.php?t=9978 ).



Reproducible: Always

Steps to Reproduce:
1. Browse for 10 minutes or an hour.

Actual Results:  
Mozilla gets slow.
Keywords: perf
is Mozilla using lots of memory when it gets slow (check with "top")?

==> gtk
Assignee: general → blizzard
Component: Browser-General → GFX: Gtk
QA Contact: general → ian
I've also heard this complaint, but I don't think I've ever seen it.
Blocks: xft_tracking
Status: UNCONFIRMED → NEW
Ever confirmed: true
> is Mozilla using lots of memory when it gets slow (check with "top")?

The slowness doesn't seem to be directly related to memory use (I've seen it
happen at 35M and 80M according to top). That said, it happens more often when
Mozilla is using more memory (probably just because Mozilla tends to leak over
time, and the likelihood of the slowness-phenomenon kicking in increases with
said time).

Also: as far as I can tell it is not so much of a gradual slowing down, rather
that Mozilla all of a sudden becomes very slow. When that happens, performance
degrades rapidly, though. 
I have seen this in the galeon in debian unstable (1.3.4+cvs), with
mozilla-1.3-5 with xft and gtk2. Creating new tabs becomes dramatically slower.
One more datapoint:

Mozilla leaks X pixmaps badly. According to the handy little tool found at 

http://www.xfree86.org/~mvojkovi/restest.c

Mozilla creates 5-15 pixmaps everytime I reload, say, slashdot.org. When Mozilla
has reached 1500 allocated pixmaps, or so, Mozilla becomes slower on my machine.

Just a thought: I'm on a laptop with very little video memory (8128 kB on an ATI
Rage Mobility). Maybe Mozilla/X tries to use it all at some point (with all the
leaked pixmaps etc.)?
FWIW: I'm seeing the exactly same symptoms with Firebird 0.6 (both with xft and
gtk2+xft builds), as well as with epiphany 0.6.0.
iirc, there was an old version of xft that leaked pixmaps (maybe even the one
that shipped with RH 8.0) if you were displaying against a server that didn't
have render.
Well, I'm running RH9 - so it should be fixed by now, right? Anyway, I'm running
against the local XFree 4.3.0, so render should be there.

However, I noticed that Mozilla doesn't leak pixmaps if I disable image loading
(by means of Preferences/Privacy & Security/Images/Do not load any images).
Yeah, RH9 should have the bulk of those fixes.  If you disable image loading and
the problem goes away, we might have a problem in the pixmap code.

Are you interested in testing this with gtk 1.2 and see if the problem goes
away?  We might have a porting issue - maybe something that needs to be included
in the gtk2 code but isn't?
Unfortunately, "slowing down" and "pixmap leaking" seems to be two different bugs. 

1. Pixmap leaking occurs on lots of Mozillas with image loading on
(1.4-20030513, 1.4-20030519 (both gtk builds from ftp.mozilla.org), Firebird 0.6
from ftp.mozilla.org (without xft), Firebird 0.6-xft-gtk1 and Firebird 0.6-xft-gtk2.

Since it happens even with the ones linked against gtk-1.2, it doesn't seem to
be a porting issue?

2. "Slowing down" only seems to happen with xft, but it happens eventually even
when image loading is off (and thus without pixmap leaking): so the two seems to
be unrelated after all. Sorry for the noice -- but it's still quite bad that
Mozilla leaks pixmaps (and memory)...

Anything else I could try? 
You could point a profiler at it and try to figure out where we're spending all
that time.
I noticed problems like this when using TBE extension on RedHat Linux/FireBird.
http://bugzilla.mozilla.org/show_bug.cgi?id=213744

If you have this installed, try disabling it, and see if the problems go away.
No, I don't use any extensions for Firebird (and see the same thing when running
plain Mozilla), so I don't think that's the problem here.

I made a (very) feeble attempt to get a profiler to work (using the eazel
profiler as per http://www.mozilla.org/performance/tools.html), but couldn't get
the darned thing to work - and rebuilding Mozilla takes an eternity and a half
on my poor laptop).

If there's a better way to profile on Linux, I'd be happy to try that.
There's oprofile.
jprof also works.
Attached file jprof output
Thanks Andrew: jprof works fine! (whereas oprofile seems to be broken on RH9 -
saying that the oprofile module is missing :-( ).

I'm attaching jprof output from a mozilla compiled with gtk2 and --enable-xft,
while browsing a few pages on slashdot (the slowdown is worst on pages with a
lot of text, unsurprisingly).

I don't know nearly enough about Mozilla internals (and jprof output) to
interpret this, though. Hope someone else can...
...and the above attachment is gzipped (due to bugzillas size limit), not
text/html. Sorry 'bout that...
Attachment #129496 - Attachment mime type: text/html → application/x-gzip
I have the same problem with a 1.4 final xft build that I compiled myself. If I
have it running for more than a few hours (say, about three), performance
degrades to the point that it is unusable. This is with an Athlon XP 1800+ with
512M PC2100 on Slackware 9.0.
Here are the first dozen or so lines of attachment 129496 [details] (Flat profile) that I
just took a look at.

Total hit count: 3955
Count %Total  Function Name
1600   40.5     /usr/X11R6/lib/libXft.so.2
1015   25.7     _end
576   14.6     /usr/lib/libfreetype.so.6
 36   0.9     __i686.get_pc_thunk.bx
 19   0.5     /usr/lib/libfontconfig.so.1
 
It's not very helpful because libXft was treated as a blackbox.
nsFontMetricsXft::GetWidth (for 'char' rather than 'Unichar') called Xft APIs
(perhaps XftTextExtent or XftGlyphExtent) most of ten, which is consistent with
the observation that perf. is worst on pages with a lot of text (comment #16).
Perhaps, jprof has to be taken again with an unstripped version of libXft. 

             6344 /usr/X11R6/lib/libXft.so.2
               1999 nsFontMetricsXft::GetWidth(char const*, unsigned, int&, 
                       nsRenderingContextGTK*)
                839 nsFontXft::FillDrawStringSpec(unsigned*, unsigned, void*)
                224 nsFontMetricsXft::DrawString(char const*, unsigned, int,
int, int const*, 
                      nsRenderingContextGTK*, nsDrawingSurfaceGTK*)
                 82 nsFontXft::GetTextExtents32(unsigned const*, unsigned,
_XGlyphInfo&)
                 13 nsFontMetricsXft::SetupMiniFont()
                  7 nsFontMetricsXft::DrawString(unsigned short const*,
unsigned, int, int, int, int  
                      const*, nsRenderingContextGTK*, nsDrawingSurfaceGTK*)
                  3 nsFontMetricsXft::CacheFontMetrics()
                  2 nsFontXft::GetXftFont()
                  1 nsFontMetricsXft::DrawStringCallback(unsigned const*,
unsigned, nsFontXft*, void*)
                  1 nsRenderingContextGTK::DrawString(char const*, unsigned,
int, int, int const*)
                  1 nsRenderingContextGTK::GetWidth(char const*, unsigned, int&)
 11040 1600     3172 /usr/X11R6/lib/libXft.so.2
               6344 /usr/X11R6/lib/libXft.so.2
                968 _end
                593 /usr/lib/libfreetype.so.6
                  9 /usr/X11R6/lib/libXrender.so.1
                  1 /usr/X11R6/lib/libX11.so.6
                  1 /usr/lib/libfontconfig.so.1

I'm getting this progressive sluggishness on Mozilla 1.4 with XFree86 4.3.0
also.  In order to help diagnose the problem, I waited until the browser got
sluggish and then I attached gdb to the process.  I would periodically break and
get a backtrace and then continue.  During the stalled periods (lasting several
seconds usually -- and getting worse over time), I would always get this
backtrace (I'm only including the first few lines):

#1  0x4202aaf9 in rand () from /lib/tls/libc.so.6
#2  0x403c13bd in _XftDisplayManageMemory () from /usr/X11R6/lib/libXft.so.2
#3  0x403c8472 in _XftFontManageMemory () from /usr/X11R6/lib/libXft.so.2
#4  0x403c3c71 in XftGlyphExtents () from /usr/X11R6/lib/libXft.so.2
#5  0x41b7d7e8 in nsFontMetricsXft::DrawStringCallback(unsigned, nsFontXft*,
void*) ()
   from /usr/lib/mozilla-1.4/components/libgfx_gtk.so
#6  0x41b7edc6 in nsFontXft::GetMaxDescent() ()
   from /usr/lib/mozilla-1.4/components/libgfx_gtk.so
#7  0x41b7d4e3 in nsFontMetricsXft::EnumerateGlyphs(unsigned*, unsigned, void
(*)(unsigned, nsFontXft*, void*), void*) () from
/usr/lib/mozilla-1.4/components/libgfx_gtk.so
#8  0x41b7bc70 in nsFontMetricsXft::DrawString(char const*, unsigned, int, int,
int const*, nsRenderingContextGTK*, nsDrawingSurfaceGTK*) ()
   from /usr/lib/mozilla-1.4/components/libgfx_gtk.so

The details would vary slightly from break to break, but it would _always_ be in
the function _XftDisplayManageMemory().  If I understand this correctly, it
looks like there's a leak in the glyph cache and _XftDisplayManageMemory doesn't
do a very good job freeing it.

Could this be a dupe of bug 180328 ?
Thanks, David. Your backtrace is consistent with the jprof output(attachment
129496 [details]).

The patch (to Xft) for bug 180328 might have fixed this as well, but I'm not
sure. Until recently(when I recompiled libXft from the updated cvs copy), I had
been running Mozilla with libXft compiled from the Xft cvs (pulled around
mid-May. not sure if it's after Keith's check-in of the patch for bug 180328)
and hadn't observed the problem reported here.

David, can you compile libXft from the cvs copy  (it's at http://fontconfig.org,
not at http://www.xfree86.org) and see if the problem goes away? 
Xft has a limit on the amount of memory that glyph images can consume in the
server; this keeps the server from growing huge when rendering a page with lots
of different glyphs.  However, the cache management code was freeing fonts but
not counting the memory used by the related glyphs, so the available memory for
glyph storage was decreasing, eventually you'd run completely out and the
library would re-rasterize every glyph each time it was drawn.

This bug is fixed in Xft version 2.1.2, which is available at
fontconfig.freedesktop.org
Thanks, Keith, for the note. 
Asa, do we have to release note this and bug 180328? Xft-build is not the default linux 
build, but most Linux users use Xft build these days....  Both bugs (and one or two more 
bugs) are fixed on the Xft side. 
Keywords: relnote
Would updating xft itself to 2.1.2 require a recompile of Mozilla?
No, unless you statically linked Mozilla to Xft, which is very unlikely if possible at all. 
It's certainly not the case that most users of the builds provided by
mozilla.org use XFT (fewer than 20% of the linux downloads from ftp.mozilla.org
are contributed/xft builds) and the release notes we provide are specifically
for the mozilla.org builds, not for some other distribution with differently
configured/compiled products. Other distributions or contributed builds should
have their own release notes. 
I can confirm that xft 2.1.2 from fontconfig.freedesktop.org fixes this problem
for me - Mozilla's CPU usage stays pretty constant over time. 

Overall performance is still degrading over time, though (but much, much less
than with the default libXft in RH8 and RH9), probably due to the huge memory
leaks in Mozilla. Browsing and mail/news-reading easily kicks Mozilla up to 100
megs RSS, which obviously hurts performance.

It would be great to have a readme note on the xft problem and solution
somewhere, as the relationship between XFree86 and xft versions is less than
obvious (at least to me). Also, Mozilla seems to be the only program hurt by the
libXft bug; e.g. Konqueror works just fine with a stock libXft.
Marking as WONTFIX due to the upstream nature of the problem.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → WONTFIX
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: