Open Bug 559110 Opened 14 years ago Updated 2 years ago

Autoscrolling causes Xorg to use ~98% CPU when using mutter / gnome-shell

Categories

(Core :: Web Painting, defect)

x86_64
Linux
defect

Tracking

()

UNCONFIRMED

People

(Reporter: drago01, Unassigned)

Details

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; de-DE; rv:1.9.1.9) Gecko/20100330 Fedora/3.5.9-2.fc12 Firefox/3.5.9
Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; de-DE; rv:1.9.1.9) Gecko/20100330 Fedora/3.5.9-2.fc12 Firefox/3.5.9

When using mutter (compositing manager based on metacity using clutter / opengl for drawing on screen) or gnome-shell (which is based on mutter; http://live.gnome.org/GnomeShell/) using autoscroll does cause the Xorg process to use 98% cpu which means I cannot stop the scrolling from happening unless it reaches the end.

This is not 100% reproduce able but does happen with sites that have lots of vertical content (i.e enough place to scroll for a while).

It does not happen when using metacity nor compiz, but after talking with a mutter developer it seems that "maybe firefox is just using an algorithm that fails if the ratio of mouse events to redraw speed is wrong or something".

Hence the report.

This system uses a NVIDIA GPU but there has been at least one report of this happening on a INTEL GPU so a driver issue is unlikely.

Reproducible: Sometimes

Steps to Reproduce:
1. Start gnome-shell or just mutter
2. Open firefox
3. Open a site with contains enough content (like planet.gnome.org)
4. See it happen
Actual Results:  
Xorg uses 98% CPU and scrolling cannot stop unless reaching the end.

Expected Results:  
It should not DoS the Xserver and allow me to stop scrolling anytime I want.
Component: General → Layout: View Rendering
Product: Firefox → Core
QA Contact: general → layout.view-rendering
On what OS does this happen?  How can it be easily reproduced?
(In reply to comment #1)
> On what OS does this happen?  How can it be easily reproduced?

Fedora 12 x86_64 using the Fedora supplied Firefox rpm.

As I wrote in the initial report it is not 100% reproduce able but it is pretty easy to trigger (I'd said 3 out of 5 tries are enough to trigger it).
How much CPU are Firefox and the compositing manager using?
(Usually Firefox would use a comparable proportion of CPU to that of X.
I don't know how many cores are on your machine.)

It would also be interesting to compare a more recent Firefox
(nightly and/or 3.6).
http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central/
(In reply to comment #3)
> How much CPU are Firefox and the compositing manager using?
> (Usually Firefox would use a comparable proportion of CPU to that of X.
> I don't know how many cores are on your machine.)

No, the only CPU consumer is X (firefox ~10% mutter ~3-6%).

As for the number of cores it is a quadcore with hyperthreading (core i7).

> It would also be interesting to compare a more recent Firefox
> (nightly and/or 3.6).
> http://ftp.mozilla.org/pub/mozilla.org/firefox/nightly/latest-mozilla-central/

I tried today's nightly (x86_64) but unfortunately it does not seem to be any better.
Thanks.  That at least excludes the involvement of GDK_POINTER_MOTION_HINT_MASK here.  (It would be used in the Fedora build, but not the Mozilla build.)

Sounds like the first thing to investigate here is whether the mouse events are getting to Firefox (and Firefox is too busy scrolling to read them) or whether something else is holding up the mouse events.
This occurs for me on both Fedora and Ubuntu w/ Gnome+Compiz on x86_64. It seems to occur frequently on long Slashdot threads with many tabs open and Flash eating up the CPU.
Bug 564991 may help here, depending on what the root cause really is.
(In reply to comment #0)
> This system uses a NVIDIA GPU but there has been at least one report of this
> happening on a INTEL GPU so a driver issue is unlikely.

If this is only happening on some systems and can't be reproduced by everyone with mutter/gnome-shell and firefox, then it may be a driver/configuration issue.  e.g. lack of RENDER acceleration support.

drago01: what driver are you using for your NVIDIA card?

(Perhaps a similar problem could occur with an Intel GPU if they have a driver still using XAA without 'Option "XAANoOffscreenPixmaps" "on"'.)
(In reply to comment #8)
> (In reply to comment #0)
> > This system uses a NVIDIA GPU but there has been at least one report of this
> > happening on a INTEL GPU so a driver issue is unlikely.
> 
> If this is only happening on some systems and can't be reproduced by everyone
> with mutter/gnome-shell and firefox, then it may be a driver/configuration
> issue.  e.g. lack of RENDER acceleration support.
> 
> drago01: what driver are you using for your NVIDIA card?

The proprietary one; which has pretty much complete RENDER acceleration on this GPU.
(In reply to comment #9)
> (In reply to comment #8)
> > (In reply to comment #0)
> > > This system uses a NVIDIA GPU but there has been at least one report of this
> > > happening on a INTEL GPU so a driver issue is unlikely.
> > 
> > If this is only happening on some systems and can't be reproduced by everyone
> > with mutter/gnome-shell and firefox, then it may be a driver/configuration
> > issue.  e.g. lack of RENDER acceleration support.
> > 
> > drago01: what driver are you using for your NVIDIA card?
> 
> The proprietary one; which has pretty much complete RENDER acceleration on this
> GPU.

I had it (auto scrolling) disabled for a while due to this bug; I re enabled it today and found that I can not reproduce it when doing sub stage redraws rather than always redrawing the whole screen which clutter was doing for windows with a height > 300.

Reverting the patch and trying for a bit causes it to happen again; so we might be indeed triggering a driver bug here; but it wouldn't explain why the same happens on INTEL where the driver completely different code paths.
Possibly Firefox is sending multiple requests to the server to perform repeated scrolls but never waiting for any kind of response from the server to indicate that the server has caught up.
I'm not sure that it makes sense for all clients performing animations to check that the server has caught up.  A limit on client request queue sizes in the server seems a possible solution here.  I don't know what the current limit is.  Perhaps slow drivers (or operations) can make the queue take a long time to empty.
Attached file gdb backtrace
I've observed this bug with both fglrx and intel graphics drivers, and on both Debian squeeze (at various times) and Ubuntu (I don't remember what version it was, 9.04 or 10.04), running GNOME and metacity. I'm currently using iceweasel 3.6.10-1 from Debian's experimental repository.
Reproducibility varies with webpage, but the current Google image search results pages reproduce really well. It freezes indefinitely, long enough for me to grab a backtrace with gdb, and some oprofile samples. I couldn't get call information from opreport, but I'm attaching the report anyway because it confirms that the backtrace is relevant.
Attached file output of opreport -l
Thanks for the profile data, Simon.

This bug is involves CPU time spent in the X server, while the data you've attached points to time spent in GDK, so it seems you are seeing different symptoms.  Whether or not the cause is the same, I don't know.

Scrolling is quite different in Firefox 4 and I don't think the code showing up in your profiles gets used (at least not in the same way), so your issue may have been resolved already.
Component: Layout: View Rendering → Layout: Web Painting

Related issue here, but FF itself has the CPU spike, not Xorg.

I'm on Alpine Linux edge x86_64, Firefox ESR v68, i3wm and no desktop environment

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: