Closed Bug 93620 Opened 23 years ago Closed 21 years ago

100% CPU use just moving the mouse around on a large page

Categories

(Core :: DOM: UI Events & Focus Handling, defect, P4)

x86
All
defect

Tracking

()

RESOLVED WORKSFORME
Future

People

(Reporter: jesup, Assigned: waterson)

References

()

Details

(Keywords: perf, testcase)

Attachments

(2 files)

0.9.3 official (2001080110) Win2K SP2

I browsed to a jprof .html page (1.3MB) stored on a local server.  Just waving
the mouse around in an empty area causes 100% CPU use and a jerky mouse.

To repeat; just use a jprof-enabled build, set JPROF_FLAGS to "JP_START
JP_REALTIME JP_PERIOD=0.005", run mozilla, browse to the gecko colored table
stress-test (above), and quit or stop jprof.  Then run 

 ./jprof mozilla-bin ./jprof-log > tmp.html

and browse to tmp.html from within mozilla.  It takes a LONG time to finish
loading (somewhat understandable, but too long).  Wave the mouse over the page,
watch CPU peg.
Hyatt told me to reassign this to Chris Waterson
Assignee: joki → waterson
Keywords: perf
For reference: this also applies to most large files in lxr.  On the attachment,
moving the mouse around in IE takes only ~5% CPU, 10% if I strum it over links. 

Another interesting note about this case: Watching a CPU meter (windows task
manager), total CPU use for IE on this file starts at 10% and runs to 25%. 
Mozilla quickly hits 100% and stays there (and takes longer to load to boot). 
(This is over a 640Kbps connection, actually downloading it, 750MHz Athlon,
384MB, Win2k).

While this may well be reflow chewing all available cycles while loading, this
is a problem.  Perhaps it can be tuned to avoid this, or perhaps we can add some
heuristics to reduce reflows if they're unlikely to affect the currently-visible
page.

It might be nice to collect a jprof/etc of loading this file.
Blocks: 71668
Blocks: 91351
No longer blocks: 71668
Loaded the first attachment.  Started jprof.  Waved mouse over page (mostly in
the blank area).

I think the flat portion of the jprof I just posted clearly shows the culprit:

Total hit count: 646
Count %Total  Function Name
314   48.6     nsContainerFrame::GetFrameForPointUsing(nsIPresContext *, nsPoint
&, nsIAtom *, nsFramePaintLayer, int, nsIFrame **)
150   23.2     nsFrame::GetNextSibling(nsIFrame **) const
59   9.1     nsRect::Contains(int, int) const
38   5.9     nsContainerFrame::GetFrameForPoint(nsIPresContext *, nsPoint &,
nsFramePaintLayer, nsIFrame **)
31   4.8     nsFrame::GetFrameForPoint(nsIPresContext *, nsPoint &,
nsFramePaintLayer, nsIFrame **)

Note that I did this test on a dual-CPU Linux box, but it clearly shows the
problem.  OS->All
OS: Windows 2000 → All
My checkin for bug 91794 this morning might have helped this (on Linux). 
GetFrameForPoint does use overflow-checking (it doesn't search the children of a
frame that has no overflow and doesn't contain the point), but it still has to
check everything.  Nobody has yet suggested an optimization that doesn't break
it in some cases.
Tried again: pulled and rebuilt.  No apparent change in jprof output.
nsContainerFrame::GetFrameForPointUsing() probably should break out of the loop
when it gets a hit (when we set *aFrame).  Unless there's a reason that we
should be returning the last hit instead of the first, in which case we could
doubly-link the frames and search back from the end.

We're spending (perhaps because of the above) 1/4 of our time in GetNextSibling:
NS_IMETHODIMP nsFrame::GetNextSibling(nsIFrame** aNextSibling) const
{
  *aNextSibling = mNextSibling;
  return NS_OK;
}

This _really_ should be inlined.
And yes, it may not be possible to inline GetNextSibling when called from an
outside class because the .h specifies NS_IMETHOD, which is virtual.
As dbaron pointed out in IRC, we can't break out on a hit because we want to
look for hits in reverse order (because paint occurs in forward order, and
unlike html3, things can overlap (I assume)).

So the options are: find a (faster) way to limit the nodes looked at;
double-link the frame lists and walk backwards to find the frame (so we can stop
on the first hit); ????

I'm working on the double-linking.  It will cost a pointer per nsFrame object.
BTW, in other browsers I've seen things where there was a "FirstVisibleElement"
and "LastVisibleElement" for a (scrollable, positionable) document (i.e. html
frame/iframe/etc).  This amazingly cuts down on the nodes to search.  Is there
any way we could make a similar optimization?  I'm guessing that's not possible
in CSS/HTML4 the way it is in HTML3.x, but I'd love to be proved wrong, or find
some other fast way to eliminate (looking at) nodes.
To capture my thoughts:

Searching back from the end will save on average half, but in many common cases
it would save little.  Still, 1/2 isn't chicken feed.  Cost would be 4
bytes/frame.

Better would be some easy way to determine which frames were actually visible in
a fairly efficient way, and only check those frames.  We do look at the
containing rect, but we override that if there's overflow, and more
problematically on large, flat documents (like jprofs and the like) you have a
giant BlockFrame/ContainerFrame that has a zillion lines which we look into.

If we had some way to minimize the overhead of looking at all those siblings; to
look and test only those that might be the ones (might be visible), there'd be a
giant win.  Perhaps we can key off of nsXxxFrame::Paint and set state bits for
possibly visible; I'm not sure that will be enough.  Or we could use a linked
list of frames that actually had something visible to paint in the last ::Paint
(again, using 4 bytes per frame), and then we can VERY quickly check those. 
Having a fast list of visible items might come in handy elsewhere too....  Hmmm.

I'm open for ideas.
I imagine we have performance problems with painting on pages like this too. 
Perhaps what we need to do is take blocks with lots of children and group them
into sub-blocks of a few hundred lines each.  This would help as long as those
sub-blocks didn't have overflow.
Can't we use the line structures as the sub-blocks?

I worry (not knowing enough about how the frames interact with layout/etc) how
many details would have to be dealt with to make new pseudo-blocking work.  Of
course it may not be that bad.

issue raised in performance meeting today.  probably should investigate for 0.9.5...

dbaron/Randell/waterson?  who is interested in taking this bug?
Blocks: 71668
Target Milestone: --- → mozilla0.9.5
Another Seamonkey-only issue is the way the current tooltip code works.  We're
executing JS code all the time when you move the mouse.  I plan to optimize this
code and move it into C++.
excellent!  :-) hyatt is the winner!  reassigning...
Assignee: waterson → hyatt
I think hyatt was thinking of another issue.  The problem this is reporting is
NOT with JS being run.  It's with iterating over large, flat documents (see
.performance discussion over the last few days in response to the perf agenda
from Friday).
No, wait.  There's a separate bug on the tooltip stuff already.  I was just
pointing out a second perf issue with mouse moves.
okay, sorry.  back to original owner of the bug -> waterson
Assignee: hyatt → waterson
Blocks: 97345
I'm not exactly sure what we're trying to fix here -- waving the mouse around on
a page and being disappointed with the CPU usage doesn't seem like a crucial
performance problem to me. If this has impact on real user actions, could
someone clarify?
Status: NEW → ASSIGNED
Priority: -- → P4
Target Milestone: mozilla0.9.5 → Future
First, massive CPU use on any mouse movement will cause problems for anything
else that's running, like plugins, video streams, java, even network interfaces.

Second, the CPU use gets so bad under Win2000 that it makes the mouse jerky and
hard to control (updates of mouse position seemed to occur at about 5 hertz on a
750 MHz machine).

Please un-future.
This bug is supposed to block 97345 which is fixed. Any comment?
We solved 97345 by side-stepping the issue and avoiding walking/regenerating the
lists when inserting options.
Keywords: testcase
QA Contact: madhur → rakeshmishra
Using trunk build 2002060408 on win-xp, 1.1ghz, 512ram things are okay when 
moving the mouse around. though a very high cpu usage on scrolling.
Using trunk build 2002063008 on win-xp pro,1.1ghz,512ram this seems to be okay 
now. loading in 12.268 secs and scrolling is also fine.
QA Contact: rakeshmishra → trix
WFM with build 2003010408 WinXP Athlon1700XP/256MB
Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.3.1) Gecko/20030425

Haven't seen this in a while now...
marking fixed 
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
I am not sure that fixed is the right resolution in this case. WFM is probably
better.
Fixing dependency and reopening to fix resolution
No longer blocks: 97345
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
It *does* workforme too, btw. 
Status: REOPENED → RESOLVED
Closed: 21 years ago21 years ago
Resolution: --- → WORKSFORME
Component: Event Handling → User events and focus handling
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: