Open Bug 1151151 Opened 9 years ago Updated 2 years ago

Intermittent high CPU usage (under PresShell::Paint) on a complex page running no code

Categories

(Core :: Layout, defect)

37 Branch
x86
Windows 7
defect

Tracking

()

UNCONFIRMED

People

(Reporter: mahks1, Unassigned)

Details

(Keywords: perf)

Attachments

(5 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; rv:37.0) Gecko/20100101 Firefox/37.0
Build ID: 20150402191859

Steps to reproduce:

Intermittently, several times a day CPU usage climbs to 100% while in firefox.
I only notice this when I am working on a specific app I am developing.

I would like to provide you with a test case but the app is too complex to use itself and I have had no success in pinpointing what triggers the behaviour. I have provided a profile generated by dev tools (when CPU was at about 80%). If this could help someone direct me to what is the trigger I could then try to develop a test case.


Actual results:

If I create a profile under the performance tab in dev tools. It shows all the activity is either graphics or gecko. At this time my app has no functions active, no events being triggered, no network access and no visible animations.

If I minimize Firefox or open another tab, CPU usage drops to normal. 
CPU then spikes again when maximized or app tab gets focus.

The weirdest thing is that the behaviour continues after FF is closed then reopened.
- I close FF, CPU usage drops.
- Check processes to ensure FF has stopped.
- Open FF and my app, CPU spikes.

BUT, If I wait 2 minutes or so before re-opening FF, then the behaviour does not occur!


Expected results:

The only addon is Firebug and disabling Firebug has no effect.

I have re-installed FF several times and it has been happening over several versions of FF, but I cannot go back beyond where display:flex was implemented (about FF31 I think) so am unable to say when this began.

The app is fairly complex.
I use iframes a lot, so I can reload portions without reloading the whole site. (I have not found any other way to dynamically reload js files)
So it may have something to do with multiple iframe reloads, but I cannot seem to force it even by reloading an iframe dozens of times.

I also use display:flex a lot, nested several layers. I know there are issues in FF with reflow on complex flex structures, but nowhere does it mention continuous looping, like what must be happening here. This problem began after I started converting to display:flex, but that may just be a coincidence as FF versions changed as well.

It feels similar to a memory leak (maybe CPU creep) in that some loop within the graphics code never ends and calls to it build.

I have run out of ideas on how to track this down further.
Any ideas about what could be the cause or any way to debug this would be muchly appreciated.
Note that during the above profile, firefox is only using about 40-50% of the CPU, dwm.exe 12-20%
Firefox minimized but still using 50% of CPU
The profiles come up empty when I try to upload them to http://people.mozilla.org/~bgirard/cleopatra/ .

Did you follow the instructions at https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Reporting_a_Performance_Problem ? Can you share a link?
Flags: needinfo?(mahks1)
The profiles were created by FF 37 devtools.

I see now Bugzilla has no option to download those json files.
I was able to copy the contents, save as .json and load them into devtools performance tab... 

They seem to load fine into both FF 37 & 38 devtools > performance > import

If that is not good enough, I will download nightly,  when I have more time.
Ah, I see, thanks. The other profiler will give more detailed information, although I'm not sure it will be useful.

Testing in the Nightly & new profile is a good idea anyway, in case this is something that was already fixed in the dev builds.
So if I'm interpreting the profile correctly, it's slow painting (building of the display list).

I'm not sure if we have any other tools to help debug the root cause of this (short of creating a minimized testcase.) Pinging roc for ideas.
Flags: needinfo?(mahks1) → needinfo?(roc)
Keywords: perf
Summary: High CPU usage when app has no active code → Intermittent high CPU usage (under PresShell::Paint) on a complex page running no code
Component: Untriaged → Layout
Product: Firefox → Core
Yes, loaded new profiler much better, will post result on next occurrence.

What I don't understand is why there is painting going on when nothing is (supposedly) changing.
Can you see what is triggering the paints?
Right, that's what I don't know how to debug.

The platform profiler shows the backtrace for reflow/restyle (the frames bar in cleopatra, hover over the tiny "scripts"/"styles" bars to get the tooltip): https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Profiling_with_the_Built-in_Profiler#Understanding_Frames_In_Cleopatra

..but not the backtrace for "ViewManagerFlush" in refresh driver's terms. Bug 887976 seems to be the ultimate solution for this.
You could try setting the pref nglayout.debug.paint_flashing to true, that will show you if anything is actually getting repainted and might help you track this down.
Yeah, a full SPS profile would be invaluable here.
Flags: needinfo?(roc)
Thought the problem had been solved as it had not occurred for a week or so after I began using nightly.

But is has been happening, just a lot less frequently and the CPU usage is less than in FF beta.

Attached is a profile that is compatible with Cleopatra.
I see two problems with the profile:
- you don't have the full stack (do you have "stackwalk" in the profiler popup enabled? You need to stop the profiler before clicking the checkboxes)
- you have devtools' Graphs.jsm in the profile. Were you using the devtools while profiling? Did you have it open? Could you describe in more detail what happened when you took the profile?

Did you try paint flashing as suggested by Timothy?
I could not find the stackwalk option (or any popup associated with the performance tab) 
Is that in FF DevTools?
Where does that popup activate from?

I had DevTools open as that is the only way I know of running the profiler. I only watch the performance timeline, while waiting for it to get to about 10 seconds, then click the end & save the profile.

"Could you describe in more detail what happened when you took the profile?"
If you mean what is happening with the page in the browser, there are few (if any, they are very short) javascript routines running, there may be 1 or 2 small (30 X 30) animated gif's active, maybe 1 or 2 CSS animations (that normally do not cause any lag) Otherwise "nothing" should be happening, no mouse overs, the visual appearance of the page does not change, yet there seems to be a lot of graphics activity in the profile.

I tried the paint flashing. Seemed to only tint the browser, and only change the tint on mouse overs. Could not find any documentation on how it was supposed to work. I suspect all it does is re-tint anything that changes. If that is the case then nothing is changing while all this graphic activity is happening.
Latest occurrence profile
>I had DevTools open as that is the only way I know of running the profiler.
I'm sorry for the confusion. I assumed you already switched to using the platform profiler ("SPS"): https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Reporting_a_Performance_Problem

It has the stackwalk option, and if you profile while the profiler's popup is not open, it shouldn't cause any redraws by itself. Please use it for further profiling, keep the devtools closed.

>there may be 1 or 2 small (30 X 30) animated gif's active, maybe 1 or 2 CSS animations
Earlier you said "no visible animations". Note that there are two possible problems here:
1) A redraw is triggered with no visible changes. In this case we should identify the cause of the redraw (for example, hidden animated GIFs used to cause it)
2) The redraws are triggered as expected (i.e. the page changes), but take more CPU time than they should. In this case you should expect non-zero CPU usage (although perhaps not as high).

I'd try to tackle (1) first. Please specify whether the problem was (1) or (2) for each profile you provide.

> Seemed to only tint the browser, and only change the tint on mouse overs.
It should change tint of any areas that are re-painted. In case (1) no tint changes are expected.
It should rarely change the tint of a large area. Repaints don't normally happen when moving the mouse (unless you have :hover styles or JS changing something in reaction to mouse moves).

You can see an example here: https://msujaws.wordpress.com/2012/02/01/layout-paint-flashing-in-firefox/

> If that is the case then nothing is changing while all this graphic activity is happening.
Animated GIFs and CSS animations you mentioned earlier should have caused tint changes.
For example if I open this with paint flashing enabled: http://upload.wikimedia.org/wikipedia/commons/5/50/Triple-Spiral-Labyrinth-animated.gif - the browser background does not change tint (whether I move the mouse or not), but the area with the GIF gets a new tint with each new frame.
Finally, the behaviour just occurred again.
Enabled Cleopatra 1.16.1 in FF 41.0a1 
But get error : Could not understand response from symbolication server at http://symbolapi.mozilla.org/
and Cleopatra would not function...

Would that be a set up issue?
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: