Closed Bug 335925 Opened 20 years ago Closed 20 years ago

High CPU use with no activity

Categories

(Firefox :: General, defect)

2.0 Branch
PowerPC
macOS
defect
Not set
major

Tracking

()

RESOLVED FIXED
Firefox 2 alpha2

People

(Reporter: stephen, Assigned: jaas)

References

Details

(Keywords: fixed1.8.1, regression)

Attachments

(4 files)

User-Agent: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8) Gecko/20060428 BonEcho/2.0a1 Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8) Gecko/20060429 BonEcho/2.0a1 ID:2006042902 Launch the specific branch build (20060429) with a new profile. Without any activity, Firefox is using near 100% of CPU time. Reproducible: Always Steps to Reproduce: 1. Launch Activity Monitor 2. Launch Firefox 3. Check Firefox CPU useage in Activity Monitor Actual Results: Firefox is using nearly 100% of the CPU with a blank or no browser window. Expected Results: Minimal CPU useage.
Summary: High CPU use with not activity → High CPU use with no activity
I can confirm for my PowerBook g4 (1.25 Ghz) with the same build.
did this regress between 20060428 and 20060429 ?
Severity: normal → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: regression
Version: unspecified → 2.0 Branch
Attached file sample-output
This is the output of sample, immediately after startup. Firefox is running at 86% cpu (it's an old 300 MHz laptop), on a freshly created profile where I only replaced the bookmarks file. today's Firefox 2.0 build (20060430), Mac OS 10.2.8
Flags: blocking-firefox2?
I have a sequence of steps that seem to make this problem disappear: 1) Start Bon Echo - don't try to open prefs, ext, themes, etc. 2) Command N to open a new browser window - you should see the cpu % drop significantly 3) Command Q to quit 4) Restart Bon Echo Hopefully you will now be able to continue to have Bon Echo run at a reasonable cpu percentage and also be able to go to the other windows. Note: While testing various sequences, I got to a point where Bon Echo wouldn't launch a window at all. Continuous force quits didn't work either. I had to start Bon Echo in safe mode in order to see an actual window.
Bug number 335926 and this bug are definitely related. When Bon Echo is in the 100% cpu state, none of the pref, theme, extension windows open. Blank lines show up under the Window menu but you are unable to switch to that window. After I run the above sequence, the cpu % goes down to less than 3% and I can see prefs, themes, etc.
-> Josh, to figure out where this broke, and get traction on the issue.
Assignee: nobody → joshmoz
Flags: blocking-firefox2? → blocking-firefox2+
Target Milestone: --- → Firefox 2 alpha2
Further testing on my machine indicates that this is per profile. After performing the workaround from comment 4, I launched Bon Echo using a different profile, and once again saw the high CPU usage at idle time. I closed Bon Echo, went back to the profile I'd been using when applying the workaround, and lo and behold, the CPU was normal. So, I did some testing with new profiles, and lo and behold, the workaround in comment 4 seems to be making some sort of difference at the profile level. To play along at home: - Create two new profiles A and B - Launch firefox -P A and perform the workaround - Launch firefox -P B and note the CPU time - Launch firefox -P A and note the CPU time - scratch your head
Does moving aside XUL.mfasl in the profile directory help?
After performing the work around in comment 4, when installing or uninstalling an extension, the cpu gets pegged at 100% again.
(In reply to comment #8) > Does moving aside XUL.mfasl in the profile directory help? > The only XUL.mfasl I have is from an install of Camino. I don't have one in my profile.
(In reply to comment #9) > After performing the work around in comment 4, when installing or uninstalling > an extension, the cpu gets pegged at 100% again. > Sorry, trying not to spam, but also disabling an extension will do the same thing. Just remember to restart Bon Echo after doing one of these functions.
adding rob strong, who's likely a little busy atm, but might be able to throw us some more troubleshooting ideas since it sounds like it could be tied to extension stuff
Is there a regression range for this? This may have to do with bug 214672. I don't have a Mac OS X system so I can't debug this. :(
(In reply to comment #8) > Does moving aside XUL.mfasl in the profile directory help? > I tried it twice : the first time it didn't help, but the second time (when I nuked both XUL.mfasl and the cache) it did work. I've also tried to nuke localstore.rdf, but that didn't help at all. Note for John : it's not in your usual profile in ~/Library/Application Support, but in ~/Library/Caches
(In reply to comment #14) > (In reply to comment #8) > > Does moving aside XUL.mfasl in the profile directory help? > > > > I tried it twice : the first time it didn't help, but the second time (when I > nuked both XUL.mfasl and the cache) it did work. > > I've also tried to nuke localstore.rdf, but that didn't help at all. > > Note for John : it's not in your usual profile in ~/Library/Application > Support, but in ~/Library/Caches > OK, I found the file and the cache associated with it. I didn't have any luck blowing the file and the cache away. I tried it both while Bon Echo was running and when it wasn't running. After restarts the cpu was still pegged at 100%.
Is anyone seeing this in a debug build? If so, could you breakpoint in the place where we open the jar channels and see why that's happening?
I experienced it again with todays build (20060504), but removing the XUL.mfasl file didn't help this time. But the workaround in comment 4 did.
If you're on a Mac with Developer Tools installed, you can unzip this file and double click on it to see the trace in Sampler.app, which is a little nicer way of looking at the data than flat text. You want to look at thread 4327. Its freaking out in CSSLoaderImpl::SheetComplete. To see more about what is going on in that function open this trace up.
Note: The Sampler.app trace I posted is really a collection of a bunch of traces spaced 20ms apart so you see what is happening over time. Sampler.app puts it all together in a way that is easy to digest.
fwiw, with profiles in which the workaround from comment 4 has been applied, the CPU still goes haywire when I do a manual check for updates ...
I can confirm Peter van der Woude's regression date, the first build that displays this behavior is 4/29/06 2:45:00 PM.
From looking at bonsai, 2006-04-28 11:02 checkin for 333634 (dbaron) looks most suspicious.
Assignee: joshmoz → nobody
While running the 4-29 build I noticed I cannot launch the JS console, Ext and Theme Manager or Preferences, but DOM Inspector works. If I am able to open a new window and avoid the spinning wheel, I can access the JS console.
What's the regression *range*? Have you tried local backout of the checkin you suspect?
BuildID = "2006042804" - Upon launch of a fresh profile, CPU zooms to around 96% in Activity Monitor and then goes down to normal. BuildID = "2006042904" - Upon launch of a fresh profile, CPU zooms to around 96-100% in Activity Monitor and then stays there until you open a new window. Testing on a G5, OS X 10.4.6 with 1.25 GB RAM (In reply to comment #24) > What's the regression *range*? > > Have you tried local backout of the checkin you suspect? >
Don't take the dates I mentioned as regressionwindow. I notice Mac users (2) complaining on the forum and urged them to file a bug. This may just as well have regressed the days before that.
And this is branch builds only, and *not* trunk builds?
So wait. Josh, is your Sampler.app output similar to the first attachment? That is, more or less all the time is spent in nsVoidArray::IndexOf? Marcia, the builds in comment 25 are 1.8 branch builds, right?
So Marcia's regression range from comment 25 is http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=MOZILLA_1_8_BRANCH&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=20060428+04&maxdate=20060429+10&cvsroot=%2Fcvsroot or so. Can someone who can reproduce this bug and builds do the following please? 1) Either get a debug build or an optimized build in which you've uncommented line 96 of nsCSSLoader.cpp (the "/ #define FORCE_PR_LOG /* Allow logging in the release build */" line). 2) Set NSPR_LOG_MODULES=nsCSSLoader:5 and whatever NSPR_LOG_FILE you want. 3) Reproduce the bug; let it spin for a minute or so, perhaps? 4) Attach the log file here. http://www.mozilla.org/projects/netlib/http/http-debugging.html has directions on using NSPR logging on OSX; just make sure to use NSPR_LOG_MODULES=nsCSSLoader:5 and not the HTTP modules. ;)
For what it's worth, if the "sample-output" attachment is representative of this problem, then the chance that this is caused by the checkin for bug 333634 is about 0.
bz - yes, most of the time is spent in nsVoidArray::IndexOf. All of the time is spent in nsCSSStyleSheet::SetComplete, frame 25. Frame 26 is most often nsDocument::SetStyleSheetApplicableState but sometimes nsDocument::EndUpdate. Frame 27 is nsDocument::AddStyleSheetToStyleSets whenever frame 26 is nsDocument::SetStyleSheetApplicableState, and PresShell::EndUpdate whenever frame 26 is nsDocument::EndUpdate. Frame 30 is most often nsVoidArray::IndexOf, which is whenever frame 26 is nsDocument::SetStyleSheetApplicableState. The frame before nsVoidArray::IndexOf is always nsDocument::GetIndexOfStyleSheet.
Right. So somehow we're getting into an infinite loop of loading stylesheets, end up with tens or hundreds of thousands of them in that document, etc. I'd love that CSSLoader log. Then we'd know what the stylesheet URIs involved are. ;)
Attached file nsCSSLoader log file
ask and ye shall receive from debug 1.8 branch build
Its "chrome://browser/content/places/places.css" that is getting loaded a bzillion times.
Attachment #220979 - Attachment mime type: application/octet-stream → application/zip
Sounds like a regression from ben's patch to disable places on the branch, then.
So the deal is that chrome://browser/content/hiddenWindow.xul includes http://lxr.mozilla.org/seamonkey/source/browser/base/content/macBrowserOverlay.xul which includes chrome://browser/content/places/places.css. Trying to load places.css on branch gives an error. Still looking into why we end up loading it a whole bunch of times.
Peter, I see us enter nsExpatDriver::HandleProcessingInstruction with mBytesParsed = 1442 and aBuffer pointing to the '<' of the xml-stylesheet PI in question. Then we call MOZ_XML_Parse, after where the return value of XML_GetCurrentByteIndex() is 1442. So we keep passing this PI into expat... I thought we'd fixed this, but maybe that was just trunk? I also don't know why it matters that the sheet fails to load -- the failure is not propagated to the caller in this case....
Attached patch ifdef places cssSplinter Review
Assignee: nobody → joshmoz
Status: NEW → ASSIGNED
Attachment #221032 - Flags: review?(bugs.mano)
Comment on attachment 221032 [details] [diff] [review] ifdef places css r+a181=mano
Attachment #221032 - Flags: review?(bugs.mano)
Attachment #221032 - Flags: review+
Attachment #221032 - Flags: approval-branch-1.8.1+
fixed on 1.8 branch and trunk Someone who knows more about the CSS loader (bz? dbaron?) should probably file a bug on the fact that it spun out of control when it shouldn't have.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
My guess is that bug 335926 should also be marked as fixed as it was a direct result of this bug. However, I have yet to do a test with this fix.
The problem is the XML parser, not the CSS loader. The XML parser keeps parsing the same PI over and over and keeps telling the loader to load more and more stylesheets. I really wish we could reproduce this with a standalone XML file linking to (non-chrome, presumably?) CSS... I'm pretty sure it's branch-only, though, since on trunk Peter nuked all that code, so I'm not sure how much we should worry about it.
Keywords: fixed1.8.1
I get this problem in current cvs, as well as older cvs checkouts (post 5/5/06 when the "fix" was checked in). It happens when I try to open the extension manager. Just dumping the LOG_URI information in nsCSSLoader give me the output below repeatedly (only one set below). In current cvs checkouts, it's worse, it happens on startup in Komodo. SheetLoadData::OnDetermineCharset for 'chrome://browser/skin/browser.css' Child uri: 'chrome://global/skin/global.css' Load from: 'chrome://global/skin/global.css' SheetLoadData::OnDetermineCharset for 'chrome://global/skin/global.css' Child uri: 'chrome://global/locale/intl.css' Load from: 'chrome://global/locale/intl.css' Child uri: 'chrome://global/skin/formatting.css' Load from: 'chrome://global/skin/formatting.css' SheetLoadData::OnDetermineCharset for 'chrome://global/locale/intl.css' Finished loading: 'chrome://global/locale/intl.css' SheetLoadData::OnDetermineCharset for 'chrome://global/skin/formatting.css' Finished loading: 'chrome://global/skin/formatting.css' Finished loading: 'chrome://global/skin/global.css' Finished loading: 'chrome://browser/skin/browser.css' Link uri: 'chrome://browser/skin/browser.css' Load from: 'chrome://browser/skin/browser.css'
found a work-around for us. basicly we're still pre-xulrunner app. We build the browser first, then build our own stubs. The browser chrome has always ended up in our builds, and gre likes to load browser.css first instead of our stuff. making some minor changes in our prefs file and removing browser chrome avoids the css parsing problem we encounter with browser.css. I do not think this bug should have been marked fixed. avoiding the problem is not a fix.
peterv has a fix for the actual XML parser bug somewhere...
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: