Closed Bug 29063 Opened 25 years ago Closed 23 years ago

excessive stat() calls

Categories

(Core Graveyard :: Tracking, defect, P4)

All
Linux
defect

Tracking

(Not tracked)

RESOLVED FIXED
mozilla0.9.5

People

(Reporter: colin, Assigned: dp)

References

Details

(Keywords: perf)

Attachments

(4 files, 1 obsolete file)

Not sure which category to put this into. I think it should start life belonging to RDF, but there doesn't appear to be an RDF category. I originally posted this in a couple of n.p.m newsgroups, but got no response other than Dan Veditz saying "You are not being paranoid. These stats are killing us on the Mac as well." M14 code, and I'm still seeing more local file accesses than I would expect/hope. Once the browser is up and running, I go to a page, and then hit the back button. These are the local file accesses I see. Note that these are real hard accesses. Access a simple page: stat /alpha$dka0/users/cblake/.mozilla/mozProfile stat /alpha$dka0/users/cblake/.mozilla/mozProfile stat /dkb100/work/m14opt/chrome/global/skin/default/animthrob.gif open /dkb100/work/m14opt/chrome/global/skin/default/animthrob.gif stat /alpha$dka0/users/cblake/.mozilla/mozProfile stat /alpha$dka0/users/cblake/.mozilla/mozProfile stat /dkb100/work/m14opt/chrome/global/skin/default/progressmeter-busy.gif open /dkb100/work/m14opt/chrome/global/skin/default/progressmeter-busy.gif Go back to previous page: stat /alpha$dka0/users/cblake/.mozilla/mozProfile stat /alpha$dka0/users/cblake/.mozilla/mozProfile stat /dkb100/work/m14opt/chrome/global/skin/default/button32-bg-active.gif open /dkb100/work/m14opt/chrome/global/skin/default/button32-bg-active.gif stat /alpha$dka0/users/cblake/.mozilla/mozProfile stat /alpha$dka0/users/cblake/.mozilla/mozProfile stat /dkb100/work/m14opt/chrome/global/skin/default/animthrob.gif open /dkb100/work/m14opt/chrome/global/skin/default/animthrob.gif stat /alpha$dka0/users/cblake/.mozilla/mozProfile stat /alpha$dka0/users/cblake/.mozilla/mozProfile stat /dkb100/work/m14opt/chrome/global/skin/default/progressmeter-busy.gif open /dkb100/work/m14opt/chrome/global/skin/default/progressmeter-busy.gif Two questions: 1. Why do we keep stat'ing the profile directory all the time? 2. Why do we always seem to stat a file before opening it? To see if its there? Gee, the open call would tell us that. The pairs of stat calls are coming from nsChromeRegistry::GetProfileRoot. We get two stat calls per time through this. Does anyone else think this is excessive and I should file a bug? Or am I just being paranoid because I live on a system that has slow stat's and open's? Colin.
Forgot to say. I am seeing this on the OpenVMS platform, but that's built from the normal code base as if it were a UNIX derivative.
lets start with waterson. also adding dp, mcafee, pav, sspit, alecf, sfraser as general unix/mac hacks to cc list to enjoin them in the hunt. RELEASE THE HOUNDS!
Assignee: chofmann → waterson
perf keyword
Keywords: perf
Stat is certainly an issue. In xpt loading we found 10% of time reading in about 50 small xpt files was being spent on stat. Jband is looking into removing that. I can add more info on this when I get to my office machine with quantify data. I think we need to fix many part of this. Waterson, may I take this bug from you. This is a performance issue. Once we find the culprits I will cycle the bug or file individual bugs to try eliminating them.
all yours dp.
Assignee: waterson → dp
Hooking into startup performance
Blocks: 7251
Severity: normal → major
Status: NEW → ASSIGNED
Priority: P3 → P1
Target Milestone: M14
putterman was seeing the excessive stat calls, too. cc'ing him.
Stat timings on startup Function % #call Time(sec) --------------------------------------------------------- stat 100 557 0.58 nsFileSpec::IsFile 45.32 185 0.26 (bug# 28964) nsFileSpec::Exists 25.26 165 0.15 nsFileSpec::GetFileSize 20.91 92 0.12 nsFileSpec::IsDirectory 8.50 115 0.05 All the 185 IsFile() calls are from xpt that are taken care in bug# 28964. So let us skip that. Looking at Exists(). Function % #call Time(sec) ----------------------------------------------------------------- nsFileSpec::Exists 100 165 0.15 GetProfileDirectory 74.74 96 0.12 <got fix> nsChromeRegistry:: 12.93 24 0.02 CheckForProfileFile nsFileSpec::ResolveSymlink 7.71 28 0.01 GetProfileDirectory() and ChromeRegistry() look like they need to looked to see if they need to be doing this. GetProfileDirectory() was simple enough to fix. Fixing chrome registry could give a bigger longer term yeild. Who want to take that ?
Depends on: 18048
With one ProfileDir fix and xpt file globbing, we have gotten rid of 280 out of 550 calls. What remains is: nsFileSpec::Exists 25.26 70 0.15 nsFileSpec::GetFileSize 20.91 92 0.12 nsFileSpec::IsDirectory 8.50 115 0.05
do we have stack traces for each of these? even the top 4 frames would be great.
The callers are not just one. So there would be more than one trace per category. I would just break in the debugger for say isDirectory() and see the stack.
Target Milestone: M14 → M16
prefs calls nsFileSpec::Exists MANY times more than it needs (in fact, it never needs to call it) - I have patches in my tree awaiting review which cleans up 25-30 of them from startup.
Alec is your patch for pref checked in.
yes - I had to put one back because it broke some stuff. But I got rid of all but that one.
Hoping you would take care of this.
Assignee: dp → warren
Status: ASSIGNED → NEW
M16 has been out for a while now, these bugs target milestones need to be updated.
Can somebody measure this again and report back? Things have changed a lot in the file transport. Or maybe we should close/ignore this bug since everything will change when the chrome comes out of jar files.
Target Milestone: M16 → M20
Its better (see below). mozProfile isn't being accessed all the time now. But we're still stat'ing a file before open'ing it. Is the plan to have all the chrome in a jar file AND have that jar file read permanently into memory? FORWARD stat /m16opt/chrome/skins/modern/global/skin/progressmeter-busy.gif stat /m16opt/chrome/skins/modern/global/skin/animthrob.gif open /m16opt/chrome/skins/modern/global/skin/animthrob.gif open /m16opt/chrome/skins/modern/global/skin/progressmeter-busy.gif BACKWARD stat /m16opt/chrome/skins/modern/global/skin/progressmeter-busy.gif stat /m16opt/chrome/skins/modern/global/skin/animthrob.gif open /m16opt/chrome/skins/modern/global/skin/animthrob.gif open /m16opt/chrome/skins/modern/global/skin/progressmeter-busy.gif
This now just seems like a dup of the bug that the throbber hits the disk each time. Bug 41268.
Warren, I am reassigning some bugs which you own over to me (dougt). If you think that you will have time to address them please reassign them back to yourself. This reassignment was suggested by choffman.
Assignee: warsome → dougt
leger is no longer @netscape. changing qa contact to component's default
QA Contact: leger → chofmann
I think someone needs to look at this issue again. Last February Suresh reported that there were 557 stat calls at startup. Some work was done to reduce this number quite considerably, and startup time improved. I've noticed a slowdown at startup time and so I checked the number of stat calls being made (now that we are using jar files I would have expected the number to be small). We are making 744 stat calls to start up the browser and display 'about:blank'. Whoa!!! And add to that 215 access() calls, the majority of which I bet are just checking for the existance of a file before opening it. Talk about a waste of cycles! I can provide a list of the stat and access calls if required, but I think the output that Suresh posted last time is much more meaningful. Colin.
anybody know how to run dp juju? is it just some output from prlog?
colin is that in a debug build? Also, do you have jar packaging disabled?
Yes, it is a debug build (from a few days ago). I can only trace the calls in a debug build on OpenVMS. jar packaging in certainly ON. My bin tree is void of all the files that make up the jar files so I know its using the jar files.
the dp juju look surprisingly familiar.. I want to say it's gprof - not sure.
when I do this on linux: strace -f mozilla -P seth about:blank |& tee ~/strace.txt I see us making 286 stat calls. I'll attach my strace log, for those are interested.
Seth, your trace is very interesting. More about that in a moment. First I should say that yesterday I found an RDF problem that was causing OpenVMS to always reinstall all the chrome upon startup. Now this is fixed my number of stat calls is down from 744 to 196, so that has helped a lot, but its still higher than your 119 (I was counting only stat, not fstat). Comparing your trace and mine, they start out pretty much the same, but when yours stops mine continues. I see stat'ing of: rdf files in bin/chrome rdf files in user-profile/chrome css files in user-profile/chrome all the files in bin/searchplugins rdf files in user-profile/chrome/overlayinfo/xxx/xxx rdf files in bin/chrome/overlayino/xxx/xxx xml files in bin/res/builtin panels.rdf in user-profile Any idea why OpenVMS is stat'ing all these files while UNIX is not?
mcafee discovered some interesting stuff last night that I would've never thought of. Specifically, it looks like when you run strace on ``just an optimized build'', there is a significant amount of filesystem activity following symlinks from the ``dist'' directory to the actual files (duh!). So, be sure you're measuring on a ``real'' installation that doesn't symlink to files built in another directory.
Once I downloaded a build, I got a run similar to what sspitzer posted here. But ! jprof isn't enabled by default. We either need this on by-default, or: what's the mechanism to package up a build on unix?
You mean you're not seeing all the additional files getting stat'ed, like I am on OpenVMS? Damn. What is going on on OpenVMS???
jprof will not find excessive stat() calls because it is a time-based sampling profiler. only strace or gprof (which probably doesn't work with mozilla) will count these.
add topperf
Keywords: topperf
Comment from Alan Stange (stange@rentec.com) in news://news.mozilla.org/3A8CB264.3000804%40rentec.com : > Typically, [open and stat calls] are paired up. First stat the file, then > open it. There is a very nice optimization here to open() the file, if that > works, then do an fstat(). The fstat() is essentially free once the > open() is done. The idiom of stat() then open() shows up all over.
dougt, are you working on this? are you the right owner?
chris, no I am not actively looking at this bug. I don't think that this bug will be found in one central place, but rather it is a bunch of usage cases. As a sidenote, the nsIFile caches stat info. If you create one of these objects, all sucessive getter() calls will result in only one stat every being made. Briefly looking at the OpenVMS stat log, it looks like we can quickly identify some problems, one of which we know about. However, I am not really sure what this output is. I am guessing these are calls from the application to the os |stat| api. (for example, a stat then an open would only result in one stat being logged, right??) The first problem is that gif files are loading (or at least stat'ed) multiply times. This is a known problem which will go away when necko lands the new image library. The second problem is that some rdf files are also loading multiple times. For example, the search.rdf file is stat'ed twice and localstore.rdf is being loaded three times? Why is this? Another problem is that some js files are being loaded twice. I think that I spoke to waterson (or brendan) about this in the passed. If you look at security-prefs.js, it is being loaded twice. I bet this is caused by it being references twice from XUL. Waterson, does your xul cache hold onto compiled js? Prefernces looks like it is pounding stat second only to imagelib. It stats unix.js and initprefs.js three times, and the other prefs javascript files twice. Prefs io needs a enema. But on the brigher side, I do not see any duplicates of jar files or libraries.
The OpenVMS stat log is showing exactly what you say, Doug. That is, the calls from Mozilla to the O/S. So each stat you see is a real stat() call and NOT one that has been satisfied by the nsIFile stat cache. If you see anything that looks like an OpenVMS specific problem (eg. other platforms only stat some file once but OpenVMS does it twice), please let me know!!
see also bug 69140
Blocks: 69140
kandrot is going to knock this one silly...
Assignee: dougt → kandrot
On a current Linux build, I see about 250 stat() calls at startup, then no more. I used the PRE_LOAD to gather this info (then verified it by adding the same code to XPCOM), overriding stat(), inserting a printf (or fprintf), then calling the original stat(). I do not see the gifs and all, mentioned in previous posts (though I do see the two stat() before an open, which I shall look into).
Status: NEW → ASSIGNED
looks like this should fall of the top perf list if we are in good shape... anyone have any evidence that stat() call are killing startup?
Keywords: topperf
Marking this as 0.9.1, since I do not know if there is enough info yet to complete this bug.
Target Milestone: --- → mozilla0.9.1
are we still seeing lots of stat() calls? Is this still affecting the startup performance?
Summary: excessive stat calls → excessive stat() calls
forgot we decided in last performance meeting (3/9/01) that there doesn't appear to be any issues with stat() call, will take this off tracking list for now
No longer blocks: 7251
Severity: major → normal
Priority: P1 → P3
moving off the 0.9.1 radar
Target Milestone: mozilla0.9.1 → mozilla0.9.2
linux Bug 65845 incorporate code coverage info into link process esp Comments From Roger Chickering on: 2001-06-08 15:10, 2001-06-08 23:18, 2001-06-08 23:28 and 2001-06-08 23:54. and the bug: 84871 Half of Linux cold start is spent stat()ing components files
Target Milestone: mozilla0.9.2 → mozilla1.0
reassigning kandrot bugs.
Assignee: kandrot → dougt
Status: ASSIGNED → NEW
Target Milestone: mozilla1.0 → ---
Blocks: 98275
to dp.
Assignee: dougt → dp
Need to analyze this to figure out exactly what we should do about it.
Target Milestone: --- → mozilla0.9.5
Status: NEW → ASSIGNED
Priority: P3 → P4
Analysis on stat calls: Test case was strace ./mozilla -P 'Default User' file:///usr/tmp/quit.html > trace.out stat() 86 .so 47 .rdf 34 .xpt 31 .js 6 .css 24 searchplugin 12 directory ---------------- 240 total accounted 256 total 125 fstat() Comments: - 80 .so are comming from the prepopulation of registry code. We can cut this down to 40 - 24 searchplugin : why are we loading the search plugin stuff - 47 rdf files mmh!
Attachment #24569 - Attachment is obsolete: true
Attachment #50155 - Attachment description: List of files we stat (in order) on startup (post above patch) → List of files we stat (in order) on startup - BEFORE above patch
Comment on attachment 50154 [details] [diff] [review] xpcom patch to eliminate 86 stats Looks good to me. r=waterson
Attachment #50154 - Flags: review+
Comment on attachment 50154 [details] [diff] [review] xpcom patch to eliminate 86 stats awesome! So glad to see this part done. sr=alecf
Attachment #50154 - Flags: superreview+
Attached file stats after patch
bug# 100834 eliminates the 24 stats from searchplugins and a directory enumeration Will look into plugins next. Looks like we are enumerating (loading too ?) plugins on startup.
Depends on: 100834
Checked in xpcom portions removing 86 stats (Patch 09/20/01 15:28) into trunk.
Timings on these: After fixing searchplugins and xpcom component stats, we have about 101 stats taking 2.6msec out of 8000msec. That is about 0.03% of startup. Not worth going after.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: