Closed Bug 18048 Opened 25 years ago Closed 24 years ago

nsFileSpec - file access all the time?

Categories

(Core :: Networking, defect, P3)

All
Other
defect

Tracking

()

VERIFIED WONTFIX

People

(Reporter: colin, Assigned: dougt)

References

Details

(Keywords: perf)

Attachments

(1 file)

Mozilla isn't really blindingly fast on OpenVMS, and I'm trying to
find out why. I know that the file system on OpenVMS isn't as fast
as on Linux (I'm comparing to Linux because we build the OpenVMS
version from the UNIX ifdefs), and so I investigated how much file
activity was going on. Since the OpenVMS file system doesn't use
UNIX style names, we have jackets around most file system calls and
so its easy for me to trace those calls.

For an example, I traced the file activity while instead mail, and
just selected one mail message. That's all I did. The trace I will
(attempt to) post as an attachment.

There seems to be an awful lots of stat() calls going in. Often 2-3
stat calls on a file before it is finally opened. Why? Surely you're
not checking to see if the file exists. Just try to open in! If it
doesn't exist you'll get an error - you don't need to keep stat'ing
it!!!

And what's with the continual stat'ing of animathrob.gif? Is this
really necessary? Like its going to change between throbs!

If it seems like I'm picking on open and stat, then its for a reason.
These are particularly slow on OpenVMS, and so if we can cut back on the
amount of unnecessary file activity, I know it'll speed the OpenVMS
browser up (it'll speed up the browser on other platforms too, but not
too the same extent).

Though I've used MAIL as an example, I don't think this problem is
unique to MAIL. I think its prevalent throughout the Mozilla code.
Attached file File access log
[Just FYI, you're more likely to receive a prompt reply by posting on the
newsgroups, rather than as a Browser-General bug report.]
Assignee: leger → gagan
Component: Browser-General → Necko
QA Contact: leger → tever
Whiteboard: [PERF]
Many performance issues are being worked on right now.  This is a dup of many
already specific bugs.

tever, find a dup of this one please.

colin...elig is right, go out to the news groups...you can also check:
http://www.mozilla.org/quality/most-frequent-bugs/
for most common known bugs.
I really don't think that newsgroups are the correct place to report and track
problems. I also don't think that this particular problem has been reported
before; at least, I couldn't find it. And its certainly not in the most frequent
bugs list.

There are many performance issues with Mozilla, and I know that a lot of these
are being worked. But this bug report is specific to file accesses and in
particular open and stat calls. And from what I could tell by searching in
bugzilla, this area is not yet being worked. Hence my bug report.
Assignee: gagan → warren
Target Milestone: M12
Colin,

Can you put a breakpoint on stat and see who's calling it? That would help a
lot. Thanks.

Warren
Here's a typical one for the throbber:

DBG> ex /az *file_spec
"//dkb100/work/m11dbg/chrome/global/skin/default/animthrob.gif"
DBG> sho call
 module name     routine name      line           rel PC           abs PC
 NSFILESPEC      GetFileSize      23525       00000000000016B0 0000000000AD1BE0
*NSFILETRANSPORT Open             31508       0000000000004258 0000000001D36378
*NSFILETRANSPORT Process          32129       0000000000008908 0000000001D3AA28
*NSFILETRANSPORT Run              32102       0000000000008724 0000000001D3A844
*NSTHREAD        Run              11602       0000000000003C18 0000000000B11B38
*NSTHREAD        Main             11091       0000000000000FF8 0000000000B0EF18
*PTTHREAD        _pt_root         31806       00000000000001F0 00000000009ACAE0
 SHARE$PTHREAD$RTL                            000000000004E02C 000000000048C02C
 SHARE$PTHREAD$RTL                            00000000000402C4 000000000047E2C4
                                              0000000000000000 0000000000000000

Looking at the code in question, I see that the Open method calls
GetFileSize and IsDirectory, both of which call stat, and Open hasn't
even opened the file yet.

class nsLocalFileSystem : public nsIFileSystem
{
public:
    NS_DECL_ISUPPORTS

    NS_IMETHOD Open(char **contentType, PRInt32 *contentLength) {
        // don't actually open the file here -- we'll do it on demand in the
        // GetInputStream/GetOutputStream methods
        nsresult rv = NS_OK;

        *contentLength = mSpec.GetFileSize();

        if (mSpec.IsDirectory()) {
            *contentType = nsCRT::strdup("application/http-index-format");
        }
        else {
            char* fileName = mSpec.GetLeafName();
            if (fileName != nsnull) {

Many of the stat calls are probably coming from the other functions like
GetFileSize (ones which get other file info such as mod date). Each
function makes its own stat call and so they do tend to mount up.
Chris: Do you think we can eliminate this stat call in the
nsLocalFileSystem::Open method and instead return -1, allowing your code to
just read from the file until eof? Or did you end up backing that stuff out?
warren: i'll try that. i'm gonna try to fix the POP bug today...
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
I think Chris just took out the stat call, but I'm afraid we'll need to add it
back to get proper progress info later. Colin, can you verify that performance
is better (and ideally quantify it). I'm closing the bug for now.
No, the stat() call is still in there. So all is as it should be...I think :-)
Bulk move of all Necko (to be deleted component) bugs to new Networking

component.
Status: RESOLVED → REOPENED
Why is this marked resolved? The situation hasn't changed. I often see three
stat calls before an open (maybe one to check the file exists, one to check the
size, and another to check the date?). Anyway, this is a waste. Only one stat
call is needed per file, regardless of the number of attributes to be read.

I'm taking the liberty of reopening this bug. If somebody wants to say that
doing three separate stat calls to the same file to retrieve three items of data
is the most efficient way that Mozilla can do it, then (I won't believe you but)
I'll let you close it :-)

This is happening all over the place, all the time. Every little rollover, every
gif for every icon, while reading mail/news, everywhere!!
Resolution: FIXED → ---
Clearing FIXED resolution due to reopen.
Target Milestone: M12
Target Milestone: M15
Another place where I see a lot of stat calls (three per file) is when I use
Mozilla to browse the local file system with a URL such as file:///

How about a hybrid "GetFileInfo" function that returns "IsDirectory, Size and
Date". On UNIX this will be one stat call to return three items of information.
Then the calling code can be changed....
I'd rather just see nsFileSpec (or nsIFile) cache IsDirectory, etc. at the
first stat call.

[adding dougt to cc list -- he should catch up on this discussion]
But making nsFileSpec cache the entire PRFileInfo data structure (for example)
will lead to bugs where the file size or mtime becomes incorrect. I'm sorry,
but in the majority of the cases, I've gotta believe that the OS is going to do
a better job of this than we can.

Another way to look at the problem is to say that our API is too narrow. Why
don't we add a method to nsFileSpec that'll return the PRFileInfo struct (or
something comparable)? This will at least allow components like the file
browser to get -all- the attributes it needs in one fell swoop, and avoid
multiple redundant system calls.
I know we can't cache file size, but how about is dir, is link? That stuff
doesn't change (does it?).
I believe that most of the time, caching this information will be marginally
faster than an OS call. I think the real problem is that the API is too narrow,
and we're ending up doing 3 stat() calls e.g. for this:

  if (spec.IsDirectory()) {      // <-- one
    mtime = spec.LastModified(); // <-- two
    size = 0;
  } else {
    mtime = spec.LastModified(); // <-- two
    size  = spec.GetSize();      // <-- three
  }

where what you want is something like:

  PRFileInfo info;
  spec.Stat(&info);
  if (info.type == eDirectory) {
    mtime = info.mtime;
    size  = 0;
  } else {
    mtime = info.mtime;
    size  = info.size;
  }

Forgive the redundant code, but you get the idea.
Sold!
There was a conversation about sucking out this stat type of information into
another interface.  For one reason or another, we did not do this.  I think, at
the time, we believe that the data would be stale, and stat should occur every
time.  Either way, an implementation could buffer what they needed to for
performance reasons (and in fact, the implementation of the stat interface
could stat for ever call.)

I don't think that I am going to have time to do this before beta.  My
primary goal is to land nsIFile without any further changes.  If this is
something that you want to have, I need some help getting it done.
Unix nsIFile already caches everything, and I was thinking that if you wanted to
get fresher stat data you could use Clone() to get a new one.  That's not as
nice as an invalidateStatData call, maybe.

I'd be OK with an nsIFileMetadata interface, since we really can't cache
anything safely in nsIFile: we're referring to a file path, so if we're racing
with something like:
  mkdir /tmp/foo
  rmdir /tmp/foo
  touch /tmp/foo
then even isDirectory will change on us.
Keywords: perf
Bulk add of "perf" to new keyword field.  This will replace the [PERF] we were
using in the Status Summary field.
Does anyone have a feel for how much this slows the app down?  It seems a likely 
candidate for sluggishness on Mac OS too.  Should it go on beta1 radar?
Whiteboard: [PERF]
I've looked pretty closely at file access during startup (autoreg), and don't 
believe that there are remaining excresences there. However, I think it would be 
good to look at necko's file access patterns.
file download uses the file transport. maybe this is the root of the mac 
download speed problems (25108).
Blocks: 29063
Moving non-essential, non-beta2 and performance-related bugs to M17.
Target Milestone: M15 → M17
post-beta2
Target Milestone: M17 → M18
I want to give this to dougt since this is really about the design of nsIFile.
Assignee: warren → dougt
Status: REOPENED → NEW
Keywords: arch
Keywords: arch
nsFileSpec has many problem (preformance, concrete class, non scriptable).  All
of these have been addressed by nsIFile.  Marking as wont fix.  I have written
up bugs for users of nsFileSpec.  Some have been converted like necko, but
others have been latered.  The example of the throbber now uses nsIFile.

 Convert to nsIFile and reap the benefits.

Status: NEW → RESOLVED
Closed: 25 years ago24 years ago
Resolution: --- → WONTFIX
Summary: Why so much file access all the time? → nsFileSpec - file access all the time?
Sounds like this bug should morph into "[something] needs to be converted from 
nsFileSpec to nsIFile".
verified wontfix
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: