Closed Bug 113529 Opened 23 years ago Closed 8 years ago

Need better multiplexed file I/O

Categories

(Core :: Networking, defect)

defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE
Future

People

(Reporter: sfraser_bugs, Unassigned)

References

Details

(Keywords: perf)

We spend a lot of time during startup waiting for file IO to complete, which 
suggests that we multiplex file I/O poorly with other tasks during startup. This 
bug will list some parts of the code that have poor behaviour in this respect.
Keywords: perf
Note: On Mac, when we're waiting for IO and no other threads are running, we end 
up in _MD_PauseCPU, which calls WaitNextEvent (thus yielding time to the OS). We 
have little control over how long it takes before the OS hands control back to 
us, so Mac may suffer more than the other platforms.

The first place where we twiddle our thumbs waiting for IO is xpti's 
ReadManifestIntoMemory, which is called early at startup when no other threads 
are running. On Mac we spin for 350-400ms here.

Loading default prefs is the next culprit: openPrefFileSpec.

nsChromeRegistry::LoadDataSource is next, via RDFXMLDataSource::BlockingParse 
(lots of these).

CSSLoaderImpl::LoadAgentSheet (called from 
nsChromeRegistery::LoadStyleSheetWithURL) is next, which reads a CSS file from a 
jar file, thus invoking nsZipArchive::BuildFileList etc (all on the main thread).

CSSLoaderImpl::LoadAgentSheet() then loads another CSS file, this time from a 
file rather than a jar.

Next we load a bunch of string bundle properties files from a jar (called by 
nsNSSComponent::GetPIPNSSBundleString() (why is this eager?)

nsPrefBranch::GetDefaultFromPropertiesFile is is the next culprit, loading from a 
jar. Is called via nsHttpHandler::PrefsChanged. Several of these.

The next few instances are XPT file loads.

nsCharsetConverterManager::GetBundleValue is next, loading a properties file from 
a file.

To recap, all these are instances of large blocking reads on the main thread.
Fix summary
Summary: Need to better multiplexed file I/O → Need better multiplexed file I/O
Previously noted file I/O was all before appshellService->Run() is called. Next 
set come after appshellService->Run().

Another CSSLoaderImpl::LoadAgentSheet (loading from a file)
And another (imported child sheet from file)

Next is another CSS sheet coming out of a jar file
And another.

LocalStoreImpl::LoadData next calls RDFXMLDataSourceImpl::BlockingParse reading 
from a file (bunch of these)

next is nsImapMailFolder::nsImapMailFolder(), calling 
nsMsgFolder::initializeStrings to read a properties file from a jar. (What? I 
didn't load mail).

Another bunch of LocalStoreImpl::LoadData calls.

The next big culprit is reading from the fastload file; lots of file reading.

Next, loading a .dtd file from a jar. There are several of these.

A bit later, nsChromeRegistry::LoadDataSource shows up again (we're at 
nsXULDocument::EndLoad time now) (nsChromeReg::GetDynamicInfo is on the stack.)

Some XPT file loads show up as the next big chunk of file I/O time.

nsBookmarksServer::getLocaleString does a string bundle load from a jar next.

More .dtd loading shows up during this time (Tokenizer_HandleExternalEntityRef), 
loading from jars.

After lots of XPT loading and chrome overlay loading, we see 
nsPersistentProperties::Read, which is called from nsCharsetAlias2::GetPreferred, 
coming out of the collation code.
To summarize: the biggest culprits of blocking file I/O on the main thread are:

1. Chrome registry LoadDataSource()
2. XPT file loading.
3. CSSLoaderImpl::LoadAgentSheet()
4. Expat loading .dtd files.
5. Various string bundle loads.
6. Fastload
7. LocalStoreImpl::LoadData

If we can change some or all of these to do their file I/O on a different thread 
(e.g. use necko for file I/O) it would help startup a lot.
Some stats ('segments' here are time periods during which I had the profiler
running. It's not beefy enough to collect data for the whole of startup yet).

Time period (thread)                       # read/write calls    % segment time
                                                                   in file IO
-------------------------------------------------------------------------------

main -> appshellservice->Run()
  Main thread                                 134              9.7   (0.8 secs)
  File transport thread                        17              1.1   (0.1 secs)
(we spend 0.8 secs (9.6%) in MD_PauseCPU here)
                                                         (segment time 8.4 sec)
  
appshellservice->Run() -> first window shows
  Main thread                                 256              25.8  (1.7 secs)
  File transport thread                       105              23.4  (1.6 secs)
                                                         (segment time 6.3 sec)

What this suggests is that if we did good multiplexed file I/O, we could save
up to 17% of startup time.

Now is the time for someone to tell me that I'm on crack.
some good news... NSS is no longer eagerly initialized at startup.

also, getting more of the I/O to be asynchronous is obviously tricky because it
would require in most cases significant alterations to the client code.  but,
perhaps we can tackle them one at a time.  perhaps this bug needs to be a meta bug??
Would you rather have reading from the fastload file, or more reading from lots
of different .js members or jars?  Trick question, but at least fastload reduces
and consolidates i/o.  I'm working on lazy fastload (bug 107907), but it needs
more design attention from me to avoid doing just as many reads (sigh).  Time to
take a super-review vacation!

/be
FastLoad is gating new window, so running the reads from the fastload file on
another thread won't help much, I bet.  The trick is to read less.  Sorry if
this is obvious.

/be
1.2
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.2
Not going to get to this any time soon.
Target Milestone: mozilla1.2alpha → Future
ah! this is great stuff. see bug 190730 for some of the places that I've also found.
Depends on: 11232
Assignee: sfraser_bugs → nobody
QA Contact: chrispetersen → layout
This is a mass change. Every comment has "assigned-to-new" in it.

I didn't look through the bugs, so I'm sorry if I change a bug which shouldn't be changed. But I guess these bugs are just bugs that were once assigned and people forgot to change the Status back when unassigning.
Status: ASSIGNED → NEW
Component: Layout → Networking
profiles need to be redone
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.