improve startup performance tracking bug

RESOLVED DUPLICATE of bug 447581
(NeedInfo fromanyone)

Status

defect
P3
normal
RESOLVED DUPLICATE of bug 447581
20 years ago
3 years ago

People

(Reporter: chofmann, Unassigned, NeedInfo)

Tracking

(Depends on 6 bugs, {meta, perf, topperf})

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(5 attachments)

all the areas that need to be fixed to make us start fast...
Depends on: 7185, 7187, 7188, 7189, 7190
Blocks: 7229
Whiteboard: [Perf]
Status: NEW → ASSIGNED
Depends on: 7916
Blocks: 8700
Blocks: 8702
No longer blocks: 8700
No longer blocks: 8702
Depends on: 8700, 8702
Depends on: 8724
Depends on: 8745
Blocks: 9147
No longer blocks: 7229
Depends on: 7249
Depends on: 9011
Depends on: 5085
Depends on: 5952
Depends on: 3884
Depends on: 4901
No longer depends on: 8700
Depends on: 12296
No longer depends on: 7249
removing dependency on 7249, since we do this for the optimized builds already.
the bug is still open to get it working for developer builds.
Depends on: 13129
Depends on: 13635
Depends on: 13555
Depends on: 13747
Depends on: 14824
Keywords: perf
Bulk add of "perf" to new keyword field.  This will replace the [PERF] we were
using in the Status Summary field.
*** Bug 26224 has been marked as a duplicate of this bug. ***
Adding beta1 to keyword list

Here's the latest test data for start time after a system reboot.

                            4.71       SM
Win98200MHz, Pentium Pro    2:02s      11:23s
64MB RAM

Mac OS 8.5                  2:89s      13:63s
266MHz, PowerPC G3,
64MB RAM 

Linux RedHat 6.0,           5:11s      27:34s
166MHz, Pentium, 80MBRAM

For beta 1 we need to be within 2x 4.71
Keywords: beta1
-> jevering 
Assignee: chofmann → jevering
Status: ASSIGNED → NEW
Putting on PDT+ radar for beta1.
Whiteboard: [Perf] → [PDT+]
Removed PDT+  to get a re-eval so that this doesn't show in stats.  I'd rather 
see the actual bugs  rated as PDT+, then see this one in addition to all the 
others (at present, there is one other :-/ ).
Whiteboard: [PDT+]
Whiteboard: [PDT-]
Adding 28964
Depends on: 28964
Depends on: 28911
Adding 28911
No longer depends on: 28911
Adding 26512
Depends on: 26512
Depends on: 28911
Depends on: 29063
Depends on: 29249
Depends on: 29363
Depends on: 29685
Depends on: 29725
Depends on: 30201
Depends on: 29728
Depends on: 30365
Taking over this bug.
Assignee: jevering → dp
Status: NEW → ASSIGNED
Target Milestone: --- → M30
Depends on: 27510
Target Milestone: M30 → Future
Updating QA Contact.
QA Contact: leger → paw
*** Bug 50746 has been marked as a duplicate of this bug. ***
Using Quantify and the latest build, it appears that we spend 40% of our boot
time in malloc, allocating 180,000+ blocks.
Depends on: 26516
Depends on: 18277
Depends on: 46801
Depends on: 46800
Keywords: beta1nsbeta1
dp no longer at netscape. assigning to chofmann so this is on someone's radar
Assignee: dp → chofmann
Status: ASSIGNED → NEW
maybe selmers that better owner/watcher for this tracking bug 
from an application point of view.

kandrot, do you know if there is a bug that specifically talks
about all this memory allocation going on at startup?
That is the part memory and performance team might be
interested in.  also, do we know if the memory allocation
that is going on rougly mirrors the code size of the .dll's
or is there any sore thumbs that might stand out as allocating
more than they need to or should at startup?

sounds like evaughn's late loading ideas might have a big impact
on the overall startup time if we can shave off lots of code
that really is not used a large pct. of the time.

waterson, did you see this.  holy heck.  40% of startup
time just to cram the monkey into memory...  whoa!
another good piece of data to get would be to figure out of
the 40% of seamonkey start time in malloc ratio also holds true
in gtkembed and winembed.
->McAfee, CC Vishy
Assignee: chofmann → mcafee
OS: Windows 95 → All
Hardware: PC → All
Whiteboard: [PDT-]
Adding bug 67618 (mallocs, ~40%) and bug 64146 (making charset menu takes ~5%).
Depends on: 64146, 67618
Depends on: 69009
No longer depends on: 69009
Blocks: 71066
added more startup bugs
No longer blocks: 71066
Depends on: 8578, 9012, 26455, 42087, 49145, 49786
Blocks: 71066
Depends on: ? 14889, 6
No longer depends on: ?
Depends on: ? 68045
No longer depends on: ? 3884, 4901, 5952, 7185, 7187, 7188, 7189, 7190, 7916, 8724, 8745, 9011, 12296, 13129, 13555, 13635, 13747, 14824, 18277, 26512, 28964, 29363, 29685, 29728, 30201, 30365, 46801, 6
Depends on: 71447
No longer depends on: 14889, 26516, 68045
Depends on: 71781
Depends on: 71780
No longer depends on: 8702
Depends on: 8702
Depends on: 71861
over to cathleen
Assignee: mcafee → cathleen
Depends on: 71874
No longer depends on: 8578, 8702, 28911, 29249, 29725, 42087, 46800, 67618
No longer depends on: 71447
No longer depends on: 29063
No longer depends on: 71874
Depends on: 69140
Depends on: 26516
Depends on: 12896
Depends on: 72551
Depends on: 63000
Keywords: topperf
Depends on: 35816
No longer depends on: 26516
No longer depends on: 9012, 26455
Depends on: 65266
Depends on: 62164
Depends on: 68074
Depends on: 76705
Depends on: 26291
No longer depends on: 12896
No longer depends on: 63000
No longer depends on: 9012
Depends on: 79521
Depends on: 63246
Depends on: 79580
Depends on: 78695
Adding myself.
Depends on: 65845
Depends on: 46775
Depends on: 77757
Target Milestone: Future → ---
Depends on: 85770
Depends on: 86929
Adding bug 68045 ("precompile chrome JS and load it incrementally"), which is
really "precompile all XUL/XBL/JS/CSS that costs a lot more to parse than to
fast-load from a pre-compiled form".

/be
Depends on: 68045
Depends on: 91804
Depends on: 67618
Depends on: 88583
Depends on: 91241
Depends on: 92477, 92478, 92479, 92480
Blocks: 10770
No longer blocks: 9147, 71066
Keywords: nsbeta1, perf, topperfhelpwanted
QA Contact: paw → sujay
Summary: improve startup performance tracking bug → TRACKING: mouse event handling in tables
ARGH!  Someone who got CC'd on the last update to this please re-add all the
dependencies!  Mozilla (or bugzilla?) apparently emptied the dependency field,
and  when I added one it erased all the old ones.  Please add 91231 when you fix
this.  Sorry; I should have been more paranoid when I noticed an empty text
field there.
Hmmmm.  I just noted that a bunch of the other fields got wierded out as well.
(Summary should be "improve startup performance tracking bug").   Sounds like a
Bugzilla bug....
Restoring the fields as they were. Adding 91231 as requested.
Keywords: helpwantednsbeta1, perf, topperf
QA Contact: sujay → paw
Summary: TRACKING: mouse event handling in tables → improve startup performance tracking bug
Blocks: 9147, 71066
No longer blocks: 10770
sorry for the spam, the field changes fooled me at a sleepy moment...
Now be good this time, bugzilla...
Depends on: 90545, 92256
Mysteriously disappearing values in fields are usually bugs in Mozilla's form
handling.  Randell, were you using Mozilla to update the bug, and if so what
build?

Next time you notice this you should see if it happens with Netscape 4.x or
another browser.  If so, file a bug against Bugzilla; otherwise file one against
Browser->HTML Form Controls or Browser->Form Submission (depending on whether
you notice the problem before or after the form gets submitted) and notify
staff@mozilla.org immediately so we can block that Mozilla build to prevent
corruption of the Bugzilla database.

Usually I would blame Mozilla for this sort of thing ... BUT ... I noticed that
when the dependencies were readded the notifications got sent out merged with
the original notifications when the dep got removed.

AIUI when this happens its a Bugzilla bug, and we're not sure why it happens. 
If that's the case, it's likely that the whole thing might be a Bugzilla bug.
I was using mozilla; a daily from just before 0.9.3.  Note however that it
replaced the subject line with the subject line from the first bug in my query
list - it would be tough for mozilla to do that, since I never even looked at
that bug (other then it being an entry in the query results).  I still think
it's bugzilla.
Depends on: 94199
No longer depends on: 91804
Some analysis on the above.

I'm going to treat main() as the root for the relevant number of stack
traces (that's 1858, as opposed to 1911). Since this is a sampling
profiler, it should yield good results with respect to time spent in
I/O. Unfortunately, it doesn't appear to have picked up symbols from
NSPR (even though I built with seawood's patch for bug 88045).

- Only 2% of the stack traces (37/1911) fell in NS_InitXPCOM(). Of
  that, 16 (or less than 1%) were spent in
  nsComponentManagerImpl::PlatformPrePopulateRegistry(). So, we seem
  to spend very little time reading in the registry. About 1/2% of the
  time was spent reading the XPTI files. So it appears that there is
  little to gain from optimizing XPCOM initialization code.

- About 8% of the time was spent in
  nsAppShellService::CreateHiddenWindow() (158 stacks). From here, 134
  stacks (or 7% overall) was spent in
  nsAppShellService::GetHiddenWindowAndJSContext(). This winds up in
  what appears to be a CreateInstance() call. (See below.)

- 329 stacks, or 17% of the time, was spent in CreateInstance(). The
  biggest offenders are:

  - nsNSSComponentConstructor(), which accounts for 5.5% (103 stacks).
  - nsChromeRegistryConstructor(), which accounts for 2% (39 stacks).
  - nsAutoConfigConstructor(), which accounts for 1.6% (30 stacks).
  - nsPrefServiceConstructor(), which accounts for 1.6% (29 stacks).
  - NS_NewCharsetMenu(), which accounts for 1% (19 stacks).
  - Construct_nsIScriptSecurityManager(), which accounts for 0.5% (10
    stacks).

  Others to note (that take less than 0.5%, but still show up with
  several stacks) are:

  - nsHTTPHandler::Create()
  - nsProfileConstructor()
  - CreateXULDocument()
  - NS_NewLocalStore()
  - InternetSearchDataSourceConstructor()
  - nsBookmarksServiceConstructor()
  - nsCookieHTTPNotifyConstructor()

  All of which read files from the disk.

- We spend 1.7% of the time in nsStaticComponentLoader::GetFactory(),
  which seems high. Perhaps we're paging in the executable to access
  data at this point?

- We spend 18% of the time (330 stacks) in poll(), from
  g_main_poll(). What are we doing?

- We spend 5% of the time (241 stacks) parsing style sheets, that is,
  in CSSParserImpl::Parser().

- We spend 8% of the time (149 stacks) handling ContentInserted()
  notifications from asynchronous XBL loads. It appears that most of
  the time is spent constructing frames, so it's not clear that
  synchronously loading the XBL (i.e., so it would have been ``there
  already'') would buy much.

- It appears that we spend

  - 10.8% of the time (201 stacks) in
    nsCSSFrameConstructor::ConstructFrame(),

  - 10.7% of the time (199 stacks) in
    nsCSSFrameConstructor::ConstructFrameInternal(),

  - 4.9% of the time (90 stacks) in
    nsCSSFrameConstructor::ConstructXULFrame(), and

  - 5.9% of the time (109 stacks) in
    nsCSSFrameConstructor::ProcessChildren().

  That is, when the sample was taken, the stack pointer was somewhere
  in these routines (not somewhere in a descendant of the
  routine). Take this evaluation with a grain of salt: since these
  routines recursively call each other, I'm not completely positive
  that I've properly understood jprof's accouting.

More later...
Chris Waterson writes:
>- We spend 1.7% of the time in nsStaticComponentLoader::GetFactory(),
>  which seems high. Perhaps we're paging in the executable to access
>  data at this point?

        Probably.  dlopen() (depending on the OS) doesn't really load
the entire library; quite a bit will get paged in later.

>- We spend 18% of the time (330 stacks) in poll(), from
>  g_main_poll(). What are we doing?

        Those will be mostly waiting on IO I'd guess; you'll have to look
at the exact stacks.  The question what IO are we waiting on...???  I don't
think we're doing async file IO so that wouldn't show up, and I don't think
we should be doing network IO or DNS (but perhaps we are).  Perhaps some
things/Events are handled or generated off of timers as well, and we're
waiting for those?  This would be good to spend some time understanding.
Perhaps it's a gtk/etc thing?  What was used to end the profile, or shut
mozilla down?  If you did it by hand, that might account for it.  Also,
it may have been waiting for X - I don't know how gtk/gdk handles X;
it may dispatch things async to X and then go back to the main event
loop to poll.(!)

        Also, we have 41 hits (2.2%) in select() (i.e. sleeping waiting
for some form of IO or timer).

>- 329 stacks, or 17% of the time, was spent in CreateInstance(). The
>  biggest offenders are:
>
>  - nsNSSComponentConstructor(), which accounts for 5.5% (103 stacks).
>  - nsChromeRegistryConstructor(), which accounts for 2% (39 stacks).
>  - nsAutoConfigConstructor(), which accounts for 1.6% (30 stacks).
>  - nsPrefServiceConstructor(), which accounts for 1.6% (29 stacks).
>  - NS_NewCharsetMenu(), which accounts for 1% (19 stacks).

        NSSComponentConstructor looks suspicious, especially since we
haven't hit any https links yet.  It spends 1/2 it's time in
NewTempCertificate, a lot of in ASN1 decoding.  Probably parsing key/etc
files?  The others may each deserve a short look for easy gains.  The
CharSetMenu is a good target for lazy creation.

>- We spend 5% of the time (241 stacks) parsing style sheets, that is,
>  in CSSParserImpl::Parser().

        This may be worth looking at a little.

>- We spend 8% of the time (149 stacks) handling ContentInserted()
>  notifications from asynchronous XBL loads. It appears that most of
>  the time is spent constructing frames, so it's not clear that
>  synchronously loading the XBL (i.e., so it would have been ``there
>  already'') would buy much.

        Agreed.

>- It appears that we spend
>
>  - 10.8% of the time (201 stacks) in
>    nsCSSFrameConstructor::ConstructFrame(),
>
>  - 10.7% of the time (199 stacks) in
>    nsCSSFrameConstructor::ConstructFrameInternal(),
>
>  - 4.9% of the time (90 stacks) in
>    nsCSSFrameConstructor::ConstructXULFrame(), and
>
>  - 5.9% of the time (109 stacks) in
>    nsCSSFrameConstructor::ProcessChildren().
>
>  That is, when the sample was taken, the stack pointer was somewhere
>  in these routines (not somewhere in a descendant of the
>  routine). Take this evaluation with a grain of salt: since these
>  routines recursively call each other, I'm not completely positive
>  that I've properly understood jprof's accouting.

        Yes, those all wind in an out of each other a lot, then they
devolve into a load of smaller hits all over.  Untangling these (maybe
looking at the raw stack frames) may help.  Also, profiling some pageloads
this way may show better where the FrameConstructers are spending time.

Other notes:

    - 6.2% is spent in read().

    - 4.8% is spent in nsContainerFrame::ReflowChild

    - 3.6% is spent in free() - and fair bit of that in locking/unlocking
           the pthread mutexes.
    - 3.6% is spent in malloc() - about 2/3's in malloc, 1/3 in pthreads.

    - 3.2% is spent in ~nsCOMPtr_base; about 1/4 in that itself(!?) and the
           rest in a bazillion ::Release methods.  (I was a little
           surprised by the number of hits in this directly.  The hits in
           ::Release are destructors and free().

    The nsObjectHashtable destroy/reset code is hit a lot, but it all
    devolves to ::Release calls (i.e. destructors & free).

    nsXULPrototypeDocument::~nsXULPrototypeDocument(void) seems to get more
    hits than I'd expect.

    - 2.4% is spent in js_GetToken, 1% of it direct hits.

    - 2.3% is spent in js_GC, and separately about 1% in gc_root_marker

    - 2.3% is spent in doContent (from XML_Parse).

    - 1.3% is direct hits in kill (?)

    - 1.1% is spent in nsCSSScanner::Next

    - 1.1% is spent in nsHashtable::Get (from all over)

    - 1% is spent in libc_write

    - 1% is spent inflating zip files

    - 1% is spent parsing StdURL's; probably much of that allocing CStrings

    we're spending a couple of % in the dlopen and dl_runtime code.

    - 0.5% doing stat's (from a bunch of places)

    - 0.5% direct hits in nsCOMPtr_base::begin_assignment (seems odd)

    - 0.5% direct hits in nsStr::StrTruncate (seems odd)

    - 0.5% is spent in JS_DHashTableOperate (mostly direct hits in
           SearchTable)

    - 0.2% direct hits in nsVoidArray::InsertElementAt (I'll look at this)
           plus 0.3% in GrowArrayBy (really in Realloc)

    - 0.2% direct hits in nsVoidArray::ElementAt (I'll look at this but I
           think it's optimal)

I noted there's ShutdownXPCOM called - I assume then this is a
startup/shutdown pair?  Perhaps startup, then signal (or JS?) to stop
the logging would be better, though you'll pick a bunch of poll() hits.

Some hotspots/solutions aren't obvious from a profile, especially when a
lot of it is interpreting XUL/JS/etc.  For example, lazy creation of things
not immediately needed (especially if the lazy creation can be dropped onto
a low-pri background thread; perhaps not possible in this case - or find
some way to evaluate it on the main thread when it's otherwise idle, to
avoid adding locking all over.).
Depends on: 94883
Depends on: 94366
> Those will be mostly waiting on IO I'd guess; you'll have to look
> at the exact stacks.  The question what IO are we waiting on...???  I don't
> think we're doing async file IO so that wouldn't show up, and I don't think
> we should be doing network IO or DNS (but perhaps we are).

FWIW, if a PAC file is specified, it will be loaded during startup (prefs
initialization) rather than waiting for a network event. (bug 83984)
Depends on: 83984
Depends on: 26455
I ran a jprof (JP_REALTIME JP_START JP_PERIOD=0.0003) on a dual 450 Linux
box with plenty of RAM.  This was a warm start, done seconds after having
started and stopped it.  I stoped the profile by hand after it had come up
(with www.mozilla.org as the home, and tinderbox as the sidebar).

Build: from about a week ago, with my ContentIterator fixes.  Compiled with
-O2 optimization.

Summary:

Note: %'s are always % of the total number of non-poll hits (157)!
'direct' is hits directly in the routine.

157 non-poll/etc hits.
   59 (38%) in ResumeParse()
      art of this (<10%) is handling initial reflows.
      Most appears to be in doing *::OpenContainer calls.

   32 (20%) is doing CSSLoaderImpl::ParseSheet()
      Most of that is inserting content, eventually calling
      nsCSSFrameConstructor::ContentInserted.
     7 (4.5%) are in CSSParserImpl::Parse()

   36 (23%) are in CreateInstance() (pre dp's patch)
      22 (14%) are in nsDll::Load, probably all going to dlopen.  Almost
         all of those are in dl_lookup_symbol (21, 13.5%)
      15 (9.6%) in nsGenericFactory::CreateInstance calling Constructors
         and ::Create() methods  

   30 (19%) are in nsCSSFrameConstructor::ConstructFrame (note: see ParseSheet
      above, this overlaps with that.)

   25 (16%) in js_Interpret
      A fair bit of this is XBL.
      6 (3.8%) in js_GetToken, 3 of them direct hits(!)
      5 (3.2%) in js_ForceGC
      5 (3.2%) in js_AtomizeString (1 direct)

   17 (11%) in nsXBLService::LoadBindings
      10 (6.4%) in ::InstallProperties, a lot of it executing JS

   13 (8.3%) in nsDocShell::CreateContentViewer

   13 (8.3%) in nsContainerFrame::Reflow (some overlap with above/below entries)
      8 (5.1%) in nsBlockFrame::Reflow

   13 (8.3%) in XPCWrappedNative::CallMethod, including 3 direct hits(!?)

   13 (8.3%) in malloc()/new

   12 (7.6%) in free(), including 2 direct hits.

   11 (7.0%) in nsBox::Layout, called from Reflow() (see containerframe above)

   11 (7.0%) in XML_Parse

   10 (6.4%) in InitializeProfileService.  Most of that is in:
      8 (5.1%) nsChromeRegistry::FlushCaches calling GetServiceByContractID
          (this may be mostly in opening shared libs)

   10 (6.4%) in nsPipe::nsPipeInputStream, almost all from:
      9 (5.7%) in HaveDecodedRow, drawing images

    7 (4.5%) destroying webshells/docshells and their frames

    3 direct in memset (from 3 places)
    2 direct in nsString::nsString(void)
    2 direct in nsSupportsArray::QueryInterface
    2 direct in memcpy
    2 direct in nsString::Length
    2 direct in JS_GetPrivate()

I can upload the jprof or email it if someone likes
js_GetProperty seems to be a hotspot in the JS code.  (6.4%) and it doesn't fan
out too far.
Sorry, I meant _js_LookupProperty for the hotspot (JS_GetProperty is one of it's
callers).  The jprof will show you.
Depends on: 95823
Depends on: 95825
Depends on: 96457
Depends on: 96461
Depends on: 96464
Depends on: 96469
Depends on: 96701
Depends on: 97171
Depends on: 97172
Depends on: 97173
Depends on: 97175
No longer depends on: 97173
No longer depends on: 97172
trace-malloc results for mozilla startup, with blank page
Type                    Count    Bytes %Total
TOTAL                  115396  5863421 100.00
void*                   23517   943767  16.10
registry-Buffer             1   512000   8.73
xpti-unclassified         250   274080   4.67
unclassified-string      5623   203441   3.47
JSScopeProperty          4460   160560   2.74
gtk-unclassified          928   159473   2.72
nsZipArchive             3438   147662   2.52
JS-Array                 4654   144448   2.46
js_MatchToken            5046   135536   2.31
FrameArena                 31   127565   2.18
JS-script                 866   125423   2.14
nsXULElement             2086   118296   2.02
JS-atom                  4085   114218   1.95
CSSDeclarationImpl       2732   109940   1.88
nsCSSRule                1342   106488   1.82
JSScope                  1140    95760   1.63
nsComponentManagerImpl   2312    90464   1.54
nsTextFragment            360    90106   1.54
JS-unclassified          2110    87392   1.49
nsStaticCaseInsensitiveNameTable   2574    85184   1.45
orkin-unclassified         50    82694   1.41
X-unclassified           3187    82483   1.41
nsCStringKey             2525    77323   1.32
JS-GC-arena                 8    73912   1.26
JS-function              2412    71840   1.23
nsXULAttribute              5    71680   1.22
AtomImpl                 1749    65758   1.12
nsXULAttributes           865    62280   1.06
xptiInterfaceInfo        1409    62232   1.06
nsXMLElement             1137    59124   1.01
RuleHash                 3782    58652   1.00
InMemoryDataSource         92    56074   0.96
nsPersistentProperties   1923    55962   0.95
sscanf                   1504    55325   0.94
xptiManifest             1874    46336   0.79
JS-slots                  818    45536   0.78
nsImageGTK                110    44008   0.75
nsXULPrototypeElement     891    42768   0.73
nsSupportsArray           582    32592   0.56
OTHER                   22918   885039  15.09
Depends on: 97528
Depends on: 38621
Depends on: 101769
Depends on: 102785
Depends on: 103174
Depends on: 93055
No longer depends on: 5085
Depends on: 76944
No longer depends on: 38621
Blocks: 103712
Depends on: 103912
Depends on: 103916
Depends on: 104120
Blocks: 104166
Depends on: 104957
Depends on: 104959
Depends on: 104962
Depends on: 105040
Depends on: 105042
Depends on: 98275
No longer depends on: 96461
Depends on: 105301
Depends on: 92141
Depends on: 105501
Depends on: 105509
Depends on: 49142
Depends on: 103453, 105795
I made my own start up time "profiling", using FileMon from
www.sysinternals.com. There were 9 sec stopwatch measured cold start time, 5.5
sec hot start (filemon activated). See attachments following (a .txt compilation
of the largest times measured, an excel worksheet containing the cold start
profiling data for the mozilla.org tree and an excel worksheet for the cold
start profiling data for the Application Data tree). 

\PROGRAM FILES\MOZILLA.ORG AND SUBDIRECTORIES
7.39 sec for 3245 uncached (cold start) file operations. 0.104 sec for 2649
cached (hot start) file operations on mozilla.org dir and subdirs. Cold start
values:
TOTAL 2.07 sec for MOZILLA.ORG\MOZILLA
TOTAL 2.3 sec for MOZILLA.ORG\MOZILLA\CHROME 
TOTAL 2.41 sec for MOZILLA.ORG\MOZILLA\COMPONENTS, including 0.90 sec for gecko
(gk*.dll)
TOTAL 0.327 sec for MOZILLA.ORG\MOZILLA\PREF
TOTAL 0.361 sec for MOZILLA.ORG\MOZILLA\RES

$HOME\APPLICATION DATA\MOZILLA AND SUBDIRECTORIES
0.566 sec for 436 uncached (cold start) file operations. 0.04 sec for 415 cached
(hot start) file operations.

MY HUMBLE CONCLUSIONS: 
- Huge part of cold start time (7.39 out of 9 sec) is due to file i/o (...you
didn't already know that , heh? :-)
- Oddly enough, hot start time is moderately reduced (9 -> 5.5 sec), although
file io is reduced to a minimal 0.566 sec in that case. Is filemon reporting
exaggerated numbers on cold start?
- chrome is a beast (2.3 sec) with some jar files taking up to 0.482 sec (comm.jar)
- xpcom is a small beast (0.621 sec) 
- Where is this precompiled chrome stuff ? xul.mfl (0.123 sec) maybe ?
- At least 0.45 sec (5.65 % of total file io) could be gained from loading of
messenger chrome and msgmapi.dll with Mail&News, instead of doing it at browser
initialisation. Should I file a bug for it?
- At least 0.106 sec (1.33 % of total file io) could be gained from delayed
loading of chatzilla chrome
- Are ucvlatin.dll (0.109 sec) and other similar libraries needed for creating
the initial browser window?
- Venkman doesn't add significant startup delay
- 721 operations on Windows registry. I speculate many of them are unecessary at
startup, but I have no idea of the time gained. Probably every file io gain
might be significant on a heavily fragmented drive. 
- Oddly, 151 operations on Windows registry resulted in NOT FOUND, most of them
are like this: HKCU\CLSID\{00021401-0000-0000-C000-000000000046}\InprocServer32.
Is this normal ?
- Extremely high level of component.reg usage reported by Daniel Bratell in bug
27510 seem not existing anymore (filemon reports 1064 KB total).

TEST CONDITIONS: Build 2001101909, new clean profile, modern theme, Win2K SP2,
K6-III/400, 192MB ram, 5400 rpm hdd
Coldstart -> warmstart doesn't drop as much as you expected because as a
multi-threaded app, other things (CPU use) are happening during the (cold) file
loads.  On warmstart, you still have all the CPU use but can't overlap it with
file IO.  (Note: this is an assumption.)

Loading mapi/messenger chrome with mail&news would be good; I suspect this has
been thought of before and a bug exists.  Do a query (or check bugs that this
depends on) and add it if you can't find it.

I suspect that a lot of the time for warmstart (and cold) is taken doing
relocations, memory allocations and object/factory creations/initializations.
See some of the jprofs.
The mailnews and chatzilla chrome are opened because both components overlay the
browser to add items to the Tasks menu (and possibly other overlays). If we
can't cache this somehow we could create a small "mailoverlay.jar" if it helps.
Randell, thank you for the clarification.  With all my ignorance about
javascript and the Mozilla structure, I do agree there's a lot of room on
reducing memory allocations and other cpu/memory intensive operations during
startup. But there's also a lot of work on reducing file i/o too. Warm start
times are quite acceptable on low end machines with at least 64MB of memory.
Cold start is what draws the attention of an average user (and he can't use
turbo mode on low memory machines, because of footprint problems). Even worse,
cold start is terribly affected by fragmentation.
Couldn't we keep library and chrome loading to a minimum needed for showing a
simple browser window? Even if start page is not blank, some delays from
late-loaded files wiil be hidden by the network latencies, once a request for
web data retrieval is issued.

Bug 12896 isn't for the mail files I was talking about. I can't find a similar
bug, so I will file a new one. As dveditz suggests, it might be feasible to find
a solution for late loading those files.
Depends on: 105966
Depends on: 106087
Depends on: 106148
Keywords: meta
Depends on: 108612
Depends on: 109207
Depends on: 110340
Depends on: 110555
QA Contact: paw → hong
Depends on: 45008
Depends on: 112308
Depends on: 66877
Depends on: 113393
Depends on: 113467
Depends on: 113739
Depends on: 112767
Depends on: 98533
Depends on: 114914
Depends on: 115217
No longer depends on: 115217
Depends on: 115217
Depends on: 115164
Depends on: 115129
No longer depends on: 91242
Depends on: 116187
Depends on: 116191
Depends on: 115973
Depends on: 46707
Depends on: 102992
Target Milestone: --- → mozilla1.0
Blocks: advocacybugs
moving tracking/meta bugs out of milestone pools. 
Target Milestone: mozilla1.0 → Future
Depends on: 135505
Depends on: 146113
No longer depends on: 26291
Depends on: 91242
*** Bug 174091 has been marked as a duplicate of this bug. ***
*** Bug 185265 has been marked as a duplicate of this bug. ***
Depends on: 118455
Depends on: 196843
Blocks: majorbugs
*** Bug 216801 has been marked as a duplicate of this bug. ***
*** Bug 235263 has been marked as a duplicate of this bug. ***
Could somebody post the link to the graphs or charts which showed the startup
times? Thanks.
(In reply to comment #53)
> Could somebody post the link to the graphs or charts which showed the startup
> times? Thanks.

Anyone? Thanks again.
Blocks: 203448
No longer blocks: majorbugs
*** Bug 308628 has been marked as a duplicate of this bug. ***
I don't know if this is the right place to post this bug, but since about 3 weeks now, minefield has extremely slow startup. Just upgraded to Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a7pre) Gecko/2007072505 Minefield/3.0a7pre and it took my 5 minutes! to boot it up. If I should post a new bug, just let me know. 

p.s. Is there any way I can make a trace of why this is taking so long?
Depends on: 408113
Assignee: cathleennscp → nobody
Status: NEW → RESOLVED
Closed: 11 years ago
QA Contact: hong_bugmail → chofmann
Resolution: --- → DUPLICATE
Duplicate of bug: 479078
Status: RESOLVED → REOPENED
Depends on: 479078
Resolution: DUPLICATE → ---
How did this get reopened?
Status: REOPENED → RESOLVED
Closed: 11 years ago9 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 447581
What's about the open 'Depends on' & 'Blocks' Bugs ???
Move to Bug 447581 ???
Flags: needinfo?
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.