Last Comment Bug 560647 - add startup timeline instrumentation
: add startup timeline instrumentation
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: General (show other bugs)
: Trunk
: All All
: -- normal with 1 vote (vote)
: mozilla1.9.3a5
Assigned To: :Ehsan Akhgari
:
Mentors:
Depends on: 563082 568912 568915 548612 568324 568917
Blocks: 568921
  Show dependency treegraph
 
Reported: 2010-04-20 13:28 PDT by Vladimir Vukicevic [:vlad] [:vladv]
Modified: 2010-05-28 12:35 PDT (History)
24 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
sample instrumentation patch (14.21 KB, patch)
2010-04-20 13:28 PDT, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details | Diff | Splinter Review
very basic instrumentation & fixes [landed on m-c] (8.41 KB, patch)
2010-04-23 13:39 PDT, Vladimir Vukicevic [:vlad] [:vladv]
bzbarsky: review+
Details | Diff | Splinter Review
Instrument color management initialization [landed on m-c] (574 bytes, patch)
2010-04-23 14:45 PDT, Jeff Muizelaar [:jrmuizel]
ehsan: review+
Details | Diff | Splinter Review
Instrument storage/ (1.59 KB, patch)
2010-04-26 11:06 PDT, Shawn Wilsher :sdwilsh
bugmail: review+
Details | Diff | Splinter Review
log only on main thread (1.87 KB, patch)
2010-04-26 11:37 PDT, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details | Diff | Splinter Review
Instrument storage/ v1.1 [landed on startup-timeline] (3.19 KB, patch)
2010-04-26 11:39 PDT, Shawn Wilsher :sdwilsh
bugmail: review+
Details | Diff | Splinter Review
Basic instrumentation patch [landed on startup-timeline] (23.47 KB, patch)
2010-04-26 15:28 PDT, :Ehsan Akhgari
no flags Details | Diff | Splinter Review
Instrument places [landed on startup-timeline] (1.53 KB, patch)
2010-04-26 16:16 PDT, Shawn Wilsher :sdwilsh
no flags Details | Diff | Splinter Review
Instrument cookies [landed on startup-timeline] (774 bytes, patch)
2010-04-26 16:20 PDT, Shawn Wilsher :sdwilsh
dwitte: review+
Details | Diff | Splinter Review
Instrumentation log sample (50.43 KB, text/plain)
2010-04-27 07:08 PDT, :Ehsan Akhgari
no flags Details
Another log example (54.73 KB, application/x-bzip2)
2010-04-27 11:24 PDT, :Ehsan Akhgari
no flags Details
sample log (446.82 KB, text/plain)
2010-04-27 12:06 PDT, Shawn Wilsher :sdwilsh
no flags Details
sample log (651.94 KB, text/plain)
2010-04-27 12:18 PDT, Shawn Wilsher :sdwilsh
no flags Details
Hacked plot-timeline.py script for generating visual timelines (11.30 KB, text/plain)
2010-04-27 12:52 PDT, Johnathan Nightingale [:johnath]
no flags Details
post-processor for log files (251 bytes, text/plain)
2010-04-27 12:57 PDT, Johnathan Nightingale [:johnath]
no flags Details
instrument main thread events > 5.0 ms [landed on startup-timeline] (1.18 KB, patch)
2010-04-27 13:21 PDT, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details | Diff | Splinter Review
Sample log on mac (61.42 KB, application/x-bzip2)
2010-04-27 15:23 PDT, :Ehsan Akhgari
no flags Details
fix exit/min instrumentation [landed on startup-timeline] (1.51 KB, patch)
2010-04-27 15:48 PDT, Vladimir Vukicevic [:vlad] [:vladv]
no flags Details | Diff | Splinter Review
Log from a release build on Windows (101.34 KB, text/plain)
2010-04-28 12:16 PDT, David Bolter [:davidb]
no flags Details
Log from a release build on Windows (100.50 KB, text/plain)
2010-04-28 16:31 PDT, David Bolter [:davidb]
no flags Details
Log from a release build on Windows (130.26 KB, text/plain)
2010-04-29 19:44 PDT, :Ehsan Akhgari
no flags Details
Log from a PGO build on Windows (86.56 KB, application/x-bzip2)
2010-05-03 15:54 PDT, :Ehsan Akhgari
no flags Details
Vista, non-PGO, rel build, non-empty profile log (213.12 KB, application/zip)
2010-05-04 13:54 PDT, Johnathan Nightingale [:johnath]
no flags Details
Annotated startup log (1.81 MB, application/octet-stream)
2010-05-04 15:22 PDT, Johnathan Nightingale [:johnath]
no flags Details
More completely annotated startup log (1.81 MB, text/plain)
2010-05-05 12:59 PDT, Johnathan Nightingale [:johnath]
no flags Details
mozilla-central patch (154.57 KB, patch)
2010-05-05 16:02 PDT, :Ehsan Akhgari
no flags Details | Diff | Splinter Review
mozilla-central patch (154.58 KB, patch)
2010-05-05 16:33 PDT, :Ehsan Akhgari
no flags Details | Diff | Splinter Review
mozilla-central patch (154.60 KB, patch)
2010-05-05 22:17 PDT, :Ehsan Akhgari
no flags Details | Diff | Splinter Review
mozilla-central patch (155.28 KB, patch)
2010-05-07 11:05 PDT, :Ehsan Akhgari
no flags Details | Diff | Splinter Review
mozilla-central patch (boilerplate) (144.05 KB, patch)
2010-05-13 17:15 PDT, :Ehsan Akhgari
vladimir: review+
Details | Diff | Splinter Review
mozilla-central patch (non-trivial stuff) (8.39 KB, patch)
2010-05-13 17:16 PDT, :Ehsan Akhgari
vladimir: review-
Details | Diff | Splinter Review
mozilla-central patch (non-trivial stuff) (7.06 KB, patch)
2010-05-19 16:18 PDT, :Ehsan Akhgari
vladimir: review+
Details | Diff | Splinter Review
mozilla-central patch (non-trivial stuff) (3.83 KB, patch)
2010-05-19 16:23 PDT, :Ehsan Akhgari
vladimir: review+
Details | Diff | Splinter Review

Description Vladimir Vukicevic [:vlad] [:vladv] 2010-04-20 13:28:24 PDT
Created attachment 440309 [details] [diff] [review]
sample instrumentation patch

Bug 480735 added a lightweight timeline/function timer API for capturing coarse timing data, intended for use in analyzing startup.  We need to add instrumentation that actually takes advantage of this API.

Documentation is available at: https://wiki.mozilla.org/Platform/FunctionTimer_Usage

I've attached an example patch that has a couple places instrumented, to show example usage; it's got a bunch of things mixed in and should be split apart by component.
Comment 1 Vladimir Vukicevic [:vlad] [:vladv] 2010-04-23 13:39:10 PDT
Created attachment 441137 [details] [diff] [review]
very basic instrumentation & fixes [landed on m-c]

Some very basic instrumentation (just a few things in XRE_main, and a mark in win32's nsWindow::Show), and some fixes to the timer code.  The fixes are necessary to avoid some crashes and static constructor initialization order problems.
Comment 2 Jeff Muizelaar [:jrmuizel] 2010-04-23 14:45:47 PDT
Created attachment 441171 [details] [diff] [review]
Instrument color management initialization [landed on m-c]
Comment 3 :Ehsan Akhgari 2010-04-23 14:47:46 PDT
Comment on attachment 441171 [details] [diff] [review]
Instrument color management initialization [landed on m-c]

Let this be my first gfx code review.  :-)  rs=me.
Comment 4 Boris Zbarsky [:bz] (TPAC) 2010-04-23 21:59:40 PDT
Comment on attachment 441137 [details] [diff] [review]
very basic instrumentation & fixes [landed on m-c]

firstShow could just move into Show(), right (and stay static)?
Comment 5 Shawn Wilsher :sdwilsh 2010-04-26 11:06:48 PDT
Created attachment 441545 [details] [diff] [review]
Instrument storage/

I'm kinda surprised at how little this is showing up, but I think it'll be useful to have in there.
Comment 6 Vladimir Vukicevic [:vlad] [:vladv] 2010-04-26 11:37:53 PDT
Created attachment 441553 [details] [diff] [review]
log only on main thread

Untested patch to log only on main thread, per sdwilsh's request
Comment 7 Shawn Wilsher :sdwilsh 2010-04-26 11:39:03 PDT
Created attachment 441554 [details] [diff] [review]
Instrument storage/ v1.1 [landed on startup-timeline]

You are too fast for me asuth; I forgot to include the ExecuteStep stuff.
Comment 8 Andrew Sutherland [:asuth] 2010-04-26 11:54:30 PDT
(In reply to comment #7)
> Created an attachment (id=441554) [details]
> Instrument storage/ v1.1

Printing the statement pointer on executeStep does not seem very useful in and of itself.  Is the assumption that whoever is running this will also be running with the storage PR_LOG enabled so they can map the pointer back to the actual SQL?

Should building with the timeline accordingly do something to automatically enable the storage debugging?
Comment 9 Shawn Wilsher :sdwilsh 2010-04-26 14:44:28 PDT
(In reply to comment #8)
> (In reply to comment #7)
> > Created an attachment (id=441554) [details] [details]
> > Instrument storage/ v1.1
> 
> Printing the statement pointer on executeStep does not seem very useful in and
> of itself.  Is the assumption that whoever is running this will also be running
> with the storage PR_LOG enabled so they can map the pointer back to the actual
> SQL?
> 
> Should building with the timeline accordingly do something to automatically
> enable the storage debugging?
I did it like this such that if someone feels like they need to go after the storage statements, they can then get the sql.  Alternatively, I can just dump out the SQL.  Your call.
Comment 10 Johnathan Nightingale [:johnath] 2010-04-26 14:47:38 PDT
Ehsan has created a user repo for work-in-progress instrumentation patches, here:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/
Comment 11 Andrew Sutherland [:asuth] 2010-04-26 14:53:01 PDT
Comment on attachment 441554 [details] [diff] [review]
Instrument storage/ v1.1 [landed on startup-timeline]

Eh, that's probably a question for whoever is going to be consuming this information.
Comment 12 :Ehsan Akhgari 2010-04-26 14:57:12 PDT
Comment on attachment 441171 [details] [diff] [review]
Instrument color management initialization [landed on m-c]

http://hg.mozilla.org/mozilla-central/rev/f236632a9747
Comment 13 :Ehsan Akhgari 2010-04-26 15:18:33 PDT
Comment on attachment 441554 [details] [diff] [review]
Instrument storage/ v1.1 [landed on startup-timeline]

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/d5305304b1b6
Comment 14 :Ehsan Akhgari 2010-04-26 15:28:21 PDT
Created attachment 441626 [details] [diff] [review]
Basic instrumentation patch [landed on startup-timeline]

This is attachment 440309 [details] [diff] [review] updated to trunk.  Not sure who to ask review from; throwing the request to vlad for now.
Comment 15 Shawn Wilsher :sdwilsh 2010-04-26 16:16:49 PDT
Created attachment 441646 [details] [diff] [review]
Instrument places [landed on startup-timeline]
Comment 16 Shawn Wilsher :sdwilsh 2010-04-26 16:20:28 PDT
Created attachment 441647 [details] [diff] [review]
Instrument cookies [landed on startup-timeline]
Comment 17 dwitte@gmail.com 2010-04-26 16:38:34 PDT
Comment on attachment 441647 [details] [diff] [review]
Instrument cookies [landed on startup-timeline]

r=dwitte
Comment 18 :Ehsan Akhgari 2010-04-26 20:30:49 PDT
Comment on attachment 441626 [details] [diff] [review]
Basic instrumentation patch [landed on startup-timeline]

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/1f7d855a2575
Comment 19 :Ehsan Akhgari 2010-04-26 20:31:13 PDT
CSS loader instrumentation:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/05f4ad81ad1b
Comment 20 :Ehsan Akhgari 2010-04-26 20:33:27 PDT
Comment on attachment 441646 [details] [diff] [review]
Instrument places [landed on startup-timeline]

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/035d3997d170
Comment 21 :Ehsan Akhgari 2010-04-26 20:34:57 PDT
Comment on attachment 441647 [details] [diff] [review]
Instrument cookies [landed on startup-timeline]

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/035d3997d170
Comment 22 :Ehsan Akhgari 2010-04-26 20:40:43 PDT
Instrumented the cache service:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/dcea20694e7d
Comment 23 :Ehsan Akhgari 2010-04-26 21:07:21 PDT
Instrumented the accessibility service:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/03570bdb8679
Comment 24 :Ehsan Akhgari 2010-04-26 21:29:05 PDT
Instrumented the DNS services:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/03570bdb8679
Comment 25 :Ehsan Akhgari 2010-04-26 21:38:06 PDT
Instrumented http transaction and http handler:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/e7d05f937f41
Comment 26 :Ehsan Akhgari 2010-04-26 21:44:50 PDT
Instrumented the plain text editor service:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/a440cb669738
Comment 27 :Ehsan Akhgari 2010-04-26 21:57:18 PDT
Instrumented the spell checking startup:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/b0deb5751be9

Instrumented the text control frame editor initialization:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/9e8ac8d53a8c
Comment 28 :Ehsan Akhgari 2010-04-26 22:46:48 PDT
Instrument the URI loader service:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/1fa17094d7e2

Use __PRETTY_FUNCTION__ on gcc:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/2569e343fde3

More to come tomorrow!
Comment 29 Shawn Wilsher :sdwilsh 2010-04-26 22:49:21 PDT
(In reply to comment #28)
> More to come tomorrow!
Is there going to be anything left for the rest of us tomorrow? :P
Comment 30 :Ehsan Akhgari 2010-04-26 22:56:01 PDT
(In reply to comment #29)
> (In reply to comment #28)
> > More to come tomorrow!
> Is there going to be anything left for the rest of us tomorrow? :P

Yep!  :P

At the very least, it would be good for people to try the new instrumentations, and see where they don't make sense, and let me know to remove those (or change them).  Although I personally think that it's best if we have more than enough logs.  I think filtering the logs for non-significant stuff is the job of the tool(s) we'll have for making sense of the logs.

Oh, and also, make sure to suggest any significant stuff that needs to be instrumented.  I'm going to focus a lot on the layout internals tomorrow, but I may forget about some things (or not even know about them in the first place!), so suggestions are welcome!
Comment 31 Johnathan Nightingale [:johnath] 2010-04-27 05:13:05 PDT
I want you to know I love you both very much. Keep em coming!
Comment 32 :Ehsan Akhgari 2010-04-27 06:25:10 PDT
Instrument nsContentDLF:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/c97b9a197904
Comment 33 :Ehsan Akhgari 2010-04-27 07:08:14 PDT
Created attachment 441794 [details]
Instrumentation log sample

This is a sample of the log produced by the probes I have so far.  I'm just attaching this for people to use as a sample of how a log would look like in order to get some scripts ready to use it.  Other than that, this log is of absolutely no use, and comes with no guarantees or anything!
Comment 34 Joe Drew (not getting mail) 2010-04-27 08:27:16 PDT
Instrument imagelib:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/10429ee91d06
Comment 35 :Ehsan Akhgari 2010-04-27 08:36:53 PDT
Instrument pres shell (lots of interesting stuff):

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/b92732cc2556
Comment 36 :Ehsan Akhgari 2010-04-27 08:54:41 PDT
Joe also instrumented gfx font list init:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/50b5334f6233
Comment 37 :Ehsan Akhgari 2010-04-27 09:00:54 PDT
Instrument XPCOM initialization:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/15c3b88f58e0
Comment 38 :Ehsan Akhgari 2010-04-27 11:12:17 PDT
Instrument the component manager service:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/3918a86919c6

This gives us info on how much time it takes for each service to get instantiated, and it can also be used to get a list of all services we init on the startup path.
Comment 39 :Ehsan Akhgari 2010-04-27 11:24:54 PDT
Created attachment 441863 [details]
Another log example

Again, please do not use this for any other purpose besides tool development!
Comment 40 Shawn Wilsher :sdwilsh 2010-04-27 12:06:49 PDT
Created attachment 441884 [details]
sample log

Less spammy component stuff: http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/d793aec54693
Comment 41 Shawn Wilsher :sdwilsh 2010-04-27 12:18:28 PDT
Created attachment 441891 [details]
sample log

As requested by johnath.  Loads about:blank from the command line, and once a window is up I <ctrl+l>example.com<enter> and then close the browser.  We can ignore everything after (and including) the first occurrence of example.com.
Comment 42 Johnathan Nightingale [:johnath] 2010-04-27 12:52:29 PDT
Created attachment 441898 [details]
Hacked plot-timeline.py script for generating visual timelines

Originally from Federico, over here - http://people.gnome.org/~federico/news-2006-03.html#timeline-tools

Hacked to use a different regex, since our output doesn't look like strace, and I'm constantly twiddling PIXELS_PER_SECOND so that the left-edge time scale takes an appropriate amount of space (the default value tends to cluster all the lines in the top left corner, since the text for each line takes more vertical space than the line itself.)

Also - this script doesn't like 0-based millisecond timeouts, so I post-process them to use seconds-since-epoch like strace does, in a tiny little perl script to be attached next.
Comment 43 Johnathan Nightingale [:johnath] 2010-04-27 12:57:29 PDT
Created attachment 441899 [details]
post-processor for log files

As mentioned above, this converts the timestamps to a format that plot-timeline.py likes. It also filters subsequent events within the same millisecond, to cut down on places-we're-unlikely-to-focus-anyhow.

My workflow right now:

1) Get/generate log
2) ./post-process.pl < rawlog > postlog
3) python plot-timeline.py -o outfile.png postlog

On log files much larger than a few meg, this eventually OOMs, but that's maybe a healthy post-processing threshold anyhow. Log it all up front, fine, but too much noise just hides the forest in trees.
Comment 44 Vladimir Vukicevic [:vlad] [:vladv] 2010-04-27 13:21:21 PDT
Created attachment 441909 [details] [diff] [review]
instrument main thread events > 5.0 ms [landed on startup-timeline]
Comment 45 :Ehsan Akhgari 2010-04-27 15:07:40 PDT
Comment on attachment 441909 [details] [diff] [review]
instrument main thread events > 5.0 ms [landed on startup-timeline]

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/85a72e5bf0be
Comment 46 :Ehsan Akhgari 2010-04-27 15:13:10 PDT
JS execution instrumentation (thanks a lot to Shaver for doing the real work here):

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/994610f9cf10

This is *hot* stuff!!!
Comment 47 :Ehsan Akhgari 2010-04-27 15:23:36 PDT
Created attachment 441952 [details]
Sample log on mac

I created this log by setting the home page to about:blank, opening Firefox, quickly going to www.google.com and closing Firefox.
Comment 48 Johnathan Nightingale [:johnath] 2010-04-27 15:37:22 PDT
(In reply to comment #47)
> Created an attachment (id=441952) [details]
> Sample log on mac
> 
> I created this log by setting the home page to about:blank, opening Firefox,
> quickly going to www.google.com and closing Firefox.

Graph based on this log - http://people.mozilla.org/~johnath/startup/startup-with-js.png
Comment 49 Vladimir Vukicevic [:vlad] [:vladv] 2010-04-27 15:48:53 PDT
Created attachment 441959 [details] [diff] [review]
fix exit/min instrumentation [landed on startup-timeline]
Comment 50 Shawn Wilsher :sdwilsh 2010-04-27 18:28:54 PDT
This fills the first gap we had in startup: http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/f46f04927337
Turns out, a bunch of time is spent doing                 gInterfaceInfoManager->AutoRegisterInterfaces();

Going to log that one next.
Comment 51 Shawn Wilsher :sdwilsh 2010-04-27 18:54:17 PDT
And this fills it the rest of the way: http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/a9fdbbc67d82

Also, vlad suggested I link my xpt files which makes the hole go away a bunch more too:
./dist/bin/xpt_link.exe firefox.xpt dist/bin/components/*.xpt
Comment 52 Vladimir Vukicevic [:vlad] [:vladv] 2010-04-27 19:05:58 PDT
(Our release builds do the xpt linking, so we should be profiling with them linked -- it's done as part of the packaging step though, would be nice to fix it for normal opt builds.)
Comment 53 Robert Kaiser 2010-04-28 03:35:43 PDT
Just doing a "make package" and trying on what's in the package doesn't work?
Comment 54 Mike Shaver (:shaver -- probably not reading bugmail closely) 2010-04-28 05:25:20 PDT
I think that the create-instance and execute-step calls aren't adding much value to our current analysis.  It seems like "auto-register-interfaces takes 200ms" is the level of detail we want from the timeline.  Similarly with "130ms spent creating the hidden window" or "80ms initializing Places".  We're not trying to replace a profiler here, after all, we're trying to get to the point where we can say "attacking the hidden window, interface registration, and the network service will give us back about 700ms", so that we can build a plan for doing that.

We can post-filter, but I think that we should probably not bother adding sub-instrumentation once we've identified a culprit for a couple of hundred ms.  If we end up with a 200-entry timeline, or so, I think we're going to be in the sweet spot for planning our next work.
Comment 55 David Bolter [:davidb] 2010-04-28 12:16:19 PDT
Created attachment 442140 [details]
Log from a release build on Windows

This is Ehsan, attaching a log from a release build of trunk with the patches in startup-timeline.

I created a release build using the mozconfig file for 1.9.2 release builds, with --enable-jemalloc stripped out (because this machine can't do that), I ran make package && make installer, I executed the installer and installed Firefox to Program Files, I loaded it using a new profile, set home page to about:blank, I ran the build, and loaded google.com right after the window showed up, and I'm attaching the log from that run here.

Please note that this is a very fast machine, so this should be taken with a grain of salt.
Comment 56 :Ehsan Akhgari 2010-04-28 16:21:58 PDT
I landed two experimental patches which hack nspr to use QueryPerformanceCounter on Windows, and it gives us a much better resolution:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/cf7b100f8b66
http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/da1c9c08d470
Comment 57 David Bolter [:davidb] 2010-04-28 16:31:46 PDT
Created attachment 442234 [details]
Log from a release build on Windows

This is Ehsan again, attaching a new log produced with the same steps in comment 55, but with a build with high resolution timers.  It's much more useful now.
Comment 58 Johnathan Nightingale [:johnath] 2010-04-28 19:57:27 PDT
Comment on attachment 442234 [details]
Log from a release build on Windows

>[   334.03] @ Main Thread Event 05AD80A8 took  33.64 ms
>[   359.17] < (  2)     15.86 ms (    15.86 ms total) - PresShell::Paint (line 5728) (document: chrome://browser/content/browser.xul)
>[   880.98] < (  2)      1.06 ms (     1.06 ms total) - PresShell::Paint (line 5728) (document: chrome://browser/content/browser.xul)
>[  1272.00] < (  4)      1.18 ms (     1.18 ms total) - PresShell::Paint (line 5728) (document: chrome://browser/content/browser.xul)

I'd say we definitely need to understand what's going on between those Paint calls!
Comment 59 :Ehsan Akhgari 2010-04-29 19:43:08 PDT
Instrument the Windows message loop for messages taking longer than 5ms to be processed:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/cff98aa74b6e

Log the nsIRunnable and timer handler functions inside nsThread::ProcessNextMessage on Windows:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/403b3de93880
Comment 60 :Ehsan Akhgari 2010-04-29 19:44:15 PDT
Created attachment 442578 [details]
Log from a release build on Windows

This is the latest log that I captured using my work from today.
Comment 61 :Ehsan Akhgari 2010-04-29 19:47:34 PDT
We'll need to log more about what's happening on appStartup->Run in XRE_main.  There is also some time spent between the first two PresShell::Paint calls, but I'm not sure what's to blame there yet.
Comment 62 :Ehsan Akhgari 2010-04-29 19:50:52 PDT
I made a blog post about the current state of things with regard to startup timeline measurements.

http://ehsanakhgari.org/blog/2010-04-29/making-sense-firefox-startup-time
Comment 63 Ted Mielczarek [:ted.mielczarek] 2010-05-03 07:23:00 PDT
(In reply to comment #60)
> Created an attachment (id=442578) [details]
> Log from a release build on Windows
> 
> This is the latest log that I captured using my work from today.

Is this from a PGO build? Just curious.
Comment 64 :Ehsan Akhgari 2010-05-03 10:09:48 PDT
(In reply to comment #63)
> (In reply to comment #60)
> > Created an attachment (id=442578) [details] [details]
> > Log from a release build on Windows
> > 
> > This is the latest log that I captured using my work from today.
> 
> Is this from a PGO build? Just curious.

That's a good question.  I built it using our release mozconfig, so yes, it should be.  But aren't PGO builds supposed to open the browser, collect profile data, and then relink?  I don't remember ever seeing the browser open up during the build.  And I don't know how to confirm if the build is PGO based on the binaries...
Comment 65 Ted Mielczarek [:ted.mielczarek] 2010-05-03 10:17:35 PDT
PGO isn't a mozconfig option, you need to run "make -f client.mk profiledbuild" to do a PGO build, and assuming you have the PROFILE_GEN_SCRIPT set in your mozconfig, it will execute that between the two build halves.

(Also, you'd notice, since a PGO build will take >1hr to link libxul the second time.)
Comment 66 :Ehsan Akhgari 2010-05-03 12:39:47 PDT
(In reply to comment #65)
> PGO isn't a mozconfig option, you need to run "make -f client.mk profiledbuild"
> to do a PGO build, and assuming you have the PROFILE_GEN_SCRIPT set in your
> mozconfig, it will execute that between the two build halves.

OK, thanks for the tip.  Now, I'm trying that, but I get:

LINK: fatal error LNK1264: /LTCG:PGINSTRUMENT specified but no code generation required; instrumentation failed

Do you know what's happening here?

> (Also, you'd notice, since a PGO build will take >1hr to link libxul the second
> time.)

Heh, that's a good indication!
Comment 67 :Ehsan Akhgari 2010-05-03 15:54:17 PDT
Created attachment 443216 [details]
Log from a PGO build on Windows

This is the latest log, coming from a PGO build.  To my great surprise, it isn't all that different with the previous one.
Comment 68 Johnathan Nightingale [:johnath] 2010-05-04 13:54:52 PDT
Created attachment 443436 [details]
Vista, non-PGO, rel build, non-empty profile log

This is our best put on a comprehensive windows log right now. It uses a profile with the places_generated_med places DB, one extension installed, two tabs of session restore, and some minimal cache data and urlclassifier data.

We strongly suspect that the remaining large gaps (particularly the gap right before onStartRequest for hiddenWindow) are waiting on I/O.
Comment 69 Johnathan Nightingale [:johnath] 2010-05-04 15:22:54 PDT
Created attachment 443467 [details]
Annotated startup log

Ehsan and I did a first pass at labeling the meaningful blocks of work - search for "****"
Comment 70 Johnathan Nightingale [:johnath] 2010-05-05 12:59:07 PDT
Created attachment 443703 [details]
More completely annotated startup log

The only thing that stops this log from being a solid end point is that Ehsan and I figured out the source of the 600ms block, and it was a source of painful embarrassment for us both, so we're going to re-log with that problem fixed. 

Otherwise we believe the annotations to be solid, and will be extracting out major culprits based on this log. We'll cross check those against the new log once it's ready.

And no, I won't say what it was here, you can go read the log if you like.  :) <grumble>
Comment 71 :Ehsan Akhgari 2010-05-05 16:02:58 PDT
Created attachment 443750 [details] [diff] [review]
mozilla-central patch

This is a rolled up patch from the tip of the startup-timeline repo, so that we can get this stuff landed on mozilla-central.

The only thing which I did not include in this patch is the nspr changes.  I landed a workaround which should give us 1ms resolution on Windows, which is way better than 15ms.  We can land the nspr change separately (that is being tracked in bug 563082).
Comment 72 :Ehsan Akhgari 2010-05-05 16:33:02 PDT
Created attachment 443757 [details] [diff] [review]
mozilla-central patch

Fixed a build problem:

http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/5a8159e006f6
Comment 73 :Ehsan Akhgari 2010-05-05 22:17:57 PDT
Created attachment 443809 [details] [diff] [review]
mozilla-central patch

Fixed one more build issue on Windows caused by MSDN lying about which header to include.
Comment 74 :Ehsan Akhgari 2010-05-07 11:05:46 PDT
Created attachment 444127 [details] [diff] [review]
mozilla-central patch

Updated to the tip of startup-timeline.
Comment 75 Vladimir Vukicevic [:vlad] [:vladv] 2010-05-12 13:07:17 PDT
(this turned into not a tracking bug)

+      // initialize some common services, so we don't pay the cost for these
at odd times later on;
+      // SetWindowCreator -> ChromeRegistry -> IOService ->
SocketTransportService -> (nspr wspm init), Prefs
+      {
+        nsCOMPtr<nsISupports> comp;

in XRE_main, stick this inside #ifdef NS_FUNCTION_TIMER so that we don't change
the normal startup path.


in mozJSSubScriptLoader::LoadSubScript (and maybe elsewhere?):

+#ifdef NS_FUNCTION_TIMER
+    NS_ConvertUTF16toUTF8 url__(aURL);
+    NS_TIME_FUNCTION_FMT("%s (line %d) (url: %s)", MOZ_FUNCTION_NAME,
+                         __LINE__, url__.get());

don't declare a stack NS_ConvertUTF16toUTF8 -- use
NS_ConvertUTF16toUTF8(aURL).get() in the printf args directly.  Better yet, use
NS_LossyConvertUTF16toASCII instead, no need to pay the cost of a UTF8
conversion.

in PresShell, would be much cleaner to have a macro for

+  nsCAutoString docURL__("N/A");
+  nsIURI *uri__ = mDocument->GetDocumentURI();
+  if (uri__) {
+    uri__->GetSpec(docURL__);
+  }

since it's repeated everywhere and looks really ugly; it'll likely bitrot
unless it looks clean/easy enough to fix for someone making a change.

Also, everywhere, don't use braces around:

+  if (aLoadData->mURI) {
+    aLoadData->mURI->GetSpec(spec__);
+  }

The single-line if doesn't need it, and we're trying to minimize code impact on
existing code here (visual/scanning impact).  I'd even write it on just one
line:

if (aLoadData->mURI) aLoadData->mURI->GetSpec(spec__);

I couldn't come up with a good way to declare and initialize in one go; stupid
getters :/

+  NS_TIME_FUNCTION_MIN_FMT(5, "%s (ARGH line %d) (cid: %s)",
MOZ_FUNCTION_NAME, \
+                           __LINE__, cid_buf__)

get rid of the ARGHs?

Please pull the nsStackWalk/nsThread/nsTimer instrumentation into a separate
patch; it's trickier than just the boilerplate stuff elsewhere.
Comment 76 :Ehsan Akhgari 2010-05-13 17:15:46 PDT
Created attachment 445239 [details] [diff] [review]
mozilla-central patch (boilerplate)

All comments addressed.
Comment 77 :Ehsan Akhgari 2010-05-13 17:16:30 PDT
Created attachment 445240 [details] [diff] [review]
mozilla-central patch (non-trivial stuff)

nsThread/nsTimer/nsStackwalk stuff.
Comment 78 Vladimir Vukicevic [:vlad] [:vladv] 2010-05-18 13:29:14 PDT
Comment on attachment 445239 [details] [diff] [review]
mozilla-central patch (boilerplate)

Ok, this is fine, but with the caveat that if module owners come in and want something changed in their particular area, that should happen quickly.  I'm r+'ing this because it's largely boilerplate and not part of the default build, and we want it in pretty quickly.
Comment 79 Vladimir Vukicevic [:vlad] [:vladv] 2010-05-18 13:32:04 PDT
Comment on attachment 445240 [details] [diff] [review]
mozilla-central patch (non-trivial stuff)

Almost all of this is #if 0'd out, so let's land the only part that's not -- specifically,

+#ifdef NS_FUNCTION_TIMER
+    char message[1024] = {'\0'};
+    if (NS_IsMainThread()) {
+        mozilla::FunctionTimer::ft_snprintf(message, sizeof(message), 
+                                            "@ Main Thread Event %p", event.get());
+    }
+    // If message is empty, it means that we're not on the main thread, and
+    // FunctionTimer won't time this function.
+    NS_TIME_FUNCTION_MIN_FMT(5.0, message);
+#endif

Everything else should go for a future patch, if we want the vtable goop.  That stuff should also get isolated out into its own function instead of being in the middle of the nsThread method.
Comment 80 :Ehsan Akhgari 2010-05-19 16:15:46 PDT
(In reply to comment #78)
> (From update of attachment 445239 [details] [diff] [review])
> Ok, this is fine, but with the caveat that if module owners come in and want
> something changed in their particular area, that should happen quickly.  I'm
> r+'ing this because it's largely boilerplate and not part of the default build,
> and we want it in pretty quickly.

Fair enough.
Comment 81 :Ehsan Akhgari 2010-05-19 16:18:12 PDT
Created attachment 446374 [details] [diff] [review]
mozilla-central patch (non-trivial stuff)

I guess the vtable magic stuff can remain on the startup-timeline repo for now, and we can backport it if needed in the future.
Comment 82 Vladimir Vukicevic [:vlad] [:vladv] 2010-05-19 16:21:28 PDT
Comment on attachment 446374 [details] [diff] [review]
mozilla-central patch (non-trivial stuff)

r=me, but you don't need any of the nsTimerImpl changes (they're needed for the parts of this that were #if 0'd out before).

Also, please stick a (void*) before that event.get(), to avoid weirdness that I've seen in printf and .get()
Comment 83 :Ehsan Akhgari 2010-05-19 16:23:14 PDT
Created attachment 446379 [details] [diff] [review]
mozilla-central patch (non-trivial stuff)

Note You need to log in before you can comment on or make changes to this bug.