Closed Bug 560647 Opened 15 years ago Closed 15 years ago

add startup timeline instrumentation

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.3a5

People

(Reporter: vlad, Assigned: ehsan.akhgari)

References

(Depends on 2 open bugs, Blocks 1 open bug)

Details

Attachments

(15 files, 18 obsolete files)

8.41 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
574 bytes, patch
ehsan.akhgari
: review+
Details | Diff | Splinter Review
3.19 KB, patch
asuth
: review+
Details | Diff | Splinter Review
23.47 KB, patch
Details | Diff | Splinter Review
1.53 KB, patch
Details | Diff | Splinter Review
774 bytes, patch
dwitte
: review+
Details | Diff | Splinter Review
11.30 KB, text/plain
Details
251 bytes, text/plain
Details
1.18 KB, patch
Details | Diff | Splinter Review
1.51 KB, patch
Details | Diff | Splinter Review
86.56 KB, application/x-bzip2
Details
213.12 KB, application/zip
Details
1.81 MB, text/plain
Details
144.05 KB, patch
vlad
: review+
Details | Diff | Splinter Review
3.83 KB, patch
vlad
: review+
Details | Diff | Splinter Review
Attached patch sample instrumentation patch (obsolete) — Splinter Review
Bug 480735 added a lightweight timeline/function timer API for capturing coarse timing data, intended for use in analyzing startup. We need to add instrumentation that actually takes advantage of this API. Documentation is available at: https://wiki.mozilla.org/Platform/FunctionTimer_Usage I've attached an example patch that has a couple places instrumented, to show example usage; it's got a bunch of things mixed in and should be split apart by component.
Attachment #440309 - Attachment is patch: true
Attachment #440309 - Attachment mime type: application/octet-stream → text/plain
Some very basic instrumentation (just a few things in XRE_main, and a mark in win32's nsWindow::Show), and some fixes to the timer code. The fixes are necessary to avoid some crashes and static constructor initialization order problems.
Attachment #441137 - Flags: review?
Attachment #441137 - Flags: review? → review?(benjamin)
Comment on attachment 441171 [details] [diff] [review] Instrument color management initialization [landed on m-c] Let this be my first gfx code review. :-) rs=me.
Attachment #441171 - Flags: review?(vladimir) → review+
Comment on attachment 441137 [details] [diff] [review] very basic instrumentation & fixes [landed on m-c] firstShow could just move into Show(), right (and stay static)?
Attachment #441137 - Flags: review?(benjamin) → review+
Attached patch Instrument storage/ (obsolete) — Splinter Review
I'm kinda surprised at how little this is showing up, but I think it'll be useful to have in there.
Attachment #441545 - Flags: review?(bugmail)
Attachment #441545 - Flags: review?(bugmail) → review+
Attached patch log only on main thread (obsolete) — Splinter Review
Untested patch to log only on main thread, per sdwilsh's request
You are too fast for me asuth; I forgot to include the ExecuteStep stuff.
Attachment #441545 - Attachment is obsolete: true
Attachment #441554 - Flags: review?(bugmail)
(In reply to comment #7) > Created an attachment (id=441554) [details] > Instrument storage/ v1.1 Printing the statement pointer on executeStep does not seem very useful in and of itself. Is the assumption that whoever is running this will also be running with the storage PR_LOG enabled so they can map the pointer back to the actual SQL? Should building with the timeline accordingly do something to automatically enable the storage debugging?
(In reply to comment #8) > (In reply to comment #7) > > Created an attachment (id=441554) [details] [details] > > Instrument storage/ v1.1 > > Printing the statement pointer on executeStep does not seem very useful in and > of itself. Is the assumption that whoever is running this will also be running > with the storage PR_LOG enabled so they can map the pointer back to the actual > SQL? > > Should building with the timeline accordingly do something to automatically > enable the storage debugging? I did it like this such that if someone feels like they need to go after the storage statements, they can then get the sql. Alternatively, I can just dump out the SQL. Your call.
Ehsan has created a user repo for work-in-progress instrumentation patches, here: http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/
Comment on attachment 441554 [details] [diff] [review] Instrument storage/ v1.1 [landed on startup-timeline] Eh, that's probably a question for whoever is going to be consuming this information.
Attachment #441554 - Flags: review?(bugmail) → review+
Attachment #441137 - Attachment description: very basic instrumentation & fixes → very basic instrumentation & fixes [landed on m-c]
Comment on attachment 441171 [details] [diff] [review] Instrument color management initialization [landed on m-c] http://hg.mozilla.org/mozilla-central/rev/f236632a9747
Attachment #441171 - Attachment description: Instrument color management initialization → Instrument color management initialization [landed on m-c]
Attachment #441554 - Attachment description: Instrument storage/ v1.1 → Instrument storage/ v1.1 [landed on startup-timeline]
This is attachment 440309 [details] [diff] [review] updated to trunk. Not sure who to ask review from; throwing the request to vlad for now.
Attachment #440309 - Attachment is obsolete: true
Attachment #441626 - Flags: review?(vladimir)
Attachment #441626 - Attachment is patch: true
Attachment #441626 - Attachment mime type: application/octet-stream → text/plain
Attachment #441647 - Attachment is patch: true
Attachment #441647 - Attachment mime type: application/octet-stream → text/plain
Comment on attachment 441647 [details] [diff] [review] Instrument cookies [landed on startup-timeline] r=dwitte
Attachment #441647 - Flags: review?(dwitte) → review+
Attachment #441626 - Attachment description: Basic instrumentation patch → Basic instrumentation patch [landed on startup-timeline]
Attachment #441626 - Flags: review?(vladimir)
Attachment #441646 - Attachment description: Instrument places → Instrument places [landed on startup-timeline]
Attachment #441646 - Flags: review?(dietrich)
Attachment #441647 - Attachment description: Instrument cookies → Instrument cookies [landed on startup-timeline]
(In reply to comment #28) > More to come tomorrow! Is there going to be anything left for the rest of us tomorrow? :P
(In reply to comment #29) > (In reply to comment #28) > > More to come tomorrow! > Is there going to be anything left for the rest of us tomorrow? :P Yep! :P At the very least, it would be good for people to try the new instrumentations, and see where they don't make sense, and let me know to remove those (or change them). Although I personally think that it's best if we have more than enough logs. I think filtering the logs for non-significant stuff is the job of the tool(s) we'll have for making sense of the logs. Oh, and also, make sure to suggest any significant stuff that needs to be instrumented. I'm going to focus a lot on the layout internals tomorrow, but I may forget about some things (or not even know about them in the first place!), so suggestions are welcome!
I want you to know I love you both very much. Keep em coming!
Attached file Instrumentation log sample (obsolete) —
This is a sample of the log produced by the probes I have so far. I'm just attaching this for people to use as a sample of how a log would look like in order to get some scripts ready to use it. Other than that, this log is of absolutely no use, and comes with no guarantees or anything!
Instrument the component manager service: http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/3918a86919c6 This gives us info on how much time it takes for each service to get instantiated, and it can also be used to get a list of all services we init on the startup path.
Attached file Another log example (obsolete) —
Again, please do not use this for any other purpose besides tool development!
Attached file sample log (obsolete) —
Attachment #441794 - Attachment is obsolete: true
Attachment #441863 - Attachment is obsolete: true
Attached file sample log (obsolete) —
As requested by johnath. Loads about:blank from the command line, and once a window is up I <ctrl+l>example.com<enter> and then close the browser. We can ignore everything after (and including) the first occurrence of example.com.
Attachment #441884 - Attachment is obsolete: true
Originally from Federico, over here - http://people.gnome.org/~federico/news-2006-03.html#timeline-tools Hacked to use a different regex, since our output doesn't look like strace, and I'm constantly twiddling PIXELS_PER_SECOND so that the left-edge time scale takes an appropriate amount of space (the default value tends to cluster all the lines in the top left corner, since the text for each line takes more vertical space than the line itself.) Also - this script doesn't like 0-based millisecond timeouts, so I post-process them to use seconds-since-epoch like strace does, in a tiny little perl script to be attached next.
As mentioned above, this converts the timestamps to a format that plot-timeline.py likes. It also filters subsequent events within the same millisecond, to cut down on places-we're-unlikely-to-focus-anyhow. My workflow right now: 1) Get/generate log 2) ./post-process.pl < rawlog > postlog 3) python plot-timeline.py -o outfile.png postlog On log files much larger than a few meg, this eventually OOMs, but that's maybe a healthy post-processing threshold anyhow. Log it all up front, fine, but too much noise just hides the forest in trees.
Comment on attachment 441909 [details] [diff] [review] instrument main thread events > 5.0 ms [landed on startup-timeline] http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/85a72e5bf0be
Attachment #441909 - Attachment description: instrument main thread events > 5.0 ms → instrument main thread events > 5.0 ms [landed on startup-timeline]
JS execution instrumentation (thanks a lot to Shaver for doing the real work here): http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/994610f9cf10 This is *hot* stuff!!!
Attached file Sample log on mac (obsolete) —
I created this log by setting the home page to about:blank, opening Firefox, quickly going to www.google.com and closing Firefox.
(In reply to comment #47) > Created an attachment (id=441952) [details] > Sample log on mac > > I created this log by setting the home page to about:blank, opening Firefox, > quickly going to www.google.com and closing Firefox. Graph based on this log - http://people.mozilla.org/~johnath/startup/startup-with-js.png
This fills the first gap we had in startup: http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/f46f04927337 Turns out, a bunch of time is spent doing gInterfaceInfoManager->AutoRegisterInterfaces(); Going to log that one next.
And this fills it the rest of the way: http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/a9fdbbc67d82 Also, vlad suggested I link my xpt files which makes the hole go away a bunch more too: ./dist/bin/xpt_link.exe firefox.xpt dist/bin/components/*.xpt
(Our release builds do the xpt linking, so we should be profiling with them linked -- it's done as part of the packaging step though, would be nice to fix it for normal opt builds.)
Just doing a "make package" and trying on what's in the package doesn't work?
I think that the create-instance and execute-step calls aren't adding much value to our current analysis. It seems like "auto-register-interfaces takes 200ms" is the level of detail we want from the timeline. Similarly with "130ms spent creating the hidden window" or "80ms initializing Places". We're not trying to replace a profiler here, after all, we're trying to get to the point where we can say "attacking the hidden window, interface registration, and the network service will give us back about 700ms", so that we can build a plan for doing that. We can post-filter, but I think that we should probably not bother adding sub-instrumentation once we've identified a culprit for a couple of hundred ms. If we end up with a 200-entry timeline, or so, I think we're going to be in the sweet spot for planning our next work.
Attached file Log from a release build on Windows (obsolete) —
This is Ehsan, attaching a log from a release build of trunk with the patches in startup-timeline. I created a release build using the mozconfig file for 1.9.2 release builds, with --enable-jemalloc stripped out (because this machine can't do that), I ran make package && make installer, I executed the installer and installed Firefox to Program Files, I loaded it using a new profile, set home page to about:blank, I ran the build, and loaded google.com right after the window showed up, and I'm attaching the log from that run here. Please note that this is a very fast machine, so this should be taken with a grain of salt.
Attachment #441891 - Attachment is obsolete: true
Attachment #441952 - Attachment is obsolete: true
I landed two experimental patches which hack nspr to use QueryPerformanceCounter on Windows, and it gives us a much better resolution: http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/cf7b100f8b66 http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/da1c9c08d470
Attachment #441959 - Attachment description: fix exit/min instrumentation → fix exit/min instrumentation [landed on startup-timeline]
Attached file Log from a release build on Windows (obsolete) —
This is Ehsan again, attaching a new log produced with the same steps in comment 55, but with a build with high resolution timers. It's much more useful now.
Attachment #442140 - Attachment is obsolete: true
Comment on attachment 442234 [details] Log from a release build on Windows >[ 334.03] @ Main Thread Event 05AD80A8 took 33.64 ms >[ 359.17] < ( 2) 15.86 ms ( 15.86 ms total) - PresShell::Paint (line 5728) (document: chrome://browser/content/browser.xul) >[ 880.98] < ( 2) 1.06 ms ( 1.06 ms total) - PresShell::Paint (line 5728) (document: chrome://browser/content/browser.xul) >[ 1272.00] < ( 4) 1.18 ms ( 1.18 ms total) - PresShell::Paint (line 5728) (document: chrome://browser/content/browser.xul) I'd say we definitely need to understand what's going on between those Paint calls!
Instrument the Windows message loop for messages taking longer than 5ms to be processed: http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/cff98aa74b6e Log the nsIRunnable and timer handler functions inside nsThread::ProcessNextMessage on Windows: http://hg.mozilla.org/users/eakhgari_mozilla.com/startup-timeline/rev/403b3de93880
Attached file Log from a release build on Windows (obsolete) —
This is the latest log that I captured using my work from today.
Attachment #442234 - Attachment is obsolete: true
We'll need to log more about what's happening on appStartup->Run in XRE_main. There is also some time spent between the first two PresShell::Paint calls, but I'm not sure what's to blame there yet.
I made a blog post about the current state of things with regard to startup timeline measurements. http://ehsanakhgari.org/blog/2010-04-29/making-sense-firefox-startup-time
(In reply to comment #60) > Created an attachment (id=442578) [details] > Log from a release build on Windows > > This is the latest log that I captured using my work from today. Is this from a PGO build? Just curious.
(In reply to comment #63) > (In reply to comment #60) > > Created an attachment (id=442578) [details] [details] > > Log from a release build on Windows > > > > This is the latest log that I captured using my work from today. > > Is this from a PGO build? Just curious. That's a good question. I built it using our release mozconfig, so yes, it should be. But aren't PGO builds supposed to open the browser, collect profile data, and then relink? I don't remember ever seeing the browser open up during the build. And I don't know how to confirm if the build is PGO based on the binaries...
PGO isn't a mozconfig option, you need to run "make -f client.mk profiledbuild" to do a PGO build, and assuming you have the PROFILE_GEN_SCRIPT set in your mozconfig, it will execute that between the two build halves. (Also, you'd notice, since a PGO build will take >1hr to link libxul the second time.)
(In reply to comment #65) > PGO isn't a mozconfig option, you need to run "make -f client.mk profiledbuild" > to do a PGO build, and assuming you have the PROFILE_GEN_SCRIPT set in your > mozconfig, it will execute that between the two build halves. OK, thanks for the tip. Now, I'm trying that, but I get: LINK: fatal error LNK1264: /LTCG:PGINSTRUMENT specified but no code generation required; instrumentation failed Do you know what's happening here? > (Also, you'd notice, since a PGO build will take >1hr to link libxul the second > time.) Heh, that's a good indication!
This is the latest log, coming from a PGO build. To my great surprise, it isn't all that different with the previous one.
Attachment #442578 - Attachment is obsolete: true
Attachment #443216 - Attachment mime type: application/octet-stream → application/x-bzip2
This is our best put on a comprehensive windows log right now. It uses a profile with the places_generated_med places DB, one extension installed, two tabs of session restore, and some minimal cache data and urlclassifier data. We strongly suspect that the remaining large gaps (particularly the gap right before onStartRequest for hiddenWindow) are waiting on I/O.
Attached file Annotated startup log (obsolete) —
Ehsan and I did a first pass at labeling the meaningful blocks of work - search for "****"
Depends on: 563082
The only thing that stops this log from being a solid end point is that Ehsan and I figured out the source of the 600ms block, and it was a source of painful embarrassment for us both, so we're going to re-log with that problem fixed. Otherwise we believe the annotations to be solid, and will be extracting out major culprits based on this log. We'll cross check those against the new log once it's ready. And no, I won't say what it was here, you can go read the log if you like. :) <grumble>
Attachment #443467 - Attachment is obsolete: true
Attached patch mozilla-central patch (obsolete) — Splinter Review
This is a rolled up patch from the tip of the startup-timeline repo, so that we can get this stuff landed on mozilla-central. The only thing which I did not include in this patch is the nspr changes. I landed a workaround which should give us 1ms resolution on Windows, which is way better than 15ms. We can land the nspr change separately (that is being tracked in bug 563082).
Assignee: nobody → ehsan
Status: NEW → ASSIGNED
Attachment #443750 - Flags: review?(vladimir)
Attached patch mozilla-central patch (obsolete) — Splinter Review
Attachment #443750 - Attachment is obsolete: true
Attachment #443757 - Flags: review?(vladimir)
Attachment #443750 - Flags: review?(vladimir)
Attached patch mozilla-central patch (obsolete) — Splinter Review
Fixed one more build issue on Windows caused by MSDN lying about which header to include.
Attachment #443757 - Attachment is obsolete: true
Attachment #443809 - Flags: review?(vladimir)
Attachment #443757 - Flags: review?(vladimir)
Attached patch mozilla-central patch (obsolete) — Splinter Review
Updated to the tip of startup-timeline.
Attachment #443809 - Attachment is obsolete: true
Attachment #444127 - Flags: review?(vladimir)
Attachment #443809 - Flags: review?(vladimir)
(this turned into not a tracking bug) + // initialize some common services, so we don't pay the cost for these at odd times later on; + // SetWindowCreator -> ChromeRegistry -> IOService -> SocketTransportService -> (nspr wspm init), Prefs + { + nsCOMPtr<nsISupports> comp; in XRE_main, stick this inside #ifdef NS_FUNCTION_TIMER so that we don't change the normal startup path. in mozJSSubScriptLoader::LoadSubScript (and maybe elsewhere?): +#ifdef NS_FUNCTION_TIMER + NS_ConvertUTF16toUTF8 url__(aURL); + NS_TIME_FUNCTION_FMT("%s (line %d) (url: %s)", MOZ_FUNCTION_NAME, + __LINE__, url__.get()); don't declare a stack NS_ConvertUTF16toUTF8 -- use NS_ConvertUTF16toUTF8(aURL).get() in the printf args directly. Better yet, use NS_LossyConvertUTF16toASCII instead, no need to pay the cost of a UTF8 conversion. in PresShell, would be much cleaner to have a macro for + nsCAutoString docURL__("N/A"); + nsIURI *uri__ = mDocument->GetDocumentURI(); + if (uri__) { + uri__->GetSpec(docURL__); + } since it's repeated everywhere and looks really ugly; it'll likely bitrot unless it looks clean/easy enough to fix for someone making a change. Also, everywhere, don't use braces around: + if (aLoadData->mURI) { + aLoadData->mURI->GetSpec(spec__); + } The single-line if doesn't need it, and we're trying to minimize code impact on existing code here (visual/scanning impact). I'd even write it on just one line: if (aLoadData->mURI) aLoadData->mURI->GetSpec(spec__); I couldn't come up with a good way to declare and initialize in one go; stupid getters :/ + NS_TIME_FUNCTION_MIN_FMT(5, "%s (ARGH line %d) (cid: %s)", MOZ_FUNCTION_NAME, \ + __LINE__, cid_buf__) get rid of the ARGHs? Please pull the nsStackWalk/nsThread/nsTimer instrumentation into a separate patch; it's trickier than just the boilerplate stuff elsewhere.
Summary: [tracking] add startup timeline instrumentation → add startup timeline instrumentation
All comments addressed.
Attachment #444127 - Attachment is obsolete: true
Attachment #445239 - Flags: review?(vladimir)
Attachment #444127 - Flags: review?(vladimir)
nsThread/nsTimer/nsStackwalk stuff.
Attachment #445240 - Flags: review?(vladimir)
Comment on attachment 445239 [details] [diff] [review] mozilla-central patch (boilerplate) Ok, this is fine, but with the caveat that if module owners come in and want something changed in their particular area, that should happen quickly. I'm r+'ing this because it's largely boilerplate and not part of the default build, and we want it in pretty quickly.
Attachment #445239 - Flags: review?(vladimir) → review+
Comment on attachment 445240 [details] [diff] [review] mozilla-central patch (non-trivial stuff) Almost all of this is #if 0'd out, so let's land the only part that's not -- specifically, +#ifdef NS_FUNCTION_TIMER + char message[1024] = {'\0'}; + if (NS_IsMainThread()) { + mozilla::FunctionTimer::ft_snprintf(message, sizeof(message), + "@ Main Thread Event %p", event.get()); + } + // If message is empty, it means that we're not on the main thread, and + // FunctionTimer won't time this function. + NS_TIME_FUNCTION_MIN_FMT(5.0, message); +#endif Everything else should go for a future patch, if we want the vtable goop. That stuff should also get isolated out into its own function instead of being in the middle of the nsThread method.
Attachment #445240 - Flags: review?(vladimir) → review-
(In reply to comment #78) > (From update of attachment 445239 [details] [diff] [review]) > Ok, this is fine, but with the caveat that if module owners come in and want > something changed in their particular area, that should happen quickly. I'm > r+'ing this because it's largely boilerplate and not part of the default build, > and we want it in pretty quickly. Fair enough.
I guess the vtable magic stuff can remain on the startup-timeline repo for now, and we can backport it if needed in the future.
Attachment #445240 - Attachment is obsolete: true
Attachment #446374 - Flags: review?(vladimir)
Comment on attachment 446374 [details] [diff] [review] mozilla-central patch (non-trivial stuff) r=me, but you don't need any of the nsTimerImpl changes (they're needed for the parts of this that were #if 0'd out before). Also, please stick a (void*) before that event.get(), to avoid weirdness that I've seen in printf and .get()
Attachment #446374 - Flags: review?(vladimir) → review+
Attachment #446374 - Attachment is obsolete: true
Attachment #446379 - Flags: review?(vladimir)
Attachment #446379 - Flags: review?(vladimir) → review+
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a5
Depends on: 568324
Depends on: 568912
Depends on: 568915
Depends on: 548612
Depends on: 568917
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: