Closed Bug 522375 Opened 15 years ago Closed 13 years ago

built-in startup time measurement

Categories

(Toolkit :: Startup and Profile System, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: dietrich, Assigned: taras.mozilla)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Whiteboard: [ts])

Attachments

(3 files, 4 obsolete files)

we should have a built-in startup time measurement. at the very least we could record when main() starts and expose that in a way that c++ and js can access.

this will allow easily runnable and self-contained ways of measuring startup time, and measuring the effect of code changes to it. this will not only help internally, but having this available to extension developers makes it easier for them to see their effect on startup time.

not only will this ease development, but it'd be useful information for things like about:support, Test Pilot, etc.
Whiteboard: [ts]
OS: Windows NT → All
Hardware: x86 → All
Blocks: 447581
We need to log every startup to be able to better diagnose slowdowns due to aging profiles, extension installs, fragmentation, etc.

I propose having the firefox shell script on mac/linux stick a timestamp into commandline arguments. Then chrome code could log the completion of startup.

I imagine a startup table(timestamp, time-to-startup-in-ms) and an axillary event table (timestamp, event-type, details) where event-type would be extension-install, sqlite-maintenance, firefox version upgrade, etc.  

about:startup would then provide a canvas-based interactive graph with all this date. This would permit us to easily diagnose what happens to startup over time.
Blocks: 569723
Not for nothin', but johnath and I redesigned this EXACT THING, though using internal timing rather than a shell script argument.  I wonder if I can steal some time to work on a patch...
This came up in the weekly Add-ons Manager meeting and it'd be pretty cool if we could throw some basic functionality for this into the Add-on Compatibility Reporter. We can include a few startup-related numbers with each report, which also includes all of the add-ons the user has installed.

We can then analyze the data and identify any problem add-ons. If you're interested in this I can file a separate bug with links to the extension source and figure out what new fields need to be added server-side.
(In reply to comment #2)
> Not for nothin', but johnath and I redesigned this EXACT THING, though using
> internal timing rather than a shell script argument.  I wonder if I can steal
> some time to work on a patch...

I'm not sure that internal timing is as useful. There is an amazing amount of startup slowless before main() is even run.  There are a couple of linker tricks we can do to get the code to run as early as possible, but it still wouldn't capture the full startup time.
I think that the delta from "started starting it" to "first line of main" will be relatively constant for a given user, compared to sqlite evolution, fragmentation due to update patching (oh god, I just thought through that a bit, oh god), extensions, etc.  The linker trick (like __init or equiv, I guess) would also be a good idea.

And on Win32, where our users are, we don't have a shell script!
(In reply to comment #5)

> fragmentation due to update patching (oh god, I just thought through that a
> bit, oh god)

Sigh, filed bug 570058.
(In reply to comment #5)

> 
> And on Win32, where our users are, we don't have a shell script!

I think win32 is the only platform where this will produce a useful number.
What's "this" in comment 7? internal timing? I'm fine if that's the case, fwiw, though I think characterizing it as "useless" on other platforms is pretty bold.  I'd be pretty surprised if, for a given user, it varied wildly enough to make it useless for tracking extension effects or sqlite-fragmentation/scale issues.
FWIW, I think the main benefit of having a self-measured startup time is not for precision performance analysis like Taras has been doing, but looking for gross abnormalities like "When Extension X is installed, my browser now takes 30 seconds to start instead of 3."
Depends on: about:startup
Attached patch simplified startup logging (obsolete) — Splinter Review
This accomplishes the same thing as bug 522375, but:
* logs to a file
* Does io on a thread
* does not provide about:startup. That would be better to accomplish in an addon
Attachment #501543 - Flags: review?(mh+mozilla)
this part is mostly complete. I just need to decide on comma-escaping strategy(if any) and run it through try.
(In reply to comment #11)
> this part is mostly complete. I just need to decide on comma-escaping
> strategy(if any) and run it through try.

I like shaver's suggesting of using \t instead of , for separators
An even simpler alternative to this patch would be to only expose startup timestamps and let an extension drive the logging, but that would likely incur a higher overhead.
(In reply to comment #13)
> An even simpler alternative to this patch would be to only expose startup
> timestamps and let an extension drive the logging, but that would likely incur
> a higher overhead.

I like the idea of an add-on doing the logging. However, if that does work out, could we pref off the logging? I'd rather have no logging for mobile by default.
Hmm, just read the patch and I noticed that without logging, we have no way of getting the current values. Can we expose a script accessible way to get the launch timestamp?
Comment on attachment 501543 [details] [diff] [review]
simplified startup logging

calc_us_since_startup() is missing for the !LINUX && !XP_WIN case.

How about storing the events instead of serializing them, and making them available to something else that could serialize them, display them, drop them, etc.? They could even be dropped automatically if nothing claims them before some to-be-determined event, or if a given contract is not registered with the component manager. (just dumping ideas)
Attachment #501543 - Flags: review?(mh+mozilla) → review-
Attached patch pure startup measurement (obsolete) — Splinter Review
Consensus seems to be that we need to expose this information and not log all the time. So here is a minimal patch that simply exposes the info to js.
So uneval of 
Components.classes["@mozilla.org/toolkit/app-startup;1"].getService(Components.interfaces.nsIAppStartup2).getStartupInfo()
is
({process:(new Date(1294352628959)), XRE_main:(new Date(1294352628998)), firstPaint:(new Date(1294352632499)), sessionstoreWindowsRestored:(new Date(1294352631395))}) 

I added firstpaint since it's a useful metric and sessionstoreWindowsRestored isn't available on android.
All of the fields can disappear when they aren't available. ie no xre_main/firstpaint in xpcshell, or process on mac. 
These are all available after startup in desktop firefox. 

Issues:
Which testsuite does sessionrestore so I can add a testcase?
I know need to update uuid

Feel free to suggest better names. 

I think this is a pretty simple low-risk way to add startup info such that extensions can build on it to send places(bug 585196) or to implement about:startup(bug 593743).

My interface is different from that in bug 593743. I didn't like how bug 593743 spread out startup info across a multitude of places. This also throws in first-paint and accepts the fact that different environments have different startup indicators(ie xpcshell, desktop firefox, fennec).
Attachment #501543 - Attachment is obsolete: true
Attachment #501826 - Flags: review?(mh+mozilla)
Blocks: 597282
Comment on attachment 501826 [details] [diff] [review]
pure startup measurement

Not much to say about this patch, only a few nits.

> sprintf(thread_stat, "/proc/self/task/%d/stat", (pid_t) syscall(__NR_gettid));

Not that it matters much but why __NR_gettid and not SYS_gettid ?

> if (gProcessCreationTimestamp) {
>    JSObject *date = js_NewDateObjectMsec(cx, gProcessCreationTimestamp/PR_USEC_PER_MSEC);
>    JS_DefineProperty(cx, destArray, "process", OBJECT_TO_JSVAL(date), NULL, NULL, JSPROP_ENUMERATE);     
>  }
>  (...)

I'd create a function for these.

The only complaint I'd have for the implementation, but that could be addressed in a followup, is that it basically requires a new static variable for any additional timestamp. IOW, it's not very flexible.

Other than that, I like it.
Attachment #501826 - Flags: review?(mh+mozilla) → review+
Naming ideas:
process, main, firstPaint, sessionRestored

"sessionstore-browser-state-restored" is another notif that could be sent when session restore finishes, but it's either or, not in addition to the "sessionstore-windows-restored" notif. In any case, we wouldn't want an additional timing variable for "sessionstore-browser-state-restored".
Not that you're going to need to create a new interface for the property, we're way past API freeze so can't be taking changes to nsIAppStartup2
Blocks: 623950
Attached patch startup measurement + testcase (obsolete) — Splinter Review
Renamed members according to mfinkle's suggestion. Added testcase, tested on windows. Running through try now.
Attachment #501826 - Attachment is obsolete: true
Attachment #502172 - Flags: review?
Comment on attachment 502172 [details] [diff] [review]
startup measurement + testcase

Vlad, could you give this a quick review. We need to this be able to track startup performance + addon impact.
Attachment #502172 - Flags: review? → review?(vladimir)
(In reply to comment #19)
> "sessionstore-browser-state-restored" is another notif that could be sent when
> session restore finishes, but it's either or, not in addition to the
> "sessionstore-windows-restored" notif. In any case, we wouldn't want an
> additional timing variable for "sessionstore-browser-state-restored".

sessionstore-browser-state-restored is only used later in the session (originally added for private browsing) and can be fired multiple times. sessionstore-windows-restored is fired once per session near startup. It will be fired regardless of if we're actually restoring a session - I think it was mostly added as a "even more delayed startup" point for nsBrowserGlue.

I don't know how much it matters that that topic is only in browser (and mobile? and seamonkey?), yet we're putting that into toolkit.
I had a thought, but that could be part of a followup: it could be nice to also store user and system time as well for each event (as in getrusage(2)).
(In reply to comment #20)
> Not that you're going to need to create a new interface for the property, we're
> way past API freeze so can't be taking changes to nsIAppStartup2

Latest patch adds yet another interface instead. I haven't looked through the patch fully, but instead of api sprawl maybe we could send the data in an event, or stuff it in a pref.
(In reply to comment #25)
> (In reply to comment #20)
> > Not that you're going to need to create a new interface for the property, we're
> > way past API freeze so can't be taking changes to nsIAppStartup2
> 
> Latest patch adds yet another interface instead. I haven't looked through the
> patch fully, but instead of api sprawl maybe we could send the data in an
> event, or stuff it in a pref.

It adds another interface that we can remove once 4.0 has branched. I think it's the right call when we may have multiple consumers wanting to access the data.
> > sprintf(thread_stat, "/proc/self/task/%d/stat", (pid_t) syscall(__NR_gettid));
> 
> Not that it matters much but why __NR_gettid and not SYS_gettid ?

NR is available on android. SYS isn't.

(In reply to comment #23)
> I don't know how much it matters that that topic is only in browser (and
> mobile? and seamonkey?), yet we're putting that into toolkit.

I don't understand the concern.
This adds mac support and passes try.
Attachment #502172 - Attachment is obsolete: true
Attachment #502410 - Flags: review?(vladimir)
Attachment #502172 - Flags: review?(vladimir)
Comment on attachment 502410 [details] [diff] [review]
startup measurement(lin,win,mac) + testcase

this is fine, though I'm confused as to how it relates to bug 593743 (which, as far as I can tell, has everything it needs to land?)
Attachment #502410 - Flags: review?(vladimir) → review+
(In reply to comment #29)
> Comment on attachment 502410 [details] [diff] [review]
> startup measurement(lin,win,mac) + testcase
> 
> this is fine, though I'm confused as to how it relates to bug 593743 (which, as
> far as I can tell, has everything it needs to land?)

It was backed out for crashes, ts cold regression. There are some other concerns about it. With this stuff, that bug should be re-implementable as an addon. This patch is just cleanup useful minimally invasive platform-level bits from that patch.
Attachment #502410 - Flags: approval2.0?
Note that a useful improvement (especially for fennec) would be to track "plugin-container" startup, too.
(In reply to comment #31)
> Note that a useful improvement (especially for fennec) would be to track
> "plugin-container" startup, too.

Great idea
Just want to make sure it is known (or apologize if its some problem on my end) that this patch doesn't build on Android? I see the same error on the try builds that I was getting locally:

/builds/slave/try-mb-br-andrd-r7-bld/build/try/toolkit/components/startup/src/nsAppStartup.cpp:86:24: error: sys/sysctl.h: No such file or directory

linux/sysctl.h seems to work fine, but also requires an #undef of __unused.
Attachment #502410 - Flags: approval2.0? → approval2.0+
http://hg.mozilla.org/mozilla-central/rev/dfdf3e5dc749
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Attachment #503587 - Flags: review?(dtownsend)
Attachment #503587 - Flags: review?(dtownsend)
Attachment #503587 - Flags: review+
Attachment #503587 - Flags: approval2.0+
Depends on: 625478
FYI, this broke my --disable-libxul build (on Linux64):

c++  -fno-rtti -fno-exceptions -Wall -Wpointer-arith -Woverloaded-virtual -Wsynth -Wno-ctor-dtor-privacy -Wno-non-virtual-dtor -Wcast-align -Wno-invalid-offsetof -Wno-variadic-macros -Werror=return-type -pedantic -Wno-long-long -fno-strict-aliasing -fshort-wchar -pthread -pipe  -DDEBUG -D_DEBUG -DTRACING -g -Os -freorder-blocks -fomit-frame-pointer  -fPIC -shared -Wl,-z,defs -Wl,-h,libtoolkitcomps.so -o libtoolkitcomps.so  nsToolkitCompsModule.o     -lpthread    -Wl,-rpath-link,/OBJDIR/usr/moz/hg1/dist/bin -Wl,-rpath-link,/usr/local/lib  -Wl,--whole-archive ../find/src/libmozfind_s.a ../typeaheadfind/src/libfastfind_s.a ../startup/src/libappstartup_s.a ../statusfilter/libmozbrwsr_s.a  ../downloads/src/libdownload_s.a ../intl/libintl_s.a  ../alerts/src/libalerts_s.a ../url-classifier/src/liburlclassifier_s.a ../feeds/src/libfeed_s.a -Wl,--no-whole-archive -L../../../modules/zlib/src -lmozz -L../../../dist/bin -L../../../dist/lib -lgkgfx /OBJDIR/usr/moz/hg1/dist/lib/libunicharutil_s.a -L/OBJDIR/usr/moz/hg1/dist/bin -lxpcom -lxpcom_core -lmozalloc  -L/OBJDIR/usr/moz/hg1/dist/bin -L/OBJDIR/usr/moz/hg1/dist/lib -lplds4 -lplc4 -lnspr4 -lpthread -ldl -L/OBJDIR/usr/moz/hg1/dist/bin -lmozjs   -Wl,-Bsymbolic -ldl    
/usr/bin/ld: ../startup/src/libappstartup_s.a(nsAppStartup.o): in function nsAppStartup::GetStartupInfo():/usr/moz/hg1/toolkit/components/startup/src/nsAppStartup.cpp:688: error: undefined reference to 'gXRE_mainTimestamp'
/usr/bin/ld: ../startup/src/libappstartup_s.a(nsAppStartup.o): in function nsAppStartup::GetStartupInfo():/usr/moz/hg1/toolkit/components/startup/src/nsAppStartup.cpp:689: error: undefined reference to 'gFirstPaintTimestamp'
collect2: ld returned 1 exit status
make[6]: *** [libtoolkitcomps.so] Error 1
What, how does this work, about:startup seems to point to invalid URL

The URL is not valid and cannot be loaded.

Tested with latest hourly - maybe takes a clobber ?
(In reply to comment #38)
> What, how does this work, about:startup seems to point to invalid URL
> 
> The URL is not valid and cannot be loaded.
> 
> Tested with latest hourly - maybe takes a clobber ?

There is no about:startup. This just provides building blocks for it.
Ie you can use it from error console or via xpcshell:
./xpcshell -e 'print(new Date() - Components.classes["@mozilla.org/toolkit/app-startup;1"].getService(Components.interfaces.nsIAppStartup_MOZILLA_2_0).getStartupInfo().process)'

I'll post a patch to fix nonxul libs.
Attached patch fix nonlibxul builds (obsolete) — Splinter Review
Attachment #503671 - Flags: review?(dtownsend)
Attachment #503671 - Attachment is obsolete: true
Attachment #503703 - Flags: review?(dtownsend)
Attachment #503671 - Flags: review?(dtownsend)
Assignee: nobody → tglek
Attachment #503703 - Flags: review?(dtownsend) → review+
Blocks: 628363
You need to log in before you can comment on or make changes to this bug.