Closed Bug 711297 Opened 13 years ago Closed 13 years ago

Report startup cache age via telemetry

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla13

People

(Reporter: taras.mozilla, Assigned: froydnj)

References

Details

Attachments

(4 files, 10 obsolete files)

1.86 KB, patch
froydnj
: review+
vladan
: checkin+
Details | Diff | Splinter Review
3.12 KB, patch
froydnj
: review+
vladan
: checkin+
Details | Diff | Splinter Review
7.44 KB, patch
froydnj
: review+
vladan
: checkin+
Details | Diff | Splinter Review
5.83 KB, patch
froydnj
: review+
vladan
: checkin+
Details | Diff | Splinter Review
It's not clear what the lifetime of entries in startup cache is for a typical user. We wipe it it on upgrades, when corrupted, addon install.
1) What is the startup cache? 2) How is this useful?
Startup cache contains parsed chrome js and other system-specific caches. It would be good to understand how long startup cache lasts before it is corrupted/blown away, especially once we put more stuff like bug 705594 into it. It's in the "local" profile directory. See http://dxr.mozilla.org/mozilla/mozilla-central/startupcache/StartupCache.h.html#l124 for details.
nsIFile and friends don't expose a createdTime for files, mostly because not all platforms usefully support that. Tracking cache age via lastModifiedTime doesn't seem very useful. Instead, I opted to store the creation time directly in the zip file in the comment field of the...uh...end directory struct (not sure what the official zip name for it is). Doing this avoids raciness problems with a separate file. To do this, though, nsZipArchive needs to know about reading comments. nsZipWriter already knows about writing comments, so there's no modifications to be made there.
Attachment #582939 - Flags: review?(taras.mozilla)
Now that we can read comments, it's a simple matter to compute the age of the startup cache when we open it.
Attachment #582945 - Flags: review?(taras.mozilla)
Comment on attachment 582939 [details] [diff] [review] part 1 - enhance nsZipArchive with comment functionality ZipArchive uses zero-copy strings elsewhere. The right way to do this is to store the comment offset and fetch it ondemand. ie do stuff you did in BuildFileList within GetComment(); wrap the map-asseccessing part of GetComment in MOZ_WIN_MEM_TRY_*
Attachment #582939 - Flags: review?(taras.mozilla) → review-
Comment on attachment 582945 [details] [diff] [review] part 2 - store/retrieve creation time and feed age to Telemetry I would skip atoll and use memcmp to copy PRTime as a binary. StartupCache is already machine-specific. This might make the string api upset, this is fine too.
Attachment #582945 - Flags: review?(taras.mozilla) → review+
If I'm going to be storing raw pointers and lengths anyway, I might as well ditch the nsCString bits and just have nsZipArchive clients work directly with the pointers themselves. Or does that cause issues with whatever memory mapping magic is used by nsZipArchive?
Now with on-demand copying. I realized I was talking nonsense about raw pointers and whatnot.
Attachment #582939 - Attachment is obsolete: true
Attachment #583174 - Flags: review?(taras.mozilla)
Now with binary storage. Requesting re-review because of bit fiddling and because I restructured LoadArchive to use return-early style to avoid massive indentation.
Attachment #582945 - Attachment is obsolete: true
Attachment #583177 - Flags: review?(taras.mozilla)
Comment on attachment 583174 [details] [diff] [review] part 1 - enhance nsZipArchive with comment functionality, v2 ># HG changeset patch ># User Nathan Froyd <froydnj@gmail.com> ># Date 1324337115 18000 >Bug 711297 - add comment reading to nsZipArchive; r=taras > > >diff --git a/modules/libjar/nsZipArchive.cpp b/modules/libjar/nsZipArchive.cpp >index 399227d..63cf608 100644 >--- a/modules/libjar/nsZipArchive.cpp >+++ b/modules/libjar/nsZipArchive.cpp >@@ -666,6 +666,18 @@ MOZ_WIN_MEM_TRY_BEGIN > if (sig != ENDSIG) > return NS_ERROR_FILE_CORRUPTED; > >+ // Make the comment available for consumers. >+ if (endp - buf >= ZIPEND_SIZE) { >+ ZipEnd *zipend = (ZipEnd *)buf; >+ >+ buf += ZIPEND_SIZE; >+ PRUint16 commentlen = xtoint(zipend->commentfield_len); >+ if (endp - buf >= commentlen) { >+ char *comment = (char *)buf; >+ mComment.Assign(comment, commentlen); >+ } >+ } This doesn't seem right. All of this stuff seems like it can occur in GetComment(). also irc comments.
Attachment #583174 - Flags: review?(taras.mozilla) → review-
Thinkos fixed. For posterity: the IRC discussion was about better ways to do this, maybe by looking at modtimes of individual entries or somesuch. The nsZipArchive API isn't really conducive to such things, so hijacking the comment field stays.
Attachment #583174 - Attachment is obsolete: true
Attachment #583513 - Flags: review?(taras.mozilla)
Fix some thinkos, clean includes, etc. Nothing big.
Attachment #583177 - Attachment is obsolete: true
Attachment #583177 - Flags: review?(taras.mozilla)
Attachment #583514 - Flags: review?(taras.mozilla)
Comment on attachment 583513 [details] [diff] [review] part 1 - enhance nsZipArchive with comment functionality, v3 nsCString -> nsACString is our usual convention.
Attachment #583513 - Flags: review?(taras.mozilla) → review+
Comment on attachment 583514 [details] [diff] [review] part 2 - store/retrieve creation time and feed age to Telemetry, v3 >+ PRTime now = PR_Now(); >+ if (!mArchive) { >+ nsCString comment; >+ char buf[sizeof(now)]; >+ memcpy(buf, &now, sizeof(buf)); >+ comment.Assign(&buf[0], sizeof(buf)); >+ zipW->SetComment(comment); >+ } This reads like java code. Would comment.Assign((char*)&now, sizeof(now)) be less clear? >+ >+ nsresult LoadArchive(enum TelemetrifyAge flag); > nsresult Init(); > void WriteToDisk(); > nsresult ResetStartupWriteTimer(); >diff --git a/toolkit/components/telemetry/TelemetryHistograms.h b/toolkit/components/telemetry/TelemetryHistograms.h >index 3a0552f..eb04e79 100644 >--- a/toolkit/components/telemetry/TelemetryHistograms.h >+++ b/toolkit/components/telemetry/TelemetryHistograms.h >@@ -123,6 +123,7 @@ HISTOGRAM(MAC_INITFONTLIST_TOTAL, 1, 30000, 10, EXPONENTIAL, "gfxMacPlatformFont > > HISTOGRAM(SYSTEM_FONT_FALLBACK, 1, 100000, 50, EXPONENTIAL, "System font fallback (us)") > HISTOGRAM(SYSTEM_FONT_FALLBACK_FIRST, 1, 40000, 20, EXPONENTIAL, "System font fallback, first call (ms)") >+HISTOGRAM(STARTUP_CACHE_AGE_HOURS, 1, 3000, 20, EXPONENTIAL, "Startup cache age (hours)") > > HISTOGRAM_BOOLEAN(SHUTDOWN_OK, "Did the browser start after a successful shutdown") > Before we land this bug, please add a 3rd patch with a testcase. Modify some startup cache test to make sure that the telemetry histogram is being filled.
Attachment #583514 - Flags: review?(taras.mozilla) → review+
nsCString -> nsACString. Carrying over r+.
Attachment #583513 - Attachment is obsolete: true
Attachment #584754 - Flags: review+
Address Java-isms in the code. This patch also changes the age computation to be ceil(# of hours) rather than floor(# of hours) as it was in previous versions. This is required for tests to be reasonable, since histogram buckets must start at 1, rather than 0, and we'd rather not have a test that takes an hour to run. I believe this change to be relatively non-controversial, so I'm going to carry over r+ on this new patch.
Attachment #583514 - Attachment is obsolete: true
Attachment #584755 - Flags: review+
Since the tests for nsIStartupCache all run in a single process and we don't normally record startup cache age during a single process's lifetime, we need some way of changing this for tests. nsIStartupCache's methods are basically all for testing purposes anyway, so there doesn't seem to be much harm in adding one more.
Attachment #584757 - Flags: review?(mwu)
This patch adds bits to TestStartupCache.cpp to ensure that we do record interesting information in the STARTUP_CACHE_AGE_HOURS histogram. I will admit upfront that it is ugly. However, given that: 1) nsITelemetry wasn't really designed to be used from C++, but rather from JavaScript; 1a) Making it more suitable means having a discussion about whether the ipc/chromium bits should be part of our SDK API, which probably isn't worth having for just one test; 2) Telemetry::GetHistogramById isn't exported from libxul; 2a) Making it so: see 1a; 3) Rewriting the whole test in JavaScript: ugh; This solution is probably the least bad way to do things. I think it would be less ugly if early returns from main() were used instead of the futzing with use_js, but I was trying to keep the execute-all-tests style of the current program. I also don't make any claims about whether I used the JS api 100% correctly; it works well enough for the purposes herein.
Attachment #584760 - Flags: review?(mwu)
Comment on attachment 584757 [details] [diff] [review] part 3 - add recordAgesAlways method to nsIStartupCache Review of attachment 584757 [details] [diff] [review]: ----------------------------------------------------------------- ::: startupcache/StartupCache.cpp @@ +154,4 @@ > > StartupCache* StartupCache::gStartupCache; > bool StartupCache::gShutdownInitiated; > +enum StartupCache::TelemetrifyAge StartupCache::postFlushAgeAction = StartupCache::IGNORE_AGE; IGNORE_AGE is 1, and postFlushAgeAction is a global. I suggest changing the TelemetrifyAge enum to allow this to be initialized to zero. ::: startupcache/StartupCache.h @@ +170,4 @@ > RECORD_AGE = 0, > IGNORE_AGE = 1 > }; > + static enum TelemetrifyAge postFlushAgeAction; Something like gPostFlushAgeAction would be more consistent with how static variables are named in this class.
Attachment #584757 - Flags: review?(mwu) → review+
Address review comments. Carrying over r+.
Attachment #584757 - Attachment is obsolete: true
Attachment #586445 - Flags: review+
Comment on attachment 584760 [details] [diff] [review] part 4 - add test STARTUP_CACHE_AGE_HOURS histogram Well, I don't think there's anything less painful than this at the moment, since we don't have access to the right API to do this directly. Unfortunately I don't know this test code or telemetry particularly well so taras might be able to do a better review.
Attachment #584760 - Flags: review?(mwu) → review?(taras.mozilla)
Attachment #584760 - Flags: review?(taras.mozilla) → review+
Keywords: checkin-needed
Pushing this on try showed test failures on debug builds, so removing checkin-needed while I debug.
Keywords: checkin-needed
Comment on attachment 584760 [details] [diff] [review] part 4 - add test STARTUP_CACHE_AGE_HOURS histogram I'm seeing failures for this test on Try debug builds. The error is: Assertion failure: cx->runtime->requestDepth || cx->runtime->gcRunning, at e:/builds/moz2_slave/try-w32-dbg/build/js/src/jscntxt.cpp:1288 It doesn't appear to happen all the time: I can run Linux debug builds locally just fine, and my Linux Try debug builds sometimes win, but it seems to fail consistently elsewhere. Luke, can you take a look at my JSAPI usage and tell me what bits I might be missing?
Attachment #584760 - Flags: feedback?(luke)
Comment on attachment 584760 [details] [diff] [review] part 4 - add test STARTUP_CACHE_AGE_HOURS histogram jdm pointed out JSAutoRequest and that looks like the solution to my problems.
Attachment #584760 - Flags: feedback?(luke)
Fixed test so it will work on debug builds with JS checking. Carrying over r+.
Attachment #584760 - Attachment is obsolete: true
Attachment #594434 - Flags: review+
Keywords: checkin-needed
Whiteboard: [autoland-try]
Whiteboard: [autoland-try] → [autoland-in-queue]
Autoland Patchset: Patches: 584754, 584755, 586445, 594434 Branch: mozilla-central => try Destination: http://hg.mozilla.org/try/rev/3b9b982beafb Try run started, revision 3b9b982beafb. To cancel or monitor the job, see: https://tbpl.mozilla.org/?tree=Try&rev=3b9b982beafb
Try run for 3b9b982beafb is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=3b9b982beafb Results (out of 206 total builds): success: 186 warnings: 20 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/autolanduser@mozilla.com-3b9b982beafb
Whiteboard: [autoland-in-queue]
Unfortunately failed make check on Windows: TEST-UNEXPECTED-FAIL | histogram didn't record samples https://tbpl.mozilla.org/?tree=Try&rev=3b9b982beafb
Keywords: checkin-needed
There's no need to check for values > 0; Histogram will do that for us. We also might encounter values == 0 depending on the resolution of PR_Now(). So ditch that check.
Attachment #584755 - Attachment is obsolete: true
Attachment #599010 - Flags: review+
I am an idiot. After much poking at try, I finally realized that PRTime might not have a very big resolution on Windows, resulting in an age of 0 (since the time of creation and the time of reading happen so quickly in tests). But then I realized that I was using the wrong method to check if the histogram had changed! histogram.sum is the sum of all the *samples* taken, not the sum of all the bucket counts. So, naturally, if you are taking samples with a value of zero, the sum of all those is...zero. Updated patch to do bucket-wise comparisons before and after.
Attachment #594434 - Attachment is obsolete: true
Attachment #599011 - Flags: review+
Status: NEW → ASSIGNED
Keywords: checkin-needed
Whiteboard: [autoland-try]
Whiteboard: [autoland-try] → [autoland-in-queue]
Autoland Patchset: Patches: 584754, 586445, 599010, 599011 Branch: mozilla-central => try Error applying patch 586445 to mozilla-central. patching file startupcache/StartupCache.cpp Hunk #2 FAILED at 496 Hunk #3 FAILED at 508 2 out of 5 hunks FAILED -- saving rejects to file startupcache/StartupCache.cpp.rej patching file startupcache/StartupCache.h Hunk #2 FAILED at 169 1 out of 2 hunks FAILED -- saving rejects to file startupcache/StartupCache.h.rej abort: patch failed to apply Could not apply and push patchset:
Whiteboard: [autoland-in-queue]
Whoops, need to tell the bot which order to land things in.
Whiteboard: [autoland-try:584754,599010,586445,599011]
Whiteboard: [autoland-try:584754,599010,586445,599011] → [autoland-in-queue]
Autoland Patchset: Patches: 584754, 586445, 599010, 599011 Branch: mozilla-central => try Error applying patch 586445 to mozilla-central. patching file startupcache/StartupCache.cpp Hunk #2 FAILED at 496 Hunk #3 FAILED at 508 2 out of 5 hunks FAILED -- saving rejects to file startupcache/StartupCache.cpp.rej patching file startupcache/StartupCache.h Hunk #2 FAILED at 169 1 out of 2 hunks FAILED -- saving rejects to file startupcache/StartupCache.h.rej abort: patch failed to apply Could not apply and push patchset:
Whiteboard: [autoland-in-queue]
Releng poked at this problem and declared it fixed. Pushing to try tells all. :) Third time's the charm!
Whiteboard: [autoland-try:584754,599010,586445,599011]
Whiteboard: [autoland-try:584754,599010,586445,599011] → [autoland-in-queue]
Autoland Patchset: Patches: 584754, 599010, 586445, 599011 Branch: mozilla-central => try Destination: http://hg.mozilla.org/try/pushloghtml?changeset=6e815d021d62 Try run started, revision 6e815d021d62. To cancel or monitor the job, see: https://tbpl.mozilla.org/?tree=Try&rev=6e815d021d62
Try run for 6e815d021d62 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=6e815d021d62 Results (out of 218 total builds): exception: 4 success: 175 warnings: 25 failure: 14 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/autolanduser@mozilla.com-6e815d021d62 Timed out after 12 hours without completing.
Whiteboard: [autoland-in-queue]
Comment on attachment 584754 [details] [diff] [review] part 1 - enhance nsZipArchive with comment functionality, v4 https://hg.mozilla.org/integration/mozilla-inbound/rev/60d47e224c40
Attachment #584754 - Flags: checkin+
Comment on attachment 599010 [details] [diff] [review] part 2 - store/retrieve creation time and feed age to Telemetry, v5 https://hg.mozilla.org/integration/mozilla-inbound/rev/579029d9950a
Attachment #599010 - Flags: checkin+
Comment on attachment 586445 [details] [diff] [review] part 3 - add recordAgesAlways method to nsIStartupCache, v2 https://hg.mozilla.org/integration/mozilla-inbound/rev/69b718a5a15e
Attachment #586445 - Flags: checkin+
Comment on attachment 599011 [details] [diff] [review] part 4 - add test for STARTUP_CACHE_AGE_HOURS histogram, v3 https://hg.mozilla.org/integration/mozilla-inbound/rev/b3773fd0e281
Attachment #599011 - Flags: checkin+
Keywords: checkin-needed
OS: Windows 7 → All
Hardware: x86 → All
Target Milestone: --- → mozilla13
Depends on: 730196
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: