Closed
Bug 711297
Opened 13 years ago
Closed 13 years ago
Report startup cache age via telemetry
Categories
(Core :: XPCOM, defect)
Core
XPCOM
Tracking
()
RESOLVED
FIXED
mozilla13
People
(Reporter: taras.mozilla, Assigned: froydnj)
References
Details
Attachments
(4 files, 10 obsolete files)
1.86 KB,
patch
|
froydnj
:
review+
vladan
:
checkin+
|
Details | Diff | Splinter Review |
3.12 KB,
patch
|
froydnj
:
review+
vladan
:
checkin+
|
Details | Diff | Splinter Review |
7.44 KB,
patch
|
froydnj
:
review+
vladan
:
checkin+
|
Details | Diff | Splinter Review |
5.83 KB,
patch
|
froydnj
:
review+
vladan
:
checkin+
|
Details | Diff | Splinter Review |
It's not clear what the lifetime of entries in startup cache is for a typical user. We wipe it it on upgrades, when corrupted, addon install.
Assignee | ||
Comment 1•13 years ago
|
||
1) What is the startup cache?
2) How is this useful?
Reporter | ||
Comment 2•13 years ago
|
||
Startup cache contains parsed chrome js and other system-specific caches.
It would be good to understand how long startup cache lasts before it is corrupted/blown away, especially once we put more stuff like bug 705594 into it.
It's in the "local" profile directory. See http://dxr.mozilla.org/mozilla/mozilla-central/startupcache/StartupCache.h.html#l124 for details.
Assignee | ||
Comment 3•13 years ago
|
||
nsIFile and friends don't expose a createdTime for files, mostly because not all platforms usefully support that. Tracking cache age via lastModifiedTime doesn't seem very useful.
Instead, I opted to store the creation time directly in the zip file in the comment field of the...uh...end directory struct (not sure what the official zip name for it is). Doing this avoids raciness problems with a separate file.
To do this, though, nsZipArchive needs to know about reading comments. nsZipWriter already knows about writing comments, so there's no modifications to be made there.
Attachment #582939 -
Flags: review?(taras.mozilla)
Assignee | ||
Comment 4•13 years ago
|
||
Now that we can read comments, it's a simple matter to compute the age of the startup cache when we open it.
Attachment #582945 -
Flags: review?(taras.mozilla)
Reporter | ||
Comment 5•13 years ago
|
||
Comment on attachment 582939 [details] [diff] [review]
part 1 - enhance nsZipArchive with comment functionality
ZipArchive uses zero-copy strings elsewhere. The right way to do this is to store the comment offset and fetch it ondemand.
ie do stuff you did in BuildFileList within GetComment();
wrap the map-asseccessing part of GetComment in MOZ_WIN_MEM_TRY_*
Attachment #582939 -
Flags: review?(taras.mozilla) → review-
Reporter | ||
Comment 6•13 years ago
|
||
Comment on attachment 582945 [details] [diff] [review]
part 2 - store/retrieve creation time and feed age to Telemetry
I would skip atoll and use memcmp to copy PRTime as a binary. StartupCache is already machine-specific. This might make the string api upset, this is fine too.
Attachment #582945 -
Flags: review?(taras.mozilla) → review+
Assignee | ||
Comment 7•13 years ago
|
||
If I'm going to be storing raw pointers and lengths anyway, I might as well ditch the nsCString bits and just have nsZipArchive clients work directly with the pointers themselves. Or does that cause issues with whatever memory mapping magic is used by nsZipArchive?
Assignee | ||
Comment 8•13 years ago
|
||
Now with on-demand copying. I realized I was talking nonsense about raw pointers and whatnot.
Attachment #582939 -
Attachment is obsolete: true
Attachment #583174 -
Flags: review?(taras.mozilla)
Assignee | ||
Comment 9•13 years ago
|
||
Now with binary storage. Requesting re-review because of bit fiddling and because I restructured LoadArchive to use return-early style to avoid massive indentation.
Attachment #582945 -
Attachment is obsolete: true
Attachment #583177 -
Flags: review?(taras.mozilla)
Reporter | ||
Comment 10•13 years ago
|
||
Comment on attachment 583174 [details] [diff] [review]
part 1 - enhance nsZipArchive with comment functionality, v2
># HG changeset patch
># User Nathan Froyd <froydnj@gmail.com>
># Date 1324337115 18000
>Bug 711297 - add comment reading to nsZipArchive; r=taras
>
>
>diff --git a/modules/libjar/nsZipArchive.cpp b/modules/libjar/nsZipArchive.cpp
>index 399227d..63cf608 100644
>--- a/modules/libjar/nsZipArchive.cpp
>+++ b/modules/libjar/nsZipArchive.cpp
>@@ -666,6 +666,18 @@ MOZ_WIN_MEM_TRY_BEGIN
> if (sig != ENDSIG)
> return NS_ERROR_FILE_CORRUPTED;
>
>+ // Make the comment available for consumers.
>+ if (endp - buf >= ZIPEND_SIZE) {
>+ ZipEnd *zipend = (ZipEnd *)buf;
>+
>+ buf += ZIPEND_SIZE;
>+ PRUint16 commentlen = xtoint(zipend->commentfield_len);
>+ if (endp - buf >= commentlen) {
>+ char *comment = (char *)buf;
>+ mComment.Assign(comment, commentlen);
>+ }
>+ }
This doesn't seem right. All of this stuff seems like it can occur in GetComment().
also irc comments.
Attachment #583174 -
Flags: review?(taras.mozilla) → review-
Assignee | ||
Comment 11•13 years ago
|
||
Thinkos fixed.
For posterity: the IRC discussion was about better ways to do this, maybe by looking at modtimes of individual entries or somesuch. The nsZipArchive API isn't really conducive to such things, so hijacking the comment field stays.
Attachment #583174 -
Attachment is obsolete: true
Attachment #583513 -
Flags: review?(taras.mozilla)
Assignee | ||
Comment 12•13 years ago
|
||
Fix some thinkos, clean includes, etc. Nothing big.
Attachment #583177 -
Attachment is obsolete: true
Attachment #583177 -
Flags: review?(taras.mozilla)
Attachment #583514 -
Flags: review?(taras.mozilla)
Reporter | ||
Comment 13•13 years ago
|
||
Comment on attachment 583513 [details] [diff] [review]
part 1 - enhance nsZipArchive with comment functionality, v3
nsCString -> nsACString is our usual convention.
Attachment #583513 -
Flags: review?(taras.mozilla) → review+
Reporter | ||
Comment 14•13 years ago
|
||
Comment on attachment 583514 [details] [diff] [review]
part 2 - store/retrieve creation time and feed age to Telemetry, v3
>+ PRTime now = PR_Now();
>+ if (!mArchive) {
>+ nsCString comment;
>+ char buf[sizeof(now)];
>+ memcpy(buf, &now, sizeof(buf));
>+ comment.Assign(&buf[0], sizeof(buf));
>+ zipW->SetComment(comment);
>+ }
This reads like java code.
Would comment.Assign((char*)&now, sizeof(now)) be less clear?
>+
>+ nsresult LoadArchive(enum TelemetrifyAge flag);
> nsresult Init();
> void WriteToDisk();
> nsresult ResetStartupWriteTimer();
>diff --git a/toolkit/components/telemetry/TelemetryHistograms.h b/toolkit/components/telemetry/TelemetryHistograms.h
>index 3a0552f..eb04e79 100644
>--- a/toolkit/components/telemetry/TelemetryHistograms.h
>+++ b/toolkit/components/telemetry/TelemetryHistograms.h
>@@ -123,6 +123,7 @@ HISTOGRAM(MAC_INITFONTLIST_TOTAL, 1, 30000, 10, EXPONENTIAL, "gfxMacPlatformFont
>
> HISTOGRAM(SYSTEM_FONT_FALLBACK, 1, 100000, 50, EXPONENTIAL, "System font fallback (us)")
> HISTOGRAM(SYSTEM_FONT_FALLBACK_FIRST, 1, 40000, 20, EXPONENTIAL, "System font fallback, first call (ms)")
>+HISTOGRAM(STARTUP_CACHE_AGE_HOURS, 1, 3000, 20, EXPONENTIAL, "Startup cache age (hours)")
>
> HISTOGRAM_BOOLEAN(SHUTDOWN_OK, "Did the browser start after a successful shutdown")
>
Before we land this bug, please add a 3rd patch with a testcase. Modify some startup cache test to make sure that the telemetry histogram is being filled.
Attachment #583514 -
Flags: review?(taras.mozilla) → review+
Assignee | ||
Comment 15•13 years ago
|
||
nsCString -> nsACString. Carrying over r+.
Attachment #583513 -
Attachment is obsolete: true
Attachment #584754 -
Flags: review+
Assignee | ||
Comment 16•13 years ago
|
||
Address Java-isms in the code.
This patch also changes the age computation to be ceil(# of hours) rather than floor(# of hours) as it was in previous versions. This is required for tests to be reasonable, since histogram buckets must start at 1, rather than 0, and we'd rather not have a test that takes an hour to run. I believe this change to be relatively non-controversial, so I'm going to carry over r+ on this new patch.
Attachment #583514 -
Attachment is obsolete: true
Attachment #584755 -
Flags: review+
Assignee | ||
Comment 17•13 years ago
|
||
Since the tests for nsIStartupCache all run in a single process and we don't normally record startup cache age during a single process's lifetime, we need some way of changing this for tests. nsIStartupCache's methods are basically all for testing purposes anyway, so there doesn't seem to be much harm in adding one more.
Attachment #584757 -
Flags: review?(mwu)
Assignee | ||
Comment 18•13 years ago
|
||
This patch adds bits to TestStartupCache.cpp to ensure that we do record interesting information in the STARTUP_CACHE_AGE_HOURS histogram.
I will admit upfront that it is ugly. However, given that:
1) nsITelemetry wasn't really designed to be used from C++, but rather from JavaScript;
1a) Making it more suitable means having a discussion about whether the ipc/chromium bits should be part of our SDK API, which probably isn't worth having for just one test;
2) Telemetry::GetHistogramById isn't exported from libxul;
2a) Making it so: see 1a;
3) Rewriting the whole test in JavaScript: ugh;
This solution is probably the least bad way to do things. I think it would be less ugly if early returns from main() were used instead of the futzing with use_js, but I was trying to keep the execute-all-tests style of the current program.
I also don't make any claims about whether I used the JS api 100% correctly; it works well enough for the purposes herein.
Attachment #584760 -
Flags: review?(mwu)
Comment 19•13 years ago
|
||
Comment on attachment 584757 [details] [diff] [review]
part 3 - add recordAgesAlways method to nsIStartupCache
Review of attachment 584757 [details] [diff] [review]:
-----------------------------------------------------------------
::: startupcache/StartupCache.cpp
@@ +154,4 @@
>
> StartupCache* StartupCache::gStartupCache;
> bool StartupCache::gShutdownInitiated;
> +enum StartupCache::TelemetrifyAge StartupCache::postFlushAgeAction = StartupCache::IGNORE_AGE;
IGNORE_AGE is 1, and postFlushAgeAction is a global. I suggest changing the TelemetrifyAge enum to allow this to be initialized to zero.
::: startupcache/StartupCache.h
@@ +170,4 @@
> RECORD_AGE = 0,
> IGNORE_AGE = 1
> };
> + static enum TelemetrifyAge postFlushAgeAction;
Something like gPostFlushAgeAction would be more consistent with how static variables are named in this class.
Attachment #584757 -
Flags: review?(mwu) → review+
Assignee | ||
Comment 20•13 years ago
|
||
Address review comments. Carrying over r+.
Attachment #584757 -
Attachment is obsolete: true
Attachment #586445 -
Flags: review+
Comment 21•13 years ago
|
||
Comment on attachment 584760 [details] [diff] [review]
part 4 - add test STARTUP_CACHE_AGE_HOURS histogram
Well, I don't think there's anything less painful than this at the moment, since we don't have access to the right API to do this directly. Unfortunately I don't know this test code or telemetry particularly well so taras might be able to do a better review.
Attachment #584760 -
Flags: review?(mwu) → review?(taras.mozilla)
Reporter | ||
Updated•13 years ago
|
Attachment #584760 -
Flags: review?(taras.mozilla) → review+
Assignee | ||
Updated•13 years ago
|
Keywords: checkin-needed
Assignee | ||
Comment 22•13 years ago
|
||
Pushing this on try showed test failures on debug builds, so removing checkin-needed while I debug.
Keywords: checkin-needed
Assignee | ||
Comment 23•13 years ago
|
||
Comment on attachment 584760 [details] [diff] [review]
part 4 - add test STARTUP_CACHE_AGE_HOURS histogram
I'm seeing failures for this test on Try debug builds. The error is:
Assertion failure: cx->runtime->requestDepth || cx->runtime->gcRunning, at e:/builds/moz2_slave/try-w32-dbg/build/js/src/jscntxt.cpp:1288
It doesn't appear to happen all the time: I can run Linux debug builds locally just fine, and my Linux Try debug builds sometimes win, but it seems to fail consistently elsewhere.
Luke, can you take a look at my JSAPI usage and tell me what bits I might be missing?
Attachment #584760 -
Flags: feedback?(luke)
Assignee | ||
Comment 24•13 years ago
|
||
Comment on attachment 584760 [details] [diff] [review]
part 4 - add test STARTUP_CACHE_AGE_HOURS histogram
jdm pointed out JSAutoRequest and that looks like the solution to my problems.
Attachment #584760 -
Flags: feedback?(luke)
Assignee | ||
Comment 25•13 years ago
|
||
Fixed test so it will work on debug builds with JS checking. Carrying over r+.
Attachment #584760 -
Attachment is obsolete: true
Attachment #594434 -
Flags: review+
Assignee | ||
Updated•13 years ago
|
Keywords: checkin-needed
Updated•13 years ago
|
Whiteboard: [autoland-try]
Updated•13 years ago
|
Whiteboard: [autoland-try] → [autoland-in-queue]
Comment 26•13 years ago
|
||
Autoland Patchset:
Patches: 584754, 584755, 586445, 594434
Branch: mozilla-central => try
Destination: http://hg.mozilla.org/try/rev/3b9b982beafb
Try run started, revision 3b9b982beafb. To cancel or monitor the job, see: https://tbpl.mozilla.org/?tree=Try&rev=3b9b982beafb
Comment 27•13 years ago
|
||
Try run for 3b9b982beafb is complete.
Detailed breakdown of the results available here:
https://tbpl.mozilla.org/?tree=Try&rev=3b9b982beafb
Results (out of 206 total builds):
success: 186
warnings: 20
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/autolanduser@mozilla.com-3b9b982beafb
Updated•13 years ago
|
Whiteboard: [autoland-in-queue]
Comment 28•13 years ago
|
||
Unfortunately failed make check on Windows:
TEST-UNEXPECTED-FAIL | histogram didn't record samples
https://tbpl.mozilla.org/?tree=Try&rev=3b9b982beafb
Keywords: checkin-needed
Assignee | ||
Comment 29•13 years ago
|
||
There's no need to check for values > 0; Histogram will do that for us. We also might encounter values == 0 depending on the resolution of PR_Now(). So ditch that check.
Attachment #584755 -
Attachment is obsolete: true
Attachment #599010 -
Flags: review+
Assignee | ||
Comment 30•13 years ago
|
||
I am an idiot. After much poking at try, I finally realized that PRTime might not have a very big resolution on Windows, resulting in an age of 0 (since the time of creation and the time of reading happen so quickly in tests).
But then I realized that I was using the wrong method to check if the histogram had changed! histogram.sum is the sum of all the *samples* taken, not the sum of all the bucket counts. So, naturally, if you are taking samples with a value of zero, the sum of all those is...zero.
Updated patch to do bucket-wise comparisons before and after.
Attachment #594434 -
Attachment is obsolete: true
Attachment #599011 -
Flags: review+
Assignee | ||
Updated•13 years ago
|
Updated•13 years ago
|
Whiteboard: [autoland-try] → [autoland-in-queue]
Comment 31•13 years ago
|
||
Autoland Patchset:
Patches: 584754, 586445, 599010, 599011
Branch: mozilla-central => try
Error applying patch 586445 to mozilla-central.
patching file startupcache/StartupCache.cpp
Hunk #2 FAILED at 496
Hunk #3 FAILED at 508
2 out of 5 hunks FAILED -- saving rejects to file startupcache/StartupCache.cpp.rej
patching file startupcache/StartupCache.h
Hunk #2 FAILED at 169
1 out of 2 hunks FAILED -- saving rejects to file startupcache/StartupCache.h.rej
abort: patch failed to apply
Could not apply and push patchset:
Updated•13 years ago
|
Whiteboard: [autoland-in-queue]
Assignee | ||
Comment 32•13 years ago
|
||
Whoops, need to tell the bot which order to land things in.
Whiteboard: [autoland-try:584754,599010,586445,599011]
Updated•13 years ago
|
Whiteboard: [autoland-try:584754,599010,586445,599011] → [autoland-in-queue]
Comment 33•13 years ago
|
||
Autoland Patchset:
Patches: 584754, 586445, 599010, 599011
Branch: mozilla-central => try
Error applying patch 586445 to mozilla-central.
patching file startupcache/StartupCache.cpp
Hunk #2 FAILED at 496
Hunk #3 FAILED at 508
2 out of 5 hunks FAILED -- saving rejects to file startupcache/StartupCache.cpp.rej
patching file startupcache/StartupCache.h
Hunk #2 FAILED at 169
1 out of 2 hunks FAILED -- saving rejects to file startupcache/StartupCache.h.rej
abort: patch failed to apply
Could not apply and push patchset:
Updated•13 years ago
|
Whiteboard: [autoland-in-queue]
Assignee | ||
Comment 34•13 years ago
|
||
Releng poked at this problem and declared it fixed. Pushing to try tells all. :) Third time's the charm!
Whiteboard: [autoland-try:584754,599010,586445,599011]
Updated•13 years ago
|
Whiteboard: [autoland-try:584754,599010,586445,599011] → [autoland-in-queue]
Comment 35•13 years ago
|
||
Autoland Patchset:
Patches: 584754, 599010, 586445, 599011
Branch: mozilla-central => try
Destination: http://hg.mozilla.org/try/pushloghtml?changeset=6e815d021d62
Try run started, revision 6e815d021d62. To cancel or monitor the job, see: https://tbpl.mozilla.org/?tree=Try&rev=6e815d021d62
Comment 36•13 years ago
|
||
Try run for 6e815d021d62 is complete.
Detailed breakdown of the results available here:
https://tbpl.mozilla.org/?tree=Try&rev=6e815d021d62
Results (out of 218 total builds):
exception: 4
success: 175
warnings: 25
failure: 14
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/autolanduser@mozilla.com-6e815d021d62
Timed out after 12 hours without completing.
Updated•13 years ago
|
Whiteboard: [autoland-in-queue]
Comment 37•13 years ago
|
||
Comment on attachment 584754 [details] [diff] [review]
part 1 - enhance nsZipArchive with comment functionality, v4
https://hg.mozilla.org/integration/mozilla-inbound/rev/60d47e224c40
Attachment #584754 -
Flags: checkin+
Comment 38•13 years ago
|
||
Comment on attachment 599010 [details] [diff] [review]
part 2 - store/retrieve creation time and feed age to Telemetry, v5
https://hg.mozilla.org/integration/mozilla-inbound/rev/579029d9950a
Attachment #599010 -
Flags: checkin+
Comment 39•13 years ago
|
||
Comment on attachment 586445 [details] [diff] [review]
part 3 - add recordAgesAlways method to nsIStartupCache, v2
https://hg.mozilla.org/integration/mozilla-inbound/rev/69b718a5a15e
Attachment #586445 -
Flags: checkin+
Comment 40•13 years ago
|
||
Comment on attachment 599011 [details] [diff] [review]
part 4 - add test for STARTUP_CACHE_AGE_HOURS histogram, v3
https://hg.mozilla.org/integration/mozilla-inbound/rev/b3773fd0e281
Attachment #599011 -
Flags: checkin+
Updated•13 years ago
|
Comment 41•13 years ago
|
||
Comment 42•13 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/60d47e224c40
https://hg.mozilla.org/mozilla-central/rev/579029d9950a
https://hg.mozilla.org/mozilla-central/rev/69b718a5a15e
https://hg.mozilla.org/mozilla-central/rev/b3773fd0e281
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•