Closed
Bug 850968
Opened 11 years ago
Closed 10 years ago
Cache Effectiveness Telemetry
Categories
(Core :: Networking, enhancement)
Tracking
()
RESOLVED
WONTFIX
mozilla22
People
(Reporter: mcmanus, Assigned: mcmanus)
Details
Attachments
(2 files, 1 obsolete file)
19.09 KB,
patch
|
u408661
:
review+
|
Details | Diff | Splinter Review |
1.03 KB,
patch
|
taras.mozilla
:
review+
|
Details | Diff | Splinter Review |
Let's find out if the HTTP cache helps us or hurts us. The strategy here is a little crude, but I can live with that. The experiment takes selected aurora or nightly sessions and puts them into an "experiment mode" for 15 minutes after waiting 2 minutes after startup for things to settle down. After the 15 minutes expires things return to previous behavior. To be selected, a session needs to have the allow-experiments pref set to true (its default), and the use-cache pref set to on (the default). Beyond that 1 in 16 sessions is selected at startup time.. so if this really screws you up you can just restart :) If you're in the experiment you get further divided into 1 of 4 groups. Two of those groups have their cache enabled, two have it disabled. Those categories are further split into "fast connections" and "slow connections".. (fast vs slow is pretty crude, I count the number of tcp connects in those first 2 minutes that happen <=125 ms as fast and > 125ms as slow.. if you're at least 1/3 fast I say its a fast-connected-machine.. if I thought it was worth revising I would.) Finally, assuming the experiment is running, a datapoint is collected for every http transaction measuring the elapsed time from asyncopen to the time onStopRequest is called... this is reported through telemetry along with which group you are allocated to. The intent is we should be able to see if having the cache enabled speeds things up, slows things down, or changes the tail.. and if it has a better effect on "slow" networks than fast ones. It intentionally doesn't measure hit rate or anything like that - just the overall performance profile.
Assignee | ||
Comment 1•11 years ago
|
||
given that the different groups aren't explicitly the same set of uris from the same locations this is going to rely on having a lot of data. nightly might not be anywhere near enough.
Assignee | ||
Comment 2•11 years ago
|
||
Attachment #724753 -
Flags: feedback?(taras.mozilla)
Assignee | ||
Updated•11 years ago
|
Attachment #724753 -
Flags: feedback?(hurley)
Comment 3•11 years ago
|
||
Comment on attachment 724753 [details] [diff] [review] patch 0 + Telemetry::Accumulate( + telemID, + (TimeStamp::Now() - mCacheEffectExperimentAsyncOpenTime).ToMilliseconds()); + } use void AccumulateTimeDelta(ID id, TimeStamp start, TimeStamp end = TimeStamp::Now()); I assume this is a temporary experiment, we should probably bump bucket size to 50 without worrying about overhead to increase chances of bucket success
Attachment #724753 -
Flags: feedback?(taras.mozilla) → feedback+
Comment on attachment 724753 [details] [diff] [review] patch 0 Review of attachment 724753 [details] [diff] [review]: ----------------------------------------------------------------- Other than Taras' comment about using AccumulateTimeDelta, looks good to me.
Attachment #724753 -
Flags: feedback?(hurley) → feedback+
Assignee | ||
Comment 5•11 years ago
|
||
Attachment #724753 -
Attachment is obsolete: true
Attachment #725102 -
Flags: review?
Assignee | ||
Updated•11 years ago
|
Attachment #725102 -
Flags: review? → review?(hurley)
Comment on attachment 725102 [details] [diff] [review] patch v1 Review of attachment 725102 [details] [diff] [review]: ----------------------------------------------------------------- ship it!
Attachment #725102 -
Flags: review?(hurley) → review+
Assignee | ||
Comment 7•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/513fafb75e5b
Comment 8•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/513fafb75e5b
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Assignee | ||
Comment 9•11 years ago
|
||
So there is all of 1 day of data in the telemetry dashboard for this.. and given that there is no control for uri or networks there is reason to believe its going to take a lot of data to make meaningful comparisons.. and one of our 4 categories has only about 65K datapoints in it, which is likely far too few. Nonetheless - if you'd like to take the earliest hint, the early data definitely shows the cache helping overall response time especially on slower networks. The fast networks have about 500K samples, the slow ones around 70K These are just desktop numbers for now.. probably not anywhere near enough data from mobile yet (if any - haven't really checked). percentile fast-on fast-off slow-on slow-off 25 68 135 135 378 50 226 318 378 894 75 533 894 1262 2516 90 1500 2117 3553 5961 You read that by saying "fast networks with the cache on have 25% of their transactions complete in 68ms or less, 50% in 226ms or less, and so on.."
Assignee | ||
Comment 10•11 years ago
|
||
now with 4 days worth of data.. relationships are basically unchanged. At this point, it seems like the cache is pretty useful even as is. percentile fast-on fast-off slow-on slow-off 25 81 135 160 378 50 226 318 533 894 75 633 894 1500 2117 90 1500 2117 4222 7083
Comment 11•11 years ago
|
||
(In reply to Patrick McManus [:mcmanus] from comment #10) > now with 4 days worth of data.. relationships are basically unchanged. At > this point, it seems like the cache is pretty useful even as is. > Also note the wild variation in perf with cache on. 10K bucket is empty with cache off for fast connections. I'll let this sit for a few more days but I'll bet that if we filter fast connections by netbooks(aka shitty harddrives) cache will be a net loss.
Comment 12•11 years ago
|
||
Interesting. From eyeballing histogram numbers, the situation is much better on android. Cache appears to be a clear win(if one discounts the fact that we have only 3K datapoints there atm).
Assignee | ||
Comment 13•11 years ago
|
||
we've got a fair # of datapoints but not a lot of submissions.. taras suggests temporarily turning up the data collection rate.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 14•11 years ago
|
||
Attachment #732943 -
Flags: review?(taras.mozilla)
Comment 15•11 years ago
|
||
Comment on attachment 732943 [details] [diff] [review] remove 1 in 16 filter lets take this out on friday
Attachment #732943 -
Flags: review?(taras.mozilla) → review+
Comment 17•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/5b710d7fe073
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 18•11 years ago
|
||
let's back this out for causing bug 858588 we'll need to figure out why disabling the http cache broke an applicatino cache test
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Comment 19•11 years ago
|
||
backouts https://hg.mozilla.org/releases/mozilla-aurora/rev/946b17328e71 https://hg.mozilla.org/mozilla-central/rev/40a228f74389 https://hg.mozilla.org/mozilla-central/rev/f4f549a04ee8
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 20•11 years ago
|
||
(In reply to Patrick McManus [:mcmanus] from comment #19) > backouts Why is it closed as fixed in that case?
Flags: needinfo?(mcmanus)
Assignee | ||
Updated•11 years ago
|
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Updated•11 years ago
|
Flags: needinfo?(mcmanus)
Assignee | ||
Updated•10 years ago
|
Status: REOPENED → RESOLVED
Closed: 11 years ago → 10 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•