Closed
Bug 1337073
Opened 8 years ago
Closed 8 years ago
Improve the IPC_SYNC_LATENCY_MS probe
Categories
(Core :: IPC, defect)
Core
IPC
Tracking
()
RESOLVED
FIXED
mozilla55
People
(Reporter: ehsan.akhgari, Assigned: nika)
References
Details
Attachments
(1 file, 1 obsolete file)
2.76 KB,
patch
|
billm
:
review+
gchang
:
approval-mozilla-aurora+
gchang
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
I have a difficult time interpreting this probe, and I think we can improve it a little bit.
Firstly by not specifying a low: 1 value we always waste one bucket to hold 0 for 0% of the samples.
Also the exponential bucketing makes it a pain to analyze these results. For example for PContent::Msg_GetGfxVars we know that 12% of samples had a 50-126ms pause, which can be 3 to 8 missed frames, which is quite a wide range.
I suggest making the bucketing linear and increasing the number to 20.
Chris, what do you think?
Flags: needinfo?(cpearce)
Comment 1•8 years ago
|
||
Sure. We'll need to delete the old probe and create a new one with a different name.
Note that the documentation states:
"low: Optional, the default value is 1. This field represents the minimum value expected in the histogram. Note that all histograms automatically get a bucket with label "0" for counting values below the "low" value."
https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe#Declaring_a_Histogram
So setting "low " to 1 won't help with the 0 bucket.
Reporter | ||
Comment 2•8 years ago
|
||
(In reply to Chris Pearce (:cpearce) from comment #1)
> Sure. We'll need to delete the old probe and create a new one with a
> different name.
Sounds good.
> Note that the documentation states:
>
> "low: Optional, the default value is 1. This field represents the minimum
> value expected in the histogram. Note that all histograms automatically get
> a bucket with label "0" for counting values below the "low" value."
>
> https://developer.mozilla.org/en-US/docs/Mozilla/Performance/
> Adding_a_new_Telemetry_probe#Declaring_a_Histogram
>
> So setting "low " to 1 won't help with the 0 bucket.
Oh, OK, so ignore that part of my comment then!
Comment hidden (mozreview-request) |
It seems like this is still not going to be great. Won't the first box still be something like 1-99ms? Is that better than the 50-126ms that you were complaining about Ehsan?
Flags: needinfo?(ehsan)
Reporter | ||
Comment 5•8 years ago
|
||
(In reply to Bill McCloskey (:billm) from comment #4)
> It seems like this is still not going to be great. Won't the first box still
> be something like 1-99ms? Is that better than the 50-126ms that you were
> complaining about Ehsan?
No you're right, this won't be much better either. To me the key number that matters is 16ms, perhaps we should adjust our bucketing to get a similar size (or smaller) per bucket?
Flags: needinfo?(ehsan)
The problem is that we're going to be sending a lot of these histograms back (one per message type), and I'd rather not have too much data. Pings cost money (in S3 storage capacity). It probably makes sense to think about what buckets you want and then work backwards from there.
Comment 7•8 years ago
|
||
Perhaps we do in fact want an exponential histogram, but with more buckets so that we have more precision in the low end, and less in the high end?
Is there a way we can predict the bucket sizes of an exponential?
Or we could use an enumerated histogram, with manually defined bucket sizes?
Updated•8 years ago
|
Flags: needinfo?(cpearce)
Attachment #8834174 -
Flags: review?(wmccloskey)
Attachment #8834174 -
Flags: review?(francois)
Reporter | ||
Comment 8•8 years ago
|
||
(In reply to Chris Pearce (:cpearce) from comment #7)
> Perhaps we do in fact want an exponential histogram, but with more buckets
> so that we have more precision in the low end, and less in the high end?
But don't we want less precision in the low end? I mean, a sync IPC message that takes 10ms vs 20ms is nowhere near as bad as one that takes 110ms vs 120ms. (I know this isn't a good example, but most of these probes will probably have most of their values in the first bucket. I think the interesting probes are those who don't adhere to that pattern -- aka have more probes distributed elsewhere.)
> Is there a way we can predict the bucket sizes of an exponential?
>
> Or we could use an enumerated histogram, with manually defined bucket sizes?
That could be another option, yes.
Comment 9•8 years ago
|
||
https://telemetry.mozilla.org/histogram-simulator/ lets you simulate and see buckets for various configs. Note that if you use something other than linear/exponential you'll lose the stddev/mean calculations. I don't know whether that matters in this case.
Thanks Benjamin. I think we probably should be setting the "low" value to something higher. Maybe 32ms or 40ms. That way we still get a 0<=x<40 bucket. And if we use an exponential histogram, we get a lot of precision in the range 40ms-75ms. I also think we should use a lower "high" value. I doubt we care about the difference between, say, 750ms and 1250ms. Anything higher than 500ms or so should probably just count as "too high".
Reporter | ||
Updated•8 years ago
|
Assignee: nobody → michael
Assignee | ||
Comment 11•8 years ago
|
||
I'm not really sure what we want the probe to look like now, but we probably want to fold this probe change into bug 1343729.
Ehsan, do you feel strongly about the linear vs exponential issue? I think we probably want to increase the number of buckets and potentially bring down the high value to get more resolution, but I'm not as confident about whether we want linear or exponential bucketing.
Flags: needinfo?(ehsan)
Reporter | ||
Comment 12•8 years ago
|
||
Hmm. I think Bill's suggestion in comment 10 is pretty reasonable, and I don't think I have a better idea off the top of my head right now, so let's proceed with that...
Flags: needinfo?(ehsan)
Assignee | ||
Comment 13•8 years ago
|
||
This patch applies on top of the patch from bug 1343729 and performs the rename for it.
MozReview-Commit-ID: 7JB7h06wCzu
Attachment #8845167 -
Flags: review?(wmccloskey)
Assignee | ||
Updated•8 years ago
|
Attachment #8834174 -
Attachment is obsolete: true
Attachment #8845167 -
Flags: review?(wmccloskey) → review+
Comment 14•8 years ago
|
||
Pushed by michael@thelayzells.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d2328234fe31
Improve the precision of the IPC_SYNC_LATENCY_MS probe and rename it to IPC_SYNC_MAIN_LATENCY_MS, r=billm
Assignee | ||
Comment 15•8 years ago
|
||
Comment on attachment 8845167 [details] [diff] [review]
Improve the precision of the IPC_SYNC_LATENCY_MS probe and rename it to IPC_SYNC_MAIN_LATENCY_MS
Approval Request Comment
[Feature/Bug causing the regression]: N/A
[User impact if declined]: Less useful sync IPC telemetry on beta/aurora
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: Just landed.
[Needs manual test from QE? If yes, steps to reproduce]: No
[List of other uplifts needed for the feature/fix]: bug 1333489 must be uplifted to beta. Also depends on bug 1343729.
[Is the change risky?]: No
[Why is the change risky/not risky?]: Renames and updates parameters for a telemetry probe.
[String changes made/needed]: None
Attachment #8845167 -
Flags: approval-mozilla-beta?
Attachment #8845167 -
Flags: approval-mozilla-aurora?
Updated•8 years ago
|
Comment 16•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment 17•8 years ago
|
||
Comment on attachment 8845167 [details] [diff] [review]
Improve the precision of the IPC_SYNC_LATENCY_MS probe and rename it to IPC_SYNC_MAIN_LATENCY_MS
Improve precision of telemetry data. Beta53+ & Aurora54+.
Attachment #8845167 -
Flags: approval-mozilla-beta?
Attachment #8845167 -
Flags: approval-mozilla-beta+
Attachment #8845167 -
Flags: approval-mozilla-aurora?
Attachment #8845167 -
Flags: approval-mozilla-aurora+
Comment 18•8 years ago
|
||
bugherder uplift |
Comment 19•8 years ago
|
||
bugherder uplift |
Comment 20•8 years ago
|
||
Setting qe-verify- based on Michael's assessment on manual testing needs (see Comment 15).
Flags: qe-verify-
You need to log in
before you can comment on or make changes to this bug.
Description
•