Closed
Bug 1211411
Opened 9 years ago
Closed 9 years ago
Limit the number of thread hang stats reported to Telemetry
Categories
(Toolkit :: Telemetry, defect, P1)
Toolkit
Telemetry
Tracking
()
RESOLVED
FIXED
mozilla44
People
(Reporter: Dexter, Assigned: Dexter)
References
(Blocks 1 open bug)
Details
(Whiteboard: [unifiedTelemetry][measurement:client])
Attachments
(1 file, 2 obsolete files)
2.76 KB,
patch
|
Dexter
:
review+
lizzard
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
In bug 1187864 comment 3 we found out that some pings report too many thread hang stats.
Since the repeated "(chrome script)" stack entries are not really useful we should collapse those into one "(chrome script)" or "(chrome script <N>)".
Assignee | ||
Updated•9 years ago
|
Assignee | ||
Comment 1•9 years ago
|
||
We should also limit the number of reported hangs.
Comment 2•9 years ago
|
||
So, per bug 1187864 there are different things we can do here:
(1) limit the number of hang stat entries
(2) limit the stack depth to a sane number
(3) collapse repeated successive "(chrome script)" to one "(chrome script)" or "(chrome script <N>)"
For (3) i think we should just stay with collapsing into the simple "(chrome script)" (unless there are compelling reasons to make it more complicated).
I'd be fine with moving this part to a follow-up if it is no big win.
Updated•9 years ago
|
Points: 2 → 3
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → alessio.placitelli
Assignee | ||
Updated•9 years ago
|
status-firefox42:
--- → wontfix
status-firefox43:
--- → affected
Assignee | ||
Updated•9 years ago
|
Priority: P2 → P1
Assignee | ||
Comment 3•9 years ago
|
||
Analysing the data from bug 1191846, we found out that fixing the stuff from point (2) and (3) in comment 2, along with the future fix from bug 1213780, is enough to bring down the size of the majority of the big pings to less than 1Mb.
This means that we're not limiting the number of thread hang stat entries, for the moment.
In order to find a reasonable maximum stack depth for (2), we've used a python notebook [1] to check the 95th percentile of BHR stack depths received by the Telemetry servers for the Nightly population. This value is 11, so that's the maximum depth our reported stacks will have.
Thanks Vladan and Georg for their feedback and suggestions.
[1] - https://gist.github.com/Dexterp37/fc4a043fb442d4e4be90
Assignee | ||
Comment 4•9 years ago
|
||
This patch changes the BackgroundHangMonitor to clean up the reported hang stacks by collapsing repeated "(chrome script)"/"(content script)" entries.
It also limits the stack depth by removing the topmost frames if greater than the maximum allowed depth.
Assignee | ||
Updated•9 years ago
|
Attachment #8673517 -
Flags: review?(vladan.bugzilla)
Updated•9 years ago
|
Status: NEW → ASSIGNED
Comment 5•9 years ago
|
||
Comment on attachment 8673517 [details] [diff] [review]
bug1211411.patch
Review of attachment 8673517 [details] [diff] [review]:
-----------------------------------------------------------------
::: xpcom/threads/BackgroundHangMonitor.cpp
@@ +32,5 @@
> #define BHR_BETA_MOD 100;
>
> +// Maximum depth of the call stack in the reported thread hangs. This value represents
> +// the 95th pecentile of the thread hangs stack depths reported by Telemetry.
> +const int kMaxThreadHangStackDepth = 11;
static const int
@@ +417,5 @@
> + // Limit the depth of the reported stack if greater than our limit. Only keep its
> + // last entries.
> + if (mHangStack.length() > kMaxThreadHangStackDepth) {
> + const int elementsToRemove = mHangStack.length() - kMaxThreadHangStackDepth;
> + mHangStack.erase(mHangStack.begin(), mHangStack.begin() + elementsToRemove);
you can replace this with:
mHangStack.resize(std::min(mHangStack.length, kMaxThreadHangStackDepth));
Attachment #8673517 -
Flags: review?(vladan.bugzilla) → review+
Assignee | ||
Comment 6•9 years ago
|
||
Thanks for the review. This patch addresses your first comment and, as discussed over IRC, I've kept the second change and added a little comment there.
Moreover, this patch adds the "(reduced stack)" entry to mark that the stack was reduced.
Attachment #8673517 -
Attachment is obsolete: true
Attachment #8674405 -
Flags: review?(vladan.bugzilla)
Updated•9 years ago
|
Attachment #8674405 -
Flags: review?(vladan.bugzilla) → review+
Assignee | ||
Comment 7•9 years ago
|
||
Assignee | ||
Comment 8•9 years ago
|
||
This patch fixes the test fallout by changing the int constant to a size_t.
Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=78a986b87c96
Attachment #8674405 -
Attachment is obsolete: true
Attachment #8674878 -
Flags: review+
Assignee | ||
Comment 9•9 years ago
|
||
https://hg.mozilla.org/integration/fx-team/rev/73dbb4a043f8698c835e9b0e2ca4817ecc03f4bf
Bug 1211411 - Limit the number of thread hang stats reported to Telemetry. r=vladan
Assignee | ||
Updated•9 years ago
|
Points: 3 → 1
Comment 10•9 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
Comment 11•9 years ago
|
||
Alessio: how do you feel about uplifting this patch and the patch in bug 1219751 to Beta 43? It's already on Aurora 44 afaict.
If you're ok with uplifting these, I'd like to get them in by the end of the week, in time for next week's A/B experiment
Blocks: 1222894
Status: RESOLVED → REOPENED
Flags: needinfo?(alessio.placitelli)
Resolution: FIXED → ---
Assignee | ||
Comment 12•9 years ago
|
||
Comment on attachment 8674878 [details] [diff] [review]
bug1211411.patch
Approval Request Comment
[Feature/regressing bug #]: Limit the depth of thread hang stacks, consistently reducing the size of the pings sent to Telemetry servers.
[User impact if declined]: This isn't a user-facing feature. However, without this patches bug 1222894 could potentially cause sending a lot of oversized pings, wasting both storage and processing time server side, thus increasing costs.
[Describe test coverage new/current, TreeHerder]: This has been on m-c for 3 weeks without any reported issue.
[Risks and why]: Low risk. This only touches Telemetry & Thread Hang stacks, no front facing features or other systems.
[String/UUID change made/needed]: None.
Please note that this is part of an uplift stack required for bug 1222894. This is the stack (the patches correctly apply in order to mozilla-beta): 1213780, 1211411, 1215540, 1219751.
Flags: needinfo?(alessio.placitelli)
Attachment #8674878 -
Flags: approval-mozilla-beta?
Comment 13•9 years ago
|
||
Comment on attachment 8674878 [details] [diff] [review]
bug1211411.patch
OK on m-c and m-a for a while now.
Please uplift this as part of the telemetry work for beta 4.
Attachment #8674878 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 14•9 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•