Closed Bug 1215540 Opened 4 years ago Closed 4 years ago

Telemetry is reporting repeated hang annotations in Thread hangs

Categories

(Toolkit :: Telemetry, defect, P1)

defect
Points:
1

Tracking

()

RESOLVED FIXED
mozilla44
Tracking Status
firefox43 --- fixed
firefox44 --- fixed

People

(Reporter: Dexter, Assigned: Dexter)

References

(Blocks 1 open bug)

Details

(Whiteboard: [unifiedTelemetry][measurement:client])

Attachments

(1 file, 2 obsolete files)

As discovered in bug 1187864 comment 4, Telemetry is reporting repeated annotations for thread hangs [1].

In a particular sample ping, I found repeated annotations which summed up to a 2.2Mb blob.

We should probably address this issue by collapsing identical annotations entries.

[1] - https://pastebin.mozilla.org/8847691
Points: 2 → 1
Attached patch bug1215540.patch (obsolete) — Splinter Review
This patch stacks on the one from bug 1213780.
Assignee: nobody → alessio.placitelli
Status: NEW → ASSIGNED
Attachment #8676339 - Flags: review?(aklotz)
Attachment #8676339 - Flags: review?(aklotz) → review+
Attached patch bug1215540.patch (obsolete) — Splinter Review
Using nsAutoString instead of nsString.
Attachment #8676339 - Attachment is obsolete: true
Attachment #8677551 - Flags: review+
Attached patch bug1215540.patchSplinter Review
This patch works around the rooting issue detected by the hazard analysis. After a discussion with sfink (thanks!), the issue turned out to be a false positive. It was addressed by returning the JS array using a MutableObjectHandle parameter rather then returning the JSObject pointer.

Function '_ZN12_GLOBAL__N_1L23CreateJSHangAnnotationsEP9JSContextRKN7mozilla6VectorINS2_9UniquePtrINS2_11HangMonitor15HangAnnotationsENS2_13DefaultDeleteIS6_EEEELm0ENS2_17MallocAllocPolicyEEE|Telemetry.cpp:JSObject* {anonymous}::CreateJSHangAnnotations(JSContext*, const class mozilla::Vector<mozilla::UniquePtr<mozilla::HangMonitor::HangAnnotations> >*)' has unrooted '<returnvalue>' of type 'JSObject*' live across GC call '_ZN12nsTHashtableI15nsStringHashKeyED1Ev|nsTHashtable<EntryType>::~nsTHashtable() [with EntryType = nsStringHashKey]' at toolkit/components/telemetry/Telemetry.cpp:2956
    toolkit/components/telemetry/Telemetry.cpp:2956: Assign(57,58, return := __temp_23**)
    toolkit/components/telemetry/Telemetry.cpp:2956: Call(58,59, reportedAnnotations.~nsTHashtable()) [[GC call]]
    toolkit/components/telemetry/Telemetry.cpp:2956: Call(59,60, annotationsArray.~Rooted())
    toolkit/components/telemetry/Telemetry.cpp:2957:  [[end of function]]
Attachment #8677551 - Attachment is obsolete: true
Flags: needinfo?(alessio.placitelli)
Attachment #8679403 - Flags: review+
https://hg.mozilla.org/integration/fx-team/rev/56707eeec6e53453878731864eaac19ef1a0c5ab
Bug 1215540 - Fix Telemetry reporting repeated hang annotations for Thread hangs. r=aklotz
https://hg.mozilla.org/mozilla-central/rev/56707eeec6e5
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla44
Can we uplift this to Beta 43?

See https://bugzilla.mozilla.org/show_bug.cgi?id=1211411#c11

Btw, was there a bug to limit the # of threadHang stacks reported to telemetry? (not just their depth)
Blocks: 1222894
Status: RESOLVED → REOPENED
Flags: needinfo?(alessio.placitelli)
Resolution: FIXED → ---
Comment on attachment 8679403 [details] [diff] [review]
bug1215540.patch

Approval Request Comment
[Feature/regressing bug #]: Remove duplicated annotations for thread hangs, consistently reducing the size of the pings sent to Telemetry servers.
[User impact if declined]: This isn't a user-facing feature. However, without this patches bug 1222894 could potentially cause sending a lot of oversized pings, wasting both storage and processing time server side, thus increasing costs.
[Describe test coverage new/current, TreeHerder]: This has been on m-c for 2 weeks without any reported issue. Thread hangs are xpcshell-test covered (test_ThreadHangStats.js)
[Risks and why]: Low risk. This only touches Telemetry & Thread Hang stacks, no front facing features or other systems.
[String/UUID change made/needed]: None.

Please note that this is part of an uplift stack required for bug 1222894. This is the stack (the patches correctly apply in order to mozilla-beta): 1213780, 1211411, 1215540, 1219751.
Flags: needinfo?(alessio.placitelli)
Attachment #8679403 - Flags: approval-mozilla-beta?
Marking affected for 43 since we intend to uplift this
Comment on attachment 8679403 [details] [diff] [review]
bug1215540.patch

Reducing size of telemetry ping info, please uplift to beta.
Attachment #8679403 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
https://hg.mozilla.org/releases/mozilla-beta/rev/12762fdf5ab6
Status: REOPENED → RESOLVED
Closed: 4 years ago4 years ago
Resolution: --- → FIXED
(In reply to Vladan Djeric (:vladan) -- please needinfo! PTO Nov 16/19/20 from comment #11)
> Btw, was there a bug to limit the # of threadHang stacks reported to
> telemetry? (not just their depth)

Ouch, sorry, I forgot to answer that one. No, we didn't put any hard limit in place, as the other measures fixed all the big ping samples that I had.
You need to log in before you can comment on or make changes to this bug.