Closed Bug 735768 Opened 13 years ago Closed 13 years ago

make persistent telemetry more robust

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla14

Tracking Flags:

Tracking

Status

firefox13

---

fixed

People

(Reporter: froydnj, Assigned: froydnj)

References

Details

(Whiteboard: [qa-])

Attachments

(7 files)

part 1 - robustify C++ side 13 years ago Nathan Froyd [:froydnj] 1.64 KB, patch	taras.mozilla : review+ vladan : checkin+	Details \| Diff \| Splinter Review
part 2 - robustify JS side 13 years ago Nathan Froyd [:froydnj] 2.52 KB, patch	taras.mozilla : review+ vladan : checkin+	Details \| Diff \| Splinter Review
part 3 - add tests 13 years ago Nathan Froyd [:froydnj] 10.64 KB, patch	taras.mozilla : review+ vladan : checkin+	Details \| Diff \| Splinter Review
telemetry submission rates post-patch 13 years ago Nathan Froyd [:froydnj] 78.77 KB, image/png		Details
part 1 - robustify C++ side, aurora version 13 years ago Nathan Froyd [:froydnj] 1.57 KB, patch	froydnj : review+ lsblakk : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
part 2 - robustify JS side, aurora version 13 years ago Nathan Froyd [:froydnj] 2.43 KB, patch	froydnj : review+ lsblakk : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
part 3 - add tests, aurora version 13 years ago Nathan Froyd [:froydnj] 10.45 KB, patch	froydnj : review+ lsblakk : approval-mozilla-aurora+	Details \| Diff \| Splinter Review

Nathan Froyd [:froydnj]

Assignee

Description

•

13 years ago

There are some potential problems lurking in persistent telemetry that need to be addressed. Patches incoming.

Nathan Froyd [:froydnj]

Assignee

Comment 1

•

13 years ago

Attached patch part 1 - robustify C++ side — Details — Splinter Review

We should have been checking for what happened when we reflected persisted histograms into JS. Just skip corrupt histograms as we do for normal telemetry.

Attachment #605877 - Flags: review?(taras.mozilla)

Nathan Froyd [:froydnj]

Assignee

Comment 2

•

13 years ago

Attached patch part 2 - robustify JS side — Details — Splinter Review

We should make sure on the JS side that if something goes wrong, we basically abort sending the persisted data.

Attachment #605878 - Flags: review?(taras.mozilla)

Nathan Froyd [:froydnj]

Assignee

Updated

•

13 years ago

Attachment #605878 - Attachment is patch: true

Nathan Froyd [:froydnj]

Assignee

Comment 3

•

13 years ago

Attached patch part 3 - add tests — Details — Splinter Review

Should have done this initially. Needs some tweaks to the IDL interface, but otherwise straightforward. Testing for corruption here is hard. It can happen during testing, but it doesn't happen very often. In any event, having tests makes me feel better about everything not falling apart.

Attachment #605879 - Flags: review?(taras.mozilla)

Nathan Froyd [:froydnj]

Assignee

Updated

•

13 years ago

Whiteboard: [autoland-try:-b do -p all -u all -t none]

Mozilla RelEng Bot

Updated

•

13 years ago

Whiteboard: [autoland-try:-b do -p all -u all -t none] → [autoland-in-queue]

Mozilla RelEng Bot

Comment 4

•

13 years ago

Autoland Patchset: Patches: 605877, 605878, 605879 Branch: mozilla-central => try Destination: http://hg.mozilla.org/try/pushloghtml?changeset=0505978c2624 Try run started, revision 0505978c2624. To cancel or monitor the job, see: https://tbpl.mozilla.org/?tree=Try&rev=0505978c2624

(dormant account)

Updated

•

13 years ago

Attachment #605877 - Flags: review?(taras.mozilla) → review+

(dormant account)

Comment 5

•

13 years ago

Comment on attachment 605878 [details] [diff] [review] part 2 - robustify JS side should record this corruption via telemetry

Attachment #605878 - Flags: review?(taras.mozilla) → review+

(dormant account)

Updated

•

13 years ago

Attachment #605879 - Flags: review?(taras.mozilla) → review+

Nathan Froyd [:froydnj]

Assignee

Comment 6

•

13 years ago

Try run looks pretty green, marking checkin-needed. (In reply to Taras Glek (:taras) from comment #5) > should record this corruption via telemetry If I do this, I'll do it in a followup bug, but we already have telemetry for corrupt histograms in a session, so doing corruption counting for persisted histograms just sounds like double-counting for little benefit.

Keywords: checkin-needed

Mozilla RelEng Bot

Comment 8

•

13 years ago

Try run for 0505978c2624 is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=0505978c2624 Results (out of 216 total builds): exception: 1 success: 171 warnings: 44 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/autolanduser@mozilla.com-0505978c2624

Mozilla RelEng Bot

Updated

•

13 years ago

Whiteboard: [autoland-in-queue]

Dão Gottwald [:dao]

Updated

•

13 years ago

Assignee: nobody → nfroyd

Vladan Djeric (:vladan)

Comment 9

•

13 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/d8f18e891c9a https://hg.mozilla.org/integration/mozilla-inbound/rev/49a7eafa791f https://hg.mozilla.org/integration/mozilla-inbound/rev/4ee7f4594ba9

Keywords: checkin-needed

Target Milestone: --- → mozilla14

Vladan Djeric (:vladan)

Updated

•

13 years ago

Attachment #605877 - Flags: checkin+

Vladan Djeric (:vladan)

Updated

•

13 years ago

Attachment #605878 - Flags: checkin+

Vladan Djeric (:vladan)

Updated

•

13 years ago

Attachment #605879 - Flags: checkin+

Marco Bonardo [:mak]

Comment 10

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/d8f18e891c9a https://hg.mozilla.org/mozilla-central/rev/49a7eafa791f https://hg.mozilla.org/mozilla-central/rev/4ee7f4594ba9

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Nathan Froyd [:froydnj]

Assignee

Comment 11

•

13 years ago

Attached image telemetry submission rates post-patch — Details

Here's a graph of what happened to our telemetry submission rates after we fixed this. Putting this here for evidence of Aurora uplift goodness.

Nathan Froyd [:froydnj]

Assignee

Comment 12

•

13 years ago

Attached patch part 1 - robustify C++ side, aurora version — Details — Splinter Review

[Approval Request Comment] Regression caused by (bug #): Bug 707320 and follow on patches. User impact if declined: Spurious error messages in the console. Testing completed (on m-c, etc.): On m-c; see attached graph of telemetry submission rates for evidence this pushes us in the right direction. Risk to taking this patch (and alternatives if risky): Epsilon. We will not be collecting telemetry data if this patch is declined.

Attachment #607954 - Flags: review+

Attachment #607954 - Flags: approval-mozilla-aurora?

Nathan Froyd [:froydnj]

Assignee

Comment 13

•

13 years ago

Attached patch part 2 - robustify JS side, aurora version — Details — Splinter Review

Attachment #607955 - Flags: review+

Attachment #607955 - Flags: approval-mozilla-aurora?

Nathan Froyd [:froydnj]

Assignee

Comment 14

•

13 years ago

Attached patch part 3 - add tests, aurora version — Details — Splinter Review

This patch is optional; the tests are merely there for peace of mind. The previous two patches are the important ones. [Approval Request Comment] Regression caused by (bug #): Bug 707320 and follow on patches. User impact if declined: None. Testing completed (on m-c, etc.): On m-c. Risk to taking this patch (and alternatives if risky): Epsilon.

Attachment #607956 - Flags: review+

Attachment #607956 - Flags: approval-mozilla-aurora?

Lukas Blakk [:lsblakk] use ?needinfo

Comment 15

•

13 years ago

Comment on attachment 607954 [details] [diff] [review] part 1 - robustify C++ side, aurora version [Triage Comment] Very nice. Happy to see this and thank you for the screenshot to prove the data is being positively affected.

Attachment #607954 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Lukas Blakk [:lsblakk] use ?needinfo

Updated

•

13 years ago

Attachment #607955 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Lukas Blakk [:lsblakk] use ?needinfo

Updated

•

13 years ago

Attachment #607956 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Lukas Blakk [:lsblakk] use ?needinfo

Comment 16

•

13 years ago

Please continue to monitor the data after this lands, let's check in a week or so from now and ensure there are no regressions.

Nathan Froyd [:froydnj]

Assignee

Comment 17

•

13 years ago

Pushed to aurora: https://hg.mozilla.org/releases/mozilla-aurora/rev/2fdf7be9bd2c https://hg.mozilla.org/releases/mozilla-aurora/rev/f30faa038ccb https://hg.mozilla.org/releases/mozilla-aurora/rev/fe19b0150074

status-firefox13: --- → fixed

u279076

Comment 18

•

13 years ago

Is there something QA can do to verify this fix?

Whiteboard: [qa?]

Nathan Froyd [:froydnj]

Assignee

Comment 19

•

13 years ago

I don't think there's a good STR for this.

u279076

Comment 20

•

13 years ago

Okay, thanks Nathan. If something comes to light which we can use to test please mark this [qa+].

Whiteboard: [qa?] → [qa-]

You need to log in before you can comment on or make changes to this bug.