Closed Bug 1639843 Opened 5 years ago Closed 5 years ago

Add a "reason" field to command-received telemetry events

Tracking

()

Status:

RESOLVED FIXED

Milestone:

Firefox 80

Tracking Flags:

Tracking

Status

firefox80

---

fixed

People

(Reporter: rfkelly, Assigned: lougenia)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 3 obsolete files)

Bug 1639843 - Added reason field to command-received telemetry events 5 years ago Lougenia Bailey 47 bytes, text/x-phabricator-request		Details \| Review
data-review-request.md 5 years ago Lougenia Bailey 2.80 KB, text/plain		Details
data-review-request.md 5 years ago Lougenia Bailey 2.78 KB, text/plain		Details
data-review-request2.md 5 years ago Lougenia Bailey 3.48 KB, text/plain		Details
data-review-request2.md 5 years ago Lougenia Bailey 3.55 KB, text/plain	chutten\|PTO : data-review+	Details

Ryan Kelly [:rfkelly]

Reporter

Description

•

5 years ago

The sync ping contains "command-received" events designed to measure the success rate of send-tab. I don't believe there is currently a way to tell whether the command was received as a result of a push notification (which is the intended and best behaviour) or as a result of periodic background polling (which is fallback with poorer user experience).

We have quite a few reports of push-based receipt of tabs not working well. We could add a "reason" field to these events to distinguish between the two cases, and get a sense of how well the push-based delivery of tabs is actually working in practice.

Ryan Kelly [:rfkelly]

Reporter

Comment 1

•

5 years ago

(The situation is probably not quite as simple as I make out in the description above, because a single push can trigger us to fetch many sent tabs in a single operation. I recall us having some metrics around the number of tabs discovered via fallback polling in the past, but removing them.)

Lougenia Bailey

Assignee

Comment 2

•

5 years ago

(In reply to Ryan Kelly [:rfkelly] from comment #1)

(The situation is probably not quite as simple as I make out in the description above, because a single push can trigger us to fetch many sent tabs in a single operation. I recall us having some metrics around the number of tabs discovered via fallback polling in the past, but removing them.)

Do you have any references to that fallback polling work that I could review?

Mark Hammond [:markh] [:mhammond]

Comment 3

•

5 years ago

(In reply to Ryan Kelly [:rfkelly] from comment #1)

(The situation is probably not quite as simple as I make out in the description above, because a single push can trigger us to fetch many sent tabs in a single operation. I recall us having some metrics around the number of tabs discovered via fallback polling in the past, but removing them.)

Those metrics were never actually removed - they just expired. They are here and I re-enabled them as part of bug 1644598. Note however that this polling function is only called when the user hits "sync now", so I'm not convinced the metrics will be useful (eg, if a user never hits "sync now", then they will never record they had missed commands even though they do.)

Ryan Kelly [:rfkelly]

Reporter

Comment 4

•

5 years ago

Adding notes/context.

The telemetry event is emitted here, and I expect the reason field could be added alongside flowID and streamID in the event properties. As a new data collection, it will require data review.

The entrypoint for receiving a command is pollDeviceCommands, which already has the metrics Mark links to above. The receivedIndex argument, if non-zero, tells you which command we got a push message about, so I expect that if you plumb this down into the call to _handleCommands, you can use this to determine whether each individual received command was fetched thanks to a push (its index == the received index) or not.

The API that you're talkkng to here (and the shape of the returned commands) is documented here.

Ryan Kelly [:rfkelly]

Reporter

Comment 5

•

5 years ago

As a mental check that this is worth doing: with this extra data in place, I expect we can look at all the command-received telemetry events on a given day, and determine what percentage were retrieved as a result of a push notification and what percentage weren't (e.g. they were retreived via "sync now", or discovered in the process of fetching some other command from a push notification). We want to aim for most tabs being received via push notification.

Note however that this polling function is only called when the user hits "sync now", so I'm not convinced the metrics will be useful
(eg, if a user never hits "sync now", then they will never record they had missed commands even though they do.)

Noting for completeness that the approach in Comment 4 would help with this usefulness problem, because it will log cases when the user discovers missed commands as a side-effect of handling a push message for a more recent command. But it won't help for commands that are never received at all.

Mark Hammond [:markh] [:mhammond]

Comment 6

•

5 years ago

(In reply to Ryan Kelly [:rfkelly] from comment #5)

Determine what percentage were retrieved as a result of a push notification and what percentage weren't (e.g. they were retreived via "sync now", or discovered in the process of fetching some other command from a push notification). We want to aim for most tabs being received via push notification.

Right - however, as you imply, it's subtle - it's more about "did we get a push notification for this tab". IOW, if I sent a tab last week but didn't get a push, then this week I sent a tab and did get a push, I'll see both tabs based on getting that push. We only want that second tab to count as "via push" and that first tab should count as "via {something else}".

I was a bit worried that was going to be really tricky - but I think it's only going to be a little tricky :) For whoever takes this (IIUC, Lougenia?) I think this means:

The entry-point here is here

This will be called with 0 when we are polling for missed commands via "sync now", or it will be called with an "index" when called via push.
However, at line 116 we adjust this to be the index we last saw - which will almost certainly be non-zero, but less than receivedIndex if specified.
Each message we get back in the payload has an index property.
If the original receivedIndex is zero (ie, we are polling), then every command is considered missed.
If the original receivedIndex is non-zero, every message with an index less than that value is considered missed - even though it was a push notification that caused us to see it.

The slightly tricky part here is that the code which sends the telemetry event doesn't know these indexes and so can't know if it should be "missed" or not - but it shouldn't be difficult to pass the 'reason' down to where it's recorded.

Assuming that's correct: I actually only came here to ask what values we should record, and whether it's worth trying to capture the 3 possible cases:

explicit poll for no good reason (ie, "sync now" mashed)
push for a different command and we noticed the old one?
push for this command (which is what we want to see!)

I can't see a good reason not to record all 3 - so I propose reason be one of poll, push-missed, push?

(push-missed kinda sucks, but it doesn't matter as it's just for us, but bike-sheds welcome ;)

Lougenia Bailey

Assignee

Comment 7

•

5 years ago

Attached file Bug 1639843 - Added reason field to command-received telemetry events — Details

Lougenia Bailey

Assignee

Comment 8

•

5 years ago

Also https://treeherder.mozilla.org/#/jobs?repo=try&revision=e0bca219f06395cc56e662cd8651d894f5c3bf35

Phabricator Automation

Updated

•

5 years ago

Assignee: nobody → lougenia

Status: NEW → ASSIGNED

Lougenia Bailey

Assignee

Comment 9

•

5 years ago

Attached file data-review-request.md (obsolete) — Details

Lougenia Bailey

Assignee

Comment 10

•

5 years ago

Attached file data-review-request.md (obsolete) — Details

Attachment #9162015 - Attachment is obsolete: true

Attachment #9162025 - Flags: data-review?(chutten)

Chris H-C :chutten|PTO (back Oct 23)

Comment 11

•

5 years ago

Comment on attachment 9162025 [details] data-review-request.md There is a [new version](https://github.com/mozilla/data-review/blob/master/request.md) of the data review request form with a question pertaining to the location of the collection's documentation that should be used instead of the older version.

Attachment #9162025 - Flags: data-review?(chutten)

Lougenia Bailey

Assignee

Comment 12

•

5 years ago

Attached file data-review-request2.md (obsolete) — Details

Replacing the old data review request form using the new template.

Attachment #9162025 - Attachment is obsolete: true

Attachment #9162301 - Flags: data-review?(chutten)

Mark Hammond [:markh] [:mhammond]

Comment 13

•

5 years ago

We probably want to add this to https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/docs/data/sync-ping.rst and then you can add a link to it to the data review.

Lougenia Bailey

Assignee

Comment 14

•

5 years ago

Attached file data-review-request2.md — Details

Attachment #9162301 - Attachment is obsolete: true

Attachment #9162301 - Flags: data-review?(chutten)

Lougenia Bailey

Assignee

Comment 15

•

5 years ago

(In reply to Mark Hammond [:markh] [:mhammond] from comment #13)

We probably want to add this to https://searchfox.org/mozilla-central/source/toolkit/components/telemetry/docs/data/sync-ping.rst and then you can add a link to it to the data review.

Just updated and reloaded the data form--thanks.

Lougenia Bailey

Assignee

Updated

•

5 years ago

Attachment #9162831 - Flags: data-review?(chutten)

Chris H-C :chutten|PTO (back Oct 23)

Comment 16

•

5 years ago

Comment on attachment 9162831 [details] data-review-request2.md DATA COLLECTION REVIEW RESPONSE: Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate? Yes. Is there a control mechanism that allows the user to turn the data collection on and off? Yes. This collection is Telemetry so can be controlled through Firefox's Preferences. If the request is for permanent data collection, is there someone who will monitor the data over time? Yes, Lougenia Bailey is responsible. Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under? Category 2, Interaction. Is the data collection request for default-on or default-off? Default on for all channels. Does the instrumentation include the addition of any new identifiers? No. Is the data collection covered by the existing Firefox privacy notice? Yes. Does there need to be a check-in in the future to determine whether to renew the data? No. This collection is permanent. --- Result: datareview+

Attachment #9162831 - Flags: data-review?(chutten) → data-review+

Pulsebot

Comment 17

•

5 years ago

Pushed by lobailey@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9c735f9393c9 Added reason field to command-received telemetry events r=markh,rfkelly

Atila Butkovits

Comment 18

•

5 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/9c735f9393c9

Status: ASSIGNED → RESOLVED

Closed: 5 years ago

status-firefox80: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → Firefox 80

You need to log in before you can comment on or make changes to this bug.