Telemetry feature for Fennec Stumbler

RESOLVED FIXED in Firefox 37

Status

defect
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: garvan, Assigned: garvan)

Tracking

Firefox 37
Firefox 37

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments, 6 obsolete attachments)

Adding histograms for the following:

"STUMBLER_TIME_BETWEEN_UPLOADS_SEC"
"STUMBLER_BYTES_UPLOADED_PER_SEC"
"STUMBLER_TIME_BETWEEN_START_SEC;
"STUMBLER_BYTES_PER_UPLOAD"
"STUMBLER_OBSERVATIONS_PER_UPLOAD"
"STUMBLER_CELLS_PER_UPLOAD"
"STUMBLER_WIFIS_PER_UPLOAD"
"STUMBLER_OBSERVATIONS_PER_DAY"

The mozstumbler github issue for this is https://github.com/mozilla/MozStumbler/pull/1378
Summary: Telemetry for Fennec Stumbler → Telemetry feature for Fennec Stumbler
Version: Firefox 35 → Firefox 37
Hopefully this isn't too many new histograms to add.
Attachment #8543028 - Flags: review?(rnewman)
Attachment #8543030 - Flags: review?(vng)
Attachment #8543031 - Flags: review?(vng)
Comment on attachment 8543028 [details] [diff] [review]
Part 1: histograms.json update

Review of attachment 8543028 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/Histograms.json
@@ +7061,5 @@
>      "keyed": true,
>      "description": "Counts of plugin and content process crashes which are reported with a crash dump."
> +  },
> +  "STUMBLER_TIME_BETWEEN_UPLOADS_SEC": {
> +    "expires_in_version": "never",

Probably all of these measurements should be marked as expiring in Fx45 or sooner. You can always extend them later.

@@ +7067,5 @@
> +    "n_buckets": 50,
> +    "high": 259200,
> +    "description": "Stumbler: The time in seconds between uploads."
> +  },
> +  "STUMBLER_BYTES_UPLOADED_PER_SEC": {

I'm not sure about this measurement. Is this transfer rate, a measure of amount of data uploaded, or something else?

@@ +7081,5 @@
> +    "n_buckets": 50,
> +    "high": 259200,
> +    "description": "Stumbler: The time between the service starts."
> +  },
> +  "STUMBLER_BYTES_PER_UPLOAD": {

For the uploads, I would prefer the pattern:

  STUMBLER_UPLOAD_BYTES
  STUMBLER_UPLOAD_OBSERVATIONS

etc.

@@ +7085,5 @@
> +  "STUMBLER_BYTES_PER_UPLOAD": {
> +    "expires_in_version": "never",
> +    "kind": "exponential",
> +    "n_buckets": 50,
> +    "high": 100000,

Is 100K the max? Should you make the max higher to find bugs?

@@ +7102,5 @@
> +    "n_buckets": 50,
> +    "high": 5000,
> +    "description": "Stumbler: The cells per upload."
> +  },
> +  "STUMBLER_WIFIS_PER_UPLOAD": {

STUMBLER_UPLOAD_APS?

@@ +7113,5 @@
> +  "STUMBLER_OBSERVATIONS_PER_DAY": {
> +    "expires_in_version": "never",
> +    "kind": "exponential",
> +    "n_buckets": 50,
> +    "high": 5000,

What are the realistic high values for observations per day, APs and cells per upload? Do we expect any of those to actually hit 5000? Or dramatically exceed it?
Thanks Richard, I addressed your comments in this updated patch.

> Probably all of these measurements should be marked as expiring in Fx45 or
> sooner. You can always extend them later.

Agreed, changed to expire in 45.
 
> > +  "STUMBLER_BYTES_UPLOADED_PER_SEC": {
> 
> I'm not sure about this measurement. Is this transfer rate, a measure of
> amount of data uploaded, or something else?

It is a volume measurement, normalized to per/sec, I updated the name and description.
 
> For the uploads, I would prefer the pattern:
> 
>   STUMBLER_UPLOAD_BYTES
>   STUMBLER_UPLOAD_OBSERVATIONS

Done.

> Is 100K the max? Should you make the max higher to find bugs?

Good point, upped this.
 
> STUMBLER_UPLOAD_APS?

Name changed (..WIFI_AP_COUNT)
 
> @@ +7113,5 @@
> > +  "STUMBLER_OBSERVATIONS_PER_DAY": 
> What are the realistic high values for observations per day, APs and cells
> per upload? Do we expect any of those to actually hit 5000? Or dramatically
> exceed it?

I raised these values in the updated patch, but I don't have enough data to really state what expected max is.
Attachment #8543028 - Attachment is obsolete: true
Attachment #8543028 - Flags: review?(rnewman)
Attachment #8544187 - Flags: review+
Attachment #8544187 - Flags: review+ → review?(rnewman)
- Updated to match changes in Histograms.json

- Added one more probe: TELEMETRY_TIME_BETWEEN_RECEIVED_LOCATIONS_SEC
Attachment #8543031 - Attachment is obsolete: true
Attachment #8543031 - Flags: review?(vng)
Attachment #8544192 - Flags: review?(vng)
(prev patch accidentally included debugging code in it, cleaned up)

- Updated to match changes in Histograms.json

- Added one more probe: TELEMETRY_TIME_BETWEEN_RECEIVED_LOCATIONS_SEC
Attachment #8544192 - Attachment is obsolete: true
Attachment #8544192 - Flags: review?(vng)
Attachment #8544227 - Flags: review?(vng)
Comment on attachment 8544187 [details] [diff] [review]
Part 1: histograms.json update

Review of attachment 8544187 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/Histograms.json
@@ +7114,5 @@
> +    "expires_in_version": "45",
> +    "kind": "exponential",
> +    "n_buckets": 50,
> +    "high": 10000,
> +    "description": "Stumbler: The observations per day."

Make sure this is well-defined. Each time an entry is added to this histogram, does it represent observations in the last 24 hours? If not, what does it represent, and is it useful in the aggregate?

@@ +7121,5 @@
> +    "expires_in_version": "45",
> +    "kind": "exponential",
> +    "n_buckets": 50,
> +    "high": 86400,
> +    "description": "Stumbler: The time between receiving passive locations."

Time since last passive location during stumbling activity?
Attachment #8544187 - Flags: review?(rnewman) → review+
Attachment #8543030 - Flags: review?(vng) → review+
Comment on attachment 8544227 [details] [diff] [review]
Part 3: added telemetry to stumbler java code

Review of attachment 8544227 [details] [diff] [review]:
-----------------------------------------------------------------

Just have to make sure that we backport the TELEMETRY_TIME_BETWEEN_RECEIVED_LOCATIONS_SEC metric back into the Android standalone client.
Attachment #8544227 - Flags: review?(vng) → review+
> > +    "description": "Stumbler: The observations per day."
> Make sure this is well-defined. Each time an entry is added to this
> histogram, does it represent observations in the last 24 hours? If not, what
> does it represent, and is it useful in the aggregate?

Good point. I'll update the comment for this to clarify its use. It is normalized to per-day for reporting purposes. What we know in the code is observation count per time period (often a few days), which is updated only when uploading (which is a sporadic event). We would like to make statements about roughly how many observations per-day, per-week, per-month, are normal/low/high.
 
> @@ +7121,5 @@
> > +    "expires_in_version": "45",
> > +    "kind": "exponential",
> > +    "n_buckets": 50,
> > +    "high": 86400,
> > +    "description": "Stumbler: The time between receiving passive locations."
> 
> Time since last passive location during stumbling activity?

After a passive location arrives, when the next one arrives, probe that time diff. Should I reword the description? I am not sure what you are clarifying.
> Just have to make sure that we backport the
> TELEMETRY_TIME_BETWEEN_RECEIVED_LOCATIONS_SEC metric back into the Android
> standalone client.

Will do, I am still testing this locally to ensure I am happy with the metrics before changing to checkin-needed. Looks ok when I create an artificial test environment, gathering enough data in the wild is time-consuming, and tricky because I have to make sure Fennec stays live, since the local telemetry data appears to be in-memory.
> After a passive location arrives, when the next one arrives, probe that time
> diff. Should I reword the description? I am not sure what you are clarifying.

Yeah, just trying to make these descriptions as clear as possible. Clarifying: does this only record while we're stumbling, or is it recorded every time we get a passive location? (Depends on when you unattach listeners, but it's worth documenting here.)

E.g., "Time in seconds between passive location updates received while stumbling."
> Yeah, just trying to make these descriptions as clear as possible.
> Clarifying: does this only record while we're stumbling, or is it recorded
> every time we get a passive location? (Depends on when you unattach
> listeners, but it's worth documenting here.)

I think I can clarify a bit more, and in the process maybe come up with a better wording.
The phrase 'while stumbling' I avoid because it can mean 2 things:
1) User opted in to stumbling, passive listeners are registered at all times, thus user is "stumbling"
2) Because in (1) stumbling system is idle, only when a passive location arrives does the system wake up, run some scanning and then go back to sleep after a short time period. This is actively "stumbling" during this scan.

We are interested in this time distribution between passive location events arriving.

Hmm, still stuck for a better wording than the original one :).
Updating patch to remove an unused block of code. Carrying over r+ from [:vng].
Attachment #8544227 - Attachment is obsolete: true
Attachment #8544953 - Flags: review+
Updated description for observations per day histogram: "Stumbler: The number of observations between upload events, normalized to per day."
Carry over r+ from rnewman
Attachment #8544187 - Attachment is obsolete: true
Attachment #8544957 - Flags: review+
Carry over r+ from vng.
Removed unnecessary variable.
Attachment #8544953 - Attachment is obsolete: true
Attachment #8544972 - Flags: review+
vng, here is a debugging patch, this will create toasts on the scanning events:
https://pastebin.mozilla.org/8167129
Attachment #8544957 - Attachment description: histogramsjson.diff → Part 1: updates to Histograms.json
https://hg.mozilla.org/mozilla-central/rev/3d6070292804
https://hg.mozilla.org/mozilla-central/rev/db1367a5cc7c
https://hg.mozilla.org/mozilla-central/rev/6ba69cd7598b
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → Firefox 37
You need to log in before you can comment on or make changes to this bug.