Closed Bug 1238810 Opened 5 years ago Closed 1 year ago

Move WEAVE_CONFIGURED and FXA_CONFIGURED into the telemetry environment

Categories

(Firefox :: Sync, defect, P2)

defect

Tracking

()

RESOLVED FIXED
Firefox 72
Tracking Status
firefox71 --- fixed
firefox72 --- fixed

People

(Reporter: markh, Assigned: tcsc)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Whiteboard: [sync-metrics])

Attachments

(2 files, 1 obsolete file)

The fact these probes are recorded only once per session means it's not recorded in sub-sessions. See bug 1232050 comment 10 and bug 1236383 comment 11 for more context.
Flags: firefox-backlog+
Priority: -- → P2
Mark, in our new prioritization system, should this be P2 or P3?
Flags: needinfo?(markh)
(In reply to Chris Karlof [:ckarlof] from comment #1)
> Mark, in our new prioritization system, should this be P2 or P3?

P3 sounds good given we have nothing planned that relies on these probes existing.
Flags: needinfo?(markh)
Priority: P2 → P3
Whiteboard: [sync-metrics]
Priority: P3 → P2
I think this is fairly important to fix - maybe we can set the histogram on every sync?
(In reply to Leif Oines [:loines] from comment #3)
> I think this is fairly important to fix - maybe we can set the histogram on
> every sync?

We are recording a boolean, so it's important that we consistently report true and false, right? Because users not configured don't sync, this will avoid us recording the false value in sub-sessions.

Another alternative is that we listen for the 'internal-telemetry-after-subsession-split' observer topic (https://searchfox.org/mozilla-central/search?q=internal-telemetry-after-subsession-split) and always additionally record our historgram when that is sent?
you're right Mark - your solution would be much preferred
There are 2 things to explicitly call out here:

* This uses the "internal-telemetry-after-subsession-split" notification. This
  notification is documented in the telemetry code as being internal to the
  module - however, it has already escaped into BrowserUsageTelemetry.jsm
  via Dexter, so I'm hoping it's OK to use it here too. If not, please let
  me know what an alternative is.

* WEAVE_CONFIGURED is reported with either a 1 or 0 depending on whether
  sync is configured. However, FXA_CONFIGURED is different - it only records
  a 1 when FxA is configured and doesn't report anything when it's not.
  This hasn't changed in this patch - if FxA isn't configured, it will still
  report nothing after the session-split notification. Leif, is this OK, or
  should we make an attempt to have this probe report 1 and 0?
Leif, I can't find your account on phabricator, so couldn't flag you there for review - please see the above message and let me know what you think.
Flags: needinfo?(loines)
(In reply to Mark Hammond [:markh] from comment #6)
> Created attachment 9032841 [details]
> Bug 1238810 - record WEAVE_CONFIGURED and FXA_CONFIGURED after telemetry
> session splits. r?dexter
> 
> There are 2 things to explicitly call out here:
> 
> * This uses the "internal-telemetry-after-subsession-split" notification.
> This
>   notification is documented in the telemetry code as being internal to the
>   module - however, it has already escaped into BrowserUsageTelemetry.jsm
>   via Dexter, so I'm hoping it's OK to use it here too. If not, please let
>   me know what an alternative is.

BrowserUsageTelemetry.jsm was written by our team and it is an artifact of that.
internal-telemetry-after-subsession-split is internal and supported for use by other teams.

I'd like to understand the problem here better and find the right solution for it.

> The fact these probes are recorded only once per session means it's not recorded in sub-sessions.

Why exactly is that a problem?
Flags: needinfo?(markh)
Depending on the underlying problem, i think there different solutions for this:
- Make analysis work with the current approach (e.g. a clients_daily roll-up could count clients that had this true at least once).
- Put data points into the environment. (this is where we currently have most of the data fields that are describing the overall session characteristics, but it requires extra data engineering work)
- Don't reset specific measurements on subsession splits. (We could support a "lifetime" parameter for scalars and histograms like Glean does.)

Either way, i would recommend doing any of this as new metrics, as it changes the semantics significantly.

(In reply to Georg Fritzsche [:gfritzsche] from comment #8)
> internal-telemetry-after-subsession-split is internal and supported for use
> by other teams.

*not supported
> Created attachment 9032841 [details]
> Bug 1238810 - record WEAVE_CONFIGURED and FXA_CONFIGURED after telemetry
> session splits. r?dexter
> 
> There are 2 things to explicitly call out here:
> 
> * This uses the "internal-telemetry-after-subsession-split" notification.
> This
>   notification is documented in the telemetry code as being internal to the
>   module - however, it has already escaped into BrowserUsageTelemetry.jsm
>   via Dexter, so I'm hoping it's OK to use it here too. If not, please let
>   me know what an alternative is.
> 
> * WEAVE_CONFIGURED is reported with either a 1 or 0 depending on whether
>   sync is configured. However, FXA_CONFIGURED is different - it only records
>   a 1 when FxA is configured and doesn't report anything when it's not.
>   This hasn't changed in this patch - if FxA isn't configured, it will still
>   report nothing after the session-split notification. Leif, is this OK, or
>   should we make an attempt to have this probe report 1 and 0?

Ideally we would also have FXA_CONFIGURED be 1 or 0 also, though if that is substantially more work then maybe we can tackle it at another point.
Flags: needinfo?(loines)
(In reply to Georg Fritzsche [:gfritzsche] from comment #8)
> Why exactly is that a problem?

It leads to many null values for the probe when people have tried counting clients with sync configured on a given submission_date. True, they could carry the probe's value from the previous session-level ping forward until the end of the session during analysis, but that can be more difficult and computationally more expensive. Analysis would be simplified considerably if the value could be non-null for all ping submissions.
Flags: needinfo?(markh)
(In reply to Georg Fritzsche [:gfritzsche] [away Dec 22 - Jan 14] from comment #8)
> 
> BrowserUsageTelemetry.jsm was written by our team and it is an artifact of
> that.
> internal-telemetry-after-subsession-split is internal and supported for use
> by other teams.
> 
> I'd like to understand the problem here better and find the right solution
> for it.

I suspect it is the exact same reason it is already used in browser/modules/BrowserUsageTelemetry.jsm - "so that they reflect the correct value for the new subsession."

> 
> > The fact these probes are recorded only once per session means it's not recorded in sub-sessions.
> 
> Why exactly is that a problem?

For the same reason it's a problem in BrowserUsageTelemetry.jsm? ;)

Does comment 11 help clarify?
Flags: needinfo?(gfritzsche)

Ok, i think the cleanest solution would be that some of your probes/metrics don't reset on subsession splits.
This could be achieved by introducing a new property in Scalars.yaml/Histograms.json.

Then you could say in e.g. Scalars.yaml:

services.sync:
  configured:
    kind: boolean
    lifetime: application # This will make it not reset on subsession splits.
    # ...

Then you only record services.sync.configured once in a Firefox session and get the value sent in every main ping from that session.

@leif/mhammond: How is that for you?

@mreid: What do you think of adopting the lifetime feature from Glean early into Firefox Telemetry?
Should we do a lightweight proposal around this

Flags: needinfo?(gfritzsche) → needinfo?(mreid)
Flags: needinfo?(markh)
Flags: needinfo?(loines)

(In reply to Georg Fritzsche [:gfritzsche] from comment #13)

@mreid: What do you think of adopting the lifetime feature from Glean early into Firefox Telemetry?
Should we do a lightweight proposal around this

I think a proposal around this is a good idea, this change could definitely result in surprising behaviour at analysis time. For what it's worth I think adding lifetimes specifically for boolean scalars is pretty low risk and could be helpful for other things like this as well.

Flags: needinfo?(mreid)

This seems like a nice solution, but do we have a rough idea about the timeline for implementing this new property? I'm not saying we need this fixed ASAP, but it would be nice if we could get something riding the trains ASAP - I've had a few people try to use this probe and I've had to do a lot of explaining about why doing things like select submission_date, sync_configured, count(distinct client_id) group by 1,2 is misleading :)

OTOH, we have more accurate counts of sync clients server-side, so from that perspective this isn't mission critical. however it is important for correlating sync usage to other browser metrics, as this can't be done with the server-side data.

Flags: needinfo?(loines)

For timeline, realistically i think that could hit the next Firefox release, 67.
It's not much work from the client-side technically, but we would need to collect feedback on this change first.

If you want to take something settled right now, i propose putting this into the Telemetry environment and working with data engineering to get this into the datasets you need it in.

(In reply to Georg Fritzsche [:gfritzsche] from comment #13)

@leif/mhammond: How is that for you?

If it's good for @leif it's good for me :) However, I'm going to struggle to find the time to help implement this in the core telemetry framework.

Flags: needinfo?(markh)
Attachment #9032841 - Attachment is obsolete: true

(In reply to Mark Hammond [:markh] from comment #17)

(In reply to Georg Fritzsche [:gfritzsche] from comment #13)

@leif/mhammond: How is that for you?

If it's good for @leif it's good for me :) However, I'm going to struggle to find the time to help implement this in the core telemetry framework.

We would do those changes.

If you want to do something short-term, you can add a new property to TelemetryEnvironment.jsm (this is low effort).

:markh, can you think of any explanation for this: https://sql.telemetry.mozilla.org/queries/61690/source#158773 (note that this data is a 1% sample but its the relative differences that matter)

The number of clients with null is to be expected based on the discussion in this bug, but it seems odd to me that the number of clients with sync_configured = true and sync_configured = false are roughly even. Given the account registration rates from the FxA pipeline (as a share of overall firefox users), I would expect false to outnumber true many times over, regardless of whether it is only recorded at session start.

Flags: needinfo?(markh)

To be able to track the right telemetry with sync decouple, we need to make sure that the sync_configured probe is implemented correctly and should be triggered more frequently.

(In reply to Georg Fritzsche [:gfritzsche] from comment #13)

Ok, i think the cleanest solution would be that some of your probes/metrics don't reset on subsession splits.
This could be achieved by introducing a new property in Scalars.yaml/Histograms.json.

...

Dexter,
Georg isn't accepting needinfo requests, so I'm asking you :) Please redirect if you are the wrong person. Georg mentioned some improvements were being made to the telemetry modules to make recording this information possible - are they now capable of supporting this use-case?

Flags: needinfo?(markh) → needinfo?(alessio.placitelli)

adding ni back on myself for comment 19

Flags: needinfo?(markh)

I think it would be fine for us to just set this to true on syncs (or somewhere simliarly frequently called for sync users). It would mean it would never have a value of 0, but that's true of FXA_CONFIGURED now, which Leif indicated has the right semantics.

Flags: needinfo?(alessio.placitelli) → needinfo?(chutten)

No movement on the Scalar lifetimes feature has happened in the past seven months, I'm afraid. Given our team's current focus on Glean, I think the least-worst course of action would be to add the necessary points of information to the Telemetry Environment.

I don't think we want to trigger new "main" pings each time these values change, so we'll probably want to follow the pattern we use when GFX features are updated: listen on a topic, update the value, do not call _onEnvironmentChange.

Does this sound amenable?

Flags: needinfo?(chutten)

:chutten, is there a reason we couldn't add a pref that indicates this value? That seems to be the easiest way to add things to the telemetry environment. It's also pretty unlikely that these values change frequently -- Users don't sign in/out often, but maybe I'm incorrect about your concern here.

(Alternatively, my initial plan in comment 23 was to move the WEAVE_CONFIGURED probe to somewhere it was more likely to be hit reliably (at the start of a sync, perhaps), and just record a 1. This would somewhat mirror how we behave for FXA_CONFIGURED, and be simpler to implement).

That said, providing

Assignee: nobody → tchiovoloni

Ugh. I actually meant to just assign myself, and forgot this comment was here. I'm going to approach this by adding both of these to the environment, as suggested. It seems like the best approach. I'd edit the comment, but bugzilla won't let me 🤷‍♂️

Flags: needinfo?(markh)
Summary: Change WEAVE_CONFIGURED and FXA_CONFIGURED telemetry so it is correctly recorded in subsessions. → Move WEAVE_CONFIGURED and FXA_CONFIGURED into the telemetry environment
Blocks: 1582253

Bug 1238810 (part 2) - Remove FXA_CONFIGURED and WEAVE_CONFIGURED telemetry histograms. r?markh,loines

Attached file DataReviewREquest.md

I'm not sure who to request data-review from? Please redirect if I've chosen poorly.

Attachment #9100888 - Flags: data-review?

Ah, I notice now that I've goofed up the arc command somehow. This was supposed to be two patches...

Attachment #9100887 - Attachment description: Bug 1238810 (part 1) - Add whether or not a user uses sync/fxa to the telemetry environment. r?chutten,markh → Bug 1238810 - Replace {FXA,WEAVE}_CONFIGURED with entries in the telemetry environment. r?chutten,markh,loines
Comment on attachment 9100888 [details]
DataReviewREquest.md

I thought I did this already, but for some reason it was blank? You're listed as the data peer for firefox telemetry on https://wiki.mozilla.org/Firefox/Data_Collection, let me know if I should flag someone else, though.
Attachment #9100888 - Flags: data-review? → data-review?(chutten)
Comment on attachment 9100888 [details]
DataReviewREquest.md

DATA COLLECTION REVIEW RESPONSE:

    Is there or will there be documentation that describes the schema for the ultimate data set available publicly, complete and accurate?

Yes. This collection is part of the Telemetry Environment so is documented in [its in-tree documentation](https://firefox-source-docs.mozilla.org/toolkit/components/telemetry/telemetry/data/environment.html).

    Is there a control mechanism that allows the user to turn the data collection on and off?

Yes. This collection is Telemetry so can be controlled through Firefox's Preferences.

    If the request is for permanent data collection, is there someone who will monitor the data over time?

Yes, Leif Oines is responsible.

    Using the category system of data types on the Mozilla wiki, what collection type of data do the requested measurements fall under?

Category 2, Interaction.

    Is the data collection request for default-on or default-off?

Default on for all channels.

    Does the instrumentation include the addition of any new identifiers?

No.

    Is the data collection covered by the existing Firefox privacy notice?

Yes.

    Does there need to be a check-in in the future to determine whether to renew the data?

No. This collection is permanent.

---
Result: datareview+
Attachment #9100888 - Flags: data-review?(chutten) → data-review+
Pushed by tchiovoloni@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3d7b77564cb7
Replace {FXA,WEAVE}_CONFIGURED with entries in the telemetry environment. r=chutten,markh,loines
Flags: needinfo?(tchiovoloni)
Pushed by tchiovoloni@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/bc70189de000
Replace {FXA,WEAVE}_CONFIGURED with entries in the telemetry environment. r=chutten,markh,loines
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 72

Comment on attachment 9100887 [details]
Bug 1238810 - Replace {FXA,WEAVE}_CONFIGURED with entries in the telemetry environment. r?chutten,markh,loines

Beta/Release Uplift Approval Request

  • User impact if declined: Sync team will be unable to monitor how the sync and fxa separation will go, and thus issues users experience might not be detected.
  • Is this code covered by automated tests?: Yes
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Changes limited to code running on an observer notification emitted by sync, and doesn't trigger emitting a telemetry ping when it runs.
  • String changes made/needed:
Attachment #9100887 - Flags: approval-mozilla-beta?

Marking 71 as affected since there is an uplift request.

Comment on attachment 9100887 [details]
Bug 1238810 - Replace {FXA,WEAVE}_CONFIGURED with entries in the telemetry environment. r?chutten,markh,loines

Telemetry update, covered by tests , uplift approved for 71 beta 6, thanks.

Attachment #9100887 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.