Bug 1582278 Comment 3 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Here's my first cut at this. I don't expect it to be the last word. I also did it quickly, in about an hour. We should have another data scientist (or myself when I'm back from PTO) take a more comprehensive look. But, I wanted to at least get us started on this before I go out.
https://sql.telemetry.mozilla.org/queries/65789/source#167081

Method:
1. I looked at periods of calendar weeks for a 1% sample of clients starting in august. I ignored data where `fxa_configured = null` (seemed to be infrequent enough to ignore for now)
2. I extracted the client's last submitted value of `fxa_configured` within each week (so this would end up being the most recent value from the previous week, when comparing WoW). I did this out of convenience. 
3. For the following I only counted clients when they had values set for `fxa_configured` on consecutive weeks, e.g. I did not consider a client "disconnected" if they did not submit any values for fxa_configured on the previous calendar week (to count as disconnected their previous value must have been `true` AND the number of the last week they submitted telemetry must have been equal to the number of the current week minus 1). This constraint could easily be relaxed. As it is, I think it makes interpretation easier but it might also bias things toward the patterns of users who are more likely to be active at least once a week. Maybe we only really care about those users though. 
4. I then defined "disconnected" as clients that had  `fxa_configured = true` the previous week but had `fxa_configured = false` for the current week. I divided this by the number of total clients that were connected in the previous week (`prop_of_last_week_connected_that_disconnected_this_week`) and by the number of total clients that were connected in the current week (`prop_of_this_week_connected_that_disconnected_this_week`). The latter proportion is to facilitate comparison to "new" connections, e.g. can be thought of as the additional % of clients that we would have had active in the current week if they had not disconnected. 
5. I also defined "continuing clients" as those who had  `fxa_configured = true` the previous week and also had `fxa_configured = true` for the current week. I divided this by the number of total clients that were connected in the previous week (`prop_of_last_week_connected_that_are_continuing`) and by the number of total clients that were connected in the current week (`prop_of_this_week_connected_that_are_continuing`).
6. I also defined "connected or reconnected"  as clients who had  `fxa_configured = false` the previous week but had `fxa_configured = true` for the current week. These could be new accounts among existing firefox users or re-connections of existing accounts. I divided this by the number of total clients that were connected in the current week (`prop_of_this_week_connected_that_not_connected_last_week`).

What I've found so far, given above:
1. On a given week, 16-20% of clients who were connected in the previous week (`fxa_configured = true`) show up as disconnected the next week (`fxa_configured = false`)
2. However in the long run "attrition" week-to-week looks to be largely compensated for by connections by clients that were not connected on the previous week (but were active in Firefox). Put another way, the loss of clients due to "disconnection" seems to be (basically) offset by roughly the same amount of new connections occuring in per week (again, "new" here just means that they had `fxa_configured = false` in the last week).  
3. In a typical week, 35-40% of users who have `fxa_configured = true` also had `true` the previous week, i.e. are "continuing" users. 

More to be done here and to check:
1. Is fxa_configured basically stable within clients within week? That is, if it changes value does it tend to stay that way ping-to-ping in the short run?
2. Maybe it would be better to do sliding windows of 7 days rather than over calendar weeks.
3. the ~17% figure lines up pretty close to [this view from the server metrics](https://analytics.amplitude.com/mozilla-corp/chart/new/246gkzd) but the continuing number from the query above seems to be low. Its not an apples to apples comparison at all though, as a user can be continuing at the user level (e.g. on another device) but not at the device level.
4. assuming ive been going about this right and the 17% number is roughly in the right ballpark, next steps would be to use the client telemetry, e.g. interactions with the FxA menu(s), to give our best estimate on how many of the `true` -> `false` users might plausibly be user-initiated.
Here's my first cut at this. I don't expect it to be the last word. I also did it quickly, in about an hour. We should have another data scientist (or myself when I'm back from PTO) take a more comprehensive look. But, I wanted to at least get us started on this before I go out.
https://sql.telemetry.mozilla.org/queries/65789/source#167081

Method:
1. I looked at periods of calendar weeks for a 1% sample of clients starting in august. I ignored data where `fxa_configured = null` (seemed to be infrequent enough to ignore for now)
2. I extracted the client's last submitted value of `fxa_configured` within each week (so this would end up being the most recent value from the previous week, when comparing WoW).
3. For the following I only counted clients when they had values set for `fxa_configured` on consecutive weeks, e.g. I did not consider a client "disconnected" if they did not submit any values for fxa_configured on the previous calendar week (to count as disconnected their previous value must have been `true` AND the number of the last week they submitted telemetry must have been equal to the number of the current week minus 1). This constraint could easily be relaxed. As it is, I think it makes interpretation easier but it might also bias things toward the patterns of users who are more likely to be active at least once a week. Maybe we only really care about those users though. 
4. I then defined "disconnected" as clients that had  `fxa_configured = true` the previous week but had `fxa_configured = false` for the current week. I divided this by the number of total clients that were connected in the previous week (`prop_of_last_week_connected_that_disconnected_this_week`) and by the number of total clients that were connected in the current week (`prop_of_this_week_connected_that_disconnected_this_week`). The latter proportion is to facilitate comparison to "new" connections, e.g. can be thought of as the additional % of clients that we would have had active in the current week if they had not disconnected. 
5. I also defined "continuing clients" as those who had  `fxa_configured = true` the previous week and also had `fxa_configured = true` for the current week. I divided this by the number of total clients that were connected in the previous week (`prop_of_last_week_connected_that_are_continuing`) and by the number of total clients that were connected in the current week (`prop_of_this_week_connected_that_are_continuing`).
6. I also defined "connected or reconnected"  as clients who had  `fxa_configured = false` the previous week but had `fxa_configured = true` for the current week. These could be new accounts among existing firefox users or re-connections of existing accounts. I divided this by the number of total clients that were connected in the current week (`prop_of_this_week_connected_that_not_connected_last_week`).

What I've found so far, given above:
1. On a given week, 16-20% of clients who were connected in the previous week (`fxa_configured = true`) show up as disconnected the next week (`fxa_configured = false`)
2. However in the long run "attrition" week-to-week looks to be largely compensated for by connections by clients that were not connected on the previous week (but were active in Firefox). Put another way, the loss of clients due to "disconnection" seems to be (basically) offset by roughly the same amount of new connections occuring in per week (again, "new" here just means that they had `fxa_configured = false` in the last week).  
3. In a typical week, 35-40% of users who have `fxa_configured = true` also had `true` the previous week, i.e. are "continuing" users. 

More to be done here and to check:
1. Is fxa_configured basically stable within clients within week? That is, if it changes value does it tend to stay that way ping-to-ping in the short run?
2. Maybe it would be better to do sliding windows of 7 days rather than over calendar weeks.
3. the ~17% figure lines up pretty close to [this view from the server metrics](https://analytics.amplitude.com/mozilla-corp/chart/new/246gkzd) but the continuing number from the query above seems to be low. Its not an apples to apples comparison at all though, as a user can be continuing at the user level (e.g. on another device) but not at the device level.
4. assuming ive been going about this right and the 17% number is roughly in the right ballpark, next steps would be to use the client telemetry, e.g. interactions with the FxA menu(s), to give our best estimate on how many of the `true` -> `false` users might plausibly be user-initiated.

Back to Bug 1582278 Comment 3