Measure share of AVG users that moved from having passwords to not having passwords
Categories
(Data Science :: Investigation, task)
Tracking
(Not tracked)
People
(Reporter: RT, Assigned: tdsmith)
References
Details
Brief description of the request:
We hit an issue with AVG users that caused users to stop being able to access their password BD: bug 1558765
In order to understand if these users should be fixed through a dot release we need to understand the number of users impacted.
Request: Count AVG (antivirus) users daily that moved from having passwords in their password DB to not having passwords at all (PWMGR_NUM_PASSWORDS_PER_HOSTNAME) in the last 2 weeks.
Link to any assets:
Incident doc: https://docs.google.com/document/d/1Q74sLPUoq30llfWNbBxamK0DUPY_xBn57jL2rSQIyzc/edit
Is there a specific data scientist you would like or someone who has helped to triage this request:
no
Comment 1•5 years ago
•
|
||
PWMGR_NUM_SAVED_PASSWORDS will be much easier to work with since it's a total count, not a histogram counting number of logins for a given hostname.
Updated•5 years ago
|
Assignee | ||
Comment 2•5 years ago
|
||
:wbeard got us started here; thanks Chris!
Count AVG (antivirus) users daily that moved from having passwords in their password DB to not having passwords at all (PWMGR_NUM_PASSWORDS_PER_HOSTNAME) in the last 2 weeks.
About 20%, or 50k users.
Note that we can't count AVG use among Windows 7 users, who are about 40% of our userbase; if we assume they use AVG at the same rate, that implies a number closer to about 83k.
Notebook and summary: https://dbc-caf9527b-e073.cloud.databricks.com/#notebook/132863/dashboard/132972/present
Assignee | ||
Comment 3•5 years ago
|
||
Just a note that re-running that query today, it looks like it's closer to 206k users (124k observed + unobserved Windows 7 users), which we can probably expect to be the final number since aiui active AVG users should have received a definition update by now (albeit with unknown-to-us penetration).
My query used main_summary
as a data source, which is updated on a daily cadence early in the morning GMT based on the pings received the previous day. The numbers on the 14th were current as of midnight GMT the morning of Jun 14; it appears that we heard from additional users after that point.
I don't think that difference is large enough that it would have changed our response, though it might have been useful to do additional work to either a) rely on main pings using the Dataset API for low-latency access instead of waiting for main_summary to update the next day, or b) do additional forecasting work instead of just reporting the number of users affected to date.
Description
•