Bug 1507073 Comment 0 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

Brief description of the request:

We suspect that many of our new profiles may be created by labs or internet cafes that wipe and re-image their computers after every login or day.  This assumes that the image does not already have a profile created, but this is likely sometimes the case.

We'd like to attempt to identify such profiles and, more generally, attempt to develop a notion of "real profiles" and better understand the relationships between profiles, users, and computers.

Link to any assets:

Felix left a comment on one of Jesse' docs:

---
So far I've heard a lot about internet cafes and schools that look like fresh profiles every day, and I could imagine that they might dominate this number (do they?)

Should we aspire to measure the number of new ongoing users? A few ways we could do this:
- Don't count chickens until they're hatched: count "new >1 week old profiles" so that the wiped-every-day profiles never get counted
- Do something with IPs (e.g. count distinct IPs) to reduce the impact of this effect
- Use geo and machine data to predict for each new client whether they're going to be around for another week (could be as simple as binning on country and taking an average with a prior - basically we'd use the fact that internet cafes are more popular in certain places)
---

Is there a specific data scientist you would like or someone who has helped to triage this request:

Jesse plans to work on this, or Felix may take over if he's interested.
Brief description of the request:

We suspect that many of our new profiles may be created by labs or internet cafes that wipe and re-image their computers after every login or day.  This assumes that the image does not already have a profile created, but this is likely sometimes the case.

We'd like to attempt to identify such profiles and, more generally, attempt to develop a notion of "real profiles" and better understand the relationships between profiles, users, and computers.

Link to any assets:

Felix left a comment on one of Jesse' docs:

---
So far I've heard a lot about internet cafes and schools that look like fresh profiles every day, and I could imagine that they might dominate this number (do they?)

Should we aspire to measure the number of new ongoing users? A few ways we could do this:
- Don't count chickens until they're hatched: count "new >1 week old profiles" so that the wiped-every-day profiles never get counted
- Do something with IPs (e.g. count distinct IPs) to reduce the impact of this effect
- Use geo and machine data to predict for each new client whether they're going to be around for another week (could be as simple as binning on country and taking an average with a prior - basically we'd use the fact that internet cafes are more popular in certain places)
---

Is there a specific data scientist you would like or someone who has helped to triage this request:

Jesse plans to work on this, or Felix may take over if he's interested.

_____ UPDATE _______
I've broadened the scope of this bug to understanding in general our profiles that are only active a single day. This includes "internet cafe" profiles, organic churn after a single day, and telemetry optouts.

Rosanne Scholl suggests: "Some of those probably gave us their email addresses while creating an account, right? Could we email them a survey or an invite for a paid interview?" This is a very good idea.

Romain Testard also suggests looking at the proportion of active profiles that were created recently (perhaps last year) as this will help shed light on the issues around acquisition and retention.

Back to Bug 1507073 Comment 0