Closed Bug 1077654 Opened 10 years ago Closed 6 years ago

FHR should send the stableId as soon as possible after a new profile is created

Categories

(Android Background Services Graveyard :: Firefox Health Report Service, defect)

All
Android
defect
Not set
normal

Tracking

(fennec+)

RESOLVED INVALID
Tracking Status
fennec + ---

People

(Reporter: mfinkle, Unassigned)

References

Details

Does this depend on bug 981698 for moving to stableIds?

Objective:
We need a record in FHR and sent to FHR servers ASAP so we can track the metrics for retention and attrition. There is too much of a delay right now.

Background (From an email):
I certainly don't have anything approaching definitive answers. What I can say is that the unit of analysis has so far been the profile; on desktop, each profile on a machine has it's own FHR record, so at least on desktop we are bound by technical necessity to state things in terms of profiles and hope that in most cases there are nice 1-1 mappings between profiles, machines and users. This probably approaches a first-order approximation of reality, and in any case we have no way of knowing the extent to which this approximation fails. If anything, my assumption (based completely on my hunches about the world) is that this approximate alignment of users/profiles/machines would hold even more strongly for Fennec than Desktop. But I really don't know how profiles work on Fennec.

I believe that we have never attempted to go beyond collecting data in terms of FF profiles for a few reasons:
(1) Trying to focus on users is kind of a lost cause-- we really have no good way of detecting when there are multiple people sharing the same FF instance/profile.
(2) The original concept behind FHR is that it is meant to diagnose the health of a FF instance; each profile, which has its own addons, cache, history DB, etc, will have its own 'health' that is determined by the state of the profile. Since the focus was on health, efforts to reach beyond the profile and uniquely identify the underlying machine or user were not considered necessary, and indeed thought of as overreaching in terms of user privacy.

So, unless things have changed in the two years since the decision to implement FHR was made (and they really might have, a lot has changed WRT to Mozilla's attitude towards data), I think that for the sake of consistency between Desktop and Fennec it makes the most sense to do this on a per-profile basis. Also, it's *essential* that the stableId that is sent with the first run ping be *the same* stableId that is sent with all subsequent FHR packets. So having one first-run ping per device will muddy that considerably if there is more than one Gecko profile on a device.

In the case of Fennec, that would mean sending a first-run ping per Gecko profile, including a new one any time something happens that blows up an older profile. We could revisit the discussion of profiles/devices/users, but I think that would be the approach that is most consistent with the existing precedents.
I'm gonna go wild and assume this is tracking 35.

I think the approach here is:

* Immediately† instantiate the upload service.
* Immediately compute a stable ID.
* Immediately attempt to upload an empty‡ payload.
* If that upload fails, allow for normal failure handling to occur -- we'll try again soon, and will probably upload a full document.
* If the user opts out via the data reporting notification, we won't retry.
* Question: if a user opts out, do we delete the uploaded document? If so, under what conditions?


† "Immediate" is a balance of startup perf and need for metrics. We should probably schedule an alarm as soon as we instantiate BrowserHealthRecorder, because that's at a happy place in startup, and actually run the ping ~30 seconds after start.

‡ "Empty" is a good question. Is this truly empty? Does it include the profile creation timestamp, or other environment details like CPU/Android version/etc., and is only empty of event data?
Status: NEW → ASSIGNED
tracking-fennec: --- → 35+
Depends on: 981698
OS: Linux → Android
Hardware: x86_64 → All
+cc bsmedberg

Just to give a little more background on the 'first run ping'.

The original idea for this (which is what I was referencing in that email) was that this would be part of the FHR+telemetry reboot:
https://docs.google.com/a/mozilla.com/document/d/1IGpzsYGi_sq3YFQDAPyKOkU_BKvXAC95fZYA2i4ceVs/edit#

In that case, the activation ping was envisioned as one of the "special pings" that a client would submit. The activation ping *must not* be opt-out-able, and it must be sent ASAP after activation, which would mean sending it *before* the "data choices" hanger.

To eliminate privacy concerns, it's ok if that means that it contains only super minimal information-- it must contain the clientId, but it need not have anything beyond that. IMO, the sweet spot for privacy and utility is that the activation ping would have the clientId and minimal environment info equivalent to what is found in the blocklist ping (product, os, version, ... there are a couple others). But I would say no to more highly identifying info like profile creation timestamp, CPU, etc-- we'll get those if they make it past "data choices", but we don't need them for a MVP that will help us learn useful stuff about day one retention.

Ideally, under the plan for FHR reboot, if a profile opts out of FHR we'd get another minimal opt-out ping, so that we can disentangle the rates of true attrition from the rate of FHR opt-out, and so that we can estimate the population-wide rate of opt-out and adjust our numbers accordingly. I'm not sure if it's possible to approach these issues here.
Since Android and Desktop have completely separate implementations, they would require separate bugs if we were to do this in both places.

I am not convinced that this is something we should do, for either product. There is ~0 value to the user in sending this ping. At first glance there seems to be very little value to Mozilla either: knowing that we had a new user without knowing anything else about them doesn't seem like it would drive any kind of useful decision-making.

In any case, if we did this and the user subsequently opts out of FHR, we should delete the original ping on the server. That's the promise we make with FHR data that we will delete it if you opt out.
> I am not convinced that this is something we should do, for either product.
> There is ~0 value to the user in sending this ping. At first glance there
> seems to be very little value to Mozilla either: knowing that we had a new
> user without knowing anything else about them doesn't seem like it would
> drive any kind of useful decision-making.

The Fennec team is interested in first day retention. For low barrier to entry and exit smart phone apps, the initial impression and engagement are critical, and knowing how many users are being retained through the first run experience is pretty useful and may indeed drive decision-making about what kind of efforts should be spent there. (It seems like we'd want to know about this part of our acquisition funnel on desktop too...)

> In any case, if we did this and the user subsequently opts out of FHR, we
> should delete the original ping on the server. That's the promise we make
> with FHR data that we will delete it if you opt out.

Maybe so; in that case, we need revisit that promise. I believe it does no harm to users capture existence and opt-out pings, and that the promise should be that if they opt out we get to keep those bits of data, but that we'll delete all other pings we've received as well as deleting the stableId from their system, so that the existence and opt-out pings can never be linked back to them. Ideally, we'd keep the meager system info that we get from virtually all clients in the Blocklist Ping, but if that is too much, we should indeed at least get
    {clientId: X, activationDate:Y}
and
    {clientId: X, fhrOptOutDate:Z}
(In reply to brendan c from comment #2)

> and it must be sent ASAP after activation, which would mean
> sending it *before* the "data choices" hanger.

The data choices hanger is shown immediately on startup on Fennec, as a system notification.

It's impossible to run any code sooner than that, so the first attempt will happen at about the same time.


> But I would say no to more highly
> identifying info like profile creation timestamp, CPU, etc

OK.


> Ideally, under the plan for FHR reboot, if a profile opts out of FHR we'd
> get another minimal opt-out ping, so that we can disentangle the rates of
> true attrition from the rate of FHR opt-out, and so that we can estimate the
> population-wide rate of opt-out and adjust our numbers accordingly. I'm not
> sure if it's possible to approach these issues here.

Please file a separate bug for that!
(In reply to Richard Newman [:rnewman] from comment #5)
> (In reply to brendan c from comment #2)
> 
> > and it must be sent ASAP after activation, which would mean
> > sending it *before* the "data choices" hanger.
> 
> The data choices hanger is shown immediately on startup on Fennec, as a
> system notification.
> 
> It's impossible to run any code sooner than that, so the first attempt will
> happen at about the same time.

Ah ok. I think on desktop the data choices hanger doesn't display until after a 24hr delay or something. Didn't know it was different on Fennec. Sounds good.


> > Ideally, under the plan for FHR reboot, if a profile opts out of FHR we'd
> > get another minimal opt-out ping, so that we can disentangle the rates of
> > true attrition from the rate of FHR opt-out, and so that we can estimate the
> > population-wide rate of opt-out and adjust our numbers accordingly. I'm not
> > sure if it's possible to approach these issues here.
> 
> Please file a separate bug for that!

I think this can wait for the broader discussion of FHR v4 reboot... particularly since, as Benjamin notes, it may mean revisiting the promises we make about FHR data.
I will probably not get to this before 35 is in Beta. Too many stability bugs on my list.
Assignee: rnewman → nobody
Status: ASSIGNED → NEW
tracking-fennec: 35+ → +
We don't do FHR anymore.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.