Closed Bug 1520741 Opened 6 years ago Closed 6 years ago

Add the "app_channel" field to glean pings

Categories

(Toolkit :: Telemetry, enhancement, P1)

enhancement

Tracking

()

RESOLVED FIXED
Tracking Status
firefox66 --- affected

People

(Reporter: Dexter, Assigned: Dexter)

References

Details

(Whiteboard: [telemetry:mobilesdk:m7])

Attachments

(3 files)

As stated in bug 1508305, some applications might need to report the "release channel" they are on (e.g. "beta"). However, in our mobile ecosystem, there's no unified way to report this channel and not every application is even reporting it.

We should make it possible to report the release channel for apps that need it, without adding a useless new field for apps that don't. This might as well rely on bug 1520740 to happen.

Blocks: 1491345
Depends on: 1520740
Priority: -- → P3
Whiteboard: [telemetry:mobilesdk:m?]
Depends on: 1510547
Whiteboard: [telemetry:mobilesdk:m?] → [telemetry:mobilesdk:backlog]

There seems to be consensus around the idea this should be application defined/specific. We have two possible way forward for this.

*** Proposal 1 - Allow extending the "baseline" ping ***
This proposal doesn't really require us to do much. Applications using glean can simply define their product specific metrics (e.g. "release_channel") and declare them to live in all pings (i.e. send_in_pings: default or send_in_pings: baseline in the metrics.yaml file). No schema changes are required, as such metrics, being normal metrics, would be product specific and live in the ping payload.

*** Proposal 2 - Introduce a ping tagging mechanism ***
This would expose a new API in glean, e.g. addPingTag(tagName, tagValue)/removePingTag(tagName), that would allow adding length limited strings to the "ping_info" or "client_info" section (see this bug) of all the outgoing pings. It might even be worth introducing a separate top level "tags" object for the ping, without cluttering the other ones.

In case we go for the second, a lightweight proposal document will be written to iron out the details.

@all - what's your take on this?

Flags: needinfo?(tlong)
Flags: needinfo?(mdroettboom)
Flags: needinfo?(gfritzsche)
Flags: needinfo?(fbertsch)

For pipeline/schema work, Proposal 2 is going to be much more difficult. We're going to have to know which tags are added, what their types are, scrape that into the probe-info-service, and load that into the final schemas.

Proposal 1 is simple and fits in with the current architecture. It has my vote, with the send_in_pings: default, (or perhaps send_in_pings: all) to send in all of the outgoing pings.

Flags: needinfo?(fbertsch)

(1) sounds reasonable, with send_in_pings:all (we'd probably want it in all pings anyway?).
Can you expand how this would be used with Android applications currently and how that would end up being used in datasets?
I may have had the context, but i'm a bit vague on details now.

Do we expect different application ids per channel for our products? Or do they use the same appliction id across channels?
Do we expect the applications to recognize which channel they are on?
Do we expect the applications to record the channel then into release_channel themselves?

Does this then end up, pipeline side producing different datasets? Or one dataset and users should filter it down by release_channel?

Flags: needinfo?(gfritzsche)

(In reply to Georg Fritzsche [:gfritzsche] from comment #3)

(1) sounds reasonable, with send_in_pings:all (we'd probably want it in all pings anyway?).

The send_in_pings:all proposal sounds neat. My only concern with the introduction of this new reserved tool is it's potential misuse: I fear potential PII being mistakenly sent with all pings. If we really want to introduce this (might be as a follow-up, non blocking bug), we'd need to at least flag it with data-stewards. This looks like another big thing to watch out for in addition to expiration: never.

Can you expand how this would be used with Android applications currently and how that would end up being used in datasets?
I may have had the context, but i'm a bit vague on details now.

Applications would need to declare this as a custom metric in their metrics.yaml, probably with Lifetime: application, and then set it at startup. This would end-up in the "metrics": { "strings": { "app_channel": "beta" }} section of the payloads.

To make things more consistent, we could declare the metric within the glean sdk (i.e. release_channel) and let users set its value, if/when needed. We don't send empty metrics anyway.

I think we automatically generate columns for tables with stuff in metrics for all the pings.

@Frank, is the above statement correct?

Do we expect different application ids per channel for our products? Or do they use the same appliction id across channels?

I'm afraid this depends on the app. For example, Firefox beta has a different id than normal Firefox, in the Play Store (it's org.mozilla.firefox_beta vs org.mozilla.firefox). Focus, on the other hand, would have the same id for both channels (see https://github.com/mozilla-mobile/focus-android/wiki/Release-tracks).

@Sebastian, is the above statement correct?

Do we expect the applications to recognize which channel they are on?

Some, I guess. I don't think we have a way to tell for all of them, as far as I understood.

Do we expect the applications to record the channel then into release_channel themselves?

Yes, as the concept of channel is not the same across products.

Does this then end up, pipeline side producing different datasets? Or one dataset and users should filter it down by release_channel?

I suspect this would end up in the same dataset, with users having to manually filter by release_channel.

This is another @Frank question though :)

Flags: needinfo?(s.kaspari)
Flags: needinfo?(fbertsch)

Does comment 4 answer your questions?

Flags: needinfo?(gfritzsche)
Whiteboard: [telemetry:mobilesdk:backlog] → [telemetry:mobilesdk:m7]
Summary: Add the release "channels" field to the baseline ping → Add the "release_channel" field to glean pings

(In reply to Alessio Placitelli [:Dexter] from comment #4)

I think we automatically generate columns for tables with stuff in metrics for all the pings.

@Frank, is the above statement correct?

Right, this is correct.

I suspect this would end up in the same dataset, with users having to manually filter by release_channel.

This is another @Frank question though :)

That's what I imagined the use-case being as well. If someone wants a channel-specific dataset we can always create views.

Flags: needinfo?(fbertsch)

(In reply to Alessio Placitelli [:Dexter] from comment #4)

Can you expand how this would be used with Android applications currently and how that would end up being used in datasets?
I may have had the context, but i'm a bit vague on details now.

Applications would need to declare this as a custom metric in their metrics.yaml, probably with Lifetime: application, and then set it at startup. This would end-up in the "metrics": { "strings": { "app_channel": "beta" }} section of the payloads.

To make things more consistent, we could declare the metric within the glean sdk (i.e. release_channel) and let users set its value, if/when needed. We don't send empty metrics anyway.

Ok, maybe this is a reasonable option currently instead of send_in_pings:all?
That avoids a few concerns right now and could be revisited in Q2 if needed?

Flags: needinfo?(gfritzsche)

(In reply to Georg Fritzsche [:gfritzsche] from comment #8)

To make things more consistent, we could declare the metric within the glean sdk (i.e. release_channel) and let users set its value, if/when needed. We don't send empty metrics anyway.

Ok, maybe this is a reasonable option currently instead of send_in_pings:all?
That avoids a few concerns right now and could be revisited in Q2 if needed?

This sounds good to me.

Because of reducing the namespace visibility, should this be a field in the configuration object that gets passed in on Glean.initialize()?

Flags: needinfo?(tlong)

(In reply to Alessio Placitelli [:Dexter] from comment #4)

Do we expect different application ids per channel for our products? Or do they use the same appliction id across channels?

I'm afraid this depends on the app. For example, Firefox beta has a different id than normal Firefox, in the Play Store (it's org.mozilla.firefox_beta vs org.mozilla.firefox). Focus, on the other hand, would have the same id for both channels (see https://github.com/mozilla-mobile/focus-android/wiki/Release-tracks).

@Sebastian, is the above statement correct?

Yes, that's correct. And it's hard to predict what we are going to use for future apps. Both mechanisms have advantages and disadvantages.

One important things is that builds that use the Google Play track system (with the same application id) can be promoted from track to track (e.g. you release something to the alpha track, if it is good you promote it to beta and if there are no issues you promote it to release). Therefore those builds are usually just regular release builds - which means the build is not aware what channel it is for (potentially all) and therefore can't tell Glean either.

Flags: needinfo?(s.kaspari)

(In reply to Travis Long from comment #10)

Because of reducing the namespace visibility, should this be a field in the configuration object that gets passed in on Glean.initialize()?

We'd rather not clutter any of these, more than they are already :(

Assignee: nobody → alessio.placitelli
Priority: P3 → P1

(In reply to Alessio Placitelli [:Dexter] from comment #12)

(In reply to Travis Long from comment #10)

Because of reducing the namespace visibility, should this be a field in the configuration object that gets passed in on Glean.initialize()?

We'd rather not clutter any of these, more than they are already :(

Actually, this might not be a bad idea at all: we have startup pings that will need this and the only way for them to know the channel soon enough would be to have this bit of information as part of the configuration/initialize. Otherwise, since our outer facing API is async, we risk to not have the channel on time.

Attached file Implementation PR
Attached file Schema PR
Flags: needinfo?(mdroettboom)
Attached file glean-app-channel.md

data-review? for adding an app_channel field in the client_info section of all glean pings, only if the application provides glean with this data.

Attachment #9053610 - Flags: data-review?(liuche)
Summary: Add the "release_channel" field to glean pings → Add the "app_channel" field to glean pings

I didn't see comment #11 addressed directly, but to be explicit, if an app is released through the Google Play Store, the release/beta channel field would probably only be used by apps that have separate app ids for its beta/release channels, e.g. Firefox Nightly vs Firefox, because the apk wouldn't be changing when promoted from alpha -> beta -> release via Play Store. fwiw, on Firefox TV we also do not have a "beta" release channel, and only have release/debug build flavors, which do not contain a representation of a "beta" channel.

Just a heads up on what Android apps would use this field.

Comment on attachment 9053610 [details] glean-app-channel.md 1) Is there or will there be **documentation** that describes the schema for the ultimate data set available publicly, complete and accurate? Yes, in metrics.yaml 2) Is there a control mechanism that allows the user to turn the data collection on and off? Consumers of the Glean SDK must provide a data toggle 3) If the request is for permanent data collection, is there someone who will monitor the data over time?** Yes, telemetry clients team 4) Using the **[category system of data types](https://wiki.mozilla.org/Firefox/Data_Collection)** on the Mozilla wiki, what collection type of data do the requested measurements fall under? ** Type 1 5) Is the data collection request for default-on or default-off? default on 6) Does the instrumentation include the addition of **any *new* identifiers** (whether anonymous or otherwise; e.g., username, random IDs, etc. See the appendix for more details)? No 7) Is the data collection covered by the existing Firefox privacy notice? Yes 8) Does there need to be a check-in in the future to determine whether to renew the data? (Yes/No) (If yes, set a todo reminder or file a bug if appropriate)** baseline telemetry data 9) Does the data collection use a third-party collection tool? No
Attachment #9053610 - Flags: data-review?(liuche) → data-review+

(In reply to Chenxia Liu [:liuche] from comment #17)

I didn't see comment #11 addressed directly, but to be explicit, if an app is released through the Google Play Store, the release/beta channel field would probably only be used by apps that have separate app ids for its beta/release channels, e.g. Firefox Nightly vs Firefox, because the apk wouldn't be changing when promoted from alpha -> beta -> release via Play Store. fwiw, on Firefox TV we also do not have a "beta" release channel, and only have release/debug build flavors, which do not contain a representation of a "beta" channel.

Just a heads up on what Android apps would use this field.

Thanks Chenxia! We are aware of this, this is explicitly to support applications that do know about the channel they are on.

This was reviewed and PRs were merged.

Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: