Closed Bug 1366253 Opened 7 years ago Closed 7 years ago

Add environment.experiments annotation to main summary

Categories

(Data Platform and Tools :: General, enhancement, P1)

x86
macOS
enhancement
Points:
2

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wlach, Assigned: wlach)

Details

Attachments

(1 file)

56 bytes, text/x-github-pull-request
Details | Review
Followup from bug 1362520:

We should add the environment.experiments information to the main summary so people can correlate other information in the main ping with an experiment being active/inactive.

It looks like this is just an array of strings:

http://searchfox.org/mozilla-central/rev/24c443a440104dabd9447608bd41b8766e8fc2f5/toolkit/components/telemetry/schemas/core.schema.json#16

We can probably just store it as such, similar to what we do with the list of active addons?
mreid, does the above approach make sense?
Flags: needinfo?(mreid)
Let me tell you how I and others are typically going want to query it. I want to select everyone who's in a particular experiment, and then group by the experiment branch.

So in terms of presto query semantics, I'd be looking for something like this

SELECT element_at(experiments, 'flashrollout').branch AS branch
FROM main_summary
WHERE element_at(experiments, 'flashrollout') IS NOT NULL

At least if I'm reading this right as a map of structs. Poking through an array for a particular experiment is a bit harder than a map. With an array you're almost stuck exploding it or using magic like filter() which IIRC requires a presto version we don't have deployed yet.
The "core" ping is a bit of a special case. The one we want is from the "experiments" field of the "Environment" section shared by a bunch of ping types:

https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/data/environment.html

It is very much like addons, it is a map from experiment id to some properties (of which there's only one right now, namely "branch").

For addons, we converted the map to an array of structs, rolling the "id" in as a field.

Benjamin's comments make sense to me, and I think it's likely that most analyses will be looking at a particular experiment, rather than analyzing experiments in general. Addons generally use both approaches, so it made sense to use an array (as well as a special dataset with addons exploded into rows).

I'm inclined to mirror the structure in the ping and keep this as a map of string -> struct.

Sunah, do you have an opinion on this?
Flags: needinfo?(mreid) → needinfo?(ssuh)
I agree with Mark -- structure it as a map, and then we can explode experiments out into a separate dataset partitioned by experiment ID based on this field if we find we're using it often.
Flags: needinfo?(ssuh)
Attached file PR
I have a PR in progress for this.
Points: --- → 2
Priority: -- → P1
The PR is merged and should be available in the next deploy (it functions as described in comment 3).
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Component: Datasets: Main Summary → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: