Closed
Bug 1402492
Opened 7 years ago
Closed 7 years ago
Validate experiments daily aggregation logic
Categories
(Data Platform and Tools :: General, enhancement)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: spenrose, Unassigned)
Details
Attachments
(1 file)
9.02 KB,
text/plain
|
Details |
The attached notebook identifies a client who submitted three pings with subsession_start_dates of 8-21-2017 and a main_summary.experiments value of
{u'clicktoplay-rollout': u'test',
u'e10sCohort': u'multiBucket4',
u'pref-flip-searchcomp1-pref1-1390584': u'treatment',
u'pref-flip-searchcomp1-pref2-1390584': u'control-ten',
u'pref-flip-searchcomp1-pref3-1390584': u'gen1ser3gen5'}
the subsession_length values were (in hours) [4.106388888888889, 0.9661111111111111, 0.060833333333333336] -> 5.133333333333334, but the corresponding row in experiments-daily has a subsession_hours_sum of 13.467777. So that's ... a problem.
Comment 1•7 years ago
|
||
I've dug into this a bit more and I have some additional light to shed.
The v1 dataset (s3://net-mozaws-prod-us-west-2-pipeline-analysis/spenrose/experiments-daily/bug1390584/v1/) contains 1 row per client/date for profiles enrolled in all 3 pref-flip search experiments listed in comment 0, covering only the active days when profiles were enrolled. It spans a 3-week period.
- As described in comment 0, this dataset exhibits inconsistencies in aggregated activity measures relative to main_summary. I compared it against an adhoc client/date aggregation of main_summary for the corresponding profiles, looking at subsession_hours_sum, active_hours_sum, search_count_all_sum, and scalar_parent_browser_engagement_total_uri_count_sum.
- Most client/days were aggregated over the same number of pings, but had different values. Many experiment-daily activity values were specifically 3x higher.
The v2 dataset (s3://net-mozaws-prod-us-west-2-pipeline-analysis/spenrose/experiments-daily/bug1390584/v2/) contains the same rows as v1, and additionally includes prior data for each of these profiles, spanning up to a month before each profile entered the experiment.
- However, for each v1 row included in the v2 dataset, there is also a second v2 row for the same client and date for which the branch identifier is null.
- This null-branch row (compared on the same measures listed above) almost always matches main_summary (aside from cases where the later run would have pulled more late-submitted pings into the aggregation).
My conclusions based on this investigation are:
- the v2 null-branch rows contain the good data (both prior to and during the experiment). My plan is to extract these and use them for the analysis.
- something funky happened during aggregation for the with-branch rows. Given the 3x inflation factor, my guess is that this is related to the fact that these profiles were all enrolled in 3 experiments. AIUI, for each client/date, 3 rows are selected from base experiments-daily (which has 1 row per client/date/experiment), which should all contain identical data, and 1 row is retained out of those.
Full details are in this rather long-winded notebook: https://metrics.mozilla.com/protected/dzeber/tmp/unified-search-v3-pull-data.html
Comment 2•7 years ago
|
||
We no longer plan to maintain experiments_daily as a separate dataset, instead we're adding experiments data to the very similar clients_daily dataset. See Bug 1431777.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Assignee | ||
Updated•2 years ago
|
Component: Datasets: Experiments → General
You need to log in
before you can comment on or make changes to this bug.
Description
•