Closed Bug 1286226 Opened 8 years ago Closed 8 years ago

Backfill: Update derived datasets and reports

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mreid, Unassigned)

References

Details

Once the raw data has been updated per bug 1286220, we need to recompute any derived datasets and reports. This includes: Derived Datasets: - main_summary v2 and v3 - longitudinal - client counts - Telemetry aggregates - redshift data for Firefox Desktop Report - Fennec dashboard - churn dataset - others Reports: - Firefox desktop report (v4-weekly) - Retention csv - other a.t.m.o reports - sql.t.m.o scheduled queries - others
Blocks: 1285621
Depends on: 1286220
Depends on: 1286227
Points: --- → 3
Priority: -- → P2
NIing people that can help with this. Most of this should be in airflow now and we should be able to launch 600 instances of the c3.4xlarge type, so parallelizing across multiple data sets should not be a problem.
Flags: needinfo?(rvitillo)
Flags: needinfo?(mreid)
Flags: needinfo?(mdoglio)
:whd do you have a time range for this?
Telemetry aggregates (t.m.o.) don't need to be back-filled as it's OK to have few days with less data considering the use-cases. The longitudinal dataset has been back-filled and the client_count one will reflect reality once main_summary is back-filled. Re:dash dashboards based on the longitudinal and client_count dataset are regenerated automatically every week so no action is required there as well.
Flags: needinfo?(rvitillo)
(In reply to Mauro Doglio [:mdoglio] from comment #2) > :whd do you have a time range for this? Affected dates include July 4 to July 9
Flags: needinfo?(mreid)
Depends on: 1275889
main_summary v2 backfill is running presently, and should complete in about 3 hours. main_summary v3 will be handled in bug 1275889.
main_summary v2 has been backfilled for July 4, 6, 7, 8 and 9. July 5th appears to have some data errors (bad JSON values in the histograms / keyedHistograms fields).
Depends on: 1287585
crash_aggregates dataset backfill done.
Flags: needinfo?(mdoglio)
The churn dataset has been backfilled.
(In reply to Mark Reid [:mreid] from comment #6) > main_summary v2 has been backfilled for July 4, 6, 7, 8 and 9. July 5th > appears to have some data errors (bad JSON values in the histograms / > keyedHistograms fields). The data for July 5th was also added a few days ago.
The retention CSV has also been updated.
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #3) > The longitudinal dataset has been back-filled and the client_count one will > reflect reality once main_summary is back-filled. main_summary has been backfilled, does something need to be triggered to update client_count?
Flags: needinfo?(rvitillo)
Depends on: 1290458
(In reply to Mark Reid [:mreid] from comment #11) > (In reply to Roberto Agostino Vitillo (:rvitillo) from comment #3) > > The longitudinal dataset has been back-filled and the client_count one will > > reflect reality once main_summary is back-filled. > main_summary has been backfilled, does something need to be triggered to > update client_count? No
Flags: needinfo?(rvitillo)
Depends on: 1290540
Depends on: 1299153
This is finished, as far as I know.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.