Closed Bug 1337044 Opened 8 years ago Closed 8 years ago

Add retention dataset to telemetry-batch-view

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: amiyaguchi, Unassigned)

References

Details

The churn/retention dataset currently exists at telemetry-parquet/churn/v2 and is generated by a jupyter notebook. [1] It has a few uses including work for search and stub attribution. This job should be written in scala for consistency with the other broadly used datasets. There should be some suite of unit tests that validates expected behavior, and documentation that supports its design and use case. Since this job is also being migrated, there should be some sanity check that the old and new versions are generating the same data. A jupyter notebook in bug 1329842 supports the transition in a rollup that uses python + redshift and scala + spark. A similar report for validating change can be made for retention. [1] https://github.com/mozilla/mozilla-reports/blob/master/etl/churn.kp/orig_src/Churn.ipynb
Assignee: nobody → amiyaguchi
Points: --- → 3
Priority: -- → P2
Depends on: 1337037, 1323598
Assignee: amiyaguchi → nobody
Priority: P2 → P3
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.