Closed
Bug 1337044
Opened 8 years ago
Closed 8 years ago
Add retention dataset to telemetry-batch-view
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P3)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: amiyaguchi, Unassigned)
References
Details
The churn/retention dataset currently exists at telemetry-parquet/churn/v2 and is generated by a jupyter notebook. [1] It has a few uses including work for search and stub attribution.
This job should be written in scala for consistency with the other broadly used datasets. There should be some suite of unit tests that validates expected behavior, and documentation that supports its design and use case.
Since this job is also being migrated, there should be some sanity check that the old and new versions are generating the same data. A jupyter notebook in bug 1329842 supports the transition in a rollup that uses python + redshift and scala + spark. A similar report for validating change can be made for retention.
[1] https://github.com/mozilla/mozilla-reports/blob/master/etl/churn.kp/orig_src/Churn.ipynb
Reporter | ||
Updated•8 years ago
|
Assignee: nobody → amiyaguchi
Reporter | ||
Updated•8 years ago
|
Points: --- → 3
Priority: -- → P2
Reporter | ||
Updated•8 years ago
|
Reporter | ||
Updated•8 years ago
|
Assignee: amiyaguchi → nobody
Reporter | ||
Updated•8 years ago
|
Priority: P2 → P3
Reporter | ||
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Updated•6 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•