Closed Bug 1324943 Opened 9 years ago Closed 9 years ago

Add churn/retention v2 dataset to hive

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: amiyaguchi, Assigned: amiyaguchi)

References

Details

Anthony Miyaguchi [:amiyaguchi]

Assignee

Description

•

9 years ago

The production location for the desktop churn/retention dataset will be `s3://telemetry-parquet/churn/v2`. This is necessary to expose the parquet partitions to presto and redash.

Anthony Miyaguchi [:amiyaguchi]

Assignee

Updated

•

9 years ago

Assignee: nobody → amiyaguchi

Blocks: 1311816

Points: --- → 1

Priority: -- → P1

Anthony Miyaguchi [:amiyaguchi]

Assignee

Comment 1

•

9 years ago

:sunahsuh provided me with the hive machine location and :wdh sent me the private key to access the machine. I appended the job for the churn location with the new verison, which should update the partitions automatically once a day. For reference, `cron -l` will list all parquet2hive jobs and a new job should probably not overlap with any other ones to avoid OOM (like main_summary which takes 6 hours+ as of now, will change in the near future).

Anthony Miyaguchi [:amiyaguchi]

Assignee

Comment 2

•

9 years ago

The data is available for querying [1] limited to 3 months of backfill. [1] https://sql.telemetry.mozilla.org/queries/1957/source#table

Status: NEW → RESOLVED

Closed: 9 years ago

Resolution: --- → FIXED

BMO Automation

Updated

•

7 years ago

Product: Cloud Services → Cloud Services Graveyard

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Add churn/retention v2 dataset to hive

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

Tracking

(Not tracked)

People

(Reporter: amiyaguchi, Assigned: amiyaguchi)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Updated