Airflow task taar_weekly.dataflow_import_avro_to_bigtable failing on 2022-05-22
Categories
(Data Platform and Tools :: General, defect, P1)
Tracking
(Not tracked)
People
(Reporter: linh, Assigned: epavlov)
Details
(Whiteboard: [airflow-triage])
The Airflow task taar_weekly.dataflow_import_avro_to_bigtable failed on 2022-05-22. See error log here.
Comment 1•3 years ago
|
||
The Airflow task taar_weekly.dataflow_import_avro_to_bigtable failed on 2022-07-10.
The Airflow task taar_weekly.taar_lite https://workflow.telemetry.mozilla.org/tree?dag_id=taar_daily
The issues seems to be : pyspark.sql.utils.AnalysisException: 'Path does not exist: gs://moz-fx-data-derived-datasets-parquet/clients_daily/v6/submission_date_s3=20220720;'
| Assignee | ||
Comment 2•3 years ago
|
||
Is gs://moz-fx-data-derived-datasets-parquet/clients_daily/v6 still a relevant data source or it was deprecated and we should migrate to something else?
Comment 3•3 years ago
•
|
||
if you can migrate to reading mozdata.telemetry.clients_daily (or moz-fx-data-shared-prod.telemetry_derived.clients_daily_v6), that would be good.
the parquet export for gs://moz-fx-data-derived-datasets-parquet/clients_daily/v6/submission_date_s3=20220720 seems to have started failing while i was on PTO. it's still maintained exclusively for this use case, and I haven't had a chance to investigate yet.
| Assignee | ||
Comment 4•3 years ago
|
||
:relud is it just replacing the name or switching to reading from BigQuery?
If the latter, it would require changing and retesting three TAAR jobs to query BigQuery instead of Spark and there's nobody to work on that. I'm still the maintainer but I'm at Pocket now, so I can only make quick fixes. Maybe the data engineering has the resources to work on this? Or we can just stick to the old scheme with parquet.
Comment 6•3 years ago
|
||
https://github.com/mozilla/telemetry-airflow/pull/1531 should get parquet working again
Comment 7•3 years ago
|
||
also blocked on https://github.com/mozilla/telemetry-airflow/pull/1532
Updated•3 years ago
|
Comment 8•3 years ago
|
||
parquet exports are fixed now
Comment 9•3 years ago
|
||
Job is now working. Evgeny, do we need to backfill the failed tasks, or can we mark this complete?
Updated•3 years ago
|
| Assignee | ||
Comment 10•3 years ago
|
||
We don't need to backfill, closing
Description
•