Closed Bug 1752803 Opened 3 years ago Closed 3 years ago

Airflow DAG prerelease_telemetry_aggregates not scheduling since 2022-01-21

Categories

(Data Platform and Tools :: General, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kik, Assigned: linh)

Details

(Whiteboard: [airflow-triage])

Attachments

(2 files)

The Airflow DAG prerelease_telemetry_aggregates not scheduling since 2022-01-21

This appears to be due to the combination of tasks having depends_on_past flag set to True and schedule interval being once per day.

The tasks which successfully executed in the past dag run were triggered and executed as expected. However, mozaggregator2bq_extract task fails to start because of the past failure with the following errors:

[2022-01-26 19:54:43,918] {{pod_launcher.py:149}} INFO - 22/01/26 19:54:43 INFO Executor: Starting executor ID driver on host mozaggregator2bq-extract.21d3e67223304474a5e4b52b6fe0abd4
[2022-01-26 19:54:43,936] {{pod_launcher.py:149}} INFO - 22/01/26 19:54:43 ERROR SparkContext: Error initializing SparkContext.
[2022-01-26 19:54:43,936] {{pod_launcher.py:149}} INFO - org.apache.spark.SparkException: Invalid Spark URL: spark://HeartbeatReceiver@mozaggregator2bq-extract.21d3e67223304474a5e4b52b6fe0abd4:36027
[2022-01-26 19:54:43,936] {{pod_launcher.py:149}} INFO - 	at org.apache.spark.rpc.RpcEndpointAddress$.apply(RpcEndpointAddress.scala:66)

This also results in all downstream tasks from that point not being executed. Furthermore, this caused us to have 6 parallel DAG runs to be active at the same time (since they cannot complete) which also happens to be the limit of active dag runs for this specific DAG meaning no more DAG instances for this DAG can be scheduled at the moment.

Link to DAG: https://workflow.telemetry.mozilla.org/tree?dag_id=prerelease_telemetry_aggregates

Reminder to revert this when the data starts flowing and is backfilled.

Attachment #9262661 - Attachment is patch: true
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Component: Datasets: General → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: