Closed
Bug 1777705
Opened 2 years ago
Closed 2 years ago
Airflow task bhr_collection.bhr_collection.run_dataproc_pyspark failing on 2022-06-30, 07:00:00
Categories
(Data Platform and Tools :: General, defect)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: kik, Assigned: alexical)
Details
(Whiteboard: [airflow-triage])
Airflow task bhr_collection.bhr_collection.run_dataproc_pyspark failing on 2022-06-30, 07:00:00
Error log extract:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1165, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1283, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 1313, in _execute_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.7/site-packages/airflow/providers/google/cloud/operators/dataproc.py", line 1627, in execute
super().execute(context)
File "/usr/local/lib/python3.7/site-packages/airflow/providers/google/cloud/operators/dataproc.py", line 1067, in execute
project_id=self.project_id, job=self.job["job"], region=self.region
File "/usr/local/lib/python3.7/site-packages/airflow/providers/google/common/hooks/base_google.py", line 425, in inner_wrapper
return func(self, *args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/airflow/providers/google/cloud/hooks/dataproc.py", line 948, in submit_job
metadata=metadata,
File "/usr/local/lib/python3.7/site-packages/google/cloud/dataproc_v1beta2/services/job_controller/client.py", line 413, in submit_job
response = rpc(request, retry=retry, timeout=timeout, metadata=metadata,)
File "/usr/local/lib/python3.7/site-packages/google/api_core/gapic_v1/method.py", line 145, in __call__
return wrapped_func(*args, **kwargs)
File "/usr/local/lib/python3.7/site-packages/google/api_core/grpc_helpers.py", line 69, in error_remapped_callable
six.raise_from(exceptions.from_grpc_error(exc), exc)
File "<string>", line 3, in raise_from
google.api_core.exceptions.NotFound: 404 Not found: Cluster projects/airflow-dataproc/regions/us-west1/clusters/bhr-collection-2022-06-30
Airflow task logs:
https://workflow.telemetry.mozilla.org/log?dag_id=bhr_collection.bhr_collection&task_id=run_dataproc_pyspark&execution_date=2022-06-30T05%3A00%3A00%2B00%3A00
Dataproc log link:
https://console.cloud.google.com/dataproc/jobs/bhr-collection_617fae4a/monitoring?region=us-west1&project=airflow-dataproc
Extract from dataproc logs:
Processing modules...
Processing modules took 0ms to complete
22/07/01 11:00:16 WARN org.apache.spark.scheduler.TaskSetManager: Lost task 138.0 in stage 5.0 (TID 2141, bhr-collection-2022-06-30-w-1.c.airflow-dataproc.internal, executor 20): org.apache.spark.SparkException: Python worker exited unexpectedly (crashed)
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:490)
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator$$anonfun$1.applyOrElse(PythonRunner.scala:479)
at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:38)
at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:597)
at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:575)
at org.apache.spark.api.python.BasePythonRunner$ReaderIterator.hasNext(PythonRunner.scala:410)
at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
at scala.collection.Iterator$GroupedIterator.fill(Iterator.scala:1209)
at scala.collection.Iterator$GroupedIterator.hasNext(Iterator.scala:1215)
at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:458)
at org.apache.spark.shuffle.sort.UnsafeShuffleWriter.write(UnsafeShuffleWriter.java:188)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:99)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:55)
at org.apache.spark.scheduler.Task.run(Task.scala:123)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:414)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:417)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
at org.apache.spark.api.python.PythonRunner$$anon$3.read(PythonRunner.scala:582)
... 16 more
Reporter | ||
Updated•2 years ago
|
Assignee: nobody → dothayer
Updated•2 years ago
|
Component: Datasets: General → General
Comment 1•2 years ago
|
||
This looks to be resolved, the Airflow tasks were re-ran successfully. Kik, Doug, if that doesn't seem correct, feel free to re-open.
Updated•2 years ago
|
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•