Airflow task bqetl_main_summary .client_probe_processes__v1 failed for 2023-07-21
Categories
(Data Platform and Tools :: General, defect)
Tracking
(Not tracked)
People
(Reporter: wichan, Unassigned)
Details
(Whiteboard: [airflow-triage])
Attachments
(1 file)
Airflow task bqetl_main_summary .client_probe_processes__v1 failed for 2023-07-21
Log extract:
[2023-07-22, 00:37:49 UTC] {pod_manager.py:235} INFO - Error in query string: Error processing job 'moz-fx-data-shared-
[2023-07-22, 00:37:49 UTC] {pod_manager.py:235} INFO - prod:bqjob_r43a55583b369dc7e_000001897b07fabe_1': Queries in UNION ALL have
[2023-07-22, 00:37:49 UTC] {pod_manager.py:235} INFO - mismatched column count; query 1 has 15 columns, query 2 has 16 columns; failed
[2023-07-22, 00:37:49 UTC] {pod_manager.py:235} INFO - to parse view 'moz-fx-data-shared-prod.telemetry.client_probe_counts' at [7:3]
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - Traceback (most recent call last):
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "<string>", line 1, in <module>
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/app/bigquery_etl/cli/__init__.py", line 74, in cli
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - group(prog_name=prog_name)
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1130, in __call__
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - return self.main(*args, **kwargs)
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1055, in main
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - rv = self.invoke(ctx)
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - return _process_result(sub_ctx.command.invoke(sub_ctx))
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1657, in invoke
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - return _process_result(sub_ctx.command.invoke(sub_ctx))
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 1404, in invoke
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - return ctx.invoke(self.callback, **ctx.params)
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/usr/local/lib/python3.10/site-packages/click/core.py", line 760, in invoke
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - return __callback(*args, **kwargs)
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/usr/local/lib/python3.10/site-packages/click/decorators.py", line 26, in new_func
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - return f(get_current_context(), *args, **kwargs)
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/app/bigquery_etl/cli/query.py", line 832, in run
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - _run_query(
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/app/bigquery_etl/cli/query.py", line 936, in _run_query
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - subprocess.check_call(["bq"] + query_arguments, stdin=query_stream)
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - File "/usr/local/lib/python3.10/subprocess.py", line 369, in check_call
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - raise CalledProcessError(retcode, cmd)
[2023-07-22, 00:37:50 UTC] {pod_manager.py:235} INFO - subprocess.CalledProcessError: Command '['bq', 'query', '--dataset_id=telemetry_derived', '--project_id=moz-fx-data-shared-prod', '--destination_table=mozilla-public-data:telemetry_derived.client_probe_processes_v1']' returned non-zero exit status 1.
[2023-07-22, 00:37:52 UTC] {pod_manager.py:288} INFO - Pod client-probe-processes--v1-868f1goe has phase Running
[2023-07-22, 00:37:54 UTC] {kubernetes_pod.py:691} INFO - Skipping deleting pod: client-probe-processes--v1-868f1goe
[2023-07-22, 00:37:54 UTC] {taskinstance.py:1776} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/google/cloud/operators/kubernetes_engine.py", line 532, in execute
result = super().execute(context)
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 516, in execute
return self.execute_sync(context)
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 545, in execute_sync
self.cleanup(
File "/home/airflow/.local/lib/python3.10/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 671, in cleanup
raise AirflowException(
airflow.exceptions.AirflowException: Pod client-probe-processes--v1-868f1goe returned a failure:
Comment 1•2 years ago
|
||
This seems related to having non-normalized aggregations available in GLAM now. The histogram tables have a non_norm_aggregates column while the scalar tables don't. cc are there plans to add this to the scalar tables or no?
Comment 2•2 years ago
|
||
Because the normalization in question is a histogram normalization, scalars are already not normalized for that. What I'm currently doing for Glam's sake is add a non_norm_aggregates column before exporting the data. See this open PR:
https://github.com/mozilla/bigquery-etl/pull/4107/files
Comment 3•2 years ago
|
||
I'm going to have the columns in the actual histogram + scalar tables instead of fixing the view query.
Comment 4•2 years ago
|
||
Comment 5•2 years ago
|
||
Comment 6•2 years ago
|
||
The job worked a second time by itself and the tables are updated to allow UNION between them. I'm considering this as fixed.
Comment 7•2 years ago
|
||
Eduardo, can we mark these tasks as success and the associated DAGs? Or do we need to re-run the affected days? https://workflow.telemetry.mozilla.org/dags/bqetl_main_summary/grid?run_id=scheduled__2023-07-22T02%3A00%3A00%2B00%3A00&execution_date=2023-07-22+02%3A00%3A00%2B00%3A00&tab=code&base_date=2023-07-23T14%3A11%3A02Z&dag_run_id=scheduled__2023-07-20T02%3A00%3A00%2B00%3A00&task_id=client_probe_processes__v1
Updated•6 months ago
|
Description
•