Closed Bug 1868852 Opened 1 year ago Closed 1 year ago

private_bqetl_historical_transactions.historical_transactions_derived__historical_transactions__v2 failed for exec_date 2023-12-07, 04:00:00 UTC

Categories

(Data Platform and Tools :: General, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mhirose, Assigned: srose)

Details

(Whiteboard: [airflow-triage])

Airflow task private_bqetl_historical_transactions.historical_transactions_derived__historical_transactions__v2 failed for exec_date 2023-12-07, 04:00:00 UTC

Task link:
https://workflow.telemetry.mozilla.org/dags/private_bqetl_historical_transactions/grid?dag_run_id=scheduled__2023-12-06T04%3A00%3A00%2B00%3A00&task_id=historical_transactions_derived__historical_transactions__v2&tab=logs
Log extract:

[2023-12-07, 05:06:37 UTC] {pod_manager.py:437} INFO - [base]  [{'code': 403, 'errors': [{'message': 'Access Denied: Table moz-fx-data-shared-prod:org_mozilla_fenix_nightly_stable.client_deduplication_v1: User does not have permission to query table moz-fx-data-shared-prod:org_mozilla_fenix_nightly_stable.client_deduplication_v1, or perhaps it does not exist in location US.', 'domain': 'global', 'reason': 'accessDenied'}], 'response': {'headers': {'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'cache-control': 'private', 'content-encoding': 'gzip', 'content-type': 'application/json; charset=UTF-8', 'date': 'Thu, 07 Dec 2023 05:05:16 GMT', 'server': 'ESF', 'transfer-encoding': 'chunked', 'vary': 'Origin, X-Origin, Referer', 'x-content-type-options': 'nosniff', 'x-frame-options': 'SAMEORIGIN', 'x-xss-protection': '0'}}, 'message': 'Access Denied: Table moz-fx-data-shared-prod:org_mozilla_fenix_nightly_stable.client_deduplication_v1: User does not have permission to query table moz-fx-data-shared-prod:org_mozilla_fenix_nightly_stable.client_deduplication_v1, or perhaps it does not exist in location US.'}]
[2023-12-07, 05:06:37 UTC] {pod_manager.py:437} INFO - [base] Cannot get schema for moz-fx-data-shared-prod.org_mozilla_fenix_nightly.client_deduplication: Error when dry running SQL file moz-fx-data-shared-prod/org_mozilla_fenix_nightly/client_deduplication/query.sql
[2023-12-07, 05:06:37 UTC] {pod_manager.py:437} INFO - [base] moz-fx-data-shared-prod/org_mozilla_fennec_aurora/client_deduplication/query.sql ERROR
[2023-12-07, 05:06:37 UTC] {pod_manager.py:437} INFO - [base]  [{'code': 403, 'errors': [{'message': 'Access Denied: Table moz-fx-data-shared-prod:org_mozilla_fennec_aurora_stable.client_deduplication_v1: User does not have permission to query table moz-fx-data-shared-prod:org_mozilla_fennec_aurora_stable.client_deduplication_v1, or perhaps it does not exist in location US.', 'domain': 'global', 'reason': 'accessDenied'}], 'response': {'headers': {'alt-svc': 'h3=":443"; ma=2592000,h3-29=":443"; ma=2592000', 'cache-control': 'private', 'content-encoding': 'gzip', 'content-type': 'application/json; charset=UTF-8', 'date': 'Thu, 07 Dec 2023 05:05:18 GMT', 'server': 'ESF', 'transfer-encoding': 'chunked', 'vary': 'Origin, X-Origin, Referer', 'x-content-type-options': 'nosniff', 'x-frame-options': 'SAMEORIGIN', 'x-xss-protection': '0'}}, 'message': 'Access Denied: Table moz-fx-data-shared-prod:org_mozilla_fennec_aurora_stable.client_deduplication_v1: User does not have permission to query table moz-fx-data-shared-prod:org_mozilla_fennec_aurora_stable.client_deduplication_v1, or perhaps it does not exist in location US.'}]

Flags: needinfo?(srose)
Flags: needinfo?(spatil)
Assignee: nobody → srose
Status: NEW → ASSIGNED
Flags: needinfo?(srose)

The Airflow task failed due to the following error which occurred during automated SQL generation:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/usr/local/lib/python3.10/site-packages/multiprocess/pool.py", line 48, in mapstar
    return list(map(*args))
  File "/usr/local/lib/python3.10/site-packages/pathos/helpers/mp_helper.py", line 15, in <lambda>
    func = lambda args: f(*args)
  File "/app/sql_generators/glean_usage/__init__.py", line 179, in <lambda>
    lambda f: f[0](f[1]),
  File "/app/sql_generators/glean_usage/event_monitoring_live.py", line 56, in generate_per_app_id
    app_name=[
IndexError: list index out of range

However, the real problem is that SQL generation is running because the query file can't be found:

No files matching: /root/private-bigquery-etl/sql/moz-fx-data-shared-prod/historical_transactions_derived/historical_transactions_v2/script.sql
Generating SQL content in /tmp/tmpzuz94s9f.

The ETL's sql_file_path setting was correct in the private_bqetl_historical_transactions DAG checked into the private-bigquery-etl repo's main branch, but is incorrect in the private-generated-sql branch, and the latter started being used when we switched to syncing DAGs using Git on 2023-10-17.

https://github.com/mozilla/bigquery-etl/pull/4662 will stop it from being a silent failure if the query file can't be found.

https://github.com/mozilla/bigquery-etl/pull/4668 has resolved the underlying issue, and I have successfully re-run all the failed task instances.

Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Flags: needinfo?(spatil)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.