Closed Bug 1768864 Opened 3 years ago Closed 3 years ago

Airflow task bqetl_subplat .mozilla_vpn_external__users__v1 failing on 2022-05-10

Categories

(Data Platform and Tools :: General, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kik, Assigned: relud)

Details

(Whiteboard: [airflow-triage])

Airflow task bqetl_subplat .mozilla_vpn_external__users__v1 failing on 2022-05-10

DAG link:
https://workflow.telemetry.mozilla.org/log?dag_id=bqetl_subplat&task_id=mozilla_vpn_external__users__v1&execution_date=2022-05-10T01%3A45%3A00%2B00%3A00

Error log:

[2022-05-11 02:46:03,965] {{pod_launcher.py:149}} INFO - Error in query string: Error processing job 'moz-fx-data-shared-
[2022-05-11 02:46:03,966] {{pod_launcher.py:149}} INFO - prod:bqjob_r3f31b6e0e63458f4_00000180b102583c_1': No matching signature for
[2022-05-11 02:46:03,966] {{pod_launcher.py:149}} INFO - function IF for argument types: BOOL, STRUCT<id INT64, email STRING, fxa_uid
[2022-05-11 02:46:03,968] {{pod_launcher.py:149}} INFO - STRING, ...>, STRUCT<id INT64, email STRING, fxa_uid STRING, ...>. Supported
[2022-05-11 02:46:03,968] {{pod_launcher.py:149}} INFO - signature: IF(BOOL, ANY, ANY) at [2:3]
[2022-05-11 02:46:04,110] {{pod_launcher.py:149}} INFO - Traceback (most recent call last):
[2022-05-11 02:46:04,111] {{pod_launcher.py:149}} INFO -   File "/usr/local/lib/python3.8/runpy.py", line 194, in _run_module_as_main
[...]
[2022-05-11 02:46:04,113] {{pod_launcher.py:149}} INFO - subprocess.CalledProcessError: Command '['bq', 'query', '--project_id=moz-fx-data-shared-prod', "--parameter=external_database_query:STRING:SELECT * FROM users WHERE DATE(updated_at) = DATE '2022-05-10'", '--dataset_id=mozilla_vpn_external', '--destination_table=users_v1']' returned non-zero exit status 1.

Unfortunately, when trying to reproduce the error inside BQ UI console i get the following error:

Access Denied: Connection moz-fx-guardian-prod-bfc7.us.guardian-sql-prod: User does not have bigquery.connections.use permission for connection moz-fx-guardian-prod-bfc7.us.guardian-sql-prod.

It seems like this might be the offending line:
https://github.com/mozilla/bigquery-etl/blob/main/sql/moz-fx-data-shared-prod/mozilla_vpn_external/users_v1/query.sql#L6

looks like the underlying table had a schema update

cc :srose

The upstream table dropped the attribution column, which we aren't using anymore. I resolved this by running the following queries and then re-running the airflow task:

CREATE OR REPLACE TABLE
  `moz-fx-data-shared-prod`.mozilla_vpn_external.users_attribution_v1 AS
SELECT
  id,
  attribution
FROM
  `moz-fx-data-shared-prod`.mozilla_vpn_external.users_v1
ALTER TABLE
  `moz-fx-data-shared-prod`.mozilla_vpn_external.users_v1
DROP COLUMN
  attribution
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Component: Datasets: General → General
You need to log in before you can comment on or make changes to this bug.