Closed Bug 1817409 Opened 2 years ago Closed 2 years ago

Airflow task merino_jobs.wikipedia_indexer_copy_export failed for 2023-02-06

Categories

(Data Platform and Tools :: General, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: kik, Assigned: wstuckey)

Details

(Whiteboard: [airflow-triage])

Airflow task merino_jobs.wikipedia_indexer_copy_export failed for 2023-02-06

Error logs:

[2023-02-16, 20:36:25 UTC] {{pod_manager.py:226}} INFO - Forbidden: 403 GET
[2023-02-16, 20:36:25 UTC] {{pod_manager.py:226}} INFO - https://storage.googleapis.com/storage/v1/b/merino-jobs-dev/o?projection=noAcl&p
[2023-02-16, 20:36:25 UTC] {{pod_manager.py:226}} INFO - refix=wikipedia-exports&prettyPrint=false:
[2023-02-16, 20:36:25 UTC] {{pod_manager.py:226}} INFO - default-workloads@moz-fx-data-airflow-gke-prod.iam.gserviceaccount.com does not
[2023-02-16, 20:36:25 UTC] {{pod_manager.py:226}} INFO - have storage.objects.list access to the Google Cloud Storage bucket. Permission
[2023-02-16, 20:36:25 UTC] {{pod_manager.py:226}} INFO - 'storage.objects.list' denied on resource (or it may not exist).
[2023-02-16, 20:36:26 UTC] {{pod_manager.py:273}} INFO - Pod wikipedia-indexer-copy-export-8eec27d32d104332b820ef406d5c199b has phase Running
[2023-02-16, 20:36:28 UTC] {{kubernetes_pod.py:456}} INFO - skipping deleting pod: wikipedia-indexer-copy-export-8eec27d32d104332b820ef406d5c199b

From discussion in https://mozilla-hub.atlassian.net/browse/DSRE-1198, I think we just need to switch the bucket being used in the job to gs://moz-fx-data-prod-external-data. If it's desirable to use the dev bucket while testing the new job, I can provide a terraform snippet that grants the relevant GCS access to workgroup:dataops-managed/airflow-gke, but we generally try to avoid granting airflow access to dev resources.

this is failing again this week

Thanks relud, I've turned the job off until the following PR is merged and applied. https://github.com/mozilla-services/cloudops-infra/pull/4750

As a side note this should be excluded from triage, was if found via that process or was it just a manual audit of the failing DAGs?

Marking this as FIXED as the original issue is resolved.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED

i noticed as part of triage, because i missed the triage/no_triage tag

You need to log in before you can comment on or make changes to this bug.