Closed Bug 1856366 Opened 1 year ago Closed 1 year ago

Airflow task copy_deduplicate.copy_deduplicate_main_ping failed for exec_date 2023-09-29

Categories

(Data Platform and Tools :: General, defect)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: anicholson, Assigned: relud)

Details

(Whiteboard: [airflow-triage])

Airflow task copy_deduplicate.copy_deduplicate_main_ping failed for exec_date 2023-09-29

Task link:
https://workflow.telemetry.mozilla.org/log?dag_id=copy_deduplicate&task_id=copy_deduplicate_main_ping&execution_date=2023-09-29T01%3A00%3A00%2B00%3A00&map_index=-1

Incompatible table schemata error between tmp tables:

Log extract:

[2023-09-30, 04:50:56 UTC] {pod_manager.py:235} INFO - google.api_core.exceptions.BadRequest: 400 Incompatible table schemata between tables: moz-fx-data-shared-prod:tmp.anonc5eaaa3f20724fc4b4cf63bfba9cec48$20230929, moz-fx-data-shared-prod:tmp.anon1265ab95a17e46fcb571abf91ffd8f0d$20230929, moz-fx-data-shared-prod:tmp.anon13aa80708e834eb3b8fe0d09e31c3a3b$20230929, moz-fx-data-shared-prod:tmp.anon48a5db4053584d23b4a69b7a0ad057cd$20230929, moz-fx-data-shared-prod:tmp.anon554390e0dd124ca19ee5f9a498470a10$20230929, moz-fx-data-shared-prod:tmp.anon51bb02c2143f4967a9db413832f90c07$20230929, moz-fx-data-shared-prod:tmp.anona2f6392ab31d40fc8660a283f0c7ce55$20230929, moz-fx-data-shared-prod:tmp.anona93bb32a8d1e4734890a0d0f8ca7d06a$20230929, moz-fx-data-shared-prod:tmp.anon7db5003dda9b4e4692d04e635da30321$20230929, moz-fx-data-shared-prod:tmp.anon518df59f1a00464e9eb6571c51a68753$20230929, moz-fx-data-shared-prod:tmp.anonfe1ff2ad055245e4a69994728f54a93b$20230929, moz-fx-data-shared-prod:tmp.anon1da2b70d625b4cfc927c844f484e6e18$20230929, moz-fx-data-shared-prod:tmp.anonce0d0943c45b4c418fe9a601b54bcd47$20230929, moz-fx-data-shared-prod:tmp.anon6f1c501e08eb479ba43ca416c295cf76$20230929, moz-fx-data-shared-prod:tmp.anon821ed43978764ff0b11a8c9579c4f0df$20230929, moz-fx-data-shared-prod:tmp.anon5ecc980fff7e4122bf2040494d89a8f9$20230929, moz-fx-data-shared-prod:tmp.anon1f0c66eae3db458999389295d64d9e5b$20230929, moz-fx-data-shared-prod:tmp.anon301c6b0c3ae34b508a5f8977e4554c53$20230929, moz-fx-data-shared-prod:tmp.anon37fe6858369d4233b161d940c3d81c54$20230929, moz-fx-data-shared-prod:tmp.anon79c274a78c17461c98cf6e624688aa88$20230929.
Assignee: nobody → dthorn

on attempt 2, main_use_counter_v4 and main_v5 succeeded, then on attempt 3 main_v4 and main_v5 succeeded. I checked and all three tables have the same number of rows for 2023-09-29, as they should.

This means that effectively the task has succeeded, so I cleared the task and all downstream tasks, then marked it successful.

as for explanation: it's likely that a schema deploy overlapped with the task, so the first slices had a different schema than the last slices.

This doesn't happen on other tables because they don't get copied using slices, and this task hasn't previously failed for this reason because it was only handling one table, so retries were sufficient.

This should probably be mitigated by moving main_v4, main_use_counters_v4, and main_v5 to separate copy deduplicate tasks.

Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.