Closed Bug 1759019 Opened 3 years ago Closed 3 years ago

Shredder overzealously trimmed partitions that were being copy-deduped

Categories

(Data Platform and Tools :: General, task, P1)

task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: chutten, Assigned: relud)

References

Details

Attachments

(2 files)

Turns out BQ doesn't fail us in the middle of a Shredder operation if the table changes out from underneath us. Who knew!

This bug is about making Shredder logic more robust in the face of BQ not caring about things like this.

Attached file race_test.sh

The issue here is that when trying to handle deletion requests for a whole table in a single statement (for tables over 10TiB): in a DELETE statement you can say WHERE client_id IN (...) AND DATE(submission_timestamp) < "2022-03-07" to delete from all partitions before 2022-03-07 in a single statement, but in a SELECT statement WHERE (client_id IN (...)) IS NOT TRUE AND DATE(submission_timestamp) < "2022-03-07" causes all partitions on or after 2022-03-07 to be dropped.

This was missed because most for most tables shredder handles deletion requests for a single partition per statement, so in a DELETE statement WHERE client_id IN (...) AND DATE(submission_timestamp) = "2022-03-06" and in a SELECT statement WHERE (client_id IN (...)) IS NOT TRUE AND DATE(submission_timestamp) = "2022-03-06", both do the correct thing, because the SELECT statement has a destination "table" of a single partition.

Turns out BQ doesn't fail us in the middle of a Shredder operation if the table changes out from underneath us. Who knew!

This is still true, though. The attached test showed that BQ isn't protecting against conflicting writes/race conditions.

the following five tables were impacted by this at 2022-03-18 21:54 UTC:

moz-fx-data-shared-prod.regrets_reporter_ucs_stable.events_v1
moz-fx-data-shared-prod.org_mozilla_bergamot_stable.events_v1
moz-fx-data-shared-prod.mozillavpn_stable.events_v1
moz-fx-data-shared-prod.glean_dictionary_stable.events_v1
moz-fx-data-shared-prod.firefox_translations_stable.events_v1

nevermind, those are all new tables that have no pings in them at all.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
See Also: → 1767487
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: