Closed
Bug 1516014
Opened 7 years ago
Closed 6 years ago
Productionize duplicate job id on disk check
Categories
(Data Platform and Tools :: General, enhancement, P2)
Data Platform and Tools
General
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: bugzilla, Unassigned)
References
Details
(Whiteboard: [DataPlatform])
I hacked up a script for Bug 1515730 that checks parquet file names on disk to see if they have duplicate job IDs which works for most of our cases but the parquet file output names don't seem to be consistent everywhere. We need to:
- Understand why parquet job IDs differ in some cases in normal runs
- Clean up the script (needs some refactoring, etc)
- Run this over historical data
- Schedule this to run periodically
Updated•7 years ago
|
Points: --- → 2
Priority: -- → P2
Updated•7 years ago
|
Whiteboard: [DataPlatform]
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
| Assignee | ||
Updated•3 years ago
|
Component: Datasets: General → General
You need to log in
before you can comment on or make changes to this bug.
Description
•