Closed
Bug 1296302
Opened 8 years ago
Closed 6 years ago
Increase speed of parquet2hive when --success-only flag is set
Categories
(Data Platform and Tools Graveyard :: Operations, defect, P3)
Data Platform and Tools Graveyard
Operations
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: frank, Unassigned)
References
Details
parquet2hive currently checks every partition it encounters for a _SUCCESS file. This makes it slow for datasets partitioned on multiple dimensions. This can be improved using less client/server communication, and multiprocessing.
Updated•8 years ago
|
Points: --- → 2
Priority: -- → P3
Reporter | ||
Updated•7 years ago
|
Component: Metrics: Pipeline → Presto
Product: Cloud Services → Data Platform and Tools
Comment 1•6 years ago
|
||
We aren't using --success-only anywhere with any of our datasets, is this still a concern :frank?
Flags: needinfo?(fbertsch)
Updated•6 years ago
|
Component: Presto → Operations
QA Contact: moconnor
Reporter | ||
Comment 2•6 years ago
|
||
Not if we're not using it.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(fbertsch)
Resolution: --- → WONTFIX
Updated•1 year ago
|
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•