Closed Bug 1304439 Opened 9 years ago Closed 9 years ago

Create a parquet lua module

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: trink, Assigned: trink)

References

Details

No description provided.
Assignee: nobody → mtrinkala
Blocks: 1304412
Points: --- → 3
Priority: -- → P2
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
No longer blocks: 1304412
The high level tasks for the initial version of the parquet module follow: - define writer API - implement schema parser (data output by the parquet-dump-schema utility) - implement schema object - record creation - column creation - finalize - implement a loader to iterate the schema parse table and create the corresponding schema object - implement writer property builder - support for the following properties - global - enable_dictionary (bool) - dictionary_pagesize_limit (size_t) - write_batch_size (size_t) - data_pagesize (size_t) - version ("1.0", "2.0") - created_by (string) - encoding ("plain", "rle") - compression ("uncompressed", "snappy", "gzip", "lzo", "brotli") - enable_statistics (bool) - column_properties - col_name1 - enable_dictionary (bool) - encoding ("plain", "rle") - compression ("uncompressed", "snappy", "gzip", "lzo", "brotli") - enable_statistics (bool) - col_nameN - implement dremel column striping - implement writer append_row (column output from each message) - implement writer close (outputs the row group and closes the file) - Note: this design will only allow one row group per file - testing/documentation
Blocks: 1304412
Priority: P2 → P1
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Blocks: 1317385
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.