[emr-bootstrap-presto] Set hive configuration hive.parquet.use-column-names=true

RESOLVED FIXED

Status

P1
normal
RESOLVED FIXED
2 years ago
a year ago

People

(Reporter: amiyaguchi, Assigned: robotblake)

Tracking

Details

(Whiteboard: [SvcOps])

(Reporter)

Description

2 years ago
The parquet data in our data lake is stored using schema evolution. The vanilla Presto configuration on EMR will default to using offsets unless ` hive.parquet.use-column-names=true` is specified. This configuration is documented in [1] and should be reflected in the emr-bootstrap-presto[2] repository.


[1] https://github.com/fbertsch/schema_evolution_exploration
[2] https://github.com/mozilla/emr-bootstrap-presto

Updated

2 years ago
Assignee: nobody → jthomas
Priority: -- → P1
Whiteboard: [SvcOps]

Updated

2 years ago
Depends on: 1358232
This will be addressed as part of bug 1358232.
Assignee: jthomas → bimsland
(Assignee)

Comment 2

a year ago
This has been added to the configs on the new cluster.
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.