[emr-bootstrap-presto] Set hive configuration hive.parquet.use-column-names=true

RESOLVED FIXED

Status

Data Platform and Tools
Presto
P1
normal
RESOLVED FIXED
10 months ago
6 months ago

People

(Reporter: amiyaguchi, Assigned: robotblake)

Tracking

Details

(Whiteboard: [SvcOps])

(Reporter)

Description

10 months ago
The parquet data in our data lake is stored using schema evolution. The vanilla Presto configuration on EMR will default to using offsets unless ` hive.parquet.use-column-names=true` is specified. This configuration is documented in [1] and should be reflected in the emr-bootstrap-presto[2] repository.


[1] https://github.com/fbertsch/schema_evolution_exploration
[2] https://github.com/mozilla/emr-bootstrap-presto

Updated

9 months ago
Assignee: nobody → jthomas
Priority: -- → P1
Whiteboard: [SvcOps]

Updated

9 months ago
Depends on: 1358232

Comment 1

9 months ago
This will be addressed as part of bug 1358232.
Assignee: jthomas → bimsland
(Assignee)

Comment 2

6 months ago
This has been added to the configs on the new cluster.
Status: NEW → RESOLVED
Last Resolved: 6 months ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.