Create EMR bootstrap scripts for Presto.

RESOLVED FIXED

Status

Cloud Services
Metrics: Pipeline
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: rvitillo, Assigned: rvitillo)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

The bootstrap scripts should also setup redash and scan/load Hive datasets stored within the parquet bucket.
(Assignee)

Comment 1

2 years ago
See https://github.com/vitillo/emr-bootstrap-presto. I have tried airpal as well but it doesn't seem to support arrays, maps and structs, which are used by our longitudinal dataset.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
(Assignee)

Comment 2

2 years ago
Note that some changes [1] were required to PyHive, the Python interface to Presto used by redash, to display correctly structs.

[1] https://github.com/vitillo/PyHive/commit/26a565c88e6efffbc1997dcee6630c43d166355e
You need to log in before you can comment on or make changes to this bug.