Closed Bug 1248924 Opened 8 years ago Closed 8 years ago

Create EMR bootstrap scripts for Presto.

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rvitillo, Assigned: rvitillo)

References

Details

The bootstrap scripts should also setup redash and scan/load Hive datasets stored within the parquet bucket.
See https://github.com/vitillo/emr-bootstrap-presto. I have tried airpal as well but it doesn't seem to support arrays, maps and structs, which are used by our longitudinal dataset.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Note that some changes [1] were required to PyHive, the Python interface to Presto used by redash, to display correctly structs.

[1] https://github.com/vitillo/PyHive/commit/26a565c88e6efffbc1997dcee6630c43d166355e
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.