Closed Bug 1257615 Opened 9 years ago Closed 9 years ago

Hive metastore can't import Parquet tables with binary column

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rvitillo, Assigned: whd)

References

Details

(Whiteboard: [SvcOps])

Even though Hive 1.0 supports binary columns the following statements fails: create external table KPI(activityDate string, normalizedChannel string, country string, version string, e10sEnabled boolean, e10sCohort string, hll binary) stored as parquet location 's3://telemetry-test-bucket/KPI/v2016030120160313'; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.UnsupportedOperationException: Unknown field type: binary
Apparently Hive 1.0 doesn't support the binary data type for Parquet, but 1.1 does: https://issues.apache.org/jira/browse/HIVE-7073. Wesley, could you upgrade our Hive metastore? That might also fix Bug 1255457.
Flags: needinfo?(whd)
Assignee: nobody → whd
Points: --- → 3
Flags: needinfo?(whd)
Priority: -- → P1
I've got hive updated on the current "prod" presto instance. With 1.1 I ran into https://issues.apache.org/jira/browse/HIVE-10831 but 1.2 seems to be working so far without issue. The example create from comment 1 for instance succeeded. It's hacked together right now, I'll file a PR to "operationalize" it assuming it continues to work as intended.
not happening this sprint, back to p2 until :whd comes up for air after ops related activities.
Priority: P1 → P2
Whiteboard: [SvcOps]
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.