Closed
Bug 1304100
Opened 8 years ago
Closed 8 years ago
Unknown field 'binary' in parquet2hive
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: frank, Assigned: frank)
References
Details
When using parquet2hive on a dataset that includes a binary field, it doesn't add it and reports the following error:
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.UnsupportedOperationException: Unknown field type: binary
Assignee | ||
Comment 1•8 years ago
|
||
Can test this on the cluster with
parquet2hive s3://telemetry-parquet/client_count -ulv | bash
Comment 2•8 years ago
|
||
This used to work with p2h 0.2.7:
[hadoop@ip-172-31-27-197 ~]$ pip freeze | grep parquet2hive
You are using pip version 6.1.1, however version 8.1.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
parquet2hive==0.2.7
[hadoop@ip-172-31-27-197 ~]$ parquet2hive s3://telemetry-parquet/client_count -ulv
Analyzing dataset client_count, v2016032020160920
hive -hiveconf hive.support.sql11.reserved.keywords=false -e 'drop table if exists client_count_v2016032020160920; create external table client_count_v2016032020160920(`activity_date` string, `devtools_toolbox_opened` boolean, `loop_activity_open_panel` boolean, `normalized_channel` string, `country` string, `locale` string, `app_name` string, `app_version` string, `e10s_enabled` boolean, `e10s_cohort` string, `os` string, `os_version` string, `hll` binary) stored as parquet location '"'s3://telemetry-parquet/client_count/v2016032020160920'"'; msck repair table client_count_v2016032020160920;'
hive -e 'drop table if exists client_count; create external table client_count(`activity_date` string, `devtools_toolbox_opened` boolean, `loop_activity_open_panel` boolean, `normalized_channel` string, `country` string, `locale` string, `app_name` string, `app_version` string, `e10s_enabled` boolean, `e10s_cohort` string, `os` string, `os_version` string, `hll` binary) stored as parquet location '"'s3://telemetry-parquet/client_count/v2016032020160920'"'; msck repair table client_count;'
Updated•8 years ago
|
Severity: normal → major
Updated•8 years ago
|
Assignee: nobody → fbertsch
Comment 3•8 years ago
|
||
Turns out I misunderstood what this bug is about. parquet2hive is not compatible with older Hive versions (e.g. 1.0.0), like the one being created by atmo v1. That's a non issue though as we are moving to atmo v2 which comes with Hive 2.1.0.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•