Closed Bug 1418674 Opened 8 years ago Closed 8 years ago

Error querying Athena: HIVE_PARTITION_SCHEMA_MISMATCH

Categories

(Data Platform and Tools Graveyard :: Operations, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mreid, Assigned: robotblake)

Details

(Whiteboard: [SvcOps])

Running the query at: https://sql.telemetry.mozilla.org/queries/48512/source#130992 Error running query: HIVE_PARTITION_SCHEMA_MISMATCH: There is a mismatch between the table and partition schemas. The types are incompatible and cannot be coerced. The column 'locale' in table 'telemetry.main_summary' is declared as type 'string', but partition 'submission_date_s3=20171117/sample_id=42' declared column 'attribution' as type 'struct<source:string,medium:string,campaign:string,content:string>'.
Blake looked into this over the weekend - it appears this is a problem with the AWS Glue Catalog, and seems to be a result of schema evolution (namely removing the e10s_cohort field over in bug 1413515). I tried switching the above query back to the "Presto" data source, but got an error about the query exceeding the max size of 30GB.
Update: it appears the "main_summary_v4" table is working properly in Athena, so there are now two possible workarounds: 1. change queries to use the "main_summary_v4" table instead of "main_summary" 2. change data source to Presto. These should both be considered temporary until the underlying problem is resolved.
This has been fixed, but Blake wants to continue looking at the underlying cause.
Assignee: nobody → bimsland
Whiteboard: [SvcOps]
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: Data Platform and Tools → Data Platform and Tools Graveyard
You need to log in before you can comment on or make changes to this bug.