Closed Bug 1361097 Opened 8 years ago Closed 7 years ago

Convert socorro crash schema to use snake_case

Categories

(Data Platform and Tools :: General, enhancement, P3)

x86_64
Linux
enhancement

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: amiyaguchi, Unassigned)

Details

The Socorro Crash importer [1] currently sources the schema from a json schema. Athena is an aws hosted Presto service that will most likely succeed our EMR-hosted Presto cluster. In order to support the migration to Athena, all camelCase columns should be converted into snake_case column names. The column name should be handled during the construction of the pyspark schema. [1] https://github.com/mozilla-services/data-pipeline/blob/master/reports/socorro_import/ImportCrashData.ipynb
Priority: -- → P2
Anthony - any idea who should own this? Is this active this quarter or should we downgrade to P[3,4]?
I would classify this as a good first bug (as in anyone could take it), priority is probably P3.
Priority: P2 → P3
This is being done automatically when the data is read into the metastore.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Component: Datasets: General → General
You need to log in before you can comment on or make changes to this bug.