Create Vertica tables for FHR v3 rollups

RESOLVED FIXED

Status

Cloud Services
Metrics: Dashboard
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: Katie Parlante, Assigned: sheeri)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

3 years ago
Like https://bugzilla.mozilla.org/show_bug.cgi?id=1136012, only for FHR v3

Sheeri and Hamilton are already working on this; adding this bug for tracking.

My understanding is that it should be ready ~Monday or Tuesday.
(Assignee)

Comment 1

3 years ago
These tables are created, and I'm just finalizing tweaks for the imports - taking out quotes and skipping data with commas in them (.004% of the data, so not a big deal to leave out)

fhr_v3_rollups_weekly
fhr_v3_rollups_monthly

Hamilton, the fields are named the same as the keys in the JSON file, plus a "process_date" field at the end which is the date of the process file - e.g. data that comes from a file in the directory 2015-02-23 will have that as the process date.

Let me know if that doesn't make sense.

The weekly table is almost all populated with 2/23 data. My script is going very slowly, and I'll work to make it batch-load better, but for now, there's plenty of rows in fhr_v3_rollups_weekly. It may be another day before all the data is loaded into the monthly table.
(Assignee)

Comment 2

3 years ago
Currently about 1.3 million rows in the weekly table, about 90% done, then it will start filling in the monthly table.

Comment 3

3 years ago
sounds great - thanks sheeri! At that point we can talk about making views (or whatever it is that you do with Saptarshi's tables to make a single current view)

Updated

3 years ago
Blocks: 1135930
(Assignee)

Comment 4

3 years ago
The monthly table has a bunch of data populated now if you want to start playing around with the data. It's not "rolled up" yet though.
(Assignee)

Comment 5

3 years ago
I have the batch script working great - each file takes about 30 seconds to import, as opposed to more than 30 minutes per file. The 2/23 data is fully loaded in and 2/16 will be done in about 15 minutes.

Now that the batch process is down pat (and btw, I added a file_name field to the vertica tables so debugging is even easier), let's talk about next steps - a daily table? rollups? 

We should probably set up a meeting for this...
(Assignee)

Updated

3 years ago
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.