Possible to re-export some old FxA flow data?

RESOLVED FIXED

Status

Cloud Services
Metrics: Pipeline
P1
trivial
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: pb, Assigned: whd)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

a year ago
I've marked this as severity:trivial because if there's any significant effort involved at all for whatever reason, it's probably not worth fixing.

But anyway, I accidentally clobbered some of the old flow data CSVs this weekend when the instance I was running a script on ran out of disk space and wrote zero length files to S3. In a predictably short-sighted example of best-case thinking on my part, the script wasn't checking $? before writing.

Fortunately, I spotted one of the errors before it got too far, so only a handful of CSVs were clobbered. And none of them are recent enough to be in the 3-month window visible in our main view of the data in redash.

The affected CSVs were:

s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-09-30.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-01.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-02.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-03.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-04.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-05.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-06.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-07.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-08.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-09.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-10.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-11.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-12.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-13.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-14.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-15.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-11-15.csv

(I've deleted the zero-length files now btw, because I wasn't sure whether they might cause problems for our scheduled import jobs)

So, if and only if re-exporting them is a piece of cake, would it be possible to dump those dates back into S3? And if it's at all tricky or tiresome for any reason, this bug can be closed no probs.
(Reporter)

Comment 1

a year ago
Just realised flow-2016-09-29.csv was affected as well, it got about half-way through that one before failing.
(Assignee)

Comment 2

a year ago
Yes, it's possible, and very low effort. I'll do it today.
Assignee: nobody → whd
Status: NEW → ASSIGNED
Points: --- → 1
Priority: -- → P1
(Assignee)

Comment 3

a year ago
I've re-exported these. It occurs to me now that they might have a different number of (probably empty) fields than the originals, but hopefully that won't affect anything downstream.
Status: ASSIGNED → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
(Reporter)

Comment 4

a year ago
Thanks :whd!

> ...they might have a different number of (probably empty) fields than the originals...

Not a problem. The errant script I was running was actually to pad out the old CSVs, as the conditional code for different field counts in our import scripts was technical debt I wanted to ditch.
You need to log in before you can comment on or make changes to this bug.