Closed Bug 1348641 Opened 7 years ago Closed 7 years ago

Possible to re-export some old FxA flow data?

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: pb, Assigned: whd)

Details

I've marked this as severity:trivial because if there's any significant effort involved at all for whatever reason, it's probably not worth fixing.

But anyway, I accidentally clobbered some of the old flow data CSVs this weekend when the instance I was running a script on ran out of disk space and wrote zero length files to S3. In a predictably short-sighted example of best-case thinking on my part, the script wasn't checking $? before writing.

Fortunately, I spotted one of the errors before it got too far, so only a handful of CSVs were clobbered. And none of them are recent enough to be in the 3-month window visible in our main view of the data in redash.

The affected CSVs were:

s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-09-30.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-01.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-02.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-03.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-04.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-05.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-06.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-07.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-08.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-09.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-10.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-11.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-12.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-13.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-14.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-10-15.csv
s3://net-mozaws-prod-us-west-2-pipeline-analysis/fxa-flow/data/flow-2016-11-15.csv

(I've deleted the zero-length files now btw, because I wasn't sure whether they might cause problems for our scheduled import jobs)

So, if and only if re-exporting them is a piece of cake, would it be possible to dump those dates back into S3? And if it's at all tricky or tiresome for any reason, this bug can be closed no probs.
Just realised flow-2016-09-29.csv was affected as well, it got about half-way through that one before failing.
Yes, it's possible, and very low effort. I'll do it today.
Assignee: nobody → whd
Status: NEW → ASSIGNED
Points: --- → 1
Priority: -- → P1
I've re-exported these. It occurs to me now that they might have a different number of (probably empty) fields than the originals, but hopefully that won't affect anything downstream.
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Thanks :whd!

> ...they might have a different number of (probably empty) fields than the originals...

Not a problem. The errant script I was running was actually to pad out the old CSVs, as the conditional code for different field counts in our import scripts was technical debt I wanted to ditch.
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.