having the correlation scripts over in the socorro-infra repo has betrayed us. For many days, we've had failing scripts because the wrong versions were running. Bring them over to Socorro.
Notes about a possible plan/roadmap: https://wiki.mozilla.org/Breakpad/Status_Meetings/2015-09-30#Operations_Updates
Correlations Roadmap the correlation reports are already in a general form similar to the standard socorro FTS apps. I propose literally bringing them over to the Socorro repo and refactoring them as TransformRules to be run by a processor-like FTS app. They could then be run as either long running apps like a processor, reading a RMQ stream of crashes and spewing output at the end of each day OR run as daily crontabber jobs. There are advantages and disavantages to both system. However, with the flexibility of the FTS apps, the code base is identical and we can experiment with whatever works best for us. Steps: 0) proof of concept - throw together a example of this working. 0.1) DONE, 0.2) output: http://hastebin.com/nalilazusu.coffee 0.3) code at https://github.com/twobraids/socorro/blob/correlation-conjunction/socorro/analysis/correlations/correlation_rules.py#L121 1) create a tarfile crashstorage system compatible with the correlation scripts for temporary use as as data input. See Bug 1210260 1.1) DONE 2) disambiguate the dump file location in the crashstorage system. This will unifies all the crashstorage classes that suffered some evolutionary bifurcation as they were used differently by the processor FTS family vs the submitter FTS family. See Bug 1212334 3) unify the iterator models between the processor and submitter families of FTS apps. Since the correlation reports will be encoded as TransformRules, this means using the processor as the base app class. However, the processor is long running and its iteratation system doen't ever raise the StopIteration exception. The submitter app class has the ability to allow an iterator to run to exhaustion and then quit, but doesn't have a good method of using a TransformRule system. See Bug ... 4) unify the transform models between the processor and submitter families. The transformation done by FTS apps have been coded using inheritance rather than with dependency injection. Switch them to dependency injection and then the submitter and processor families can then share transformation algorithms easily. In other words, this collapses repeated code between the two families. This lets us have FTS apps with the processor's transforms and the submitter's iteration model. See Bug ... 5) code the correlation scripts into TransformRules. See Bug ... 6) add transform rules to either a CorrelationsProcessorApp Or CorrelationsProcessorCronTabberJob. See Bug ...
What about backfill? If done in a stream of a daemon, what do we do if we (accidentally) kill that daemon (e.g. a deployment or an emergency)? Perhaps I'm out of my depth but I don't see a new tool that goes back in time in that roadmap.
none of this code work requires that we use the correlation rules in an existing processor that's getting a live stream. The backfill and the day-interrupted-by-power-failure problems indicate that we ought to do this through crontabber. No matter how it is eventually deployed, all this same work has to be done. To create the crontabber version, we just make a crontabber job with backfill capabilities using the FTS core as the script. data source: phase 1) we can have this crontabber app get its data from the tar file using the TarFileCrashStore that just landed (after having created another crontabber app to create the tar file). phase 2) in the feed the a PG Query to select the date range of crashes that we want. It can then draw the processed crashes from either PG or S3: we'll experiment to see what works better. output: phase 1) we push the data into tar files a the end of a day's run phase 2) we swap out the tarfile for some other system such as PG or S3
For what it's worth; I think we should do S3 upload as part of phase 1. Mind you, I might have understood your roadmap and I'm sorry for that but the point I'm trying to make, is that when this because a crontabber app we do NOT want to write to disk. At all. The only reliable disk, going forward, is S3. (or some persistent database would be good too)
peterbe: Bug 1218220 sets up a BotoS3 connection class to be used to put correlation reports into S3.
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/610c4f66fcc282ce85a3a4050c3336cfc5963694 Merge pull request #3077 from twobraids/common-corr Fixes Bug 1209526 - the correlation scripts as TransformRules + correlations_app
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.