Closed Bug 1117776 Opened 9 years ago Closed 8 years ago

document reprocessing

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rhelmer, Unassigned)

References

Details

There is a reprocessing crontabber job (socorro.cron.jobs.reprocessingjobs.ReprocessingJobsApp|5m) which reads the reprocessing_jobs table.

For instance:
insert into reprocessing_jobs
  select uuid from reports
  where date_processed > now() - '7 days'::interval;

This isn't currently documented, it should be.
I agree that this should be documented.  I further propose that a new section be added to the docs for common admin / maint / operational / etc tasks.
Depends on: 1117778
Notice that this will pull processed crashes from the reports table. If you need to process recorded crashes that never went through processing to begin with you'll need to grab the relevant UUIDs from S3 and enter them that way. Command line is easier than cyberduck.
To retrieve the list of uuids, I'm running this:

aws s3 ls s3://org.mozilla.crash-stats.production.crashes/v1/raw_crash/ | grep 2015-07-0[6-7] > crashreports

However, it's sure taking quite a long time, so I'm hoping rhelmer may have some trick he can share that he does when he needs a list of uuids.  In the meantime, it's pulling at the rate of about 300 per minute.
(In reply to JP Schneider [:jp] from comment #3)
> However, it's sure taking quite a long time, so I'm hoping rhelmer may have
> some trick he can share that he does when he needs a list of uuids.  In the
> meantime, it's pulling at the rate of about 300 per minute.

Listing the S3 bucket is going to be terribly slow and should be an absolute last resort - if there were crashes that were never written to the reports table that need reprocessing, one place to get them from would be the collector/crashmover logs.

S3 list by date would be much faster and more realistic if crashes were stored with the date in the prefix instead of at the very end, e.g.:

s3://bucket/v2/raw_crash/2015-07-06/...
Ah, shazbot, ok.  Well, we didn't grab those logs, so I'm doing this from an ec2 node in screen.


[centos@i-c5757232 ~]$ aws s3 --region us-west-2 ls s3://org.mozilla.crash-stats.production.crashes/v1/raw_crash/ | grep 2015-07-0[6-7] > crashreports
Moved the node:
[centos@i-101f1be6 ~]$ aws s3 --region us-west-2 ls s3://org.mozilla.crash-stats.production.crashes/v1/raw_crash/ | grep 2015-07-0[6-7] > crashreports
I don't know where to put this but Lars shared this trick. He ssh'es into a node with consul and creates this bash script::

#/usr/bin/bash
. /data/socorro/socorro-virtualenv/bin/activate
envconsul -prefix socorro/common -prefix socorro/processor socorro submitter \
    --destination.crashstorage_class=socorro.external.rabbitmq.crashstorage.RabbitMQCrashStorage \
    --destination.routing_key=socorro.reprocessing \
    --producer_consumer.producer_consumer_class=socorrolib.lib.task_manager.TaskManager \
    --source.temporary_file_system_storage_path=/tmp \
    --new_crash_source.new_crash_source_class=socorro.collector.submitter_app.DBSamplingCrashSource \
    --new_crash_source.crash_id_query="select '$1'"

Then he runs that like this:

./reprocess.sh some-long-uuid-thing


Talking directory to RabbitMQ, instead of the PG reprocessing_jobs table, should be a priority since it goes straight to meat rather than having to involved Postgres and a crontabber app.
We now have the "Reprocess" tool on report index. A tool is better than documentation :)

Also, I've updated the Mana documentation with the new headlines:
* "To re-process a UUID"
* "To re-process lots of UUIDs"
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.