Closed Bug 1388449 Opened 7 years ago Closed 7 years ago

convenience scripts for processing crashes in docker local dev environment

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

Attachments

(3 files)

There are various ways to generate crash data already. That's great. However, it's tricky to get crash data into the docker-based dev environment.

This is something we're going to want to do often, so it behooves us to implement some things that make this easy. Maybe a set of shell scripts that can be used in different ways depending on what the user has already.

This bug covers writing something that solves the basic problem. Then we can iterate from there in other bugs.
Personally, I'd like a shell script that takes a directory that's like the pseudo-directory structure we use to store raw crash data on S3 and a list of crash_ids, verifies the files are there for each crash_id, copies the data into the s3 container, then adds crash_ids to the socorro.normal rabbitmq queue for the processor to process.

Then we can write additional scripts to generate that directory of files. We could also use Antenna. It'd depend on where the crash data was coming from.

I think I'm going to start with that.
Assignee: nobody → willkg
Blocks: 1387104
Status: NEW → ASSIGNED
Redoing the summary so it's clearer in intent.

I spent some time thinking about how to do this. I think I want to create at least two scripts:

1. copy_s3: Copies the contents from host disk to the s3 container. I'm not clear on exactly what shape this should take, but seems helpful to be able to copy a directory tree into the container and copy a directory tree out of a container.

2. add_crashid_to_queue: Given a rabbitmq queue and one or more crashids, adds the crashids to the specified queue. This is a generalization of send_to_stage.py which I wrote on the -prod admin box.

3. fetch_raw_crash: Given an API token, a host, and one or more crash ids, fetches raw crash data from that host and puts it in a raw crash directory structure.


To use them, you'd get the raw_crash data into a directory on your host computer, then you'd run copy_s3 to get it into the s3 container in the right place, then you'd run add_crashid_to_queue to add it to the processor queue, then run the processor and it'd process crashes in the queue.

Having these as individual atoms makes it easier to write scripts to automate things that have additional complex steps like data validation. I think these three scripts cover most/all of my needs.
Summary: shell script to set up a crash for processing in docker dev environment → convenience scripts for processing crashes in docker local dev environment
Commit pushed to master at https://github.com/mozilla-services/socorro

https://github.com/mozilla-services/socorro/commit/a9e892cac868aadacfcd50d7bc4e4fb893704c8e
bug 1388449 - add add_crashid_to_queue and scaffolding (#3915)

* bug 1388449 - add add_crashid_to_queue and scaffolding

* adds a is_crash_id_valid lib function for verifying crash ids
* adds a scripts/add_crashid_to_queue.py script which is a stub that executes
  functionality in socorro/scripts/add_crashid_to_queue.py
* adds unit tests for new module

* Add missing headers, de-duplicate epilog

* Fixes per Peter's suggestions
Commit pushed to master at https://github.com/mozilla-services/socorro

https://github.com/mozilla-services/socorro/commit/61498103ef420196c7b8ff455e7e2c348dbdf93a
bug 1388449 - add socorro_aws_s3.sh script (#3924)

* add awscli to the processor in a place where it shouldn't affect anything
* create socorro_aws_s3.sh script and friends that wrap aws by pulling
  configuration out of the environment making it easier to use

The end result of this is that we can copy files into and out of the s3
container, see what's in there, manipulate buckets, and so on.
Depends on: 1391637
Commit pushed to master at https://github.com/mozilla-services/socorro

https://github.com/mozilla-services/socorro/commit/b9b6798251fac02768382e7ea313941334a9da10
fixes bug 1388449 - implement fetch_crash_data and pull it all together (#3928)

* fixes bug 1388449 - implement fetch_crash_data and pull it all together

* add scripts/fetch_crash_data.py to fetch data from -prod
* add docker/as_me.sh to automate running one-off scripts in processor container
  with the host's uid/gid so file permissions are ok
* add documentation to processor docs covering how to use the scripts

* Fix raw_crash location
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: