Closed Bug 1430937 Opened 6 years ago Closed 6 years ago

rewrite stage submitter as AWS lambda job

Categories

(Socorro :: General, task, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: willkg, Assigned: willkg)

References

Details

Attachments

(1 file)

The stage submitter is implemented as a Socorro app and runs via crontab (not crontabber!) every 2 minutes on a dedicated node in the webeng AWS infrastructure. It listens to the socorro.submitter RabbitMQ queue in the -prod environment, pulls the crash data from the -prod S3 crash bucket, assembles an HTTP POST payload, and then POSTs that payload as a crash report to collectors in -stage, -stage-new, and -prod-new.

Pigeon in the -prod environment populates the socorro.submitter queue with 10% of the crashes set to be processed in -prod. Thus the -stage, -stage-new, and -prod-new are only seeing 10% of the crashes intended for processing.

That sort of works, but it's clunky and it requires a dedicated node. Further, we need a way to take everything that's in -prod and submit it to -prod-new so that -prod-new can mirror -prod in regards to *all* the crash data.

One way to do this better would be to reimplement the stage submitter as an AWS Lambda app which kicks off for every S3:PutObject event, pulls the data from the -prod S3 crash bucket, assembles it into an HTTP POST, and submits it to one host. It'd have a few configurations:

1. submit-all / submit-accepted
2. throttle value
3. destination collector
4. s3 information

This bug covers investigating that idea more and if it seems like a good idea, implementing it.
Making this a P2 because we need it for the new infrastructure.
Priority: -- → P2
I spent some time updating pigeon which we run as an AWS Lambda job. As part of that, I looked at how fleshed out the dev environment is. It's not. There's some fundamental pieces still missing covered in bug #1356401 and bug #1432491.

I mulled over ways to fix them, but that'll take time. Until then, while I don't necessarily want to hard-block on those bugs, I also don't want to embark on rewriting the submitter as an AWS Lambda job until they've been figured out.

One of the plans that bumped this up in priority was this line:

"""
Further, we need a way to take everything that's in -prod and submit it to -prod-new so that -prod-new can mirror -prod in regards to *all* the crash data.
"""

Miles' current plan is different--it doesn't involve the submitter at all.

Given that, I think we should push this off in favor of other more pressing things and thus I'm bumping this down to P3.
Priority: P2 → P3
Bumping this up to P1 and grabbing this to do soon.

I'm pretty sure this will be straight-forward to implement. I've written a lot of the bits this will need in other places. Just need to pull all those bits into one Lambda job.
Assignee: nobody → willkg
Status: NEW → ASSIGNED
Priority: P3 → P1
Going back a bit--why AWS Lambda job vs. a component on a dedicated node?

1. It only needs to run in -prod, so we don't run this in -stage or in the local dev environment.

2. By making this a separate AWS Lambda job that triggers off of S3 ObjectCreated:Put events rather than pulling from a queue, then we can reduce the complexity of pigeon and related parts.

3. We can get rid of another thing to monitor in RabbitMQ.

4. In making this a separate project, we can remove the submitter bits and poster bits from Socorro.

We could accomplish a few of these things by making it a separate component, but it seems a lot easier to do this as an AWS Lambda job.
I've got parts of it done. Writing tests and fixing things as I go along now.

https://github.com/willkg/socorro-submitter
PR 3 landed in https://github.com/mozilla-services/socorro-submitter/commit/6f7bf65c10ea4b0eace148e7149fb59f14c2b56d

I wrote up bug #1456654 to set it up in prod. Marking this as FIXED.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: