1583930 - [tracker] switch from pubsub to aws sqs

fix the s3mock issue in Antenna so we could update to the latest boto3/botocore (bug #1417807)
update Socorro from boto to boto3 which involved a semi-extensive rewrite (bug #1433148) which also fixes AWS auth things (bug #1425475)

I've finished that work and it's sitting on stage now. It'll go to prod in December.

However, I'm pretty sure my window for working on this SQS project is over and I need to switch back to MLS. I'm hoping I find time to pick this back up in 2020 and finish up the SQS project plan.

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Updated

•

6 years ago

Depends on: 1598765

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Updated

•

6 years ago

Depends on: 1601455

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Updated

•

6 years ago

Depends on: 1602120

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Updated

•

6 years ago

Depends on: 1602121

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Comment 3

•

6 years ago

I landed the last of the required code changes for supporting AWS SQS. I updated the migration plan. I think we're all set to migrate in January 2020. I'll work with Brian to schedule that.

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Updated

•

6 years ago

Depends on: 1605716

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Updated

•

5 years ago

Depends on: 1617008

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Updated

•

5 years ago

Depends on: 1617187

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Updated

•

5 years ago

Depends on: 1617977

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Updated

•

5 years ago

Depends on: 1618201

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Comment 4

•

5 years ago

We switched stage over to AWS SQS yesterday. First the collector (5:18pm EST), waited for the queue to dry up, then the processor (5:35pm EST), then the webapp (6:31pm EST).

The pub/sub queue graphs continue to show a non-zero number that fluctuates up and down ranging from like 14 to 30. I don't know where those crash ids are coming from or where they're going to. Socorro has a job that runs nightly to catch any crash reports that weren't processed, so I don't think there's any risk here. It's just curious.

Everything else looks fine.

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Comment 5

•

5 years ago

Regarding the stage migration, the graphs in Grafana for Pub/Sub were showing prod--not stage. That's why they looked curious after the migration.

We switched prod over to AWS SQS today. I sent an email to the stability mailing list and notified #stability and #breakpad on chat.mozilla.org. First we switched the collector over (9:35am EDT), waited for pub/sub queue to dry up, then the processor (9:55am EDT), then the webapp (10:10am EDT).

Pub/Sub queue hit zero. Nothing in Sentry. Grafana graphs look fine.

I notified stability mailing list and #stability and #breakpad on chat.mozilla.org that the maintenance window was done.

I'll keep an eye on things for the rest of today and write up bugs for removing Pub/Sub configuration, code, and documentation.

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Updated

•

5 years ago

Assignee: nobody → willkg

Status: NEW → ASSIGNED

Will Kahn-Greene [:willkg] ET needinfo? me

Assignee

Comment 6

•

5 years ago

Everything looks fine.

We pushed all the Pub/Sub code removal to production for Socorro and Antenna.

We're done here. Marking as FIXED.

Status: ASSIGNED → RESOLVED

Closed: 5 years ago

Resolution: --- → FIXED

Bugzilla

[tracker] switch from pubsub to aws sqs

Categories

(Socorro :: General, task, P2)

Tracking

(Not tracked)

People

(Reporter: willkg, Assigned: willkg)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Updated

Updated

Updated

Comment 2

Updated

Updated

Updated

Updated

Comment 3

Updated

Updated

Updated

Updated

Updated

Comment 4

Comment 5

Updated

Comment 6