If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Deal with persona instances that were taken out of service

RESOLVED FIXED

Status

Cloud Services
Operations
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: gene, Assigned: gene)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Description

4 years ago
I removed these instances from the persona-org-1203 loadbalancer in us-east-1
i-d88146a4
i-da8146a6

http://personastatus.org/#1389732300

This is what those two instances were causing

https://perf.identity.us-east-1.prod.mozaws.net/?from=12%3A00_20140114&height=308&width=586&_salt=1389742612.696&until=18%3A00_20140114&showTarget=aws.elb.1203.persona-org.httpcode_backend_5xx.sum.count&target=aws.elb.1203.persona-org.httpcode_backend_5xx.sum.count

Figure out the root cause or destroy and have them autoscale replace, or determine if there's an issue with the availability zone : us-east-1d
(Assignee)

Updated

4 years ago
Assignee: nobody → gene
(Assignee)

Comment 1

4 years ago
This problem occurred on 1 machine in us-west-2 this morning.

Root cause for all systems was not the amazon AZ, it was that the drives filled due to unrotated logs. I've cleared the /var/browserid/log/static.log file on all systems to give us some breathing room.

I've re-added the 2 instances above to the load balancer.

I'll prioritize this ticket : https://github.com/mozilla/identity-ops/issues/126
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
(Assignee)

Comment 2

4 years ago
4 of the 16 machines on which I cleared the static.log file began serving 500's after doing so (the complexities of log rotation).

I restarted services on those 4 systems and they hosts are healthy again
You need to log in before you can comment on or make changes to this bug.