Closed
Bug 1447785
Opened 7 years ago
Closed 7 years ago
large spike in number of running instances for balrog prod on Feb 5
Categories
(Cloud Services :: Operations: Miscellaneous, task)
Cloud Services
Operations: Miscellaneous
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: bhearsum, Assigned: oremj)
Details
I noticed this while staring at Datadog today - on Feb 5th (around 0800 UTC) we jumped from ~40 instances to ~90 instances. I don't see a spike in requests around that time, nor am I aware of a deployment happening. https://screenshots.firefox.com/1nMdAZCf25LZBOpL/app.datadoghq.com
It's not causing any problems, but it seems like something we should be able to explain. I see instances cycling around this time, which makes me wonder if we switched to smaller instances, and need more to handle the traffic now?
Flags: needinfo?(oremj)
Assignee | ||
Comment 1•7 years ago
|
||
I was occasionally getting paged, because the app would scale down too far and then get a huge burst of requests, which would degrade the service until it had time to scale back up, so pinned it to never go below 70, which seems about what we need to handle the traffic bursts.
It's possible that on Feb 5th, I manually flipped it to 90 for recovery.
Assignee: nobody → oremj
Flags: needinfo?(oremj)
Reporter | ||
Comment 2•7 years ago
|
||
Ahhhh, okay. Thanks!
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•