Closed
Bug 1179132
Opened 10 years ago
Closed 10 years ago
Determine/monitor whether GA UT spikes will require Amazon pre-warming
Categories
(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)
Cloud Services Graveyard
Metrics: Pipeline
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: whd, Assigned: whd)
Details
(Whiteboard: [rC] [unifiedTelemetry])
The client fuzzes the daily ping submission over an hour (see bug #1140037), which gives our servers a chance to cope with the load for timezones where we have a lot of users. As per that bug however, the spikes may possibly still be great enough to overwhelm the ELB, depending on what GA traffic actually looks like. This should not cause data loss if the client retries whe receiving a 5xx code, but it is obviously not desirable. We should be able to monitor ELB metrics (e.g. surge queue) to determine if we need to file a support request with Amazon to keep the ELBs warm, which can be accomplished with:
https://mana.mozilla.org/wiki/display/SVCOPS/AWS+Pre-warming+ELBs
Updated•10 years ago
|
Assignee: nobody → whd
Priority: -- → P1
Whiteboard: [rC] [unifiedTelemetry]
Comment 1•10 years ago
|
||
https://aws.amazon.com/articles/1636185810492479#pre-warming "AWS recommends spikes no more than 50% over a five minute interval", looking at our current traffic pattern, we have a regular spike at about 14:45 PDT that is the only spike fast enough to exceed the AWS recommended limit. Based on past experience with tiles, I think this particular spike is likely to smooth out enough when UT hits GA that it won't be an issue. That moment when UT hits GA might be different, but it should only happen once, and the retry will take care of it. Based on this being a non-lossy (because retries), i think we should set up monitoring, and deal with this reactively instead of proactively. Specifically I think we should set monitoring to alert on 5xx spikes from the ELB.
Updated•10 years ago
|
Iteration: --- → 43.1 - Aug 24
| Assignee | ||
Comment 2•10 years ago
|
||
I agree with :relud's assesment here. We have an alert for ELB 5xxs, and given the soft launch of 5%, we can monitor this and if needed deal with it more proactively for full release volume. Closing.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•