Closed
Bug 1220585
Opened 9 years ago
Closed 9 years ago
[snippets] AVG Response time doubled
Categories
(Infrastructure & Operations Graveyard :: WebOps: Engagement, task)
Infrastructure & Operations Graveyard
WebOps: Engagement
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: giorgos, Unassigned)
References
Details
(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/2077] )
NewRelic reports that average response time for snippets.mozilla.com-python doubled from ~25ms to ~50ms on Oct 28th. https://rpm.newrelic.com/accounts/263620/applications/2904874 Snippet editors report significant performance downgrade. This is probably related to the PHX1 exit. Is this something we can fix with more machine power or is it a another bottleneck?
Reporter | ||
Comment 1•9 years ago
|
||
Bumping importance to major since this reduces our ability to work with the service.
Severity: normal → major
Updated•9 years ago
|
Change Request: --- → emergency
Comment 2•9 years ago
|
||
Giorgos, We know what's causing this, putting in an emergency change request (because of the change freeze) to remedy. Thanks!
Comment 3•9 years ago
|
||
It looks like one of the virtual IPs that should have gotten shuffled in the move got missed. =\ I've switched that IP over to the new load balancing cluster and the response times in New Relic look more like the ones from a couple of weeks ago. @girgos: Can you verify that the snippets editors are able to work with the service again?
Comment 4•9 years ago
|
||
We've backed this out because it failed after 10-15 mins of being fine. I've spoken to Ben Sternthal and we'll revisit this after the change freeze is over on the 11th Nov.
Change Request: emergency → not successful
Comment 6•9 years ago
|
||
Can we call November 11th the delivery date for this? It makes it very hard to edit and deploy snippets, and we're in the middle of a campaign cycle.
Reporter | ||
Comment 7•9 years ago
|
||
Folks we need some eyes on this. can you give us an ETA?
Flags: needinfo?(smani)
Comment 8•9 years ago
|
||
(In reply to Giorgos Logiotatidis [:giorgos] from comment #7) > Folks we need some eyes on this. can you give us an ETA? We're working on this with netops, should be done tomorrow.
Flags: needinfo?(smani)
Comment 9•9 years ago
|
||
James, These are the IPs that are moving : [shyam@snippetsadm1.private.phx1 ~]$ host snippets-rw-zeus.phx1.mozilla.com snippets-rw-zeus.phx1.mozilla.com has address 10.8.70.90 [shyam@snippetsadm1.private.phx1 ~]$ host snippets-ro-zeus.phx1.mozilla.com snippets-ro-zeus.phx1.mozilla.com has address 10.8.70.99
Flags: needinfo?(jbarnell)
Comment 10•9 years ago
|
||
We'll do this at 1100 PST tomorrow. If this goes south, we will need some time to see why and therefore the site might be offline for a bit while we do that digging around. Thanks!
Flags: needinfo?(jbarnell)
Comment 11•9 years ago
|
||
(In reply to Shyam Mani [:fox2mike] from comment #10) > We'll do this at 1100 PST tomorrow. If this goes south, we will need some > time to see why and therefore the site might be offline for a bit while we > do that digging around. > > Thanks! Hi Shyam! Can you elaborate on which elements of the site will be offline? Is it just the snippets admin? Or is about:home also at risk?
Flags: needinfo?(smani)
Comment 12•9 years ago
|
||
(In reply to Cory Price [:ckprice] from comment #11) > Can you elaborate on which elements of the site will be offline? Sorry, "which elements of the site _may_ be offline"
Reporter | ||
Comment 13•9 years ago
|
||
(In reply to Cory Price [:ckprice] from comment #11) > (In reply to Shyam Mani [:fox2mike] from comment #10) > > We'll do this at 1100 PST tomorrow. If this goes south, we will need some > > time to see why and therefore the site might be offline for a bit while we > > do that digging around. > > > > Thanks! > Hi Shyam! > > Can you elaborate on which elements of the site will be offline? > > Is it just the snippets admin? Or is about:home also at risk? If the snippets servers go down users do not get a downgraded experience, i.e. the about:home works as expected. The only difference is that some users (the ones who will not hit the cache) will not get new snippets while the downtime lasts, which will affect the snippet views and metrics.
Comment 14•9 years ago
|
||
We have updated the VIPs to host from neo-phx1 and presuming they remain in good standing, we're done here.
Flags: needinfo?(smani)
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 15•9 years ago
|
||
I verify that the response time dropped back to ~20ms according to NR. Thanks folks!
Status: RESOLVED → VERIFIED
Updated•8 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•