Closed
Bug 1289772
Opened 8 years ago
Closed 8 years ago
Stage submitter down to about a tenth again
Categories
(Socorro :: Infra, task, P1)
Socorro
Infra
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: peterbe, Unassigned)
References
Details
Attachments
(1 file)
455.00 KB,
image/png
|
Details |
The stage website is down but I can query ElasticSearch from locally and since Sunday 24 July, the stage submitter is down to about a tenth of what it usually is.
Comment 1•8 years ago
|
||
Making this critical and a P1 since it impacts development. I checked Datadog and this graph suggests the stage submitter dipped pretty significantly on Saturday, July 23rd at 17:10: https://app.datadoghq.com/monitors#238219?group=all&from_ts=1469160419267&to_ts=1469624916564 On that date, it drops from 300ish to 50ish. Since that's not 0, we're not getting any email notification. Why is it at 50ish? Is it possible we're getting crashes from other sources like testing? I'll log into the node now and see what I can see.
Severity: normal → critical
Priority: -- → P1
Reporter | ||
Comment 2•8 years ago
|
||
Note, it happened from the 15th July too https://bugzilla.mozilla.org/show_bug.cgi?id=1288170 and that was resolved but the bug was never resolved.
Comment 3•8 years ago
|
||
I'm nixing the depends... I talked with Peter and the datadog monitor pretty clearly shows the dip, so we don't need the -stage webapp working to work on this.
No longer depends on: 1289783
Comment 4•8 years ago
|
||
I logged into AWS and did a search on EC2 nodes for "submitter" and there are three of them. One is stopped. The other two are running. Shouldn't this be a highlander node? THERE CAN BE ONLY ONE SUBMITTER?
Comment 5•8 years ago
|
||
The node named "prod-submitter" has no logs in /var/logs/socorro/, so I can't see what's going on. I think JP said it's running in the background via screen, but that's not something I want to fiddle with without talking to JP first. I can't connect to the node named "prod-submitter5". The AWS console suggests it's running, but it's failing half its health checks. I'm going to have to talk to JP before I do anything with that, too.
Comment 6•8 years ago
|
||
I restarted the process once again, so we're back up to the correct number of submissions. We have a bug associated with this issue to 'realify' the prod submitter, which I *think* will resolve the issue.
Updated•8 years ago
|
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•