Closed
Bug 975614
Opened 11 years ago
Closed 11 years ago
APK Production: Monitoring
Categories
(Cloud Services :: Operations: Marketplace, task, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: oremj, Assigned: whd)
References
Details
No description provided.
Reporter | ||
Comment 1•11 years ago
|
||
Note that we need, at least, an http check for sentry.
Reporter | ||
Comment 2•11 years ago
|
||
Comment 3•11 years ago
|
||
We should also monitor the review instances at:
https://controller-review.apk.firefox.com
https://generator-review.apk.firefox.com
https://signer-review.apk.firefox.com
For signer:
signer.apk.firefox.com/system/tools
Comment 4•11 years ago
|
||
For controller:
/system/generator returns 203 on success
For generator:
/system/signer returns 203 on success
Reporter | ||
Updated•11 years ago
|
Assignee: server-ops-amo → whd
Comment 5•11 years ago
|
||
Added to PHX1 nagios:
https://controller.apk.firefox.com
AWS monitoring:
https://generator.apk.firefox.com
https://signer.apk.firefox.com
Assignee | ||
Comment 6•11 years ago
|
||
Some basic monitoring (controller sites and sentry http) is available here: https://opsview.shared.us-west-2.prod.mozaws.net/status/hostgroup?parentid=1 (web admin password is in hiera)
Reporter | ||
Updated•11 years ago
|
Priority: -- → P1
Assignee | ||
Comment 8•11 years ago
|
||
controller{,-review}.apk.firefox.com (which now respond to pings) have been added to PHX1 nagios. I'll be adding AWS monitoring presently for the other sites.
Assignee | ||
Comment 9•11 years ago
|
||
https://generator.apk.firefox.com
https://signer.apk.firefox.com
https://generator-review.apk.firefox.com
https://signer-review.apk.firefox.com
All these internal ELBs have AWS API ELB checks now that will alert if any hosts behind them are unhealthy, and the alerts should go to both pagerduty (APK escalation path) and amohubot.
Assignee | ||
Comment 10•11 years ago
|
||
Stackdriver<->Pagerduty is set up now such that any "PROD - APK Factory" health check failure (something turning red) in Stackdriver will send an alert Pagerduty. I can set up an SNS hook to add IRC notifications as well.
Assignee | ||
Comment 11•11 years ago
|
||
I misunderstood the meaning of the /system/signer and /system/generator checks. They are now implemented correctly at:
https://opsview.shared.us-west-2.prod.mozaws.net/cgi-bin/extinfo.cgi?type=2&service=APK+Controller+Generator+access&host=controller-review.apk.firefox.com
https://opsview.shared.us-west-2.prod.mozaws.net/cgi-bin/extinfo.cgi?type=2&service=APK+Controller+Generator+access&host=controller.apk.firefox.com
https://opsview.shared.us-west-2.prod.mozaws.net/cgi-bin/extinfo.cgi?type=2&service=APK+Generator+Signer+access&host=generator.apk.firefox.com
https://opsview.shared.us-west-2.prod.mozaws.net/cgi-bin/extinfo.cgi?type=2&service=APK+Generator+Signer+access&host=generator-review.apk.firefox.com
Opsview -> Pagerduty is having some issues when I try to segment the alerts so that only APK alerts go to APK escalation (and IRC); I'm addressing that presently.
Assignee | ||
Comment 12•11 years ago
|
||
I have fixed the Opsview escalation problem. Everything spec'd here is implemented, but we can always add more and tweak as needed. Closing this.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•10 years ago
|
Component: Server Operations: AMO Operations → Operations: Marketplace
Product: mozilla.org → Mozilla Services
You need to log in
before you can comment on or make changes to this bug.
Description
•