Closed Bug 975614 Opened 11 years ago Closed 11 years ago

APK Production: Monitoring

Categories

(Cloud Services :: Operations: Marketplace, task, P1)

x86
macOS
task

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: oremj, Assigned: whd)

References

Details

No description provided.
Blocks: 960779
Blocks: 960764
Note that we need, at least, an http check for sentry.
We should also monitor the review instances at: https://controller-review.apk.firefox.com https://generator-review.apk.firefox.com https://signer-review.apk.firefox.com For signer: signer.apk.firefox.com/system/tools
For controller: /system/generator returns 203 on success For generator: /system/signer returns 203 on success
Assignee: server-ops-amo → whd
Some basic monitoring (controller sites and sentry http) is available here: https://opsview.shared.us-west-2.prod.mozaws.net/status/hostgroup?parentid=1 (web admin password is in hiera)
Priority: -- → P1
controller{,-review}.apk.firefox.com (which now respond to pings) have been added to PHX1 nagios. I'll be adding AWS monitoring presently for the other sites.
https://generator.apk.firefox.com https://signer.apk.firefox.com https://generator-review.apk.firefox.com https://signer-review.apk.firefox.com All these internal ELBs have AWS API ELB checks now that will alert if any hosts behind them are unhealthy, and the alerts should go to both pagerduty (APK escalation path) and amohubot.
Stackdriver<->Pagerduty is set up now such that any "PROD - APK Factory" health check failure (something turning red) in Stackdriver will send an alert Pagerduty. I can set up an SNS hook to add IRC notifications as well.
I have fixed the Opsview escalation problem. Everything spec'd here is implemented, but we can always add more and tweak as needed. Closing this.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Component: Server Operations: AMO Operations → Operations: Marketplace
Product: mozilla.org → Mozilla Services
You need to log in before you can comment on or make changes to this bug.