Closed
Bug 1412372
Opened 7 years ago
Closed 6 years ago
Check/improve monitoring for wiki.mozilla.org
Categories
(Infrastructure & Operations :: MOC: Service Requests, task)
Infrastructure & Operations
MOC: Service Requests
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: ericz, Assigned: fauweh)
Details
Attachments
(1 file)
124.99 KB,
image/png
|
Details |
wiki.mozilla.org just migrated to Nubis/AWS so now is a good time to make sure we have some external monitoring on it (Pingdom?) and/or use the built-in stuff like Prometheus.
Assignee | ||
Comment 1•7 years ago
|
||
Eric, we have a basic uptime check in NR Synthetics for wiki.m.o. See the attached image for performance over the past 3 days. NR is seeing 503 errors pretty regularly since cutting over to Nubis. We can also turn this into a scripted browser test to perform a search or look for specific elements. P.S. I don't see that the alerts made it to PagerDuty so I'll be reaching out to New Relic to see why we weren't alerted, to give you guys a heads up on these 503s.
Assignee: nobody → kferrando
Status: NEW → ASSIGNED
Reporter | ||
Comment 2•7 years ago
|
||
Thanks Keegan! We're increasing our instance sizes to fight those 503s, I'll reach out to you when that's done.
Reporter | ||
Comment 3•7 years ago
|
||
Those 503s should be gone now (did some stress testing of the site this morning so that may show as an exception). NR Synthetics basic uptime check seems good, is there any other monitoring that we can do here?
Assignee | ||
Comment 4•7 years ago
|
||
The last error was at: Oct 30th 2017, 7:57:16 so I think we're good there Eric. We can turn this check into a true transaction test and have it perform some action such as a login, search for a known page, etc. Do you have any preference or recommendation here to ensure the service is truly functional?
Flags: needinfo?(eziegenhorn)
Reporter | ||
Comment 5•7 years ago
|
||
Yes fetching a known page would be great. https://wiki.mozilla.org/WeeklyUpdates is the page for the weekly project meeting and is likely to continue to exist, perhaps that'd work well? We could make an account for monitoring to log in with though I don't know of a way to make it read-only as would be ideal. It'd be a good additional step, I'll make a bug for that.
Flags: needinfo?(eziegenhorn)
Reporter | ||
Comment 6•7 years ago
|
||
How about I request an account for a monitoring account and you can get the email and finish creating the account? Is moc@mozilla.com a good address for your group?
Assignee | ||
Comment 7•7 years ago
|
||
(In reply to Eric Ziegenhorn :ericz from comment #5) > Yes fetching a known page would be great. > https://wiki.mozilla.org/WeeklyUpdates is the page for the weekly project > meeting and is likely to continue to exist, perhaps that'd work well? I will modify the check to hit this page. > We could make an account for monitoring to log in with though I don't know > of a way to make it read-only as would be ideal. It'd be a good additional > step, I'll make a bug for that. Sounds good! > How about I request an account for a monitoring account and you can get the > email and finish creating the account? Is moc@mozilla.com a good address > for your group? That works. Yes, that's a good email address to use as well. Thanks Eric!
Reporter | ||
Comment 8•7 years ago
|
||
Great, moc@mozilla.com account requested. You should be able to click a link in the email to set your password.
Comment 9•7 years ago
|
||
Bulk change per https://bugzilla.mozilla.org/show_bug.cgi?id=1417607
QA Contact: lypulong → kferrando
Comment 10•6 years ago
|
||
Changing default QA contact per https://bugzilla.mozilla.org/show_bug.cgi?id=1431393
QA Contact: kferrando → mcristofi
Assignee | ||
Comment 11•6 years ago
|
||
Well I apparently went to lunch after comment 8, and never came back. I was looking into this monitoring as part of the incident in bug 1442962. I have added the wiki.m.o check to test that login works and that we can navigate to the weekly updates page after logging in. Note: login is not required for the weekly updates page, the single check just validates both functions to keep the check cost overhead down.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•