Check/improve monitoring for wiki.mozilla.org

ASSIGNED
Assigned to

Status

Infrastructure & Operations
MOC: Service Requests
ASSIGNED
24 days ago
2 days ago

People

(Reporter: ericz, Assigned: fauweh)

Tracking

Details

Attachments

(1 attachment)

(Reporter)

Description

24 days ago
wiki.mozilla.org just migrated to Nubis/AWS so now is a good time to make sure we have some external monitoring on it (Pingdom?) and/or use the built-in stuff like Prometheus.
(Assignee)

Comment 1

24 days ago
Created attachment 8922914 [details]
wiki.m.o-synthetics.png

Eric, we have a basic uptime check in NR Synthetics for wiki.m.o. See the attached image for performance over the past 3 days. NR is seeing 503 errors pretty regularly since cutting over to Nubis. 

We can also turn this into a scripted browser test to perform a search or look for specific elements.

P.S. I don't see that the alerts made it to PagerDuty so I'll be reaching out to New Relic to see why we weren't alerted, to give you guys a heads up on these 503s.
Assignee: nobody → kferrando
Status: NEW → ASSIGNED
(Reporter)

Comment 2

24 days ago
Thanks Keegan!  We're increasing our instance sizes to fight those 503s, I'll reach out to you when that's done.
(Reporter)

Comment 3

21 days ago
Those 503s should be gone now (did some stress testing of the site this morning so that may show as an exception).  NR Synthetics basic uptime check seems good, is there any other monitoring that we can do here?
(Assignee)

Comment 4

16 days ago
The last error was at: Oct 30th 2017, 7:57:16 so I think we're good there Eric.

We can turn this check into a true transaction test and have it perform some action such as a login, search for a known page, etc. Do you have any preference or recommendation here to ensure the service is truly functional?
Flags: needinfo?(eziegenhorn)
(Reporter)

Comment 5

14 days ago
Yes fetching a known page would be great.  https://wiki.mozilla.org/WeeklyUpdates is the page for the weekly project meeting and is likely to continue to exist, perhaps that'd work well?

We could make an account for monitoring to log in with though I don't know of a way to make it read-only as would be ideal.  It'd be a good additional step, I'll make a bug for that.
Flags: needinfo?(eziegenhorn)
(Reporter)

Comment 6

13 days ago
How about I request an account for a monitoring account and you can get the email and finish creating the account?  Is moc@mozilla.com a good address for your group?
(Assignee)

Comment 7

13 days ago
(In reply to Eric Ziegenhorn :ericz from comment #5)
> Yes fetching a known page would be great. 
> https://wiki.mozilla.org/WeeklyUpdates is the page for the weekly project
> meeting and is likely to continue to exist, perhaps that'd work well?

I will modify the check to hit this page.

> We could make an account for monitoring to log in with though I don't know
> of a way to make it read-only as would be ideal.  It'd be a good additional
> step, I'll make a bug for that.

Sounds good!

> How about I request an account for a monitoring account and you can get the
> email and finish creating the account?  Is moc@mozilla.com a good address
> for your group?

That works. Yes, that's a good email address to use as well.

Thanks Eric!
(Reporter)

Comment 8

13 days ago
Great, moc@mozilla.com account requested.  You should be able to click a link in the email to set your password.
Bulk change per https://bugzilla.mozilla.org/show_bug.cgi?id=1417607
QA Contact: lypulong → kferrando
You need to log in before you can comment on or make changes to this bug.