Closed
Bug 1281977
Opened 8 years ago
Closed 8 years ago
auth checks in pingdom
Categories
(Infrastructure & Operations :: MOC: Projects, task)
Infrastructure & Operations
MOC: Projects
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: mdevney, Assigned: mdevney)
Details
Attachments
(1 file)
141.33 KB,
image/png
|
Details |
The following auth checks are showing down on pingdom, but not alerting in any other way. Is this expected behavior? Bad checks? Actionable alerts?
mxr.mozilla.org/addons
mxr.mozilla.org/amo-stage
passwordreset-dev.allizom.org
passwordrest.allizom.org
phonebook-dev.allizom.org
phonebook.allizom.org
phonebook.mozilla.org
plugins.allizom.org
pto.allizom.org
www-demo1.allizom.org
www-demo3.allizom.org
www-demo4.allizom.org
www-dev.allizom.org
For contrast many other auth checks are fine, such as reps.mozilla.org, pto.mozilla.org, netops-apps.allizom.org.
Assignee | ||
Comment 1•8 years ago
|
||
Tested phonebook.mozilla.org and intranet.mozilla.org. I can login to both. Suspect the checks are wrong.
Assignee | ||
Comment 2•8 years ago
|
||
Both these checks are configured to go to https://<URL> and look for a 401 code.
In prod (plugins.mozilla.org) that happens:
$ curl --head https://plugins.mozilla.org
HTTP/1.1 401 Authorization Required
Server: Apache
X-Backend-Server: plugins2.webapp.phx1.mozilla.com
WWW-Authenticate: Basic realm="Use your ldap username/password"
Vary: User-Agent, Accept-Encoding
Cache-Control: max-age=3600
Content-Type: text/html; charset=iso-8859-1
Strict-Transport-Security: max-age=31536000; includeSubDomains
Date: Fri, 24 Jun 2016 03:04:03 GMT
Expires: Fri, 24 Jun 2016 4:04:03 GMT
Transfer-Encoding: chunked
Via: Moz-Cache-zlb1
Connection: Keep-Alive
X-Cache-Info: not cacheable; response code not cacheable
In plugins.allizom.org we get something completely different:
$ curl --head https://plugins.allizom.org
HTTP/1.1 303 See Other
Server: Apache
X-Backend-Server: plugins2.stage.webapp.phx1.mozilla.com
Cache-Control: private, must-revalidate
Content-Type: text/html; charset=iso-8859-1
Date: Fri, 24 Jun 2016 03:04:14 GMT
Location: https://plugins.allizom.org/mellon/login?ReturnTo=https%3A%2F%2Fplugins.allizom.org%2F&IdP=http%3A%2F%2Fwww.okta.com%2Fexk17zkmwil4G5FGu1d8
Transfer-Encoding: chunked
Connection: Keep-Alive
That backend server plugins2.stage.webapp.phx1.mozilla.com belongs to webops, so maybe ask them?
Assignee | ||
Comment 3•8 years ago
|
||
< gcox> mxr is dead, fwiw. see the tail of 1279952.
https://bugzilla.mozilla.org/show_bug.cgi?id=1279952
Removing mxr from pingdom.
Updated•8 years ago
|
Assignee: nobody → mdevney
Component: MOC: Problems → MOC: Projects
Comment 4•8 years ago
|
||
Auth for multiple websites were moved from LDAP -> 2FA/Okta/Mellon. This would've broken the check, since the websites would now do auth via web vs. HTTP auth.
Assignee | ||
Comment 5•8 years ago
|
||
Thanks Ashish. Should we remove these checks? Or update them for 2fa/etc.?
Assignee | ||
Comment 6•8 years ago
|
||
20:35 < atoll> if anything pingdom alarms about securitywiki, change it from expecting 401 to expect 303.
20:36 < jlaz> atoll: roger dodger
20:45 < jedi> atoll: does that go for any site that's been switched over from ldap auth to 2fa/okta ?
20:45 -!- arr1 [arr@moz-428jqk.01t5.ifmj.0181.2601.IP] has joined #systems
20:45 < atoll> yes
20:46 < jedi> :D
Assignee | ||
Comment 7•8 years ago
|
||
Updated pingdom checks from 401 to 303. Now they SHOULD work.
Example:
$ curl --head https://phonebook.mozilla.org
HTTP/1.1 303 See Other
Server: Apache
X-Backend-Server: generic3.webapp.phx1.mozilla.com
Vary: Accept-Encoding
Cache-Control: private, must-revalidate
Content-Type: text/html; charset=iso-8859-1
Date: Wed, 29 Jun 2016 05:14:04 GMT
Location: https://phonebook.mozilla.org/mellon/login?ReturnTo=https%3A%2F%2Fphonebook.mozilla.org%2F&IdP=http%3A%2F%2Fwww.okta.com%2Fexk17eqkpj9pJg4C11d8
Transfer-Encoding: chunked
Connection: Keep-Alive
Set-Cookie: X-Mapping-ddfpehom=A0CC3698F33F00F4F289BE6E35D1250A; path=/
X-Cache-Info: not cacheable; response code not cacheable
...but they don't.
Opened Pingdom support ticket 169759.
Assignee | ||
Comment 8•8 years ago
|
||
Eric Masser (Pingdom)
Jun 29, 13:47 CEST
Hi Matthew,
Thank you for contacting us!
I am running a simple GET request from each of our probe servers and they are reporting a SSL/TLS issue.
Are you seeing the Transaction Monitors requests in the logs for your server?
Best regards,
Eric Masser
www.pingdom.com
Hi Eric,
What IP(s) or hostname(s) should I grep for?
Also, what ssl error do you see? I don't see any errors... Everything checks out with 'openssl s_client -connect host:443'
Please advise.
Hi again,
The TMS uses this user-agent string "(Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PingdomTMS/0.8.1 Safari/534.34)".
The issue is probably not a certificate issue as our probes does not validate certificates.
I will have to ask our operations team to take a look as to what this TLS/SSL error might mean tomorrow.
Best regards,
Eric Masser
www.pingdom.com
Hi Eric,
Yes, we do see that user-agent hitting our hosts. See below.
Looks like that user-agent is properly getting served a 303 as expected.
access_2016-06-29-05.gz:64.120.6.122 - - [29/Jun/2016:05:27:19 +0000] "GET / HTTP/1.1" 303 352 "-" "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PingdomTMS/0.8.5 PhantomJS/1.9.0 Safari/534.34"
access_2016-06-29-05.gz:64.120.6.122 - - [29/Jun/2016:05:27:19 +0000] "GET /mellon/login?ReturnTo=https%3A%2F%2Fintranet.mozilla.org%2F&IdP=http%3A%2F%2Fwww.okta.com%2Fexk17jns1dxOPXVPy1d8 HTTP/1.1" 303 1333 "-" "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PingdomTMS/0.8.5 PhantomJS/1.9.0 Safari/534.34"
access_2016-06-29-05.gz:64.120.6.122 - - [29/Jun/2016:05:28:36 +0000] "GET / HTTP/1.1" 303 352 "-" "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PingdomTMS/0.8.5 PhantomJS/1.9.0 Safari/534.34"
access_2016-06-29-05.gz:64.120.6.122 - - [29/Jun/2016:05:28:36 +0000] "GET /mellon/login?ReturnTo=https%3A%2F%2Fintranet.mozilla.org%2F&IdP=http%3A%2F%2Fwww.okta.com%2Fexk17jns1dxOPXVPy1d8 HTTP/1.1" 303 1337 "-" "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PingdomTMS/0.8.5 PhantomJS/1.9.0 Safari/534.34"
access_2016-06-29-11.gz:64.120.6.122 - - [29/Jun/2016:11:41:49 +0000] "GET / HTTP/1.1" 303 352 "-" "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PingdomTMS/0.8.5 PhantomJS/1.9.0 Safari/534.34"
access_2016-06-29-11.gz:64.120.6.122 - - [29/Jun/2016:11:41:49 +0000] "GET /mellon/login?ReturnTo=https%3A%2F%2Fintranet.mozilla.org%2F&IdP=http%3A%2F%2Fwww.okta.com%2Fexk17jns1dxOPXVPy1d8 HTTP/1.1" 303 1349 "-" "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PingdomTMS/0.8.5 PhantomJS/1.9.0 Safari/534.34"
access_2016-06-29-11.gz:64.120.6.122 - - [29/Jun/2016:11:46:53 +0000] "GET / HTTP/1.1" 303 352 "-" "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PingdomTMS/0.8.5 PhantomJS/1.9.0 Safari/534.34"
access_2016-06-29-11.gz:64.120.6.122 - - [29/Jun/2016:11:46:53 +0000] "GET /mellon/login?ReturnTo=https%3A%2F%2Fintranet.mozilla.org%2F&IdP=http%3A%2F%2Fwww.okta.com%2Fexk17jns1dxOPXVPy1d8 HTTP/1.1" 303 1345 "-" "Mozilla/5.0 (Unknown; Linux x86_64) AppleWebKit/534.34 (KHTML, like Gecko) PingdomTMS/0.8.5 PhantomJS/1.9.0 Safari/534.34"
Please advise what may be the matter. Thank you!
Comment 9•8 years ago
|
||
Interpreting the Pingdom interactive UI for building a check, they do not offer the ability to check on 3XX codes so you might explicitly confirm that understanding with Pingdom support.
I would also clarify that we can't even seem to get the check to pass after we finally get to a status 200 page when we update the check to look for 200.
Another interesting thing that might be freaking out Pingdom's check methods, there are multiple redirects before we even get to the 200 auth page.
curl -L -k -v https://phonebook.mozilla.org 2>&1|grep '< HTTP'
< HTTP/1.1 303 See Other
< HTTP/1.1 303 See Other
< HTTP/1.1 302 Moved Temporarily
< HTTP/1.1 200 OK
Assignee | ||
Comment 10•8 years ago
|
||
Eric Masser (Pingdom)
Jun 30, 16:26 CEST
Hi again,
It seems as though the responses from your API contains location headers as well, this means that the Transaction Monitor will follow those redirects, just like a regular browser would.
Best regards,
Eric Masser
www.pingdom.com
Hi Eric,
There are 3 redirects here:
< HTTP/1.1 303 See Other
< HTTP/1.1 303 See Other
< HTTP/1.1 302 Moved Temporarily
< HTTP/1.1 200 OK
If any of those codes resulted in the check coming up green I think that would do it for us.
However, see screenshot attached (rca.png). "HTTP status code should be 303 but is unavailable." This error seems pretty clear that the agent just isn't seeing, or isn't reporting, the code 303.
The received header below seems to confirm this. Seems like it should show any/all HTTP headers that it saw, but it's showing none.
Please advise how to make this check work. Thanks!
Assignee | ||
Comment 11•8 years ago
|
||
Screenshot of pingdom's "Root Cause Analysis" pane showing no headers where headers should be.
Assignee | ||
Comment 12•8 years ago
|
||
Hi again,
The Transaction Monitor will only test the last response in a redirect chain, which means that you can't test the status code of the redirecting responses.
The logs you showed earlier only seem to contain 300 series status codes and not the final 200 OK which is seen if you load the URL in a browser.
It could be something advanced going on with the redirect to Okta, this does however work fine:
Go to URL https://intranet.mozilla.org/
Wait for element #user-signin to exist
Best regards,
Eric Masser
Assignee | ||
Comment 13•8 years ago
|
||
Eric's recommended check actually seems to work -- and is probably a more thorough check anyway. I've implemented it on most of the erroring auth checks.
Example check:
Go to URL https://intranet.mozilla.org/
Wait for element #user-signin to exist
Assignee | ||
Comment 14•8 years ago
|
||
We're down to just 1 alert still alerting in Pingdom. I think we can close this.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Comment 15•8 years ago
|
||
So PingDom indirectly confirmed you can't monitor any 3XX level status codes, boo.
Also, they didn't seem to address why we don't see any headers in the "RCA" report, boo.
You need to log in
before you can comment on or make changes to this bug.
Description
•