Closed
Bug 753728
Opened 12 years ago
Closed 12 years ago
browserid.org/verify is on occasion throwing 500 errors (being caught on the load balancer)
Categories
(Cloud Services :: Operations: Miscellaneous, task)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: boozeniges, Assigned: petef)
Details
(Whiteboard: [qa+])
On both the mozillaignite (https://github.com/rossbruniges/mozilla-ignite/tree/stage) and webmaker (https://github.com/rossbruniges/make.mozilla.org) projects we've recently been experiencing irregular login failures - as of around yesterday afternoon (2pm, GMT). After a bit of digging and adding in extra logs we were able to find the following error being thrown: django_browserid.base:INFO Verification URL: https://browserid.org/verify :/projects/mozilla/ignite/mozilla-ignite/vendor-local/src/django-browserid/django_browserid/base.py:118 django_browserid.base:DEBUG Failed to decode JSON. Resp: 500, Content: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> <html> <head> <meta http-equiv="Content-Type" content="text/html;charset=utf-8"> <title>Service Unavailable</title> <style type="text/css"> body, p, h1 { font-family: Verdana, Arial, Helvetica, sans-serif; } h2 { font-family: Arial, Helvetica, sans-serif; color: #b10b29; } </style> </head> <body> <h2>Service Unavailable</h2> <p>The service is temporarily unavailable. Please try again later.</p> </body> </html> projects/mozilla/ignite/mozilla-ignite/vendor-local/src/django-browserid/django_browserid/base.py:107 Both have recently been updated to use the new playdoh - so that django can be deployed out if it, fixing this issue (https://github.com/mozilla/playdoh/issues/107) Setting as major as both projects being effected are hoping to be deployed next week, and also as this may be an issue on other sites using browserID and django-browserID.
Comment 1•12 years ago
|
||
browserid.org is Mozilla Services. Moving...
Assignee: server-ops-infra → nobody
Component: Server Operations: Infrastructure → Operations
Product: mozilla.org → Mozilla Services
QA Contact: jdow → operations
Version: other → unspecified
Reporter | ||
Comment 2•12 years ago
|
||
Thanks Shyam - that was the one thing that I wasn't sure about :)
Comment 3•12 years ago
|
||
You're welcome Ross. I've poked the services ops folks on IRC as well, if this is incorrect, they'll move it to the right place and look at it.
Assignee | ||
Comment 4•12 years ago
|
||
I think this is probably fixed now. The verifier pool in scl2 had all backends marked as draining for some odd reason, so when GSLB gave you scl2's IP address for browserid.org, /verify calls would fail. We need to start QAing the verifier service before we undrain a datacenter during a push.
Assignee: nobody → petef
Status: NEW → ASSIGNED
Comment 5•12 years ago
|
||
Yes. I agree. Seems like I need to update our Test Plan to add a section for Prod push specific stuff. Probably something that could be automated...
Whiteboard: [qa+]
Assignee | ||
Comment 6•12 years ago
|
||
Filed bug 753828 to monitor zeus pools so we'll catch this before undraining a datacenter in the future.
Comment 7•12 years ago
|
||
Should be covered by a script that :jrgm has. Also, running it once per colo per Prod push should be enough to verify everything is working as expected (getting a 200 back rather than a 500).
Assignee | ||
Updated•12 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Comment 8•12 years ago
|
||
Marking as Verified since we appear to have everything in place, including tests for Prod.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•