bug 581187 contained a couple of different failure modes, and the check-in there addresses one of them, namely preventing node assignment. From the original: https://bugzilla.mozilla.org/show_bug.cgi?id=581187#c0 "If the host is all the way down, instead of merely refusing MySQL connections, then the webheads run out of apache processes. This is because the MySQL connection timeout is long. (60s) We can shorten this, but even at 5s, I could see running out of apache processes at high load." We need to be able to mark a back-end server as down so a webhead won't waste time and processes talking to a dead or dying host. This needs to be a flag, so the configuration info is preserved. (my current hack of setting the host IP to 127.0.0.1 is not a solution)
There's now a mechanism were a webhead server can be marked as down https://hg.mozilla.org/services/server-storage/file/fe062ea80c05/syncstorage/wsgiapp.py#l130 But we could deal with this earlier in the stack (NGinx)
I believe this is being done at the Zeus level primarily and also through various downed mechanisms, so resolving.
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.