Closed Bug 993458 Opened 11 years ago Closed 11 years ago

self-serve running extremely slow and often getting "Service Unavailable" messages

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: RyanVM, Unassigned)

References

Details

All trees currently closed.
bug 983125 landed last week and could be related. I'm just poking logs of our machines that run selfserve
selfserve agents seem to be running OK according to logs. dustin pointed out the web app is buildapi and looks to be the source of the problem. restarting now
Hm, lots of [Tue Apr 08 08:30:34 2014] [error] [client 10.22.81.208] Script timed out before returning headers: buildapi.wsgi, referer: https://secure.pub.build.mozilla.org/buildapi/self-serve/mozilla-inbound/rev/468b75c559b8 Restarting Apache on the webheads seems to have it responding faster, but of course that just means it will happen again :(
seems to have gotten better thanks to dustins probing. for the record (and my own), it it's a HTTP problem like request timeouts, it is the buildapi web app issue. if requests are not getting completed (from here: https://secure.pub.build.mozilla.org/buildapi/self-serve/jobs), it's probably do to the self serve agents. troubleshooting both can be found here as of right now: https://wiki.mozilla.org/ReleaseEngineering/How_To/Restart_BuildAPI closing. if it happens again we can re open
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
I'm fairly certain this will happen again. We'll see how quickly. I'm going to move this to webops since I'll likely be pinging them for help debugging when this happens again.
Assignee: nobody → server-ops-webops
Status: RESOLVED → REOPENED
Component: Buildduty → WebOps: Other
Product: Release Engineering → Infrastructure & Operations
QA Contact: armenzg → nmaul
Resolution: FIXED → ---
OK, we dialed mod_wsgi up to 6 processes, with one thread each, rather than the 2/2 it was configured for. I'm guessing here, but presumably we've reached the point where we get more than four simultaneous buildapi connections, at which point requests queue up. This had the side-effect of restarting Apache, so whether this medicine worked or not, it's back to normal now. Still, I'm going to optimistically call this done.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.