Closed
Bug 993458
Opened 11 years ago
Closed 11 years ago
self-serve running extremely slow and often getting "Service Unavailable" messages
Categories
(Infrastructure & Operations Graveyard :: WebOps: Other, task)
Infrastructure & Operations Graveyard
WebOps: Other
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: RyanVM, Unassigned)
References
Details
All trees currently closed.
Comment 1•11 years ago
|
||
bug 983125 landed last week and could be related. I'm just poking logs of our machines that run selfserve
Comment 2•11 years ago
|
||
selfserve agents seem to be running OK according to logs.
dustin pointed out the web app is buildapi and looks to be the source of the problem.
restarting now
Comment 3•11 years ago
|
||
Hm, lots of
[Tue Apr 08 08:30:34 2014] [error] [client 10.22.81.208] Script timed out before returning headers: buildapi.wsgi, referer: https://secure.pub.build.mozilla.org/buildapi/self-serve/mozilla-inbound/rev/468b75c559b8
Restarting Apache on the webheads seems to have it responding faster, but of course that just means it will happen again :(
Comment 4•11 years ago
|
||
seems to have gotten better thanks to dustins probing.
for the record (and my own), it it's a HTTP problem like request timeouts, it is the buildapi web app issue. if requests are not getting completed (from here: https://secure.pub.build.mozilla.org/buildapi/self-serve/jobs), it's probably do to the self serve agents.
troubleshooting both can be found here as of right now: https://wiki.mozilla.org/ReleaseEngineering/How_To/Restart_BuildAPI
closing. if it happens again we can re open
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 5•11 years ago
|
||
I'm fairly certain this will happen again. We'll see how quickly. I'm going to move this to webops since I'll likely be pinging them for help debugging when this happens again.
Assignee: nobody → server-ops-webops
Status: RESOLVED → REOPENED
Component: Buildduty → WebOps: Other
Product: Release Engineering → Infrastructure & Operations
QA Contact: armenzg → nmaul
Resolution: FIXED → ---
Comment 6•11 years ago
|
||
OK, we dialed mod_wsgi up to 6 processes, with one thread each, rather than the 2/2 it was configured for. I'm guessing here, but presumably we've reached the point where we get more than four simultaneous buildapi connections, at which point requests queue up.
This had the side-effect of restarting Apache, so whether this medicine worked or not, it's back to normal now. Still, I'm going to optimistically call this done.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•