Closed
Bug 909083
Opened 11 years ago
Closed 10 years ago
buildapi needs something to restart it
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
INVALID
People
(Reporter: nthomas, Unassigned)
References
Details
(Keywords: buildapi)
Attachments
(4 files)
During today's maintenance a db connection used by buildapi01 went away during work on the network. buildapi crashed out at this point, and we don't have anything to bring it back up again (no active puppet, no supervisord).
Comment 1•11 years ago
|
||
Grabbing to apply some bandaids
Assignee: nobody → hwine
Status: NEW → ASSIGNED
Comment 2•11 years ago
|
||
BANDAID - until something better is done. Will email release@ if it finds buildapi down, and page hwine if it doesn't come up.
Comment 3•11 years ago
|
||
BANDAID - run the hourly check and email release@ if any issues
Comment 4•11 years ago
|
||
BANDAID - script to restart selfserve-agent if it is not running -- this runs on buildbot-master36, this bug seemed closest to mark that fact
Comment 5•11 years ago
|
||
BANDAID - restart selfserve agent if not running
Comment 6•11 years ago
|
||
bandaids applied -- please remove when proper solution is applied
Assignee: hwine → nobody
Status: ASSIGNED → NEW
Comment 7•10 years ago
|
||
moved to releng cluster
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → INVALID
Reporter | ||
Comment 8•10 years ago
|
||
So now we have Apache + WSGI, so making a request relaunches buildapi if required ?
Comment 9•10 years ago
|
||
If it "crashes" that's caught either by mod_wsgi (Python exception) or by the Apache parent process (segfault), and restarted immediately. If it gets wedged somehow, I believe the request would eventually time out, again either at the mod_wsgi or Apache levels. But I haven't seen this happen so I'm not sure.
Assignee | ||
Updated•7 years ago
|
Component: Tools → General
You need to log in
before you can comment on or make changes to this bug.
Description
•