Closed Bug 909083 Opened 11 years ago Closed 10 years ago

buildapi needs something to restart it

Categories

(Release Engineering :: General, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: nthomas, Unassigned)

References

Details

(Keywords: buildapi)

Attachments

(4 files)

During today's maintenance a db connection used by buildapi01 went away during work on the network. buildapi crashed out at this point, and we don't have anything to bring it back up again (no active puppet, no supervisord).
Blocks: 926246
Grabbing to apply some bandaids
Assignee: nobody → hwine
Status: NEW → ASSIGNED
BANDAID - until something better is done. Will email release@ if it finds buildapi down, and page hwine if it doesn't come up.
BANDAID - run the hourly check and email release@ if any issues
BANDAID - script to restart selfserve-agent if it is not running -- this runs on buildbot-master36, this bug seemed closest to mark that fact
BANDAID - restart selfserve agent if not running
bandaids applied -- please remove when proper solution is applied
Assignee: hwine → nobody
Status: ASSIGNED → NEW
moved to releng cluster
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → INVALID
So now we have Apache + WSGI, so making a request relaunches buildapi if required ?
If it "crashes" that's caught either by mod_wsgi (Python exception) or by the Apache parent process (segfault), and restarted immediately.

If it gets wedged somehow, I believe the request would eventually time out, again either at the mod_wsgi or Apache levels.  But I haven't seen this happen so I'm not sure.
Component: Tools → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: