Closed Bug 1130369 Opened 11 years ago Closed 10 years ago

Errors restarting run_gunicorn using restart-jobs script

Categories

(Tree Management :: Treeherder: Infrastructure, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1130408

People

(Reporter: emorley, Unassigned)

Details

Seen this twice: [2015-02-06 13:26:15] [localhost] finished: /root/bin/restart-jobs -p web (2312.519s) [localhost] err: [2015-02-06 12:47:42] [treeherder1.webapp.scl3.mozilla.com] running: supervisorctl restart run_gunicorn [localhost] err: [2015-02-06 12:47:42] [treeherder2.webapp.scl3.mozilla.com] running: supervisorctl restart run_gunicorn [localhost] err: [2015-02-06 12:47:42] [treeherder3.webapp.scl3.mozilla.com] running: supervisorctl restart run_gunicorn [localhost] err: [2015-02-06 12:47:43] [treeherder2.webapp.scl3.mozilla.com] finished: supervisorctl restart run_gunicorn (0.503s) [localhost] err: [treeherder2.webapp.scl3.mozilla.com] out: run_gunicorn: ERROR (not running) [localhost] err: [treeherder2.webapp.scl3.mozilla.com] out: run_gunicorn: ERROR (abnormal termination) [localhost] err: [2015-02-06 12:47:44] [treeherder1.webapp.scl3.mozilla.com] finished: supervisorctl restart run_gunicorn (1.535s) [localhost] err: [treeherder1.webapp.scl3.mozilla.com] out: run_gunicorn: stopped [localhost] err: [treeherder1.webapp.scl3.mozilla.com] out: run_gunicorn: ERROR (abnormal termination) [localhost] err: [2015-02-06 13:26:15] [treeherder3.webapp.scl3.mozilla.com] finished: supervisorctl restart run_gunicorn (2312.480s) [localhost] err: [treeherder3.webapp.scl3.mozilla.com] out: run_gunicorn: stopped [localhost] err: [treeherder3.webapp.scl3.mozilla.com] out: run_gunicorn: started and later, via ssh: [2015-02-06 13:26:04] [treeherder1.webapp.scl3.mozilla.com] running: supervisorctl restart run_gunicorn [2015-02-06 13:26:04] [treeherder2.webapp.scl3.mozilla.com] running: supervisorctl restart run_gunicorn [2015-02-06 13:26:04] [treeherder3.webapp.scl3.mozilla.com] running: supervisorctl restart run_gunicorn [2015-02-06 13:26:04] [treeherder3.webapp.scl3.mozilla.com] finished: supervisorctl restart run_gunicorn (0.482s) [treeherder3.webapp.scl3.mozilla.com] out: run_gunicorn: stopped [treeherder3.webapp.scl3.mozilla.com] out: run_gunicorn: ERROR (already started) [2015-02-06 13:26:15] [treeherder2.webapp.scl3.mozilla.com] finished: supervisorctl restart run_gunicorn (10.589s) [treeherder2.webapp.scl3.mozilla.com] out: run_gunicorn: stopped [treeherder2.webapp.scl3.mozilla.com] out: run_gunicorn: started [2015-02-06 13:26:16] [treeherder1.webapp.scl3.mozilla.com] finished: supervisorctl restart run_gunicorn (12.091s) [treeherder1.webapp.scl3.mozilla.com] out: run_gunicorn: stopped [treeherder1.webapp.scl3.mozilla.com] out: run_gunicorn: started
Flags: needinfo?(klibby)
if it's running restart-jobs once per host in the host group, then it's going to step all over itself. restart-jobs is designed to be run one per group, not per host.
Flags: needinfo?(klibby)
This has been fine on the last few deploys; calling this fixed by bug 1130408.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.