Closed Bug 853248 (talos-mtnlion-r5-016) Opened 12 years ago Closed 12 years ago

talos-mtnlion-r5-016 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P3)

x86_64
macOS

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: armenzg, Unassigned)

Details

(Whiteboard: [buildduty][buildslaves][capacity])

The machine never rebooted after its reboot step. Rebooted through SSH. 2013-03-19 13:30:36-0700 [Broker,client] RunProcess._startCommand 2013-03-19 13:30:36-0700 [Broker,client] /tools/buildbot/bin/python scripts/external_tools/count_and_reboot.py -f ../reboot_count.txt -n 1 -z 2013-03-19 13:30:36-0700 [Broker,client] in dir /builds/slave/talos-slave/test/. (timeout 1200 secs) 2013-03-19 13:30:36-0700 [Broker,client] watching logfiles {} 2013-03-19 13:30:36-0700 [Broker,client] argv: ['/tools/buildbot/bin/python', 'scripts/external_tools/count_and_reboot.py', '-f', '../reboot_count.txt', '-n', '1', '-z'] 2013-03-19 13:30:36-0700 [Broker,client] environment: {'XPCOM_DEBUG_BREAK': 'warn', 'SHELL': '/bin/bash', 'MOZ_NO_REMOTE': '1', 'MOZ_HIDE_RESULTS_TABLE': '1', 'VERSIONER_PYTHON_VERSION': '2.7', 'VERSIONER_PYTHON_PREFER_32_BIT': 'no', 'SSH_AUTH_SOCK': '/tmp/launch-Uy0SyV/Listeners', '__CF_USER_TEXT_ENCODING': '0x1C:0:0', 'Apple_Ubiquity_Message': '/tmp/launch-6oIGLQ/Apple_Ubiquity_Message', 'PWD': '/builds/slave/talos-slave/test', 'Apple_PubSub_Socket_Render': '/tmp/launch-AIgM5Z/Render', 'LOGNAME': 'cltbld', 'USER': 'cltbld', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin', 'NO_EM_RESTART': '1', 'HOME': '/Users/cltbld', 'PROPERTIES_FILE': '/builds/slave/talos-slave/test/buildprops.json', 'NO_FAIL_ON_TEST_ERRORS': '1', 'TMPDIR': '/var/folders/lr/nwz2bgs53v1_nr5s75sk7lqh00000w/T/'} 2013-03-19 13:30:36-0700 [Broker,client] using PTY: False 2013-03-19 13:30:57-0700 [-] Received SIGTERM, shutting down. 2013-03-19 13:30:57-0700 [-] Received SIGTERM, shutting down. 2013-03-19 13:30:57-0700 [-] stopCommand: halting current command <buildslave.commands.shell.SlaveShellCommand instance at 0x101b94248> 2013-03-19 13:30:57-0700 [-] command interrupted, attempting to kill 2013-03-19 13:30:57-0700 [-] trying to kill process group 885 2013-03-19 13:30:57-0700 [-] failed to kill process group (ignored): [Errno 1] Operation not permitted 2013-03-19 13:30:57-0700 [-] trying process.signalProcess('KILL') 2013-03-19 13:30:57-0700 [-] signal KILL sent successfully 2013-03-19 13:30:57-0700 [Broker,client] lost remote
Rebooted via ssh; idle since March 21.
It took a couple of jobs before it stopped working again. I think it might need a re-image but I rebooted it again for now as the jobs it does manage to take end up green.
Back in production for now.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.