Closed Bug 853256 Opened 12 years ago Closed 12 years ago

Reboots on Mac machines can fail (failed to kill process group (ignored): [Errno 1] Operation not permitted)

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

x86
macOS
task
Not set
normal

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: armenzg, Unassigned)

Details

(Whiteboard: [10.7])

After running count_and_reboot.py we get this message: > 2013-03-19 22:22:33-0700 [-] failed to kill process group (ignored): [Errno 1] Operation not permitted I have seen this in bug 853255, bug 767123 and bug 853254. Rebooting through SSH gets the machine in a working state. 2013-03-19 22:22:32-0700 [Broker,client] RunProcess._startCommand 2013-03-19 22:22:32-0700 [Broker,client] /tools/buildbot/bin/python scripts/external_tools/count_and_reboot.py -f ../reboot_count.txt -n 1 -z 2013-03-19 22:22:32-0700 [Broker,client] in dir /Users/cltbld/talos-slave/test/. (timeout 1200 secs) 2013-03-19 22:22:32-0700 [Broker,client] watching logfiles {} 2013-03-19 22:22:32-0700 [Broker,client] argv: ['/tools/buildbot/bin/python', 'scripts/external_tools/count_and_reboot.py', '-f', '../reboot_count.txt', '-n', '1', '-z'] 2013-03-19 22:22:32-0700 [Broker,client] environment: {'XPCOM_DEBUG_BREAK': 'warn', 'SSH_AUTH_SOCK': '/tmp/launch-8nnQY9/Listeners', 'CVS_RSH': 'ssh', 'MOZ_HIDE_RESULTS_TABLE': '1', 'VERSIONER_PYTHON_VERSION': '2.7', 'PYTHONPATH': '/Library/Python/2.5/site-packages', 'VERSIONER_PYTHON_PREFER_32_BIT': 'no', 'PROPERTIES_FILE': '/Users/cltbld/talos-slave/test/buildprops.json', '__CF_USER_TEXT_ENCODING': '0x1F5:0:0', 'MOZ_NO_REMOTE': '1', 'PWD': '/Users/cltbld/talos-slave/test', 'SHELL': '/bin/bash', 'LOGNAME': 'cltbld', 'USER': 'cltbld', 'PATH': '/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/X11/bin', 'NO_EM_RESTART': '1', 'HOME': '/Users/cltbld', 'Apple_PubSub_Socket_Render': '/tmp/launch-0L2NpH/Render', 'NO_FAIL_ON_TEST_ERRORS': '1', 'DISPLAY': '/tmp/launch-0kQ2YP/org.x:0', 'TMPDIR': '/var/folders/qd/srwd5f710sj0fcl9z464lkj00000gn/T/'} 2013-03-19 22:22:32-0700 [Broker,client] using PTY: False 2013-03-19 22:22:33-0700 [-] Received SIGTERM, shutting down. 2013-03-19 22:22:33-0700 [-] stopCommand: halting current command <buildslave.commands.shell.SlaveShellCommand instance at 0x10a6ef8c0> 2013-03-19 22:22:33-0700 [-] command interrupted, attempting to kill 2013-03-19 22:22:33-0700 [-] trying to kill process group 454 2013-03-19 22:22:33-0700 [-] failed to kill process group (ignored): [Errno 1] Operation not permitted 2013-03-19 22:22:33-0700 [-] trying process.signalProcess('KILL') 2013-03-19 22:22:33-0700 [-] signal KILL sent successfully
Presumably kittenherder is rebooting these slaves after 6 hours now?
Whiteboard: [10.7]
Product: mozilla.org → Release Engineering
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.