Closed Bug 1458792 Opened 7 years ago Closed 7 years ago

Problems with command_runner.py under dev-master2.bb.releng.use1.mozilla.com

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: apop, Assigned: bhearsum)

Details

After some daily monitoring on irc, I have saw an alert from Nagios : Thu 03:05:12 UTC [7208] [] dev-master2.bb.releng.use1.mozilla.com:procs - command_runner is CRITICAL: PROCS CRITICAL: 0 processes with regex args 'command_runner.py' (http://m.mozilla.org/procs+-+command_runner) The process is running through puppet >> https://dxr.mozilla.org/build-central/source/puppet/modules/buildmaster/templates/command_runner.cfg.erb#5 and this is the process >> https://hg.mozilla.org/build/tools/file/tip/buildbot-helpers/command_runner.py This is the command that is ran by nagios > https://dxr.mozilla.org/build-central/source/puppet/modules/buildmaster/templates/command_runner.cfg.erb#5 a) This server wasn't used for three years or so. b) I though that a machine reboot would be helpful. c) Do we still need these machines now that 60+ releases rely solely on Taskscluster?
this alert can be also related to the issue : Wed May 02 16:36:33 -0700 2018 Puppet (err): Could not get latest version: Execution of '/usr/local/bin/python /usr/lib/ruby/site_ruby/1.8/puppet/provider/package/yumhelper.py' returned 1: <type 'exceptions.AttributeError'> Wed May 02 16:36:33 -0700 2018 /Stage[main]/Packages::Nrpe/Package[nrpe]/ensure (err): change from 2.12-16.el6 to latest failed: Could not get latest version: Execution of '/usr/local/bin/python /usr/lib/ruby/site_ruby/1.8/puppet/provider/package/yumhelper.py' returned 1: <type 'exceptions.AttributeError'> Wed May 02 16:36:35 -0700 2018 more details here : https://groups.google.com/a/mozilla.com/forum/#!topic/releng-puppet-mail/mVaKZqlsqog
this problem might be related to the Python upgrade made by bhearsum
Yeah, this is my fault. I'm actively working testing the latest Python with Buildbot. I should have it fixed up today.
Assignee: nobody → bhearsum
This is fixed with the patch from https://bugzilla.mozilla.org/show_bug.cgi?id=1458692, which is currently applied to dev-master2. Until that patch lands on production puppet, the puppet freshness check for dev-master2 will fail -- but all checks should be fine once that lands.
This is fixed via bug 1458692
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.