Closed Bug 786993 (talos-mtnlion-r5-080) Opened 12 years ago Closed 10 years ago

talos-mtnlion-r5-080 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P3)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nthomas, Unassigned)

References

Details

(Whiteboard: [buildduty][buildslaves][capacity])

This slave failed many jobs like this:
hg clone http://hg.mozilla.org/build/tools tools
 in dir /builds/slave/talos-slave/test/. (timeout 1320 secs)
Upon execvpe hg ['hg', 'clone', 'http://hg.mozilla.org/build/tools', 'tools'] in environment id 4354299104
:Traceback (most recent call last):
  File "/tools/buildbot-0.8.4-pre-moz2/lib/python2.7/site-packages/twisted/internet/process.py", line 414, in _fork
    executable, args, environment)
  File "/tools/buildbot-0.8.4-pre-moz2/lib/python2.7/site-packages/twisted/internet/process.py", line 460, in _execChild
    os.execvpe(executable, args, environment)
  File "/tools/buildbot-0.8.4-pre-moz2/lib/python2.7/os.py", line 353, in execvpe
    _execvpe(file, args, env)
  File "/tools/buildbot-0.8.4-pre-moz2/lib/python2.7/os.py", line 380, in _execvpe
    func(fullname, *argrest)
OSError: [Errno 2] No such file or directory

PATH is set to /usr/bin:/bin:/usr/sbin:/sbin, and hg is in /usr/local/bin. Not doing puppet properly ?
Reimaging the machine now.  I'm going to update the 10.8 reference documentation today to indicate how to reimage them and other tasks that are useful

/usr/sbin/bless --netboot --server bsdp://10.26.56.110 && reboot
Assignee: nobody → kmoir
Reminaged, renabled in slavealloc and rebooted. Verified that a recent build ran without hitting the path issues mentioned above.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
seems slave had problems, disabled in slavealloc for investigation

Errors received:
Traceback (most recent call last):
  File "/builds/slave/talos-slave/test/build/venv/bin/mozinstall", line 9, in <module>
    load_entry_point('mozInstall==1.8', 'console_scripts', 'mozinstall')()
  File "/builds/slave/talos-slave/test/build/venv/lib/python2.7/site-packages/mozinstall/mozinstall.py", line 300, in install_cli
    install_path = install(src, options.dest)
  File "/builds/slave/talos-slave/test/build/venv/lib/python2.7/site-packages/mozinstall/mozinstall.py", line 119, in install
    install_dir = _install_dmg(src, dest)
  File "/builds/slave/talos-slave/test/build/venv/lib/python2.7/site-packages/mozinstall/mozinstall.py", line 242, in _install_dmg
    raise InstallError('App bundle "%s" already exists.' % dest)
mozinstall.mozinstall.InstallError: Failed to install "/builds/slave/talos-slave/test/installer.dmg"
Return code: 1
Halting on failure while running ['/builds/slave/talos-slave/test/build/venv/bin/mozinstall', '/builds/slave/talos-slave/test/installer.dmg', '--destination', '/builds/slave/talos-slave/test/build/application']
Running post_fatal callback...
Exiting 1
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 933805
Assignee: kmoir → nobody
The error from comment #3 looks like it may have been intermittent. I've put this machine back in production, we'll take further steps if it hits the issue again.
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → FIXED
Attempting SSH reboot...Failed.
Attempting PDU reboot...Failed.
Filed IT bug for reboot (bug 1025026)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
Attempting SSH reboot...Failed.
Attempting PDU reboot...Failed.
Filed IT bug for reboot (bug 1085075)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
QA Contact: armenzg → bugspam.Callek
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Stopped taking jobs yesterday morning, hasn't restarted despite several reboots.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Something was definitely wrong here: enabled in slavealloc, but wasn't picking up a new tac file on reboot. A re-image worked though, so I've returned it to production.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.