Closed Bug 801503 (b-linux64-hp-0032) Opened 12 years ago Closed 10 years ago

b-linux64-hp-0032 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P3)

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: nthomas, Unassigned)

References

Details

(Whiteboard: [buildduty][buildslaves][capacity])

Sitting at the MoCo Network Installer boot screen.
This machine is in the main slave pool. Closing, because these bugs are to track acute issues. bug 779487 tracks the chronic one with this class of machine.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Down, needs some help.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 807960
Fixed the raid config  today.
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
This machine sometimes hits
 Determining IP information for eth0 ... failed; no link present. Check cable ?
when booting, which shows up as PING being DOWN in nagio. You can use the remote console to send CTRL-ALT-DEL for a graceful reboot.
down, oob ip is unreachable
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 890886
Taking jobs again
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
was down for ~ 5 days, (ping) - ilo said it was up -- did a reboot via ilo
Disabled in slavealloc. This slave is failing to do any b2g builds, the error looks like this:

19:46:27     INFO - Running command: ['/builds/slave/b2g_b2g-in_leo_dep-00000000000/build/repo', 'init', '--repo-url', 'https://git.mozilla.org/external/google/gerrit/git-repo.git', '--no-repo-verify', '-q', '-u', '/builds/slave/b2g_b2g-in_leo_dep-00000000000/build/tmp_manifest', '-m', 'leo.xml', '-b', u'master'] in /builds/slave/b2g_b2g-in_leo_dep-00000000000/build
19:46:27     INFO - Copy/paste: /builds/slave/b2g_b2g-in_leo_dep-00000000000/build/repo init --repo-url https://git.mozilla.org/external/google/gerrit/git-repo.git --no-repo-verify -q -u /builds/slave/b2g_b2g-in_leo_dep-00000000000/build/tmp_manifest -m leo.xml -b master
19:46:27     INFO -  fatal: cannot make /builds/slave/b2g_b2g-in_leo_dep-00000000000/build/.repo/repo directory: File exists
19:46:27     INFO -  fatal: repo init failed; run without --quiet to see why
19:46:28     INFO -  Traceback (most recent call last):
19:46:28     INFO -    File "/builds/slave/b2g_b2g-in_leo_dep-00000000000/build/repo", line 758, in <module>
19:46:28     INFO -      main(sys.argv[1:])
19:46:28     INFO -    File "/builds/slave/b2g_b2g-in_leo_dep-00000000000/build/repo", line 731, in main
19:46:28     INFO -      os.rmdir(os.path.join(root, name))
19:46:28     INFO -  OSError: [Errno 20] Not a directory: '.repo/.repo/manifests/.git/info'
19:46:28    ERROR - Return code: 1

but on multiple different builders, so it's not needing a clobberon one. git repo corruption ?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 933774
b2g_build.py creates a symlink from build/.repo to /builds/git-shared/repo

http://hg.mozilla.org/build/mozharness/file/default/scripts/b2g_build.py#l551

so perhaps something with the contents of the shared dir is messing this slave up?
(In reply to Chris AtLee [:catlee] from comment #9)
> b2g_build.py creates a symlink from build/.repo to /builds/git-shared/repo
> 
> http://hg.mozilla.org/build/mozharness/file/default/scripts/b2g_build.py#l551
> 
> so perhaps something with the contents of the shared dir is messing this
> slave up?

Well, re-imaging would fix this. Unfortunately it hasn't puppetized properly after a re-image. Looking into that...
Back in production after a re-image.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
https://tbpl.mozilla.org/php/getParsedLog.php?id=39361950&full=1&branch=mozilla-central

07:36:27     INFO -  Error: unable to free 15.00 GB of space. Free space only 12.21 GB

Seems to have done a bunch of release- jobs recently, that's probably where all the space went. Disabled in slavealloc.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Some 140G of release builds, 10G non-release builds, 34G hg-shared, 14G git-shared, 8.5G non-shared, 6G mock_mozilla; which pretty much fills up 226GB of / when you add in the operating system.

Removed rel*xr*, rel*source*, rel-m-beta* and rel-m-rel* older than a week. Re-enabled with 100G free.
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
Alias: bld-centos6-hp-016 → b-linux64-hp-0032
Summary: bld-centos6-hp-016 problem tracking → b-linux64-hp-0032 problem tracking
Please do not re-enable this slave. We are retiring linux hardware build slaves in bug 1106922.
Blocks: 1106922
Resolution: FIXED → WONTFIX
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.