Closed Bug 803087 (b-linux64-hp-0035) Opened 12 years ago Closed 10 years ago

b-linux64-hp-0035 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P3)

x86
Linux

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: armenzg, Unassigned)

References

Details

(Whiteboard: [buildduty][buildslaves][capacity])

      No description provided.
I have tried rebooting through the IPMI interface and I have not yet seen it come back.

10:11 armenzg requests cold boot for bld-centos6-hp-019
Fixed the RAID config, back in production.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Fixed the RAID config today.
Depends on: 824755
Down since 12-24-2012 18:24:18 - needs attention
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Back in the production pool.
Status: REOPENED → RESOLVED
Closed: 12 years ago11 years ago
Resolution: --- → FIXED
Needs a reboot.
Status: RESOLVED → REOPENED
Depends on: 856689
Resolution: FIXED → ---
Back in production.
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Reboot needed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 933784
It came back!
No longer depends on: 933784
running green builds
Status: REOPENED → RESOLVED
Closed: 11 years ago11 years ago
Resolution: --- → FIXED
No space left on device, disabled in slavealloc and rebooted.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Biggest culprits:

[root@bld-centos6-hp-019.build.scl1.mozilla.com ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             226G  214G  2.6M 100% /
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/sda1              97M   31M   62M  34% /boot

[root@bld-centos6-hp-019.build.scl1.mozilla.com ~]# du -ch --max-depth 1 /builds                                                                                                                                                             
33G     /builds/hg-shared
4.0K    /builds/tooltool_cache
140G    /builds/slave
7.0G    /builds/mock_mozilla
8.8G    /builds/ccache
25G     /builds/git-shared
213G    /builds
213G    total

[root@bld-centos6-hp-019.build.scl1.mozilla.com ~]# du -ch --max-depth 1 /builds/slave | sort -rn | head -n 100
326M    /builds/slave/rel-m-beta-lx_uv_6-00000000000
326M    /builds/slave/rel-m-beta-l64_uv_4-0000000000
280M    /builds/slave/rel-m-esr24-lx_uv_3-0000000000
277M    /builds/slave/rel-m-esr24-lx_uv_2-0000000000
270M    /builds/slave/tb-rel-c-esr24-lx_uv_5-0000000
268M    /builds/slave/tb-rel-c-esr24-l64_uv_5-000000
140G    total
140G    /builds/slave
86M     /builds/slave/rel-m-beta-postrelease-0000000
86M     /builds/slave/rel-m-beta-av-0000000000000000
85M     /builds/slave/tb-rel-c-esr24-psh_mrrrs-00000
85M     /builds/slave/rel-m-rel-av-00000000000000000
85M     /builds/slave/rel-m-beta-xr_psh_mrrrs-000000
55M     /builds/slave/rel-m-beta-bncr_sub-0000000000
54M     /builds/slave/rel-m-beta-xr_sums-00000000000
13G     /builds/slave/m-in-l64-nonunified-0000000000
13G     /builds/slave/b2g-in-l64_g-d-000000000000000
13G     /builds/slave/b2g_b2g-in_emu-jb-d_dep-000000
12G     /builds/slave/rel-m-beta-lx_bld-000000000000
10G     /builds/slave/m-in-l64-asan-d-00000000000000
9.7G    /builds/slave/fx-team-and-nonunified-0000000
9.4G    /builds/slave/fx-team-l64-000000000000000000
7.9G    /builds/slave/rel-m-beta-xr_lx_bld-000000000
6.9G    /builds/slave/fx-team-l64-asan-0000000000000
4.8G    /builds/slave/rel-m-rel-l64_rpk_2-0000000000
4.4G    /builds/slave/rel-m-rel-lx_rpk_2-00000000000

what's the usual protocol for cleaning up space here?

bhearsum - asking you just cause I found: https://bugzilla.mozilla.org/show_bug.cgi?id=704545#c1
Depends on: 1001518
ben answered over irc.

he suggests:

short term: manually delete files to clear space
long term: we will need to either bump storage or optimize how we scrub old files (bug 1001518)
Flags: needinfo?(bhearsum)
Disabled in slavealloc until freespace manually purged
Will manually clean, and look at long-term fix via bug 1001518
Following guide in https://bugzilla.mozilla.org/show_bug.cgi?id=1001518#c2 I found the first build with was "b2g_mozilla-inbound_linux64_gecko-debug build". Previously it was "b2g_b2g-inbound_linux64_gecko-debug build" so it looks like in general linux64 gecko debug builds require more disk space. Will close this bug when the disk space is freed up, and will close bug 1001518 when disk space requirement is increased for linux64 gecko debug builds.
correction to comment 18:
"first build with was" -> "first build with result=5 was"
Ran:

[root@bld-centos6-hp-019.build.scl1.mozilla.com ~]# df -h; du -sk /builds/slave/* | sort -n | tail -5 | while read size dir; do echo "Removing '${dir}/build/*'..."; rm -rf "${dir}/build"/*; done; df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             226G  214G  6.2M 100% /
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/sda1              97M   31M   62M  34% /boot
Removing '/builds/slave/m-b30_14-linux32_g-d-000000000/build/*'...
Removing '/builds/slave/rel-m-rel-and_bld-000000000000/build/*'...
Removing '/builds/slave/m-cen-linux32_g-d-000000000000/build/*'...
Removing '/builds/slave/rel-m-beta-and-x86_bld-0000000/build/*'...
Removing '/builds/slave/b2g_m-b28_v1_3t_tko_eng_dep-00/build/*'...
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             226G  152G   63G  71% /
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/sda1              97M   31M   62M  34% /boot

Note, this one-liner can be used in general to clear up builds from a build slave as a temporary solution.
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
Re-enabled in slavealloc
(please note, even "longer term" solution documented in bug 1007583)
Only able to free 12.44GB, less than many of the jobs it tries to take require. Disabled in slavealloc.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Alias: bld-centos6-hp-019 → b-linux64-hp-0035
Summary: bld-centos6-hp-019 problem tracking → b-linux64-hp-0035 problem tracking
QA Contact: armenzg → bugspam.Callek
seems to have been taking jobs since the move from scl1...
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Out of space and can't free enough to build. Disabled.
https://tbpl.mozilla.org/php/getParsedLog.php?id=45255872&tree=Mozilla-B2g32-v2.0
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Cleaned up, re-enabled.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Out of space, disabled.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
should be clean enough now --> enabled.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #27)
> Out of space, disabled.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Cleaned up for chemspills, re-enabled.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Please do not re-enable this slave. We are retiring linux hardware build slaves in bug 1106922.
Blocks: 1106922
Resolution: FIXED → WONTFIX
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.