Bug 803087 (b-linux64-hp-0035)

b-linux64-hp-0035 problem tracking

RESOLVED WONTFIX

Status

Release Engineering
Buildduty
P3
normal
RESOLVED WONTFIX
5 years ago
3 years ago

People

(Reporter: armenzg, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [buildduty][buildslaves][capacity])

Comment hidden (empty)
(Reporter)

Comment 1

5 years ago
I have tried rebooting through the IPMI interface and I have not yet seen it come back.

10:11 armenzg requests cold boot for bld-centos6-hp-019
Fixed the RAID config, back in production.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED

Comment 3

5 years ago
Fixed the RAID config today.
Depends on: 824755
Down since 12-24-2012 18:24:18 - needs attention
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Back in the production pool.
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → FIXED
Needs a reboot.
Status: RESOLVED → REOPENED
Depends on: 856689
Resolution: FIXED → ---
Back in production.
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → FIXED
(Assignee)

Updated

4 years ago
Product: mozilla.org → Release Engineering
(Reporter)

Comment 8

4 years ago
Reboot needed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Reporter)

Updated

4 years ago
Depends on: 933784
(Reporter)

Comment 9

4 years ago
It came back!
(Reporter)

Updated

4 years ago
No longer depends on: 933784
running green builds
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago4 years ago
Resolution: --- → FIXED
No space left on device, disabled in slavealloc and rebooted.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Biggest culprits:

[root@bld-centos6-hp-019.build.scl1.mozilla.com ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             226G  214G  2.6M 100% /
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/sda1              97M   31M   62M  34% /boot

[root@bld-centos6-hp-019.build.scl1.mozilla.com ~]# du -ch --max-depth 1 /builds                                                                                                                                                             
33G     /builds/hg-shared
4.0K    /builds/tooltool_cache
140G    /builds/slave
7.0G    /builds/mock_mozilla
8.8G    /builds/ccache
25G     /builds/git-shared
213G    /builds
213G    total

[root@bld-centos6-hp-019.build.scl1.mozilla.com ~]# du -ch --max-depth 1 /builds/slave | sort -rn | head -n 100
326M    /builds/slave/rel-m-beta-lx_uv_6-00000000000
326M    /builds/slave/rel-m-beta-l64_uv_4-0000000000
280M    /builds/slave/rel-m-esr24-lx_uv_3-0000000000
277M    /builds/slave/rel-m-esr24-lx_uv_2-0000000000
270M    /builds/slave/tb-rel-c-esr24-lx_uv_5-0000000
268M    /builds/slave/tb-rel-c-esr24-l64_uv_5-000000
140G    total
140G    /builds/slave
86M     /builds/slave/rel-m-beta-postrelease-0000000
86M     /builds/slave/rel-m-beta-av-0000000000000000
85M     /builds/slave/tb-rel-c-esr24-psh_mrrrs-00000
85M     /builds/slave/rel-m-rel-av-00000000000000000
85M     /builds/slave/rel-m-beta-xr_psh_mrrrs-000000
55M     /builds/slave/rel-m-beta-bncr_sub-0000000000
54M     /builds/slave/rel-m-beta-xr_sums-00000000000
13G     /builds/slave/m-in-l64-nonunified-0000000000
13G     /builds/slave/b2g-in-l64_g-d-000000000000000
13G     /builds/slave/b2g_b2g-in_emu-jb-d_dep-000000
12G     /builds/slave/rel-m-beta-lx_bld-000000000000
10G     /builds/slave/m-in-l64-asan-d-00000000000000
9.7G    /builds/slave/fx-team-and-nonunified-0000000
9.4G    /builds/slave/fx-team-l64-000000000000000000
7.9G    /builds/slave/rel-m-beta-xr_lx_bld-000000000
6.9G    /builds/slave/fx-team-l64-asan-0000000000000
4.8G    /builds/slave/rel-m-rel-l64_rpk_2-0000000000
4.4G    /builds/slave/rel-m-rel-lx_rpk_2-00000000000

what's the usual protocol for cleaning up space here?

bhearsum - asking you just cause I found: https://bugzilla.mozilla.org/show_bug.cgi?id=704545#c1
WRT ^ https://bugzilla.mozilla.org/show_bug.cgi?id=803087#c12
Flags: needinfo?(bhearsum)

Updated

4 years ago
Depends on: 1001518
ben answered over irc.

he suggests:

short term: manually delete files to clear space
long term: we will need to either bump storage or optimize how we scrub old files (bug 1001518)
Flags: needinfo?(bhearsum)
Hitting this again...

https://tbpl.mozilla.org/php/getParsedLog.php?id=39258679&tree=Mozilla-Inbound
abort: No space left on device
Disabled in slavealloc until freespace manually purged
Will manually clean, and look at long-term fix via bug 1001518
Following guide in https://bugzilla.mozilla.org/show_bug.cgi?id=1001518#c2 I found the first build with was "b2g_mozilla-inbound_linux64_gecko-debug build". Previously it was "b2g_b2g-inbound_linux64_gecko-debug build" so it looks like in general linux64 gecko debug builds require more disk space. Will close this bug when the disk space is freed up, and will close bug 1001518 when disk space requirement is increased for linux64 gecko debug builds.
correction to comment 18:
"first build with was" -> "first build with result=5 was"
Ran:

[root@bld-centos6-hp-019.build.scl1.mozilla.com ~]# df -h; du -sk /builds/slave/* | sort -n | tail -5 | while read size dir; do echo "Removing '${dir}/build/*'..."; rm -rf "${dir}/build"/*; done; df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             226G  214G  6.2M 100% /
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/sda1              97M   31M   62M  34% /boot
Removing '/builds/slave/m-b30_14-linux32_g-d-000000000/build/*'...
Removing '/builds/slave/rel-m-rel-and_bld-000000000000/build/*'...
Removing '/builds/slave/m-cen-linux32_g-d-000000000000/build/*'...
Removing '/builds/slave/rel-m-beta-and-x86_bld-0000000/build/*'...
Removing '/builds/slave/b2g_m-b28_v1_3t_tko_eng_dep-00/build/*'...
Filesystem            Size  Used Avail Use% Mounted on
/dev/sda3             226G  152G   63G  71% /
tmpfs                 3.9G     0  3.9G   0% /dev/shm
/dev/sda1              97M   31M   62M  34% /boot

Note, this one-liner can be used in general to clear up builds from a build slave as a temporary solution.
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → FIXED
Re-enabled in slavealloc
(please note, even "longer term" solution documented in bug 1007583)
Only able to free 12.44GB, less than many of the jobs it tries to take require. Disabled in slavealloc.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Alias: bld-centos6-hp-019 → b-linux64-hp-0035
Summary: bld-centos6-hp-019 problem tracking → b-linux64-hp-0035 problem tracking
QA Contact: armenzg → bugspam.Callek
seems to have been taking jobs since the move from scl1...
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → FIXED
Out of space and can't free enough to build. Disabled.
https://tbpl.mozilla.org/php/getParsedLog.php?id=45255872&tree=Mozilla-B2g32-v2.0
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Cleaned up, re-enabled.
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago3 years ago
Resolution: --- → FIXED
Out of space, disabled.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
should be clean enough now --> enabled.
Status: REOPENED → RESOLVED
Last Resolved: 3 years ago3 years ago
Resolution: --- → FIXED
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #27)
> Out of space, disabled.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Cleaned up for chemspills, re-enabled.
Status: REOPENED → RESOLVED
Last Resolved: 3 years ago3 years ago
Resolution: --- → FIXED
Please do not re-enable this slave. We are retiring linux hardware build slaves in bug 1106922.
Blocks: 1106922
Resolution: FIXED → WONTFIX
You need to log in before you can comment on or make changes to this bug.