Closed Bug 728535 (t-snow-r4-0007) Opened 12 years ago Closed 10 years ago

t-snow-r4-0007 problem tracking

Categories

(Infrastructure & Operations Graveyard :: CIDuty, task, P3)

x86_64
macOS

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philor, Unassigned)

References

()

Details

(Whiteboard: [badslave?][buildduty][capacity])

Sadly, I didn't say how it was broken in bug 716326, so I don't know whether that was also like https://tbpl.mozilla.org/php/getParsedLog.php?id=9433872&tree=Mozilla-Inbound where all of rm -rf tools and then rm -rf build says "rm: tools/.hg/00changelog.i: Invalid argument" about every file, and then it craps out trying to download the build, with "Cannot write to `firefox-13.0a1.en-US.mac.dmg' (Invalid argument)." but, I sort of think it was, and it probably needs its disk looked at, rather than just another reimage. Maybe.

Anyway, it's once again chewing up jobs like crazy because it only takes a couple of seconds to fail to rm and then fail to save a downloaded build.
Disabled in slavealloc and on the buildbot master. It's refusing ssh access so will needs some hands on help.
Depends on: 729118
Priority: -- → P3
Putting it back in the pool. Will monitor.
Assignee: nobody → coop
Status: NEW → ASSIGNED
Priority: P3 → P2
I disabled it per Philor's request in IRC and didn't realize it already had a bug (teach me to not check slavealloc *before* filing bug...)
decomission?
(In reply to Armen Zambrano G. [:armenzg] - Release Engineer from comment #5)
> decomission?

rev4 machines are (essentially) brand-new, so I certainly hope not. Please file a server ops bug to get it fixed.
Depends on: 722081
Handing back to buildduty now that hands-on has been requested.
Alias: talos-r4-snow-007
Assignee: coop → nobody
Severity: major → normal
Status: ASSIGNED → NEW
Priority: P2 → P3
Summary: talos-r4-snow-007 is broken → talos-r4-snow-007
Whiteboard: [badslave?][buildduty] → [badslave?][buildduty][capacity]
Assignee: nobody → bhearsum
Summary: talos-r4-snow-007 → talos-r4-snow-007 problem tracking
Assignee: bhearsum → nobody
Component: Release Engineering → Release Engineering: Machine Management
QA Contact: release → armenzg
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
https://tbpl.mozilla.org/php/getParsedLog.php?id=11156650&tree=Mozilla-Inbound
https://tbpl.mozilla.org/php/getParsedLog.php?id=11156669&tree=Mozilla-Inbound

rm: tools/.hg: Device not configured
rm: tools/.hgignore: Invalid argument
rm: tools/.hgtags: Invalid argument
rm: tools/.pylintrc: Invalid argument
rm: tools/breakpad: Device not configured
rm: tools/buildbot-helpers: Device not configured
rm: tools/buildfarm: Device not configured
rm: tools/cdmaker: Device not configured
rm: tools/clobberer: Device not configured
rm: tools/graphserver_webapp: Device not configured
rm: tools/lib: Device not configured
rm: tools/MANIFEST.in: Invalid argument
rm: tools/misc: Device not configured
rm: tools/release: Device not configured
rm: tools/scripts: Device not configured
rm: tools/setup.py: Invalid argument
rm: tools/stage: Device not configured
rm: tools/sut_tools: Device not configured
rm: tools/trychooser: Device not configured
rm: tools: Directory not empty
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
https://tbpl.mozilla.org/php/getParsedLog.php?id=11156809&tree=Birch
https://tbpl.mozilla.org/php/getParsedLog.php?id=11156805&tree=Birch
https://tbpl.mozilla.org/php/getParsedLog.php?id=11156825&tree=Birch

rm: tools/.hg: Device not configured
rm: tools/.hgignore: Invalid argument
rm: tools/.hgtags: Invalid argument
rm: tools/.pylintrc: Invalid argument
rm: tools/breakpad: Device not configured
rm: tools/buildbot-helpers: Device not configured
rm: tools/buildfarm: Device not configured
rm: tools/cdmaker: Device not configured
rm: tools/clobberer: Device not configured
rm: tools/graphserver_webapp: Device not configured
rm: tools/lib: Device not configured
rm: tools/MANIFEST.in: Invalid argument
rm: tools/misc: Device not configured
rm: tools/release: Device not configured
rm: tools/scripts: Device not configured
rm: tools/setup.py: Invalid argument
rm: tools/stage: Device not configured
rm: tools/sut_tools: Device not configured
rm: tools/trychooser: Device not configured
rm: tools: Directory not empty
https://tbpl.mozilla.org/php/getParsedLog.php?id=11156872&tree=Birch
https://tbpl.mozilla.org/php/getParsedLog.php?id=11156764&tree=Birch

Connecting to build.mozilla.org|10.2.74.128|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1,169 (1.1K) [application/x-sh]
installdmg.sh: Invalid argument

Cannot write to `installdmg.sh' (Invalid argument).
program finished with exit code 1
disabled in slavealloc and ssh'ing in to make sure it's offline
Depends on: 747725
No longer depends on: 722081, 729118
something is wrong with the harddrive on this host - the above activity points to a bad or full drive and df -h shows:

talos-r4-snow-007:~ cltbld$ df -h
Filesystem      Size   Used  Avail Capacity  Mounted on
/dev/disk0s2   298Gi  8.3Gi  289Gi     3%    /
devfs          106Ki  106Ki    0Bi   100%    /dev
map -hosts       0Bi    0Bi    0Bi   100%    /net
map auto_home    0Bi    0Bi    0Bi   100%    /home

please pull this unit and check it's harddrive
No longer depends on: 747725
Depends on: 722081
Depends on: 754234
No longer depends on: 754234
this slave has returned from repairs with a new hdd.  Re-imaged and back in SCL1.
Back in production
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Depends on: 808009
Depends on: 808011
Needs a reboot, and to be added back to nagios.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Rebooted via PDU after nagios reported it down.
Product: mozilla.org → Release Engineering
Alias: talos-r4-snow-007 → t-snow-r4-0007
Summary: talos-r4-snow-007 problem tracking → t-snow-r4-0007 problem tracking
Attempting SSH reboot...Failed.
Attempting PDU reboot...Failed.
Filed IT bug for reboot (bug 1037831)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → RESOLVED
Closed: 12 years ago10 years ago
Resolution: --- → FIXED
Product: Release Engineering → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.