Hands-on required for talos-r4-snow-007

RESOLVED FIXED

Status

Infrastructure & Operations
RelOps
P3
normal
RESOLVED FIXED
7 years ago
5 years ago

People

(Reporter: philor, Assigned: dividehex)

Tracking

Details

(Whiteboard: [Computer Care returned 5/10])

(Reporter)

Description

7 years ago
The two logs I've got handy are

https://tbpl.mozilla.org/php/getParsedLog.php?id=8911338&tree=Mozilla-Inbound
"builddir: 'bash: /usr/bin/basename: Operation not permitted'"
"rm: tools: Device not configured"

and

https://tbpl.mozilla.org/php/getParsedLog.php?id=8911324&tree=Mozilla-Inbound
rm:  : Invalid argument
...
Cannot write to `firefox-12.0a1.en-US.mac.dmg' (Invalid argument).

which sounds like its disk is no longer a disk.
(Reporter)

Comment 1

7 years ago
https://build.mozilla.org/buildapi/recent/talos-r4-snow-007 - turns out you can take quite a lot of jobs if they take you between 0 and a maximum of 12 seconds :)
Severity: normal → major
(Reporter)

Comment 2

7 years ago
Whether someone disabled it, or it ate its own liver, it seems to have stopped taking jobs.
Severity: major → normal

Comment 3

7 years ago
disabled it in slavealloc
Please reimage this slave
Assignee: nobody → server-ops-releng
Component: Release Engineering → Server Operations: RelEng
QA Contact: release → zandr
Summary: talos-r4-snow-007 is quite broken → please reimage talos-r4-snow-007
Whiteboard: [badslave][buildduty]
Assignee: server-ops-releng → jwatkins
(Assignee)

Comment 5

7 years ago
Re-imaging is done.
Assignee: jwatkins → nobody
Component: Server Operations: RelEng → Release Engineering
QA Contact: zandr → release

Updated

6 years ago
Assignee: nobody → coop
Priority: -- → P3
Summary: please reimage talos-r4-snow-007 → Return talos-r4-snow-007 to the production pool
Whiteboard: [buildduty][loaned slave][capacity]

Comment 6

6 years ago
This slave is back in pool but retains it's note in slavealloc in case it starts acting up again.
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Whiteboard: [buildduty][loaned slave][capacity] → [buildduty][loaned slave][capacity][badslave?]
(Reporter)

Comment 7

6 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=9677405&tree=Mozilla-Inbound
...
Cannot write to `firefox-13.0a1.en-US.mac64.dmg' (Invalid argument).
Status: RESOLVED → REOPENED
Priority: P3 → --
Resolution: FIXED → ---
Summary: Return talos-r4-snow-007 to the production pool → Please disable talos-r4-snow-007 and do a second level of fixing of it

Comment 8

6 years ago
Re-imaging doesn't seem to have worked. This mini will need some diagnostics and possibly repair.
Assignee: coop → server-ops-releng
Component: Release Engineering → Server Operations: RelEng
Priority: -- → P3
QA Contact: release → arich
Summary: Please disable talos-r4-snow-007 and do a second level of fixing of it → Hands-on required for talos-r4-snow-007

Updated

6 years ago
Blocks: 728535
Assignee: server-ops-releng → mlarrain
colo-trip: --- → scl1
I will pull it from the rack and get it sent for repair/diag.
giving to desktop to send for repair
Assignee: mlarrain → desktop-support
Component: Server Operations: RelEng → Server Operations: Desktop Issues
QA Contact: arich → tfairfield

Updated

6 years ago
Assignee: desktop-support → hlangi
Sent out with Computer Care.
Status: REOPENED → ASSIGNED
Whiteboard: [buildduty][loaned slave][capacity][badslave?] → [Computer Care 3/8]
Do we now have confirmation that this is covered under warranty and an ETA for the repaired machine to be returned to us?
Recovered from computer care, logic board was replaced. Placing on Jake's desk.
Assignee: hlangi → jwatkins
Component: Server Operations: Desktop Issues → Server Operations: RelEng
QA Contact: tfairfield → arich
Whiteboard: [Computer Care 3/8] → [Computer Care returned 3/26]
(Assignee)

Comment 14

6 years ago
This slave has been returned to scl1 and has been reimaged.
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED

Updated

6 years ago
No longer blocks: 728535

Updated

6 years ago
Blocks: 728535

Comment 15

6 years ago
reopening - this unit seems to be having disk related troubles again 

see bug 728535 for examples

please pull and take a look, thanks

Updated

6 years ago
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 16

6 years ago
I have pulled this slave and will send it to desktop for repairs.
Status: REOPENED → ASSIGNED
Whiteboard: [Computer Care returned 3/26] → [Computer Care deployed 5/02]
Returned from computer care, replaced hard drive
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago6 years ago
Resolution: --- → FIXED
Whiteboard: [Computer Care deployed 5/02] → [Computer Care returned 5/10]
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.