Closed
Bug 644991
Opened 14 years ago
Closed 14 years ago
disable masters on buildbot-master1 due to slow drive
Categories
(Release Engineering :: General, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: catlee, Assigned: catlee)
References
Details
(Whiteboard: [buildmasters][buildduty])
Attachments
(1 file)
5.10 KB,
patch
|
dustin
:
review+
catlee
:
checked-in+
|
Details | Diff | Splinter Review |
Depending on how jittery the drive is feeling, I get hdparm reuslts ranging from 70-95MB/s on buildbot-master1. The machine has been up for 97 days, and has plenty of these messages in dmesg:
ata1: spurious interrupt (irq_stat 0x8 active_tag -84148995 sactive 0x8)
ata1: spurious interrupt (irq_stat 0x8 active_tag -84148995 sactive 0x2)
will reboot and see what the machine feels like afterwards
Assignee | ||
Comment 1•14 years ago
|
||
Running a SMART self test causes a whole ton of those spurious interrupt messages
Comment 2•14 years ago
|
||
Before we can take this out of service, we need to move the slaves off of it. Moving to RelEng, throw it back when the slaves have been moved.
Assignee: server-ops-releng → nobody
Component: Server Operations: RelEng → Release Engineering
QA Contact: zandr → release
Updated•14 years ago
|
Priority: -- → P3
Whiteboard: [buildmasters][slaveduty]
Comment 3•14 years ago
|
||
The slaves are off the master.
I wasn't aware that we are lacking so many masters.
I have bumped the version to critical as it affects our current capacity.
Assignee: nobody → server-ops-releng
Severity: normal → critical
Component: Release Engineering → Server Operations: RelEng
Priority: P3 → --
QA Contact: release → zandr
Whiteboard: [buildmasters][slaveduty] → [buildmasters][slaveduty][buildduty]
Comment 4•14 years ago
|
||
Reducing as there is not much more that you can do besides getting a healthy master.
Severity: critical → normal
Comment 5•14 years ago
|
||
Sorry for the noise. There are still some slaves connected.
Assignee: server-ops-releng → nobody
Component: Server Operations: RelEng → Release Engineering
QA Contact: zandr → release
Comment 6•14 years ago
|
||
Found in triage:
1) Before we can hand this over to IT, we have to power off all the master instances running on this box.
2) Per triage right now, the try and build master instances are still running. The test master instance is already off.
3) Because of load on other buildbot masters, we cannot power off this master just yet, no matter how sick it is. Once the other new masters (04,06) are fully online, and sharing load, we can revisit.
Pushing this bug to catlee, who is working on setting up 04,06, and therefore will know when its safe to power off buildbot-master1.
Assignee: nobody → catlee
Comment 7•14 years ago
|
||
Any update on this? Would like to get this shut down so we can send it out for repair.
Comment 8•14 years ago
|
||
zandr we are waiting on disabling masters from this host once we are done the work of setting more masters up on SJC (bug 656413).
Adding dependency.
Depends on: 656413
Summary: buildbot-master1 has slow drive → disable masters on buildbot-master1 due to slow drive
Whiteboard: [buildmasters][slaveduty][buildduty] → [buildmasters][slaveduty][buildduty] waiting on setup of other masters
Updated•14 years ago
|
Whiteboard: [buildmasters][slaveduty][buildduty] waiting on setup of other masters → [buildmasters][buildduty][waiting on setup of other masters]
Assignee | ||
Comment 9•14 years ago
|
||
This is ready to go back to IX.
No longer depends on: 656413
Whiteboard: [buildmasters][buildduty][waiting on setup of other masters] → [buildmasters][buildduty]
Assignee | ||
Updated•14 years ago
|
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 10•14 years ago
|
||
Attachment #536417 -
Flags: review?(dustin)
Comment 11•14 years ago
|
||
In which bug is this host tracked to be sent back to IX?
Comment 12•14 years ago
|
||
Comment on attachment 536417 [details] [diff] [review]
Remove buildbot-master1 from masters json
Remove them from slavealloc, too?
Attachment #536417 -
Flags: review?(dustin) → review+
Assignee | ||
Comment 13•14 years ago
|
||
(In reply to comment #12)
> Comment on attachment 536417 [details] [diff] [review] [review]
> Remove buildbot-master1 from masters json
>
> Remove them from slavealloc, too?
Yeah, might as well. Is that doable easily?
Assignee | ||
Updated•14 years ago
|
Attachment #536417 -
Flags: checked-in+
Comment 14•14 years ago
|
||
Via mysql, yes - I took care of it.
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•