Closed
Bug 650335
Opened 14 years ago
Closed 14 years ago
bring up slaves from 3/17 IX repair trip
Categories
(Infrastructure & Operations :: RelOps: General, task)
Infrastructure & Operations
RelOps: General
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: dustin, Assigned: zandr)
References
Details
The following slaves came back from IX yesterday, and should be racked up now (from bug 596366 comment 102):
linux-ix-slave01: bug 624371 A1-16072 4620 scl1
linux-ix-slave06: bug 624210 A1-16077 4625 scl1
linux-ix-slave13: bug 619624 A1-16084 4632 scl1
linux-ix-slave16: comment 8 A1-16087 4635 scl1
linux-ix-slave17: comment 55 A1-16088 4636 scl1
linux-ix-slave33: bug 620124 A1-16163 4674 scl1
linux-ix-slave34: comment 58 A1-16164 4675 scl1
linux-ix-slave35: bug 620124 A1-16165 4676 scl1
linux-ix-slave42: bug 624207 A1-16172 4773 scl1
linux64-ix-slave04: comment 78 A1-16176 4777 scl1
linux64-ix-slave10: comment 78 A1-16182 4783 scl1
linux64-ix-slave11: comment 78 A1-16183 4784 scl1
linux64-ix-slave12: comment 83 A1-16184 4785 scl1
linux64-ix-slave13: comment 83 A1-16185 4786 scl1
linux64-ix-slave16: comment 78 A1-16188 4789 scl1
linux64-ix-slave40: comment 102 A1-16212 4813 scl1
linux64-ix-slave41: comment 101 A1-16213 4814 scl1
mv-moz2-linux-ix-slave12: A1-14132 3121 mtv1
w32-ix-slave07: A1-16053 4601 mtv1
w32-ix-slave08: bug 635416#c31 A1-16054 4602 mtv1
w32-ix-slave23: comment 51 A1-16069 4617 scl1
w32-ix-slave41: bug 615744 A1-16104 4705 scl1
w64-ix-slave02: bug 638814 A1-16107 4708 scl1
w64-ix-slave06: bug 639628#c22 A1-16111 4712 scl1
w64-ix-slave07: comment 78 A1-16112 4713 scl1
w64-ix-slave11: comment 78 A1-16116 4717 scl1
This bug will track bringing them back into production, including adding them to slavealloc.
Reporter | ||
Comment 1•14 years ago
|
||
It looks like these aren't re-imaged yet. I don't know how to do that via IPMI, so I guess that's step one for someone else - zandr?
Comment 2•14 years ago
|
||
I am working on reimaging the linux-ix-slave* machines now, minus the ones that are not responding via IPMI (bug 651178)
The w64 reimages are waiting on the new ref image in bug 645024.
The linux64 images are waiting on the new ref image in bug 648342.
Comment 3•14 years ago
|
||
The following servers are not yet back from IX:
** linux-ix-slave01
** linux-ix-slave17
** linux-ix-slave33
** linux64-ix-slave10
** linux64-ix-slave16
** linux64-ix-slave41
** w32-ix-slave23
** w32-ix-slave41
** w64-ix-slave07
The following hosts that have been reimaged have also had their hostname set. They have not been re-added back into puppet, per Dustin.
Built from linux-ix-ref-20110204:
linux-ix-slave06
linux-ix-slave16
linux-ix-slave35
linux-ix-slave42
linux-ix-slave13 is installing slowly and had block errors the first time I tried to image it. It should be done by morning, and I'll check on it then.
linux-ix-slave34 is having issues getting to the boot menu from IPMI and needs physical intervention: bug 651306
Built from linux64-ix-ref-20110419:
linux64-ix-slave04
linux64-ix-slave11
linux64-ix-slave13
linux64-ix-slave40
linux64-ix-slave12 is still in the process of installing (quite slowly) and should be finished by morning.
The following hosts are on hold pending a new win64 image:
w64-ix-slave02 10.12.48.154
w64-ix-slave06 10.12.48.158
w64-ix-slave11 10.12.48.163
The following host is back in MTV waiting to be powered back on:
mv-moz2-linux-ix-slave12
I'd like some naming clarification on the following hosts before I image them, since they switched datacenters:
w32-ix-slave07
w32-ix-slave08
Should the records be for w32-ix-slaveNN or win32-ix-slaveNN? The new ones were set up as w32, but it appears that the records for the old ones are win32.
Status: NEW → ASSIGNED
Depends on: 651306
Comment 4•14 years ago
|
||
linux64-ix-slave12 is up as well.
linux-ix-slave13 failed after the second reimage, so I've opened up a hardware bug for it.
Reporter | ||
Comment 5•14 years ago
|
||
w32-ix-slave07
w32-ix-slave08
should be scl1.build.mozilla.org, and should keep the same names. Any records matching /win32-ix-slave.*/ are an error and should be deleted.
Reporter | ||
Comment 6•14 years ago
|
||
I have bad news:
[root@linux-ix-slave06 ~]# hdparm -tT /dev/sda
Timing cached reads: 29336 MB in 1.99 seconds = 14738.16 MB/sec
Timing buffered disk reads: 148 MB in 3.00 seconds = 49.28 MB/sec
[root@linux-ix-slave16 ~]# hdparm -tT /dev/sda
Timing cached reads: 29336 MB in 1.99 seconds = 14736.58 MB/sec
Timing buffered disk reads: 214 MB in 3.02 seconds = 70.97 MB/sec
[root@linux-ix-slave42 ~]# hdparm -tT /dev/sda
Timing cached reads: 29336 MB in 1.99 seconds = 14736.70 MB/sec
Timing buffered disk reads: 186 MB in 3.01 seconds = 61.86 MB/sec
[root@linux64-ix-slave04 ~]# hdparm -tT /dev/sda
Timing cached reads: 23900 MB in 2.00 seconds = 11964.91 MB/sec
Timing buffered disk reads: 258 MB in 3.00 seconds = 85.91 MB/sec
[root@linux64-ix-slave11 ~]# hdparm -tT /dev/sda
Timing cached reads: 23964 MB in 2.00 seconds = 11995.92 MB/sec
Timing buffered disk reads: 266 MB in 3.01 seconds = 88.41 MB/sec
[root@linux64-ix-slave12 ~]# hdparm -tT /dev/sda
Timing cached reads: 21128 MB in 2.00 seconds = 10576.90 MB/sec
Timing buffered disk reads: 46 MB in 3.07 seconds = 14.97 MB/sec
[root@linux64-ix-slave13 ~]# hdparm -tT /dev/sda
Timing cached reads: 23908 MB in 2.00 seconds = 11969.06 MB/sec
Timing buffered disk reads: 242 MB in 3.01 seconds = 80.44 MB/sec
[root@linux64-ix-slave40 ~]# hdparm -tT /dev/sda
Timing cached reads: 23924 MB in 2.00 seconds = 11977.16 MB/sec
Timing buffered disk reads: 162 MB in 3.00 seconds = 53.96 MB/sec
All but linux64-ix-slave12 are idle (booted in multiuser mode, but nothing running). All of them have a good bit of variance across multiple runs, but I only saw *one* check over 90MB/s, on linux64-ix-slave11. If 90MB/s is our send-it-back-to-IX threshold, then all of these systems need to go back. Zandr, what do you think?
Assignee | ||
Comment 7•14 years ago
|
||
(In reply to comment #6)
> Zandr, what do you think?
I think I'm going to email this comment to iX and see what they have to say.
Reporter | ||
Comment 8•14 years ago
|
||
Idle measurement of linux64-ix-slave12:
[root@linux64-ix-slave12 ~]# hdparm -tT /dev/sda
Timing cached reads: 21432 MB in 2.00 seconds = 10727.27 MB/sec
Timing buffered disk reads: 36 MB in 3.10 seconds = 11.62 MB/sec
Comment 9•14 years ago
|
||
(In reply to comment #5)
> w32-ix-slave07
> w32-ix-slave08
I've reimaged these servers. I didn't see any steps to take for a postimaging, so they're just as they booted up.
Assignee | ||
Comment 10•14 years ago
|
||
I stopped by iX systems last night and chatted with them a bit about what we've been seeing here.
They're going to package up their burnin script so we can run the same tests they do for production qualification.
Stay tuned.
Comment 11•14 years ago
|
||
FYI w32-ix-slave08 seems to autologin the cltbld user, but w32-ix-slave07 tries
to autologin Administrator and fails.
Comment 12•14 years ago
|
||
linux-ix-slave34 has been reimaged now as well and kernel panics saying that it
can not find /dev/root.
The only machine that is back from IX that has not been reimaged (attempted)
yet is mv-moz2-linux-ix-slave12, which is still waiting to be powered on.
Updated•14 years ago
|
Assignee: arich → zandr
Reporter | ||
Comment 13•14 years ago
|
||
I don't think there's anything left to do here - these systems will all get batched and sent to iX as part of bug 655304.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•