Closed
Bug 1036509
Opened 11 years ago
Closed 11 years ago
Put 11 new mtnlion test machines into production
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: coop, Assigned: coop)
References
Details
Attachments
(3 files, 1 obsolete file)
863 bytes,
patch
|
jlund
:
review+
coop
:
checked-in+
|
Details | Diff | Splinter Review |
10.03 KB,
patch
|
jlund
:
review+
coop
:
checked-in+
|
Details | Diff | Splinter Review |
2.53 KB,
patch
|
coop
:
review+
coop
:
checked-in+
|
Details | Diff | Splinter Review |
11 builders were re-imaged as testers in bug 1034715. Let's get them into production.
Assignee | ||
Comment 1•11 years ago
|
||
The machines are:
talos-mtnlion-r5-090.test.releng.scl3.mozilla.com
talos-mtnlion-r5-091.test.releng.scl3.mozilla.com
talos-mtnlion-r5-092.test.releng.scl3.mozilla.com
talos-mtnlion-r5-093.test.releng.scl3.mozilla.com
talos-mtnlion-r5-094.test.releng.scl3.mozilla.com
talos-mtnlion-r5-095.test.releng.scl3.mozilla.com
talos-mtnlion-r5-096.test.releng.scl3.mozilla.com
talos-mtnlion-r5-097.test.releng.scl3.mozilla.com
talos-mtnlion-r5-098.test.releng.scl3.mozilla.com
talos-mtnlion-r5-099.test.releng.scl3.mozilla.com
talos-mtnlion-r5-100.test.releng.scl3.mozilla.com
Assignee | ||
Comment 2•11 years ago
|
||
Attachment #8453236 -
Flags: review?(jlund)
Assignee | ||
Comment 3•11 years ago
|
||
These are missing completely from the default data list right now, but have already been added in staging and production. Not awesome, I know, but we should get them back in sync.
Attachment #8453245 -
Flags: review?(jlund)
Comment 4•11 years ago
|
||
Comment on attachment 8453245 [details] [diff] [review]
Add mtnlion slaves to graph server
Review of attachment 8453245 [details] [diff] [review]:
-----------------------------------------------------------------
I think we want "talos-mtnlion-r5-XXX" here no?
I assume we either didn't put these in staging/production or they would have blown up as dups?
Attachment #8453245 -
Flags: review?(jlund) → review-
Comment 5•11 years ago
|
||
Comment on attachment 8453236 [details] [diff] [review]
Add 11 mtnlion slaves to test pool
Review of attachment 8453236 [details] [diff] [review]:
-----------------------------------------------------------------
just a sanity check Bug 1034715 specifies 10 machines: 90-99. this bug here specifies 11: 90-100
I'm guessing we just had another machine that buildbot didn't know about but something also seems wrong with hostname resolve (look at 090):
> ssh cltbld@talos-mtnlion-r5-090
[cltbld@bld-lion-r5-038.try.releng.scl3.mozilla.com ~]$ exit
> ssh cltbld@talos-mtnlion-r5-091
[cltbld@talos-mtnlion-r5-091.test.releng.scl3.mozilla.com ~]$ exit
> ssh cltbld@talos-mtnlion-r5-100
[cltbld@talos-mtnlion-r5-100.test.releng.scl3.mozilla.com ~]$
Assignee | ||
Comment 6•11 years ago
|
||
(In reply to Jordan Lund (:jlund) from comment #5)
> just a sanity check Bug 1034715 specifies 10 machines: 90-99. this bug here
> specifies 11: 90-100
>
> I'm guessing we just had another machine that buildbot didn't know about but
> something also seems wrong with hostname resolve (look at 090)
Yes, Amy returned bld-lion-r5-087 on loan in bug 992378 as a mtnlion tester at my request.
Assignee | ||
Comment 7•11 years ago
|
||
(In reply to Jordan Lund (:jlund) from comment #4)
> I think we want "talos-mtnlion-r5-XXX" here no?
>
> I assume we either didn't put these in staging/production or they would have
> blown up as dups?
Yes, sorry. Cut-n-paste fail here on my part.
These aren't getting added to production/stage because they're already there. I'll just trying to back-fill the graphsserver bootstrapping sql.
Attachment #8453245 -
Attachment is obsolete: true
Attachment #8453457 -
Flags: review?(jlund)
Updated•11 years ago
|
Attachment #8453457 -
Flags: review?(jlund) → review+
Comment 8•11 years ago
|
||
Comment on attachment 8453236 [details] [diff] [review]
Add 11 mtnlion slaves to test pool
Review of attachment 8453236 [details] [diff] [review]:
-----------------------------------------------------------------
Ah OK. And I assume you are confident that it won't be an issue if talos-mtnlion-r5-090 thinks its host name is different than its fqdn as you didn't touch on that part of the review comment. My primitive understanding thought maybe this machine was imaged incorrectly or it would have puppetizing/buildbot issues.
sorry to comment on this again but looking at this slave, my spidey sense says this slave was not re-imaged and may warrant confirmation of amy or whoever did it.
jlund@Hastings163:~BUDM (*)
> ssh cltbld@talos-mtnlion-r5-090 build-cfg-integration [47aabad] modified
[cltbld@bld-lion-r5-038.try.releng.scl3.mozilla.com ~]$ ls /builds/slave/
buildbot.tac try-osx64-00000000000000000000 twistd.log twistd.log.3 twistd.log.7
reboot_count.txt try-osx64-d-000000000000000000 twistd.log.1 twistd.log.4 twistd.log.8
tb-try-c-cen-osx64-00000000000 try-osx64_g-000000000000000000 twistd.log.10 twistd.log.5 twistd.log.9
tb-try-c-cen-osx64-d-000000000 twistd.hostname twistd.log.2 twistd.log.6
tail /var/log/system.log
Jul 9 21:39:04 bld-lion-r5-038 com.apple.launchd[1] (org.collectd.collectd): Throttling respawn: Will start in 10 seconds
Jul 9 21:39:14 bld-lion-r5-038 collectd[32465]: Looking up "bld-lion-r5-038.try.releng.scl3.mozilla.com" failed. You have set the "FQDNLookup" option, but I cannot resolve my ho
stname to a fully qualified domain name. Please fix you network configuration.
Jul 9 21:39:14 bld-lion-r5-038 com.apple.launchd[1] (org.collectd.collectd[32465]): Exited with code: 1
Jul 9 21:39:14 bld-lion-r5-038 com.apple.launchd[1] (org.collectd.collectd): Throttling respawn: Will start in 10 seconds
Jul 9 21:39:24 bld-lion-r5-038 collectd[32478]: Looking up "bld-lion-r5-038.try.releng.scl3.mozilla.com" failed. You have set the "FQDNLookup" option, but I cannot resolve my
/var/log/system.log.7.bz2 goes back to july 1st.
Either way r+ for what this patch is trying to achieve.
Attachment #8453236 -
Flags: review?(jlund) → review+
Assignee | ||
Comment 9•11 years ago
|
||
(In reply to Jordan Lund (:jlund) from comment #8)
> Ah OK. And I assume you are confident that it won't be an issue if
> talos-mtnlion-r5-090 thinks its host name is different than its fqdn as you
> didn't touch on that part of the review comment. My primitive understanding
> thought maybe this machine was imaged incorrectly or it would have
> puppetizing/buildbot issues.
The fact that it's still displaying it's old hostname means something is amiss here. Maybe it just got missed in the re-imaging?
Since the old version of this machine (bld-lion-r5-038) is disabled in slavealloc, it doesn't really matter for now. I've kicked off a re-image for this slave using the test cluster commands from the mana page to see if that fixes it. I'm obviously not going to enable the other slaves now. I'll do that and check the result of the re-image in the morning.
Assignee | ||
Comment 10•11 years ago
|
||
Comment on attachment 8453236 [details] [diff] [review]
Add 11 mtnlion slaves to test pool
Review of attachment 8453236 [details] [diff] [review]:
-----------------------------------------------------------------
https://hg.mozilla.org/build/buildbot-configs/rev/4695ddab8666
Attachment #8453236 -
Flags: checked-in+
Assignee | ||
Comment 11•11 years ago
|
||
Comment on attachment 8453457 [details] [diff] [review]
Add mtnlion slaves to graph server, v2
Review of attachment 8453457 [details] [diff] [review]:
-----------------------------------------------------------------
http://hg.mozilla.org/graphs/rev/f3b5552dba34
Attachment #8453457 -
Flags: checked-in+
Assignee | ||
Comment 12•11 years ago
|
||
Attachment #8453953 -
Flags: review?(jlund)
Comment 13•11 years ago
|
||
Comment on attachment 8453953 [details] [diff] [review]
Move 3 preprod lion build slaves into production
Review of attachment 8453953 [details] [diff] [review]:
-----------------------------------------------------------------
lgtm with one in line check.
::: mozilla/production_config.py
@@ +1,1 @@
> +MAC_LION_MINIS = ['bld-lion-r5-%03d' % x for x in range(1,16) + range(41,69) + range(70,87) + range(89,92) + range(93,95)]
sanity check, do we want to remove bld-lion-r5-088? I still see it in slavealloc and says it's enabled.
Assignee | ||
Comment 14•11 years ago
|
||
Comment on attachment 8453953 [details] [diff] [review]
Move 3 preprod lion build slaves into production
Review of attachment 8453953 [details] [diff] [review]:
-----------------------------------------------------------------
Got an r+ from jlund in IRC, provided I add back in 088.
https://hg.mozilla.org/build/buildbot-configs/rev/d9062d7fa837
Attachment #8453953 -
Flags: review?(jlund)
Attachment #8453953 -
Flags: review+
Attachment #8453953 -
Flags: checked-in+
Assignee | ||
Comment 15•11 years ago
|
||
Merged to production, and deployed.
Assignee | ||
Comment 16•11 years ago
|
||
I've enabled these slave in slavealloc, and have rebooted them. They should begin taking jobs shortly.
Assignee | ||
Comment 17•11 years ago
|
||
These are all taking jobs now.
Take that, wait times!
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•7 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•