Closed
Bug 848885
Opened 11 years ago
Closed 11 years ago
Move staging test minis to production
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task, P3)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: armenzg, Assigned: armenzg)
References
Details
Attachments
(5 files, 1 obsolete file)
4.47 KB,
patch
|
nthomas
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
20.32 KB,
patch
|
nthomas
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
507 bytes,
patch
|
nthomas
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
679 bytes,
patch
|
nthomas
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
1.67 KB,
patch
|
nthomas
:
review+
armenzg
:
checked-in+
|
Details | Diff | Splinter Review |
Adding few machines back to the pool will help slightly the wait times. Let's take all of our preprod-test machines and make them take production jobs. Whenever we need one of these machines we can pull them back from production. If we need to modify them or loan them then we will need to use our usual re-imaging. For the puppet slaves I assume we will sync with production and put them on staging when we need to test a new package. It will be a little harder to test new packages but I believe it is better than finding discrepancies when taking production jobs. We will need to remove network rules that we added when we wanted to determine if we could run tests without reaching external networks. AFAIK that project is halted. Once we have all the patches ready we should re-image all of these machines just to be sure that we start from a clean state. talos-mtnlion-r5-001 talos-mtnlion-r5-002 talos-mtnlion-r5-003 talos-mtnlion-r5-010 talos-r3-fed-001 talos-r3-fed-002 talos-r3-fed-010 talos-r3-fed64-001 talos-r3-fed64-002 talos-r3-fed64-010 talos-r3-w7-001 talos-r3-w7-002 talos-r3-w7-003 talos-r3-w7-010 talos-r3-xp-001 talos-r3-xp-002 talos-r3-xp-003 talos-r3-xp-010 talos-r4-lion-001 talos-r4-lion-002 talos-r4-lion-003 talos-r4-lion-010 talos-r4-snow-001 talos-r4-snow-002 talos-r4-snow-003 <strike>talos-r4-snow-046</strike> - it misbehaves
Assignee | ||
Updated•11 years ago
|
Blocks: talos-r3-fed64-001
Assignee | ||
Updated•11 years ago
|
Blocks: talos-r3-w7-003
Comment 1•11 years ago
|
||
(In reply to Armen Zambrano G. [:armenzg] from comment #0) > <strike>talos-r4-snow-046</strike> - it misbehaves Yes, we'll need to make sure the slavealloc comment is preserved for this slave (and any others in the same state), and that it doesn't get enabled by accident.
Assignee | ||
Comment 2•11 years ago
|
||
poolids determined through this query: mysql> select * from pools where name like 'tests-scl1%'; +--------+--------------------+ | poolid | name | +--------+--------------------+ | 22 | tests-scl1-linux | | 11 | tests-scl1-macosx | | 29 | tests-scl1-panda | | 19 | tests-scl1-windows | +--------+--------------------+ 4 rows in set (0.01 sec) This is the last patch to land. Everything else has to happen first.
Attachment #722429 -
Flags: review?(nthomas)
Assignee | ||
Comment 3•11 years ago
|
||
Attachment #722440 -
Flags: review?(nthomas)
Assignee | ||
Comment 4•11 years ago
|
||
Graphs seems to have those machines in production.
Assignee | ||
Comment 5•11 years ago
|
||
Armens-MacBook-Air puppet-manifests hg:[default!] $ for i in `cat list_machines | grep -v "w7|xp|mtn"`; do grep "$i" scl-production.pp; done | wc -l 13 Armens-MacBook-Air puppet-manifests hg:[default!] $ for i in `cat list_machines`; do grep "$i" scl-production.pp; done | wc -l 13 Armens-MacBook-Air puppet-manifests hg:[default!] $ for i in `cat list_machines`; do grep "$i" scl-production.pp; done node "talos-r3-fed-001" inherits "fedora12-i686-test" { node "talos-r3-fed-002" inherits "fedora12-i686-test" { node "talos-r3-fed-010" inherits "fedora12-i686-test" { node "talos-r3-fed64-001" inherits "fedora12-x86_64-test" { node "talos-r3-fed64-002" inherits "fedora12-x86_64-test" { node "talos-r3-fed64-010" inherits "fedora12-x86_64-test" { node "talos-r4-lion-001" inherits "darwin11-x86_64-test" { node "talos-r4-lion-002" inherits "darwin11-x86_64-test" { node "talos-r4-lion-003" inherits "darwin11-x86_64-test" { node "talos-r4-lion-010" inherits "darwin11-x86_64-test" { node "talos-r4-snow-001" inherits "darwin10-i386-test" { node "talos-r4-snow-002" inherits "darwin10-i386-test" { node "talos-r4-snow-003" inherits "darwin10-i386-test" { Armens-MacBook-Air puppet-manifests hg:[default!] $ for i in `cat list_machines`; do grep "$i" staging.pp; done | wc -l 0 Armens-MacBook-Air puppet-manifests hg:[default!] $ grep "talos-r4-snow-046" staging.pp node "talos-r4-snow-046" inherits "darwin10-i386-test" { Armens-MacBook-Air puppet-manifests hg:[default!] $ grep "talos-r4-snow-046" scl-production.pp
Attachment #722445 -
Flags: review?(nthomas)
Assignee | ||
Comment 6•11 years ago
|
||
Do I need to make any changes for mtnlion slaves?
Assignee | ||
Comment 7•11 years ago
|
||
Attachment #722449 -
Flags: review?(nthomas)
Comment 8•11 years ago
|
||
Comment on attachment 722429 [details] [diff] [review] move staging machines to production (slavealloc) You should use poolid=28 for the talos-mtnlion slaves, as they talk to scl3 masters. Otherwise OK.
Attachment #722429 -
Flags: review?(nthomas) → review-
Comment 9•11 years ago
|
||
Comment on attachment 722440 [details] [diff] [review] buildbot-configs - move staging test slaves to production >diff --git a/mozilla-tests/production_config.py b/mozilla-tests/production_config.py >- 'fedora64' : dict([("talos-r3-fed64-%03i" % x, {}) for x in range (3,10) + range(11,35) + range(36,72)]), >+ 'fedora64' : dict([("talos-r3-fed64-%03i" % x, {}) for x in range (1,72)]), talos-r3-fed64-035 got decommissioned but you're adding it back here. >- 'win7': dict([("talos-r3-w7-%03i" % x, {}) for x in range(4,10) + range(11,17) + range(18,105)]), >+ 'win7': dict([("talos-r3-w7-%03i" % x, {}) for x in range(1,105)]), talos-r3-w7-018 also got decommissioned >- 'snowleopard': dict([("talos-r4-snow-%03i" % x, {}) for x in range(4,46) + range(47,81) + [82,84]]), >+ 'snowleopard': dict([("talos-r4-snow-%03i" % x, {}) for x in range(1,84) \ >+ if x not in [46]]), # bug 824754 - This machine is not suitable for production We don't have a talos-r4-snow-081, apparently, so lets not add it to the config. r+ if you fix that up.
Attachment #722440 -
Flags: review?(nthomas) → review+
Updated•11 years ago
|
Attachment #722449 -
Flags: review?(nthomas) → review+
Updated•11 years ago
|
Attachment #722445 -
Flags: review?(nthomas) → review+
Comment 10•11 years ago
|
||
Comment on attachment 722440 [details] [diff] [review] buildbot-configs - move staging test slaves to production Does this still pass buildbot-configs/mozilla/test/test_slave_allocation.py ?
Updated•11 years ago
|
OS: Mac OS X → All
Summary: Make staging tests machines to take production jobs → Move staging test minis to production
Assignee | ||
Comment 11•11 years ago
|
||
(In reply to Nick Thomas [:nthomas] from comment #10) > Comment on attachment 722440 [details] [diff] [review] > buildbot-configs - move staging test slaves to production > > Does this still pass buildbot-configs/mozilla/test/test_slave_allocation.py ? Yes, it still does.
Assignee | ||
Updated•11 years ago
|
Attachment #722440 -
Flags: checked-in+
Assignee | ||
Updated•11 years ago
|
Attachment #722445 -
Flags: checked-in+
Assignee | ||
Updated•11 years ago
|
Attachment #722449 -
Flags: checked-in+
Assignee | ||
Comment 12•11 years ago
|
||
These machines are waiting for a reconfig and then run the slavealloc patch: > talos-mtnlion-r5-002 - no puppet changes are needed > talos-mtnlion-r5-003 - no puppet changes are needed > talos-r3-w7-001 - hostname changed > talos-r3-w7-002 - hostname changed > talos-r3-w7-003 - hostname changed > talos-r3-w7-010 - hostname changed > talos-r3-xp-001 - hostname changed and added to OPSI production > talos-r3-xp-010 - hostname changed and added to OPSI production > talos-r4-lion-001 - booked > talos-r4-lion-002 - IT working on it > talos-r4-lion-003 - puppetized > talos-r4-lion-010 - puppetized > talos-r4-snow-001 - puppetized > talos-r4-snow-002 - puppetized > talos-r4-snow-003 - puppetized These are waiting for various reasons: > talos-mtnlion-r5-001 - booked > talos-mtnlion-r5-010 - booked > talos-r3-fed-001 - booked > talos-r3-fed-002 - IT working on it > talos-r3-fed-010 - IT working on it > talos-r3-fed64-001 - IT working on it > talos-r3-fed64-002 - IT working on it > talos-r3-fed64-010 - booked > talos-r3-xp-002 - IT working on it > talos-r3-xp-003 - IT working on it
Assignee | ||
Comment 13•11 years ago
|
||
These machines are still being used by other relengers and might cause them trouble if they don't sync with staging. I have added them back. This patch is for when we have moved them to production.
Attachment #722860 -
Flags: review?(nthomas)
Assignee | ||
Comment 14•11 years ago
|
||
Attachment #722429 -
Attachment is obsolete: true
Attachment #722861 -
Flags: review?(nthomas)
Assignee | ||
Updated•11 years ago
|
Priority: -- → P2
Updated•11 years ago
|
Attachment #722860 -
Attachment description: add back few staging machines → remove last few staging machines
Attachment #722860 -
Flags: review?(nthomas) → review+
Updated•11 years ago
|
Attachment #722861 -
Flags: review?(nthomas) → review+
Assignee | ||
Comment 15•11 years ago
|
||
Merged and reconfiguration completed.
Assignee | ||
Comment 16•11 years ago
|
||
I've put these slaves into production: https://build.mozilla.org/buildapi/recent/talos-mtnlion-r5-002 https://build.mozilla.org/buildapi/recent/talos-mtnlion-r5-003 https://build.mozilla.org/buildapi/recent/talos-r3-w7-001 https://build.mozilla.org/buildapi/recent/talos-r3-w7-002 https://build.mozilla.org/buildapi/recent/talos-r3-w7-003 https://build.mozilla.org/buildapi/recent/talos-r3-w7-010 https://build.mozilla.org/buildapi/recent/talos-r3-xp-001 https://build.mozilla.org/buildapi/recent/talos-r3-xp-010 https://build.mozilla.org/buildapi/recent/talos-r4-lion-003 https://build.mozilla.org/buildapi/recent/talos-r4-lion-010 https://build.mozilla.org/buildapi/recent/talos-r4-snow-001 https://build.mozilla.org/buildapi/recent/talos-r4-snow-002 https://build.mozilla.org/buildapi/recent/talos-r4-snow-003 Ready to be put in production: * talos-r3-xp-003 > These are waiting for various reasons: > > talos-mtnlion-r5-001 - booked > > talos-mtnlion-r5-010 - booked > > talos-r3-fed-001 - booked > > talos-r3-fed-002 - IT working on it > > talos-r3-fed-010 - IT working on it > > talos-r3-fed64-001 - IT working on it > > talos-r3-fed64-002 - IT working on it > > talos-r3-fed64-010 - booked > > talos-r3-xp-002 - IT working on it > > talos-r3-xp-003 - IT working on it > > talos-r4-lion-001 - booked > > talos-r4-lion-002 - IT working on it
Assignee | ||
Comment 17•11 years ago
|
||
(In reply to Nick Thomas [:nthomas] from comment #9) > talos-r3-w7-018 also got decommissioned Are you sure this machine got decommissioned? I see it on DNS and it had been taking jobs happily: https://secure.pub.build.mozilla.org/buildapi/recent/talos-r3-w7-018 FTR, I removed one more snow machine that I should have needed to. I've added it back to default: http://hg.mozilla.org/build/buildbot-configs/rev/5b79bdab398a
Assignee | ||
Comment 18•11 years ago
|
||
* talos-r3-xp-003 https://secure.pub.build.mozilla.org/buildapi/recent/talos-r3-xp-003 I need to figure out these two: https://build.mozilla.org/buildapi/recent/talos-mtnlion-r5-002 https://build.mozilla.org/buildapi/recent/talos-mtnlion-r5-003 and these two: > Reimaged talos-mtnlion-r5-010 and talos-mtnlion-r5-001 (talked to kmoir). The following still need dcops intervention now: talos-r4-lion-002 talos-r3-xp-002 talos-r3-fed-002 talos-r3-fed-010 talos-r3-fed64-001 talos-r3-fed64-002 talos-r3-fed64-010 This is still booked: * talos-r3-fed-001
Assignee | ||
Comment 19•11 years ago
|
||
nthomas, where did you get the info that talos-r3-w7-019 was to be decommissioned? I can't find any reference to it. The mtnlion slaves are now taking jobs. talos-r3-xp-002 - taking jobs talos-r3-fed-002 - taking jobs These are connected to production masters but I'm still waiting on them: https://secure.pub.build.mozilla.org/buildapi/recent/talos-r3-fed-010 https://secure.pub.build.mozilla.org/buildapi/recent/talos-r3-fed64-001 https://secure.pub.build.mozilla.org/buildapi/recent/talos-r3-fed64-002 https://secure.pub.build.mozilla.org/buildapi/recent/talos-r3-fed64-010 The following still need dcops intervention now: talos-r4-lion-002 - IT still working on it This is still booked: * talos-r3-fed-001
Flags: needinfo?(nthomas)
Comment 20•11 years ago
|
||
(In reply to Armen Zambrano G. [:armenzg] from comment #19) > nthomas, where did you get the info that talos-r3-w7-019 was to be > decommissioned? I can't find any reference to it. Sorry, should have said talos-r3-w7-017 (bug 747734).
Flags: needinfo?(nthomas)
Assignee | ||
Comment 21•11 years ago
|
||
(In reply to Nick Thomas [:nthomas] from comment #20) > (In reply to Armen Zambrano G. [:armenzg] from comment #19) > > nthomas, where did you get the info that talos-r3-w7-019 was to be > > decommissioned? I can't find any reference to it. > > Sorry, should have said talos-r3-w7-017 (bug 747734). I landed a fix for it and I will put it back after our next reconfig.
Assignee | ||
Comment 22•11 years ago
|
||
Status ###### * talos-r4-lion-002 - IT still working on it * talos-r3-fed-001 - booked * talos-r3-w7-018 - put back in production after reconfig All other slaves have taken jobs on production.
Comment 23•11 years ago
|
||
This is in production.
Assignee | ||
Updated•11 years ago
|
Attachment #722860 -
Flags: checked-in+
Assignee | ||
Updated•11 years ago
|
Attachment #722861 -
Flags: checked-in+
Assignee | ||
Comment 24•11 years ago
|
||
Waiting on: * talos-r4-lion-002 * talos-r3-fed-001 armenzg has to follow up: * talos-r3-w7-001 * talos-r3-w7-002 * talos-r3-w7-003 * talos-r3-w7-010
Whiteboard: status on comment 23
Assignee | ||
Comment 25•11 years ago
|
||
talos-r4-lion-002 is running jobs. Waiting on talos-r3-fed-001 We will deal with the win7 slaves on bug 850531.
Whiteboard: status on comment 23 → status on comment 25
Assignee | ||
Updated•11 years ago
|
Priority: P2 → P3
Whiteboard: status on comment 25 → waiting on talos-r3-fed-001
Assignee | ||
Comment 26•11 years ago
|
||
I got fed-001 done.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: waiting on talos-r3-fed-001
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
Updated•6 years ago
|
Product: Release Engineering → Infrastructure & Operations
Updated•4 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•