Closed
Bug 985625
Opened 12 years ago
Closed 12 years ago
Releng Config/Automation support of staging run of scl1->scl3 move
Categories
(Infrastructure & Operations Graveyard :: CIDuty, task)
Infrastructure & Operations Graveyard
CIDuty
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: Callek, Assigned: Callek)
References
Details
Attachments
(4 files, 1 obsolete file)
2.69 KB,
patch
|
armenzg
:
review+
|
Details | Diff | Splinter Review |
5.05 KB,
patch
|
armenzg
:
review+
|
Details | Diff | Splinter Review |
1.37 KB,
patch
|
armenzg
:
review+
Callek
:
checked-in+
|
Details | Diff | Splinter Review |
2.91 KB,
patch
|
Callek
:
review+
Callek
:
checked-in+
|
Details | Diff | Splinter Review |
On 2014-03-18 12:43 , Amy Rich wrote:
> Here are the machines I would like to move from scl1 -> scl3 this week:
>
> b-linux64-hp-001.build.scl1.mozilla.com -> b-linux64-hp-0001.try.releng.scl3.mozilla.com (scl3-releng-vlan264)
> bld-linux64-ix-027.build.scl1.mozilla.com -> b-linux64-ix-0001.build.releng.scl3.mozilla.com (scl3-releng-vlan252)
> w64-ix-slave03.winbuild.scl1.mozilla.com -> b-2008-ix-0018.wintry.releng.scl3.mozilla.com (scl3-releng-vlan244)
>
> Hal, can those be removed from the pool and new configs set up in slavealloc/buildbot to accomodate them on the other side? If so, I'll open up a bug to make this happen. If these machines are not feasible, please suggest machines from the same pools.
>
...
On 2014-03-18 13:25 , Amy Rich wrote:
> I have one more to add, since I just ran a preliminary image of an r4 mini in scl3 and it seemed to succeed:
>
> talos-r4-snow-001.build.scl1.mozilla.com -> t-snow-r4-0001.test.releng.scl3.mozilla.com (scl3-releng-vlan256)
Assignee | ||
Comment 1•12 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #0)
> > w64-ix-slave03.winbuild.scl1.mozilla.com -> b-2008-ix-0018.wintry.releng.scl3.mozilla.com (scl3-releng-vlan244)
We'll use currently-in-production w64-ix-slave07 instead (since 03 is currently on a loan)
so thats
w64-ix-slave07.winbuild.scl1.mozilla.com -> b-2008-ix-0066.winbuild.releng.scl3.mozilla.com (scl3-releng-vlan236)
If you really needed to test the wintry not winbuild VLAN, let me know and I'll grab you a try one as well.
Assignee | ||
Comment 2•12 years ago
|
||
(In reply to Justin Wood (:Callek) from comment #1)
Ignore that whole comment...
w64-ix-slave31.winbuild.scl1.mozilla.com --> b-2008-ix-0019.wintry.releng.scl3.mozilla.com (scl3-releng-vlan244)
Assignee | ||
Comment 3•12 years ago
|
||
Now all hosts have been disabled in slavealloc, still waiting for bld-linux64-ix-027 to drain its last job.
Assignee | ||
Comment 4•12 years ago
|
||
Armen, r? on the following SQL:
mysql> update slaves SET dcid=(select dcid from datacenters where name="scl3") AND name="b-2008-ix-0019" WHERE name="w64-ix-slave31";
update slaves SET dcid=(select dcid from datacenters where name="scl3") AND name="b-linux64-hp-0001" WHERE name="b-linux64-hp-001";
update slaves SET dcid=(select dcid from datacenters where name="scl3") AND name="b-linux64-ix-0001" WHERE name="bld-linux64-ix-027";
update slaves SET dcid=(select dcid from datacenters where name="scl3") AND name="t-snow-r4-0001" WHERE name="talos-r4-snow-001";
Flags: needinfo?(armenzg)
Assignee | ||
Comment 5•12 years ago
|
||
Armen gave me r+ in IRC, but then trying to apply it I got:
ERROR 1452 (23000): Cannot add or update a child row: a foreign key constraint fails (`buildslaves`.`slaves`, CONSTRAINT `slaves_ibfk_6` FOREIGN KEY (`dcid`) REFERENCES `datacenters` (`dcid`))
Popping into #db I got help from :cyborgshadow which led me to use the following commands:
mysql> Update slaves s JOIN datacenters dc on dc.name='scl3' set s.dcid = dc.dcid, s.name='b-2008-ix-0019' where s.name = 'w64-ix-slave31';
Query OK, 1 row affected, 1 warning (0.01 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> Update slaves s JOIN datacenters dc on dc.name='scl3' set s.dcid = dc.dcid, s.name='b-linux64-hp-0001' where s.name = 'b-linux64-hp-001';
Query OK, 1 row affected, 1 warning (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> Update slaves s JOIN datacenters dc on dc.name='scl3' set s.dcid = dc.dcid, s.name='b-linux64-ix-0001' where s.name = 'bld-linux64-ix-027';
Query OK, 1 row affected, 1 warning (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
mysql> Update slaves s JOIN datacenters dc on dc.name='scl3' set s.dcid = dc.dcid, s.name='t-snow-r4-0001' where s.name = 'talos-r4-snow-001';
Query OK, 1 row affected, 1 warning (0.00 sec)
Rows matched: 1 Changed: 1 Warnings: 0
Flags: needinfo?(armenzg)
Assignee | ||
Comment 6•12 years ago
|
||
So this does the config changes for the pools where we're moving a single host from this week.
I note it looks like we're missing some stuff from this list at a glance.
I based these values on https://docs.google.com/a/mozilla.com/spreadsheets/d/1vq3dcMGvwbW1sug0lipYj2-P_SRovf8q-_pEOxmvhFg/edit#gid=947240444
And of the following sheets there:
"talos-r4-snow", "linux build", "windows try", "linux try - thunderbird and fuzzing",
All other sheets were not referenced for this patch.
If any of those 4 sheets change, I will need to know. I also want to avoid changing the mappings at this stage.
Attachment #8393781 -
Flags: review?(hwine)
Assignee | ||
Comment 7•12 years ago
|
||
needinfo to :arr on c#6, she says she'll look into missing hosts tomorrow, and since I need to be aware if those sheets change, and especially since I need to be aware if we're changing the mappings of any currently specified machines in those sheets
Flags: needinfo?(arich)
Assignee | ||
Comment 8•12 years ago
|
||
see c#6 for patch details.
I initially attached a patch for wrong repo by accident
Attachment #8393781 -
Attachment is obsolete: true
Attachment #8393781 -
Flags: review?(hwine)
Attachment #8393785 -
Flags: review?(hwine)
Assignee | ||
Comment 9•12 years ago
|
||
Attachment #8393816 -
Flags: review?(armenzg)
Updated•12 years ago
|
Attachment #8393816 -
Flags: review?(armenzg) → review+
Updated•12 years ago
|
Attachment #8393785 -
Flags: review?(hwine) → review+
Assignee | ||
Comment 10•12 years ago
|
||
To document snippet of a convo I just had:
[12:04:00] Callek jmaher: this is re: Bug 985482 --- (and Bug 985625) --- the physical machine is not changing, so choose 1 --- (a) update graphserver DB to say the new slave name instead of the old slave name -- or (b) insert a new row into graph server to accomodate.
....
[12:39:00] jmaher Callek: I would vote for adding new machine names to the database
So thats the plan of record.
![]() |
||
Comment 11•12 years ago
|
||
changes to https://docs.google.com/a/mozilla.com/spreadsheets/d/1vq3dcMGvwbW1sug0lipYj2-P_SRovf8q-_pEOxmvhFg after data validation of the etherpad:
the second w64-ix-slave144 has been removed from the "windows build - esr and b2g18" tab, replaced with w64-ix-slave156, and put in the windows build tab.
The following have been added to the wintry pool (windows try tab) and renamed:
w64-ix-slave32.winbuild.scl1.mozilla.com
w64-ix-slave33.winbuild.scl1.mozilla.com
w64-ix-slave34.winbuild.scl1.mozilla.com
w64-ix-slave35.winbuild.scl1.mozilla.com
w64-ix-slave36.winbuild.scl1.mozilla.com
w64-ix-slave37.winbuild.scl1.mozilla.com
w64-ix-slave38.winbuild.scl1.mozilla.com
w64-ix-slave39.winbuild.scl1.mozilla.com
w64-ix-slave40.winbuild.scl1.mozilla.com
w64-ix-slave04.winbuild.scl1.mozilla.com
w64-ix-slave05.winbuild.scl1.mozilla.com
w64-ix-slave157.winbuild.scl1.mozilla.com
added the following to "linux try - thunderbird and fuzzing":
b-linux64-ix-049.build.scl1.mozilla.com
b-linux64-ix-050.build.scl1.mozilla.com
![]() |
||
Updated•12 years ago
|
Flags: needinfo?(arich)
![]() |
||
Comment 12•12 years ago
|
||
You should be able to test these two machines now:
b-linux64-hp-0001.try.releng.scl3.mozilla.com
b-linux64-ix-0001.build.releng.scl3.mozilla.com
We're still working out some issues with the deployment for b-2008-ix-0019.wintry.releng.scl3.mozilla.com, and we're waiting for the attachment of an EDID for t-snow-r4-0001.test.releng.scl3.mozilla.com.
![]() |
||
Comment 13•12 years ago
|
||
t-snow-r4-0001.test.releng.scl3.mozilla.com is now available for testing as well.
Comment 14•12 years ago
|
||
b-2008-ix-0019.wintry.releng.scl3.mozilla.com is now being installed.
![]() |
||
Comment 15•12 years ago
|
||
something checked-in went into production :)
Assignee | ||
Comment 16•12 years ago
|
||
Done the graph server changes (for the machines in this bug only)
mysql> select * from machines WHERE name IN ("b-linux64-hp-001", "bld-linux64-ix-027", "w64-ix-slave31", "talos-r4-snow-001");
+------+-------+---------------+-----------+-------------------+-----------+------------+
| id | os_id | is_throttling | cpu_speed | name | is_active | date_added |
+------+-------+---------------+-----------+-------------------+-----------+------------+
| 1455 | 21 | 0 | 2.4 | talos-r4-snow-001 | 1 | 1317830270 |
| 6585 | 27 | 0 | 1.0 | w64-ix-slave31 | 1 | 1358380982 |
+------+-------+---------------+-----------+-------------------+-----------+------------+
2 rows in set (0.00 sec)
mysql> select * from machines WHERE name LIKE "b-linux%";
Empty set (0.01 sec)
mysql> select * from machines WHERE name LIKE "bld-%";
Empty set (0.00 sec)
mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)
mysql> insert into machines values (null, 21, 0, "2.4", "t-snow-r4-0001", 1, unix_timestamp());
Query OK, 1 row affected (0.00 sec)
mysql> insert into machines values (null, 27, 0, "1.0", "b-2008-ix-0019", 1, unix_timestamp());
Query OK, 1 row affected (0.00 sec)
mysql> commit;
Query OK, 0 rows affected (0.00 sec)
Assignee | ||
Comment 17•12 years ago
|
||
This is done, all staged machines are taking jobs in production and working fine.
needinfo @ myself for c#11 touchups though.
Flags: needinfo?(bugspam.Callek)
Assignee | ||
Comment 18•12 years ago
|
||
per c#11
Attachment #8398282 -
Flags: review?(armenzg)
Flags: needinfo?(bugspam.Callek)
Updated•12 years ago
|
Attachment #8398282 -
Flags: review?(armenzg) → review+
Assignee | ||
Comment 19•12 years ago
|
||
Comment on attachment 8398282 [details] [diff] [review]
[buildbot-configs] part 2
https://hg.mozilla.org/build/buildbot-configs/rev/080b223a13d0
Attachment #8398282 -
Flags: checked-in+
![]() |
||
Comment 20•12 years ago
|
||
in production.
Assignee | ||
Comment 21•12 years ago
|
||
The Buildbot-configs part 2 broke slave healths crons.
This patch should fix it
Attachment #8399594 -
Flags: review?(armenzg)
Assignee | ||
Comment 22•12 years ago
|
||
Comment on attachment 8399594 [details] [diff] [review]
[slave_health] Part 2 - fix for buildbot-configs part 2
Review of attachment 8399594 [details] [diff] [review]:
-----------------------------------------------------------------
r+ = jhopkins over IRC
Attachment #8399594 -
Flags: review?(armenzg)
Attachment #8399594 -
Flags: review+
Attachment #8399594 -
Flags: checked-in+
Assignee | ||
Updated•12 years ago
|
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
![]() |
||
Comment 23•11 years ago
|
||
Note for the future readers -- we didn't specifically test a b2g build on a linux box, so missed a builder -> cruncher flow. Discovered & fixed during move train A - see bug 1014221.
Updated•7 years ago
|
Component: Platform Support → Buildduty
Product: Release Engineering → Infrastructure & Operations
Updated•6 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•