Closed Bug 527994 Opened 15 years ago Closed 15 years ago

Setup 10 new Try Talos slaves with buildbot

Categories

(Release Engineering :: General, defect, P2)

x86
All
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: joduinn, Assigned: anodelman)

References

Details

Attachments

(2 files)

Once cloned machines are online, this is to track rest of manual setup needed. 

Note: these slaves would be in MV, so will need firewall changes to connect to try talos server in colo.
Summary: Setup 11 new Try Talos slaves with buildbot → Setup 10 new Try Talos slaves with buildbot
Assignee: nobody → joduinn
Attachment #412310 - Flags: review?(anodelman) → review+
Attachment #412344 - Flags: review?(anodelman) → review+
Pushed to staging graph server:

mysql> insert into machines values (NULL,4,0,"1.83","qm-pleopard-try12",1,unix_timestamp());
Query OK, 1 row affected (0.02 sec)

mysql> insert into machines values (NULL,4,0,"1.83","qm-pleopard-try13",1,unix_timestamp());
Query OK, 1 row affected (0.00 sec)

mysql> insert into machines values (NULL,4,0,"1.83","qm-pleopard-try14",1,unix_timestamp());
Query OK, 1 row affected (0.00 sec)

mysql> insert into machines values (NULL,4,0,"1.83","qm-pleopard-try15",1,unix_timestamp());
Query OK, 1 row affected (0.01 sec)

mysql> insert into machines values (NULL,4,0,"1.83","qm-pleopard-try16",1,unix_timestamp());
Query OK, 1 row affected (0.01 sec)

mysql> insert into machines values (NULL,5,0,"1.83","qm-pubuntu-try12",1,unix_timestamp());
Query OK, 1 row affected (0.07 sec)

mysql> insert into machines values (NULL,5,0,"1.83","qm-pubuntu-try13",1,unix_timestamp());
Query OK, 1 row affected (0.01 sec)

mysql> insert into machines values (NULL,5,0,"1.83","qm-pubuntu-try14",1,unix_timestamp());
Query OK, 1 row affected (0.00 sec)

mysql> insert into machines values (NULL,5,0,"1.83","qm-pubuntu-try15",1,unix_timestamp());
Query OK, 1 row affected (0.00 sec)

mysql> insert into machines values (NULL,5,0,"1.83","qm-pubuntu-try16",1,unix_timestamp());
Query OK, 1 row affected (0.00 sec)
Comment on attachment 412310 [details] [diff] [review]
[checked in]include new slaves

changeset:   1754:75db33cb6953
Attachment #412310 - Attachment description: include new slaves → [checked in]include new slaves
Attachment #412310 - Flags: checked-in? → checked-in+
Manual post-clone steps for OSX now done. These slaves should be able to connect to production talos-master once the DNS entries are detangled in dep.bug.

Still need to do post-clone steps for linux slaves.
Assignee: joduinn → anodelman
Priority: -- → P2
qm-pleopard-try12/13/14/15/16 up and connected.  Had to update buildbot.tac's to point at ip instead of machines name.
qm-pubuntu-try12/14/15/16 up and connected.  qm-pubuntu-try13 unreachable, probably just needs a kick.

Installed xrestop on all slaves, updated buildbot.tac to use ip instead of master machine name.
qm-pubuntu-try13 is dead, phong is replacing with another mini.
qm-pubuntu-try13 replaced, up and configured.
All done here.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Comment on attachment 412344 [details] [diff] [review]
[checked in]add new machines to graphserver

changeset:   256:e41e4d3f5d8d
Attachment #412344 - Attachment description: add new machines to graphserver → [checked in]add new machines to graphserver
Attachment #412344 - Flags: checked-in? → checked-in+
(In reply to comment #6)
> qm-pleopard-try12/13/14/15/16 up and connected.  Had to update buildbot.tac's
> to point at ip instead of machines name.

(In reply to comment #7)
> Installed xrestop on all slaves, updated buildbot.tac to use ip instead of
> master machine name.

Using the IP address instead of the hostname concerns me, and I'd like to fix that. If DNS entry is still not working after a machine reboot, we should reopen blocking bug#529196.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
It appears that it can reach the master using the proper name (I pinged from one of the slaves) - might have been a temporary problem that I hit.  I don't see the point in changing the machines configuration as they are all up and connected - but good to know for future set ups.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
(In reply to comment #13)
> It appears that it can reach the master using the proper name (I pinged from
> one of the slaves) - might have been a temporary problem that I hit.  I don't
> see the point in changing the machines configuration as they are all up and
> connected - but good to know for future set ups.

Please do change the buildbot.tac files in these new slaves from using IPs to using the hostname like we do in other Talos slaves. We've been bitten by small differences in config files in the past, and I'd rather not set ourselves up to hit problems like this again.

Using "graceful shutdown" should make this a quick (and boring) fix to rollout.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to comment #14)
> Using "graceful shutdown" should make this a quick (and boring) fix to rollout.

That doesn't actually work for machines which reboot as the last step of the build - the slave reboots before the shutdown gets a chance to run, and the subsequent reconnect to the master resets the shutdown request.
(In reply to comment #15)
> (In reply to comment #14)
> > Using "graceful shutdown" should make this a quick (and boring) fix to rollout.
> 
> That doesn't actually work for machines which reboot as the last step of the
> build - the slave reboots before the shutdown gets a chance to run, and the
> subsequent reconnect to the master resets the shutdown request.

Ah, very true, Nick. Simple ssh to edit of buildbot.tac would take effect on next start. Or we could wait until they are idle. Whatever is easiest.

Anyway, original point remains, lets keep the config files on these new slaves consistent with the existing slaves by using hostnames like usual.
buildbot.tac files fixed to point to 'talos-master' instead of ip, will be picked up on next reboot.

All done here.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: