Closed
Bug 999435
Opened 11 years ago
Closed 11 years ago
Setup new Ubuntu 14.04 nodes for Mozmill CI in qa.scl3.mozilla.com
Categories
(Infrastructure & Operations :: Virtualization, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: whimboo, Assigned: cknowles)
References
Details
(Whiteboard: [vm-create:18][vm-delete:8][qa-automation-blocked])
Ubuntu 14.04 has been released, and we want to upgrade our production machines to 14.04. Given that it is a LTS release we can replace all of the existent Ubuntu nodes, which means 12.04 (32/64) and 13.10 (32/64).
So can we please get new Ubuntu 14.04 VM templates generated? Thanks.
We want to use that version to get started with Puppet on bug 973535.
Reporter | ||
Comment 1•11 years ago
|
||
Machines we would need to be created are:
mm-ub-1404-32-1.qa.scl3.mozilla.com
mm-ub-1404-32-2.qa.scl3.mozilla.com
mm-ub-1404-32-3.qa.scl3.mozilla.com
mm-ub-1404-32-4.qa.scl3.mozilla.com
mm-ub-1404-64-1.qa.scl3.mozilla.com
mm-ub-1404-64-2.qa.scl3.mozilla.com
mm-ub-1404-64-3.qa.scl3.mozilla.com
mm-ub-1404-64-3.qa.scl3.mozilla.com
Pre-installation instructions can be found here:
https://mana.mozilla.org/wiki/display/websites/QA+Automation+ESX+Service#QAAutomationESXService-Linux%28Ubuntu%29
Comment 2•11 years ago
|
||
These will need to be kickstarted into PuppetAgain. This may become a much more common request, from people without easy vCenter console access. Should we have a quick meeting to talk about how to do this, and work out any kinks?
The 14.04 kickstart process isn't set up yet, so we can't KS these VMs yet anyway.
Reporter | ||
Comment 3•11 years ago
|
||
(In reply to Dustin J. Mitchell [:dustin] from comment #2)
> These will need to be kickstarted into PuppetAgain. This may become a much
> more common request, from people without easy vCenter console access.
> Should we have a quick meeting to talk about how to do this, and work out
> any kinks?
May be fine with me but we would need feedback from Adrian or someone else who will set up those VMs first. Better we manage a good time via IRC or Email then.
> The 14.04 kickstart process isn't set up yet, so we can't KS these VMs yet
> anyway.
What do you think how long this will take?
Comment 4•11 years ago
|
||
Sorry, that was for a meeting with the virtualization folks.
We're planning to use Ubuntu-14.04 for OpenStack as well, so I'm working on the KS process now.
Assignee | ||
Comment 5•11 years ago
|
||
Dustin - so sorry for the delay - I'd be happy to meet up with you. How's the puppetagain work for the 14.04 coming?
My schedule is likely more wide open than yours, feel free to ping me on irc whenever.
CJK
Assignee | ||
Comment 6•11 years ago
|
||
OK, upshot from the brief meeting - kickstarting puppetagain is understood reasonably well... from a "click here, do that" level. So that's not in the way.
However work on 14.04 is ongoing, and "not quite ready" for kickstarting yet. Let me know when we can move forward on it.
Thanks for the time today.
Assignee | ||
Comment 7•11 years ago
|
||
Alright, I see that the blocking bug for the 14.04 is now closed - starting on this.
:whimboo - can you add these to the puppetagain node definitions so that the puppetagain kickstart can fully complete?
CJK
Assignee: server-ops-virtualization → cknowles
Reporter | ||
Comment 8•11 years ago
|
||
Chris, on bug 1020659 I'm currently working on QA specific node definitions. With the patch attached there we will be able to recognize Ubuntu 14.04 for staging machines. The hosts which I pointed out in the initial comment are for production. So what we indeed also need are the machines for staging.
Those would be:
mm-ub-1404-32.qa.scl3.mozilla.com
mm-ub-1404-64.qa.scl3.mozilla.com
Once the patch on the other bug landed, both machines would be able to pull their configuration from our qa puppetmaster. So I would suggest we start with those 2 machines, and do the tests how it works. Does it sound good?
Status: NEW → ASSIGNED
Assignee | ||
Comment 9•11 years ago
|
||
sounds fine, other than that I just kickstarted 6 of the 8 machines. :/ (my timing problem, not yours)
I'll power them back down, and setup to be ready to kickstart the two you suggest.
Let me know when I'm clear to begin.
Reporter | ||
Comment 10•11 years ago
|
||
I have noticed that! So I went ahead and also added all the production nodes to the config which has been pushed to production now. If you want, you can start them all! Nothing should fail at the moment.
The only problem which persists is bug 1006891, which also installed Apache. We have to get this fixed. So adding as dependency.
Depends on: 1006891
Assignee | ||
Comment 11•11 years ago
|
||
Alright, I've spun up mm-ub-1404-[32,64].qa.scl3 and kickstarted them. However, these are stuck at the post boot splash screen - which I had been informed implied a problem with puppetization - let me know if I'm clear to proceed on the rest, or how I should modify the procedure to work better...
CJK
Comment 12•11 years ago
|
||
Logging into these, it looks like they need to have the releng hg repo merged, as the current QA repo still specifies puppet-3.4.2, which doesn't exist for trusty. They'll continue retrying until that's done (and probably sending you email?)
Reporter | ||
Comment 13•11 years ago
|
||
I did the merge from build/puppet now. So lets see how it works.
Sadly I still get tons of emails for:
Error 400 on SERVER: Could not find default node or by name with 'mm-ub-1404-64.qa.scl3.mozilla.com' on node mm-ub-1404-64.qa.scl3.mozilla.com
Not sure why it doesn't fetch the node. Maybe somewhat is wrong in the node regex.
Reporter | ||
Comment 14•11 years ago
|
||
I totally messed up this merge. Chris, I'm sorry but can you please shutdown all machines? Otherwise I have a full inbox on Tuesday when I come back.
Comment 15•11 years ago
|
||
No need to shut down the machines, I think - I killed the puppetize.sh processes on them so they shouldn't spam over the weekend.
Assignee | ||
Comment 16•11 years ago
|
||
Given the "I think" I decided to shut them off.
Let me know when I'm clear to start powering them up again.
CJK
Assignee | ||
Comment 17•11 years ago
|
||
Per our IRC conversation, I just started the kickstart of mm-ub-1404-[32,64].qa.scl3. I'll let you know what I see.
Assignee | ||
Comment 18•11 years ago
|
||
Alright, the kickstarts have gone off, and are again sitting at the splashscreen - none of my credentials are working for SSHing in as root, so I can't see what's going on in the logs - let me know what you'd like me to try next.
CJK
Reporter | ||
Comment 19•11 years ago
|
||
I have a problem in connecting to the machines given that on my Linux machine I cannot resolve the DNS names. I have to wait until I'm back at home for further inspection. But what I see so far is promising. All went fine this time, except a single error for apt-get right after puppetizing the machine. I will file a separate bug for that, at least for investigation.
Reporter | ||
Comment 20•11 years ago
|
||
Ok, so I can login via SSH with the root account, but not with my own username. So something went wrong with the initial puppet run. I will have to check what we actually do when adding those users. Maybe I can find what's wrong. Not sure if bug 1024938 has any effect here.
Reporter | ||
Comment 21•11 years ago
|
||
So the admin_users we define in qa-config.pp are those only for the puppetmaster?
https://hg.mozilla.org/qa/puppet/file/2dea7e8f1dcc/manifests/qa-config.pp#l39
If, yes what needs to be done to also have them available on the slave nodes?
Assignee | ||
Comment 22•11 years ago
|
||
These are questions I think that are best directed towards Dustin, as I have to poke him with questions when things go awry.
Dustin, do you have any input on this?
Reporter | ||
Comment 23•11 years ago
|
||
Chris, sorry that was my fault. Those questions are targeted for Dustin. So I revisit the current bugs, and figured out that we should get this discussion moved over to bug 973535. All what could be done on this bug has been done.
I have added those 8 new nodes to our ESX documentation on Mana:
https://mana.mozilla.org/wiki/display/websites/QA+Automation+ESX+Service
Chris also instructed me on Thursday how to kickstart a machine with Ubuntu 14.04. I did that for a 32bit and 64bit one, and all works fine. So I think all work as it could be done here has been finished, and we can close the bug.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Whiteboard: [qa-automation-blocked]
Comment 24•11 years ago
|
||
Just to follow up, admin users are present on all toplevel::server nodes, which does *not* include slaves.
Assignee | ||
Updated•11 years ago
|
Whiteboard: [vm-create:8]
Reporter | ||
Comment 25•11 years ago
|
||
Chris, I will have to re-open this bug given that the PuppetAgain process is still ongoing and we weren't able to finish it off yet. As I have read last week Ubuntu even released 14.10, and we still don't have 14.04 live on our machines! We talked about that in our team and decided that we cannot wait until Puppetagain is ready for us on Ubuntu. So let us do the remaining steps here:
1. I need new templates for both 14.04 releases (32/64). Please ensure that this gets installed from fresh and not updated from a former Ubuntu release template.
2. I will do the necessary customizations for the template
3. Once customization is done we have to re-create the already existent 14.04 CI machines based on the template.
Severity: normal → major
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [vm-create:8] → [vm-create:8][qa-automation-blocked]
Assignee | ||
Comment 26•11 years ago
|
||
So, let me rephrase, and ask a few questions to make sure I understand the request.
You'd like me to delete mm-ub-1404-32-{1..4} and mm-ub-1404-64-{1..4} - and recreate them from infrastructure templates. - which are installs of 1404. do you want these puppeted?
If this is far from correct, perhaps a new bug with a full, clear request, and without all the ancient history that is in here would better serve.
Reporter | ||
Comment 27•11 years ago
|
||
(In reply to Chris Knowles [:cknowles] from comment #26)
> You'd like me to delete mm-ub-1404-32-{1..4} and mm-ub-1404-64-{1..4} - and
> recreate them from infrastructure templates. - which are installs of 1404.
> do you want these puppeted?
What do the infrastructure templates contain? Are those bare installations, or did those already receive customizations? If they are plain, we can duplicate them to templates we could use. For all the other releases and also for Windows we have our own templates. You might be able to find them in vSphere. I was not with my privileges. Those will not get any puppetagain related customization! That will happen later when I'm done with that for our purposes. It may still take a bit.
Not sure if you have to delete the existent mm-ub-1404-* hosts. Do the best what you think has to be done to later create the hosts from the customized template.
> If this is far from correct, perhaps a new bug with a full, clear request,
> and without all the ancient history that is in here would better serve.
Comment 0 and comment 1 still apply, simply without puppet. Our estimate to get it directly running with puppetagain was kinda too optimistic. Sorry.
Assignee | ||
Comment 28•11 years ago
|
||
Well, if comment1 applies, you *do* want new boxes to replace the mm-ub-1404-32-{1..4} and mm-ub-1404-64-{1..4} ones that were created earlier. Please confirm that remove.
The infra templates do have some added packages for our puppetizing pleasure - also keys and other access related elements for the datacenters and to allow IT access to the created VMs.
However, puppet is not applied out of the box.
So, I'm still a little confused - is this the request to create mm-ub-1404-32-{1..4} and mm-ub-1404-64-{1..4} - or is this a request to create a 32 and 64 bit template - to create those ?
Reporter | ||
Comment 29•11 years ago
|
||
(In reply to Chris Knowles [:cknowles] from comment #28)
> Well, if comment1 applies, you *do* want new boxes to replace the
> mm-ub-1404-32-{1..4} and mm-ub-1404-64-{1..4} ones that were created
> earlier. Please confirm that remove.
Alright! Then lets do that when the template is ready to get distributed to the to be replaced hosts. All of those nodes are not in use. We will not replace mm-ub-1404-32 and mm-ub-1404-64, which are hosts in our staging instance and which I use for testing puppet.
> The infra templates do have some added packages for our puppetizing pleasure
> - also keys and other access related elements for the datacenters and to
> allow IT access to the created VMs.
>
> However, puppet is not applied out of the box.
I think that should be ok for now.
> So, I'm still a little confused - is this the request to create
> mm-ub-1404-32-{1..4} and mm-ub-1404-64-{1..4} - or is this a request to
> create a 32 and 64 bit template - to create those ?
First we need the templates to be created and customized before we can create the nodes.
Assignee | ||
Comment 30•11 years ago
|
||
Alright, per our conversation this morning, Spin up two VM's - mm-ub-1404-32-template.qa.scl3.mozilla.com and mm-ub-1404-64-template.qa.scl3.mozilla.com - and let you do all your customizations to it.
Once that's done, we'll convert those into templates in the QA space, and then we can spin out the actual worker VMs.
Comment 31•11 years ago
|
||
Releng is running 14.04 hosts in production .. what's still to do?
Assignee | ||
Comment 32•11 years ago
|
||
:whimboo, the machines : ubuntu-14.04-64-template.qa.scl3.mozilla.com and ubuntu-14.04-32-template.qa.scl3.mozilla.com
Only changes I've made to them from the default desktop setup is:
1) set apt-proxy to the dc proxies - per your docs
2) installed open-vm-tools-desktop for proper vm support
3) added standard IT keys to the mozauto ssh account.
4) installed and enabled ssh server on there - so you can SSH to them as mozauto, with the password we discussed on IRC earlier.
Let me know what else I can do.
Whiteboard: [vm-create:8][qa-automation-blocked] → [vm-create:10][qa-automation-blocked]
Assignee | ||
Comment 33•11 years ago
|
||
My early morning brain on that day made a mistake - the '.' in 14.04 was causing inventory to misinterpret things as a subdomain.
So, with permission, I took these down and renamed them, simply removing the '.' - feeling that a '-' would be weird.
ubuntu-1404-32-template.qa.scl3.mozilla.com
ubuntu-1404-64-template.qa.scl3.mozilla.com
Any problems or concerns, let me know.
Reporter | ||
Comment 34•11 years ago
|
||
Chris, I'm not able to connect to those machines via SSH. Neither I can find them in vSphere in our VLAN. So can you please install a SSH server? Thanks.
Status: REOPENED → ASSIGNED
Flags: needinfo?(cknowles)
Assignee | ||
Comment 35•11 years ago
|
||
Per the following, they've already got SSH servers installed, and are on the QA vlan (VLAN73).
Last login: Mon Nov 3 16:31:05 on ttys000
cknowles-20405:~ cknowles$ ssh mozauto@ubuntu-1404-32-template.qa.scl3.mozilla.com
Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-32-generic i686)
* Documentation: https://help.ubuntu.com/
225 packages can be updated.
99 updates are security updates.
Last login: Mon Nov 3 04:32:30 2014 from 10-22-248-146.vpn.scl3.mozilla.com
mozauto@ubuntu-14:~$ uptime
03:35:21 up 23:04, 1 user, load average: 0.00, 0.01, 0.05
mozauto@ubuntu-14:~$ exit
logout
Connection to ubuntu-1404-32-template.qa.scl3.mozilla.com closed.
cknowles-20405:~ cknowles$ ssh mozauto@ubuntu-1404-64-template.qa.scl3.mozilla.com
Welcome to Ubuntu 14.04.1 LTS (GNU/Linux 3.13.0-32-generic x86_64)
* Documentation: https://help.ubuntu.com/
224 packages can be updated.
98 updates are security updates.
Last login: Mon Nov 3 04:31:52 2014 from 10-22-248-146.vpn.scl3.mozilla.com
mozauto@ubuntu-14:~$ uptime
03:35:34 up 23:04, 1 user, load average: 0.00, 0.01, 0.05
mozauto@ubuntu-14:~$ exit
logout
Connection to ubuntu-1404-64-template.qa.scl3.mozilla.com closed.
cknowles-20405:~ cknowles$
Flags: needinfo?(cknowles)
Reporter | ||
Comment 36•11 years ago
|
||
Ups, totally my fault. I tried to get the IP address via my people SSH connection, but actually also tried to SSH into the above VMs from that location. That's clearly failing. I can connect now.
Reporter | ||
Comment 37•11 years ago
|
||
Alright. Both VMs have been updated and customized for our needs. Chris, you can now convert them back to templates, and replace our existent 4 machines for 32bit and 64bit with the new template. Thanks.
Assignee | ||
Comment 38•11 years ago
|
||
Alright will be shutting down:
mm-ub-1404-32-1.qa.scl3.mozilla.com
mm-ub-1404-32-2.qa.scl3.mozilla.com
mm-ub-1404-32-3.qa.scl3.mozilla.com
mm-ub-1404-32-4.qa.scl3.mozilla.com
mm-ub-1404-64-1.qa.scl3.mozilla.com
mm-ub-1404-64-2.qa.scl3.mozilla.com
mm-ub-1404-64-3.qa.scl3.mozilla.com
mm-ub-1404-64-3.qa.scl3.mozilla.com
Destroying and redeploying from template - will let you know when that's complete.
Assignee | ||
Comment 39•11 years ago
|
||
Also, re-reading the bug history - will you be needing the staging boxes as well?
mm-ub-1404-32.qa.scl3.mozilla.com
mm-ub-1404-64.qa.scl3.mozilla.com
Assignee | ||
Comment 40•11 years ago
|
||
Alright the 8 have been respun from your template.
mm-ub-1404-32-1.qa.scl3.mozilla.com
mm-ub-1404-32-2.qa.scl3.mozilla.com
mm-ub-1404-32-3.qa.scl3.mozilla.com
mm-ub-1404-32-4.qa.scl3.mozilla.com
mm-ub-1404-64-1.qa.scl3.mozilla.com
mm-ub-1404-64-2.qa.scl3.mozilla.com
mm-ub-1404-64-3.qa.scl3.mozilla.com
mm-ub-1404-64-3.qa.scl3.mozilla.com
They're all responding to SSH, and seem to be healthy, let me know of any concerns. And let me know if you need those staging ones respun as well.
Flags: needinfo?(hskupin)
Reporter | ||
Comment 41•11 years ago
|
||
(In reply to Chris Knowles [:cknowles] from comment #39)
> Also, re-reading the bug history - will you be needing the staging boxes as
> well?
Nope, we can keep them. No need to re-deploy them. They are based on Puppet and will be used for testing. Thanks!
(In reply to Chris Knowles [:cknowles] from comment #40)
> Alright the 8 have been respun from your template.
Great. I will check that soonish and reply back if I see something suspicious.
Assignee | ||
Updated•11 years ago
|
Whiteboard: [vm-create:10][qa-automation-blocked] → [vm-create:18][vm-delete:8][qa-automation-blocked]
Reporter | ||
Comment 42•11 years ago
|
||
Chris, for those VMs we do not have the checkbox for auto-upgrading VMware tools enabled. Can you please make sure to enable it for the templates?
Assignee | ||
Comment 43•11 years ago
|
||
It's not needed, and in fact may cause problems.
Longer:
Starting with the modern versions of the Linuxes (RHEL and CentOS 7, as well as ubuntu 14.04) vmware provided tools are now deprecated, and the open-vm-tools that are included with the distros are considered the canonical source. So for ubuntu, an "apt-get update;apt-get dist-upgrade" will set you with the latest open-vm-tools, as they're already installed on there.
The checkbox tries to mount the tools CD image and install the vmware version - which may or may not cause issues.
Also, note that for the linuxes, we have scripting in place that can and should manage tools upgrades for the environment and should keep them reasonably up to date.
Reporter | ||
Comment 44•11 years ago
|
||
Oh! That are news I haven't known. That is really good to hear. So ok, I will get the checkbox disabled again as first action tomorrow morning. Thanks for the info Chris!
Flags: needinfo?(hskupin)
Reporter | ||
Comment 45•11 years ago
|
||
So what I did so far:
* Updated all the machines for auto-upgrade of VMware tools
* Connected all 32bit machines
* Connected all 64bit machines
Todo:
* I still see problems with the Flash installer not being able to download the real binaries. This is all due to proxy settings. This needs to be fixed, so I will care about it tomorrow.
Reporter | ||
Comment 46•11 years ago
|
||
I removed the auto-upgrade check from all Ubuntu 14.04 vms. So this is clean now. Further I investigated the Flash issue a bit more and given that it's not only 14.04, which suffers from that, I will take care of it on bug 949427.
So all is done on that bug. Thanks a lot Chris for all the help!
Status: ASSIGNED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → Infrastructure & Operations
You need to log in
before you can comment on or make changes to this bug.
Description
•