Closed Bug 720549 Opened 12 years ago Closed 12 years ago

migrate cg-ecmascript01 to scl3

Categories

(Infrastructure & Operations :: Virtualization, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: afernandez)

References

Details

I can't tell from bug 400784 what the status of this VM is.  If it's no longer in use, let's verify that and shut it down.

Otherwise, we'll need to schedule a way to migrate it to scl3.  This will involve some amount of downtime, as well as a new IP address.

:dherman, do you know the current status of this system?  Thanks!
Blocks: scl3-move
Depends on: 720553
I'm a bit confused about what's hosted where at the moment, but we definitely don't want to lose any of the ecmascript.org subdomains -- they're in active use. The difficult one to port has been the wiki. I'm out of town this week but can help out next week if we want to try to do the upgrade and move.

Dave
I know fox2mike was involved with some of those moves, so perhaps he can provide some advice on whether it's easier to move this VM, or finish getting services off of it?
Assignee: dherman → shyam
Just move it as is, please. It's too messy otherwise, unless Dave has made progress since the last time we spoke :) (and I guess that's not happened).
No, no progress. :-/

Dave
Punting back to dustin, please move to scl3 as is.
Assignee: shyam → dustin
OK, sounds good.  Dave, this host will need a new IP address.  I see ecmacscript.org in our DNS zone files, so I can make the necessary changes there.  Will a new IP require any other changes?

We will likely have the VMware infrastructure set up in late February, so this move will probably occur in early March.
> OK, sounds good.  Dave, this host will need a new IP address.  I see
> ecmacscript.org in our DNS zone files, so I can make the necessary changes
> there.  Will a new IP require any other changes?

I don't think so.

> We will likely have the VMware infrastructure set up in late February, so
> this move will probably occur in early March.

Thanks for the heads up. If you could ping me when the process starts so I can babysit from the application side of things, that'd be helpful. Thanks!

Dave
Dan, can you give us some parameters on this move?  This will go to ecmascript1.community.scl3.mozilla.com.  I'd like to do this in the next week or so.

Dave, if there are any particular parameters around downtime, let us know.
Assignee: dustin → server-ops
Component: Server Operations → Server Operations: Virtualization
QA Contact: cshields → dparsons
Probably not going to be done within the next week unfortunately. Too much ultra-high-priority work on my shoulders
OK, this can slide to another train - the next hard-stop would be 4/23
Assignee: server-ops → rbryce
rbryce, can you pick a downtime to move this and let us know?  Give us a few days' warning, and I'll work with dherman to make sure I can take care of any loose ends.

:dherman - do you know of any changes aside from DNS that I'll need to make?  Also, can you communicate login details to me and rbryce out of band, so we can make sure the host comes back up at its new IP?

The new IP and hostname are:
  ecmascript1.community.scl3.mozilla.com
  63.245.223.13/25 (community VLAN)
dustin: I am going to be out of town and unavailable until 4-24.  I have asked dumitru to handle the migration.  is there a time that would be better than another?
(In reply to Rick Bryce [:rbryce] from comment #12)
> dustin: I am going to be out of town and unavailable until 4-24.  I have
> asked dumitru to handle the migration.  is there a time that would be better
> than another?

I will see if I can squeeze that in. On Monday we are also moving the tape library and the backup server and there are a lot of things to do after its physical move to make sure all the backups work.
Let's say next Wednesday, during PST daytime - say, start around 11am?

Dave, OK?  Dumitru?
That should be fine. I'll send a heads-up to relevant parties so they know to stay off the wiki. Do we have a rough sense of how long people should lay off the site?

Thanks,
Dave
VM migrations have been pretty slow - they always seem to hit some un-anticipated snag.  So I'd estimate most of the day, with the option to be pleasantly surprised.
We are good to go @ 11AM pst today for the migration.  If anyone has any last minutes notes to add please do so asap.
at ~40% migration complete, the converter errored with:

FAILED: A general system error occurred:
No connection could be made because the target machine actively refused it. 

I've restarted the migration. Expected to be done ~30 minutes
How are we doing?
:rbryce? :mburns?
dustin:

We had to bail out of this migration.  3 tries, and the file copy fails @ 41%.  mburns is still working on another plan.
(In reply to Dustin J. Mitchell [:dustin] from comment #16)
> VM migrations have been pretty slow - they always seem to hit some
> un-anticipated snag.  So I'd estimate most of the day, with the option to be
> pleasantly surprised.

:dherman, are you OK with continued downtime here?  Please find me or mburns if this becomes a pressing concern.
It could become a problem if it goes much longer, but I actually seem to be able to reach bugs.ecmascript.org and hg.ecmascript.org -- am I confused about which subdomains are affected?

Dave
Indeed, the old host is up, so this is less pressing than I thought.  We just need to get the new version of it up in scl3 before the lights go out in sjc1 (which is only 20 days, so don't get too relaxed).  Sorry for not seeing that!
Assignee: rbryce → mburns
mburns, any thoughts on how to move forward on this one?
Proceeding with attempting migration.
:dherman gave green light.
Assignee: mburns → afernandez
Migration in progress, currently @ 5% Complete .
Migration failed @ 39% working on resolution.
2nd attempt cancel as per :dherman's request.

Reason:
13:20 <dherman> some partners are doing a presentation with test262.ecmascript.org


Please update bug (and to expedite), contact me via irc to start migration attempt again.
So, before you guys try to migrate this again :

yum clean all
yum -y update

Reboot the machine and then give it a go (after stopping all the services). I've seen this fail multiple times now, and the above steps helped me the last time I had a machine that was failing in a similar fashion.
Also since this is already a VM :

[root@cg-ecmascript01 ~]# dmidecode | grep Prod
	Product Name: VMware Virtual Platform
	Product Name: 440BX Desktop Reference Platform

We can maybe shut it down and copy it (if everything else fails).

Also, make sure vmware-tools is up to date on the box before trying the next migration?
The migration fails while the transfer is running, this could possibly be fixed but to avoid "attempts" will just move the actual  vmdk files (ie vm shutdown).

:dherman please let me know when VM could be taken offline again.

Thank you.
:dherman, ping?
OK, I've got an all-clear for any time Friday through Monday from everyone but one of the people I asked. So let's go for it.

Thanks for your patience,
Dave
VM will be migrated today.

Will update when it actually goes down for migration.
VM coming down for migration.
VM Migration running.

Status: 2% Complete
ETA: ~ 23 minutes (will fluctuate)
VM Migration completed, getting server online.

The previous A record(s) of ecmascript.org , have been updated as follows;
63.245.210.149 -> 63.245.223.13
The VM is online but need to increase disk space (original disk 10G) as not enough free space to install/upgrade VMWare tools.

Increasing disk (lvm) to 20G.
VM is fully online.

Seems httpd.conf was configured to listen on the previous IP, updated it to use the new one.

Please verify that all is well.

If no issues are reported, bug will be closed after 24 hours.
Everything looks good!

Thanks,
Dave
Cool! Thanks Aj, Dave.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.