Closed Bug 731583 Opened 12 years ago Closed 12 years ago

hg in SCL3

Categories

(Developer Services :: General, task)

x86
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: cshields, Assigned: bkero)

References

Details

(Whiteboard: SCL3 [tracker])

Tracker bug for moving hg to SCL3.  Once hg has migrated we will move the old hg physical gear to SCL3 as spares.
Summary: HG in SCL3 → hg in SCL3
Whiteboard: [tracker] → SCL3 [tracker]
Depends on: 731973
summary of discussions w/cshields, arr and bkero:

1) this switchover *may* be ready to do Friday (06apr). bkero will know sometime tmrw (wed) morning if we are go/nogo for this. If we can do the switchover this friday, its one less thing to worry about next week! (This friday is also day-off for some countries, so less disruptive to developers - nice, but not a requirement). If it means waiting until next week to do it right, thats fine also.

2) coop is standing by to schedule downtime for this after talking with bkero Wed morning. catlee already announced *possible* downtime in today's platform meeting, just in case.

3) bkero doing the late night work on this, so he should get the fame/glory/assignment.
Assignee: server-ops-devservices → bkero
[8:58pm] coop: bkero: which hosts should i be attempting to clone from in scl3? is there a balancer setup or should i be hitting the individual nodes?
[8:58pm] bkero: coop: There is a balancer set up, 63.245.215.25
[8:59pm] bkero: coop: ssh checkouts to that will work fine, but http checkouts will have to have a http host header set

Xylophone:~ ccooper$ hg -v --debug clone ssh://63.245.215.25/mozilla-central
running ssh 63.245.215.25 "hg -R mozilla-central serve --stdio"
sending hello command
sending between command
abort: no suitable response from remote hg!
bkero got things unblocked. I can now from 63.245.215.25.

I currently have 11 dev slaves all cloning mozilla-central from 63.245.215.25 concurrently.
bkero - 

What is the outline of the move for this window?  Is it a server move or data syncing?  I'm trying to map out what pre and post prep I need to do and I'm also getting asked why 4 hours.

thanks
The timeline for this is as follows:

1) Break snapmirror and make hg netapp volume unmountable in sjc1, but mountable in scl3 (this will take about an hour)

2) alter network settings to point hg.mozilla.org, hgpvt.mozilla.org, and hg.ecmascript.org to the new global VIPs (this is a DNS change, so it takes time to propagate/expire)

The 4 hour window is for if we need to wait for the netapp volume to sync back to sjc1 and roll back the deployment.
thanks for the answer.  please take this next question as friendly ;)...

if the snapmirror step takes about an hour can the DNS part be done in parallel so we could reduce the window down to 3 (or even 2?)

I'm asking only because of the amount of time over the next couple of weeks we are going to be having planned (and unplanned) downtimes.  If the answer is "no" so be it.

thanks again
(In reply to Mike Taylor [:bear] from comment #6)
> thanks for the answer.  please take this next question as friendly ;)...
> 
> if the snapmirror step takes about an hour can the DNS part be done in
> parallel so we could reduce the window down to 3 (or even 2?)

The long window is to allow for rollback procedures.  I'm pretty sure we can have it flipped over in an hour or so, but you'll far outrun that window (even 2 or 3) if something goes wrong.

So, in other words, the window should not be indicative of absolute time down, unless shit happens.  If shit happens, the window gives fair warning.


> I'm asking only because of the amount of time over the next couple of weeks
> we are going to be having planned (and unplanned) downtimes.  If the answer
> is "no" so be it.


Yeah, moving out of MPT is a pain  :(  Thankfully this move will be a huge piece of the pie.
We're going with a 3 hour window, 6am-9am thursday.
hg is now in SCL3, this bug is complete. Please file new bugs for issues from here on.

Great job bkero!
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Component: Server Operations: Developer Services → General
Product: mozilla.org → Developer Services
You need to log in before you can comment on or make changes to this bug.