Closed Bug 391058 Opened 13 years ago Closed 13 years ago

Move MDC (devmo) from giles to new php5 cluster

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: reed, Assigned: oremj)

Details

MDC (devmo) on one machine is a Bad Idea(tm). Now that we have a php5 cluster, we should move MDC to the cluster. This will better protect and stabilize MDC by adding redundancy.
Yes, please.
This is a great idea, and needs to be done, but just to be clear up front - you'll loose all login access to the box(es) as they are running other production applications.  Not sure if this is still and issue, but was last time we tried to migrate this app.  Hopefully it isn't and we can get this moved asap!
Well, as an aside to this issue, we need to create a proper way for the submission and maintenance of uploaded files that work in tandem with MDC.  In the past it's generally involved emailing the file to me or someone else with giles access to post it up, but it would be nice to provide a more flexible system for file management than that.

Code samples and other small items we could technically do using the image upload feature of MediaWiki (although I don't like doing it that way personally, we do it some already).  However, for larger items like videos and the like, we probably need another system for file posting and administration.
Most other sites have all their code in svn or cvs.  They maintain a live copy for dev, and once tested, just file an IT ticket to update the site to a tag or head.  Would this method work?  If so, just let us know where to pull the code from.
Yeah, that would probably work for us.
I don't think we want videos in svn -- this isn't just about code, but also about often-large assets.  Can we use a different directory on stage.m.o for this, and then rsync into place?

Otherwise, a chrooted sftp to the right directory on the cluster seems like it's pretty low-risk, and would be much more direct.
Yes, I was thinking about just for sample code and stuff, but yes, for videos we'll still need a solution other than svn or cvs.
(In reply to comment #6)
> I don't think we want videos in svn -- this isn't just about code, but also
> about often-large assets.  Can we use a different directory on stage.m.o for
> this, and then rsync into place?
> 
> Otherwise, a chrooted sftp to the right directory on the cluster seems like
> it's pretty low-risk, and would be much more direct.
> 

Yea, we have this same situation with air.mozilla.com.  We have a separate video server which does only large files.  We'll rsync those over to that box - you'll just need to update your links.  I agree, large files in svn = bad :-)

More to come with the details...
Assignee: server-ops → oremj
Before we do this I think we should figure out a way to reduce the application to one mediawiki directory instead of one per language.

We could create one mediawiki instance in dir foo and have one symlink to that directory per language, for example en -> foo, then we could dynamically set the language in LocalSettings.php off that symlink.

Ideas? 
We shouldn't need symlinks -- should be able to do it with rewrites (which are easier to keep straight via SVN to various machines, I think).  I had that working with the old pre-upgrade MDC on a previous laptop, though without any care for the cache/upload/etc. directory correctness.

We'll need to make sure that they all get the right upload/image/etc. paths, though, and don't stomp on each other.
My original thought was rewrites as well, but scripts that are run on the command line would no longer work.  Any idea how to solve that problem?
I would guess that symlinks are probably the safest way to do it and keep the command line stuff happy.  Unless you want to fix all the command line stuff to require a language as a parameter...
There is a copy on the cluster pointing at the live database at developer-cluster.mozilla.org
Wow, that's fast.  What sort of issues should be be looking for?  Any suggestions on things to test?
Mostly just making sure things seem the same, uploads work, etc.
If the cluster points at the live database, are the uploads pointing to a shared filesystem as well, or will people doing uploads end up with a phantom on the live MDC (database entries for the upload, but files not present)?  I fear the latter, would love to be wrong!
Saving changes to articles seems to be sluggish on the cluster, and occasionally I'm getting odd errors about some kind of sync problem.  I'll get the exact message and add it here once it happens again.  Had a brain malfunction and forgot to grab it.
(In reply to comment #16)
> If the cluster points at the live database, are the uploads pointing to a
> shared filesystem as well, or will people doing uploads end up with a phantom
> on the live MDC (database entries for the upload, but files not present)?  I
> fear the latter, would love to be wrong!
> 

Unfortunately, this is the case.

 (In reply to comment #17)
> Saving changes to articles seems to be sluggish on the cluster, and
> occasionally I'm getting odd errors about some kind of sync problem.  I'll get
> the exact message and add it here once it happens again.  Had a brain
> malfunction and forgot to grab it.
> 

I checked the error logs and didn't see anything.  Let me know if you see this again.
Hm.  I haven't seen the error again, but I've had cases where trying to save changes had no effect.  I've had to switch back to using the deployment version of the wiki to get things done.
(In reply to comment #18)
> (In reply to comment #16)
> > If the cluster points at the live database, are the uploads pointing to a
> > shared filesystem as well, or will people doing uploads end up with a phantom
> > on the live MDC (database entries for the upload, but files not present)?  I
> > fear the latter, would love to be wrong!
> > 
> 
> Unfortunately, this is the case.

Can't we just replicate the db for testing, and point it to the replica?
I can just create another db, but we would have to take downtime to set up a replicant.
(In reply to comment #19)
> Hm.  I haven't seen the error again, but I've had cases where trying to save
> changes had no effect.  I've had to switch back to using the deployment version
> of the wiki to get things done.
> 
I just realized for this application sessions are stored on local disk. I put those on NFS. This problem should now be solved.
Just dump from giles and load on a DB for developer-cluster?  That'd be enough to let us test without desyncing or anything, and it's what we do for developer-stage.  Probably make it easier to isolate problems, too, since the DB won't be changing as much underneath us.
I guess we'll need to get the file-space syncing sorted out between the cluster machines anyway, though, at which point giles is just another one of those clients talking to the shared FS, right?

That would let sheppy keep working on the cluster, which will give us much more satisfying test coverage.
Files space syncing is already sorted out for the cluster. Giles could be another client talking to the storage, but right now it isn't on the same vlan and isn't configured to be on the storage net. 

I'll just dump the db for now.  When we are ready to go live I'll switch the config back to use the live db and rsync the wiki-images directory over.
developer-cluster is now running off a dump.
OK, so real work needs to be done on developer.mozilla.org instead now, correct?
Correct, developer-cluster is just for testing.
Any new issues?  How much longer should this be tested before we go live?
I need to spend more time testing on this.  I'll put some time in on it this afternoon and evening.
It's been noticed that the nutch searching is broken on developer-cluster.  Can you guys figure out what's up with that?
Nutch probably just isn't pointing at it.  I have no idea where nutch is or how it is configured.
Sancus, any idea what's up with nutch on developer-cluster
aravind setup nutch - probably has a good idea how to fix...
If nutch is pointed ($mzNutchURL in GlobalSettings.php) at nutch.developer.mozilla.org, then yeah, it won't work, because it uses wgServerName to search, and so it'll be searching for records with developer-cluster.mozilla.org in them, instead of developer.mozilla.org.

The only way to change this for testing purposes is to uncomment:
//for testing
$wgServerName="developer.mozilla.org";

in /extensions/SpecialNutch_Class.php

I should probably add in a config variable to override that, but it hasn't come up before!
(In reply to comment #35)
> If nutch is pointed ($mzNutchURL in GlobalSettings.php) at
> nutch.developer.mozilla.org, then yeah, it won't work, because it uses
> wgServerName to search, and so it'll be searching for records with
> developer-cluster.mozilla.org in them, instead of developer.mozilla.org.
> 
> The only way to change this for testing purposes is to uncomment:
> //for testing
> $wgServerName="developer.mozilla.org";
> 
> in /extensions/SpecialNutch_Class.php
> 
> I should probably add in a config variable to override that, but it hasn't come
> up before!
> 

I made that change.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Do you think we could do this tomorrow?
Works for me.
Where should we be putting files that need to be synced to the MDC web site?  Is there a directory on stage?  Should people be getting accounts to upload them?  Where am I?
How is that working right now?  Which directory are we talking about?  
We've been scping them to giles; Justin may have plans, as per comment 8.  I think there are a handful of directories, not sure off-hand which ones.
Once you know what directories are needed I could create an NFS mount that you all could write to.
I like the idea of the NFS mount.  Right now, here are the directories we have:

es4
presentations
samples

I'd like to add a few more:

video
audio
demos
Devmo has been migrated.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Let us know when we have the NFS share for file uploads, and how to get to it, please.
I have files to upload, so how do I do that? :)
I have mounted these directories on mpt-vpn.mozilla.com:/data/devmo
After spending the last hour trying to get this to mount without any luck, it's time to ask.  How do I get this NFS volume to mount successfully on my Mac?  Connecting to nfs://mpt-vpn.mozilla.com/data/devmo doesn't work.  I've tried using NFS Manager, and other utilities, and nothing seems to do the trick.  What am I missing here?
(In reply to comment #48)
> What am I missing here?

Don't try doing it over NFS, as that won't work. Use something like scp.
Reed is correct you will want to scp or ssh in to look around.
Transmit and Interarchy both do sftp if you want a GUI to look at them.  There's also an sshfs filesystem driver for MacFUSE that'll let you mount it over ssh and browse from the Finder - http://code.google.com/p/macfuse/
OK, this translates to "don't tell Sheppy more than he needs to know." All the talk about setting up an NFS share just confused me. :)
I keep getting permission denied errors trying to connect to this, so I'm apparently not sure what I'm doing still.  I've tried my LDAP username and password, plus both my regular usernames and my passphrase for RSA.
Right after posting that last comment, I got it to work.
Except that I don't appear to have write access.
It should work for you now.
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.