Closed
Bug 738381
Opened 12 years ago
Closed 12 years ago
Setup memcache for bedrock/mozilla.org
Categories
(Infrastructure & Operations Graveyard :: WebOps: Other, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: jlong, Assigned: nmaul)
References
Details
We could really use memcache on mozilla.org to cache things that should only be generated once. Is it possible to set it up by next Thursday, when we are hoping to push a bunch of bedrock stuff live?
Assignee | ||
Comment 1•12 years ago
|
||
How much cache do you think you might need? Seamicro Atom nodes and/or VM's should be able to do this pretty well and we have plenty of them available right now, but the total RAM might be somewhat limited. But that might be enough to get you started. We can do a "real" memcache node (several GB of usable cache space), but might involve ordering hardware. We also have some ready-to-go "generic" memcache nodes, but using them would violate the idea that www.mozilla.org should be somewhat isolated from other things, so as to avoid resource contention causing issues with it (very important site, don't want random-thing.mozilla.org breaking it). So that's non-ideal, IMO. Any comments, cshields? I don't think the Seamicro Xeons will be ready any time soon, since that's still blocked on so many power/network things. So it seems to me like either spin up some Atoms, VMs, or allocate/buy some blades.
Reporter | ||
Comment 2•12 years ago
|
||
At least at first, we'll just use it to cache the output of RSS feeds from blogs and things like that. I can't imagine needing very much RAM. It will be a global cache, the exact same for every single user. It will be refreshing every 30 minutes or so. I would bet even 256 or 512MB would be enough, but I can't say for sure until we profile it.
Reporter | ||
Comment 3•12 years ago
|
||
Any word on getting this done by Thursday? We'd like to use it to cache a blog feed.
Assignee | ||
Comment 4•12 years ago
|
||
Hardware for this is allocated: https://inventory.mozilla.org/en-US/systems/show/5042/ @phong, can you have someone kickstart/puppetize this? RHEL6, x86-64... I can get it's puppet manifests straightened out once it's online. Thanks!
Assignee: server-ops → phong
Reporter | ||
Comment 5•12 years ago
|
||
Thanks Jake! I'll assume it'll be ready by Thursday. If not, I think I can switch to a filesystem-based cache.
Comment 6•12 years ago
|
||
kickstarting right now.
Comment 7•12 years ago
|
||
bedrock-memcache1.webapp.phx1.mozilla.com is kickstarted, puppetized and added in Nagios with generic checks. Reassigning to Jakem.
Assignee: phong → nmaul
Reporter | ||
Comment 8•12 years ago
|
||
We're not relying on this for tomorrow's release, though it would be great to get up and running soon.
No longer blocks: 736338
Assignee | ||
Comment 9•12 years ago
|
||
Working on this today.
Assignee | ||
Comment 10•12 years ago
|
||
This system is up and should be usable, at least for prod (need to open a netops ACL bug for dev/stage). Let me know what settings you need in place to start using this.
Comment 11•12 years ago
|
||
Jake: Were you able to hook up memcache for dev/stage yet too? Those don't need to be too beefy, they just need to work :) jlongster: Can you tell Jake what settings need to be put in place for memcache? Standard django stuff, I presume.
Reporter | ||
Comment 12•12 years ago
|
||
I think this is all you need, right? CACHES = { 'default': { 'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache', 'LOCATION': '127.0.0.1:11211', } } Replace the LOCATION with the appropriate thing, of course.
Comment 13•12 years ago
|
||
can LOCATION be a list?
Reporter | ||
Comment 14•12 years ago
|
||
(In reply to Fred Wenzel [:wenzel] from comment #13) > can LOCATION be a list? I don't see anything about that: https://docs.djangoproject.com/en/dev/ref/settings/#std:setting-CACHES-LOCATION
Comment 15•12 years ago
|
||
(In reply to James Long (:jlongster) from comment #14) > (In reply to Fred Wenzel [:wenzel] from comment #13) > > can LOCATION be a list? > > I don't see anything about that: > https://docs.djangoproject.com/en/dev/ref/settings/#std:setting-CACHES- > LOCATION It should, we use it on a variety of sites, e.g. CACHES = { 'default': { 'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache', 'LOCATION': [ 'memcache-generic01:11211', 'memcache-generic02:11211', ], 'KEY_PREFIX': 'mozillalabs_stage' } }
Comment 16•12 years ago
|
||
Yeah seems like a documentation fail. A list should work indeed: http://stackoverflow.com/questions/6876250/how-does-django-handle-multiple-memcached-servers All right, use a list then! :)
Reporter | ||
Comment 17•12 years ago
|
||
How should we go about testing this? Should we add the settings to www-dev.allizom.org and test it there, and then roll the code to production? Should www-dev continue using the production memcache (probably not)?
Comment 18•12 years ago
|
||
I'm not sure on the status of memcache servers for bedrock itself, given it's out of two DCs, etc 302 jake and maybe corey on where that's at
Comment 19•12 years ago
|
||
(In reply to James Long (:jlongster) from comment #17) > Should www-dev continue using the production memcache (probably not)? God no :)
Assignee | ||
Comment 20•12 years ago
|
||
I'm actually okay with dev using the prod memcache, because comment 15 indicates you can specify a KEY_PREFIX, which should effectively prevent any collisions. I've never known memcache to be noticeably slower under extra load (at least until you have kernel-level problems), so the only real concern is cache size. There's around 10GB of cache space, and we can always add more if needed. In any case, there seems to be some sort of issue with this. Once this setting is in place, manage.py jobs no longer work. This is for www-dev.allizom.org: [root@bedrockadm.private.phx1 bedrock]# python manage.py compress_assets 2>&1 1> /dev/null | grep -v 'old-style Playdoh layout' Error: No module named memcache Any ideas?
Assignee | ||
Comment 21•12 years ago
|
||
Prod will be a bit more of a pain, we'll need to spin up a memcache node there still and use puppet to set a /etc/hosts entry to use that goes to the right place. This means memcache would not be suitable for session info, as users will bounce between DC's. Is that okay or do we need to come up with something different?
Assignee | ||
Comment 22•12 years ago
|
||
I disabled this on www-dev.allizom.org to stop the emails. Let us know when the module is installed (something in vendor I guess) and we'll uncomment this setting.
Reporter | ||
Comment 23•12 years ago
|
||
(In reply to Jake Maul [:jakem] from comment #20) > > [root@bedrockadm.private.phx1 bedrock]# python manage.py compress_assets > 2>&1 1> /dev/null | grep -v 'old-style Playdoh layout' > Error: No module named memcache > > Any ideas? The memcache package wasn't installed. I just installed it and pushed to dev.
Assignee | ||
Comment 24•12 years ago
|
||
Heh... well, I just installed python-memcached package as well (RPM). I don't know which of our fixes fixed it, but it's fixed now and turned back on.
Reporter | ||
Comment 25•12 years ago
|
||
(In reply to Jake Maul [:jakem] from comment #21) > Prod will be a bit more of a pain, we'll need to spin up a memcache node > there still and use puppet to set a /etc/hosts entry to use that goes to the > right place. > > This means memcache would not be suitable for session info, as users will > bounce between DC's. Is that okay or do we need to come up with something > different? Yep, for now at least. We usually can't depend on sessions because that inherently means we can't cache the page. (In reply to Jake Maul [:jakem] from comment #22) > I disabled this on www-dev.allizom.org to stop the emails. Let us know when > the module is installed (something in vendor I guess) and we'll uncomment > this setting. Sorry about that, I was trying to get to it earlier but some other stuff came up.
Assignee | ||
Comment 26•12 years ago
|
||
For stage (www.allizom.org, which still needs moved over to the bedrock cluster), we can use the same memcache node as here. no problems, stage is only in PHX1, like dev. For prod, we'll need to set up another memcache node in SCL3. Then we'll set the configs up to look at a simple/short hostname (just "memcache" or something), and then use puppet to deploy a proper /etc/hosts record for that to each cluster. That'll work just fine for the web nodes accessing memcache normally. This brings up a problem with the cron job though... it will need to be able to import things to *both* sets of memcache nodes. I'd rather not set up a separate admin node for bedrock in SCL3, just because that seems likely to make things more complicated than they need to be (they might get out of sync). So let's open this up for discussion: how can we have the one admin node write to both sets of memcache nodes, and yet still have the settings_local.py file only point to the "local" set of memcache nodes? Lots of ideas come to mind: 1) Maintain separate settings for the update_feeds cron, so that it knows about both sets of memcache nodes, and can set keys on both of them. 2) Perhaps the update_feeds cron could take an argument indicating which set of nodes to update, and we could just call it twice... once with each argument. This would still require some setting somewhere so that it would know which nodes are which (or perhaps the argument could be the whole node list for a cluster). 3) Make a separate cron that builds on the update_feeds job... fetch the keys from the PHX1 memcache and insert them in SCL3. This could potentially be done entirely outside of Django as a shell script or something. Same settings problem as #1 and #2 though. 4) Put the update_feeds cron on one or more of the web nodes, instead of the admin node. Pretty strange organization (generally our web nodes have only system-level crons on them). Also, either some nodes become "special", or they all get the cron and we do extra work. 5) Ultimately eschew memcache and put these records in MySQL, which should eventually be a master/master cluster, in which case writes anywhere will be available in both places. Obviously this would be dependent on having MySQL up and running. 6) Find some way to mirror data between the 2 memcache clusters. Maybe there's a generic solution that will work. 7) Don't do this in cron. Instead, have the nodes pull down this information if it's not already there, and also set a 1-hour timestamp key, and automatically refresh when it's hit and the value is past (or nearing) TTL. Basically, treat it more like a traditional memcache installation. Thoughts?
Comment 27•12 years ago
|
||
(In reply to Jake Maul [:jakem] from comment #26) > 1) Maintain separate settings for the update_feeds cron, so that it knows > about both sets of memcache nodes, and can set keys on both of them. > > 2) Perhaps the update_feeds cron could take an argument indicating which set > of nodes to update, and we could just call it twice... once with each > argument. This would still require some setting somewhere so that it would > know which nodes are which (or perhaps the argument could be the whole node > list for a cluster). manage.py does indeed have a --settings option (sometimes used to run tests, for example): --settings=SETTINGS The Python path to a settings module, e.g. "myproject.settings.main". If this isn't provided, the DJANGO_SETTINGS_MODULE environment variable will be used. We could use that to feed two different sets of memcache settings into the settings file (all else be equal), then call the cron job twice?
Assignee | ||
Comment 29•12 years ago
|
||
I have bedrock ready to go for prod... not counting the cron job, which is a separate issue to be handled in bug 753566. Puppet manages a "memcache1-prod" entry in /etc/hosts, which correctly points to the local memcache node in SCL3 and PHX1. There is no synchronization between PHX1/SCL3 memcache, but this should pose no issues for normal memcache usage patterns. If it becomes a problem we can investigate Couchbase licensing, which theoretically implements this. Please let me know if I should enable the new CACHES block. It looks like this: #CACHES = { # 'default': { # 'BACKEND': 'django.core.cache.backends.memcached.MemcachedCache', # 'LOCATION': [ # 'memcache1-prod:11211', # ], # 'KEY_PREFIX': 'bedrock_prod' # } #} The current CACHES block looks like this: CACHES = { 'default': { 'BACKEND': 'django.core.cache.backends.locmem.LocMemCache', 'LOCATION': 'translations' } }
Comment 30•12 years ago
|
||
We should enable it on stage (www.allizom.org) first. Please ping me before doing it since we don't want www.allizom.org to be broken as we are preparing for the Firefox 14 launch.
Assignee | ||
Comment 31•12 years ago
|
||
This is deployed for stage, and we confirmed that keys are being set properly. I have PTO tomorrow and Monday, and can enable this for prod on Tuesday... or you can have someone else from webops take care of it. The new CACHES block is already in the settings/local.py file, just commented out. Simply remove the CACHES block for locmem and uncomment the one for memcache. Note that this bug is purely about making memcache work and enabling it in Django... it's not about setting up any cron's to write to memcache. There are other bugs for that (notably 759564 and 753566, at least).
Assignee | ||
Comment 32•12 years ago
|
||
This is completed! So far everything seems fine.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Updated•5 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•