Closed Bug 984624 Opened 11 years ago Closed 11 years ago

Explore options to optimize apache config for bedrock

Categories

(Infrastructure & Operations Graveyard :: WebOps: Product Delivery, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jgmize, Assigned: nmaul)

Details

No description provided.
Assignee: server-ops → nmaul
Component: Server Operations → WebOps: Product Delivery
Product: mozilla.org → Infrastructure & Operations
QA Contact: shyam → nmaul
So far: We know we want to get away from threading in WSGIDaemonProcess and use only processes... there are libraries used by Bedrock (and other playdoh apps) that are not thread-safe, and definitely cause problems. To do this means using more RAM. We can only go so high on VMs, and I'd prefer not to have to explode the number of VMs we have to run, if we can help it. Here are a collection of things we can do, roughly in order: 1) Set the stack-size parameter on WSGIDaemonProcess. Should trim a bit of fat off the mod_wsgi procs. 2) Switch PHP processing to fastcgi (or FPM, if we can get it to work). This is mainly to support #3, but also starts edging towards a design where the PHP and Python code could run on separate systems more easily. 3) Switch Apache to the Worker MPM. With PHP and Python both running in external processes, this should become totally safe, and cut down on Apache's own memory usage. 4) Tune the Apache Worker MPM "ThreadStackSize" parameter. Not usable until #3 is done, but should be similar gains as #1. #2 and #3 have been tested on bedrock-stage, and seem to work fine. There is some work still to do with New Relic / PHP and PHP-APC, but the core seems fine. I got this working with fastcgi and the fastcgi-supplied "php-wrapper" script. I didn't have any luck with php-fpm, but that would be preferred. There's a myriad of guides on how to do it, all slightly different. :) More radically: 5) Explore replacing Apache with Nginx. This will be tricky due to all the crazy .htaccess rules and whatnot... but if we can pull it off (for either PHP or Python), it might save us yet more RAM, due to Nginx's event-based model instead of Apache's process/thread model. 6) Explore replacing mod_wsgi with uWSGI or Gunicorn. I hold little hope this would significantly reduce our memory usage, *but* it would likely gain us other things... notably, better status data on the wsgi processes and faster restart times.
I should note, we can stop at pretty much any point along that list (or other things I didn't list) once we've reduced memory usage to the point where we can eliminate threading in the WSGI layer. :)
Need to test if PHX1 can handle the full load by disabling (or lowering the weight of) SCL3 in Dynect.
Test successful! PHX1 was able to take the load without much issue. Load averages did go up, but not unacceptably so. There was a slight spike in response time at first (as measured by New Relic), but within a couple minutes this went away. I'm actually just going to stop here without making any changes at all. We seem to have found a happy medium where we can use multi-process without multi-threading, and still have sufficient RAM to make everything work nicely. Items #2 and #3 are still interesting, but I'd rather stick with the known scenario for the time being.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.