Closed Bug 825766 Opened 12 years ago Closed 12 years ago

configure Mozpool Apache frontend to not limit connection count

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

Details

Attachments

(1 file)

I'm not entirely sure what's going on, but when using Mozpool via Apache, things move a bit slower. If I talk directly to the mozpool daemon, things are faster. Apache seems to add some delay - perhaps because its backend connection queue is full, or as a DDOS protection, or something. This is also causing some 502's for the mozharness script - see bug 825727.
I was able to replicate this with --- while true; do curl http://mobile-imaging-002.p2.releng.scl1.mozilla.com/api/device/panda-0101/status/ date done --- it's very clear that it runs at speed for about a half-dozen requests, then takes a few seconds each for the next requests. I was able to "fix" this by reconfiguring Apache with `ProxyPass .. disablereuse=On`, but then I *reverted* this setting and it was still fixed. So there's still some mystery here.
OK, I can replicate this as follows: On a clean httpd, this runs all requests in the low hundreds of ms: ab -n 2000 -c 10 http://mobile-imaging-002.p2.releng.scl1.mozilla.com/api/device/panda-0101/state/ but running ab -n 2000 -c 20 http://mobile-imaging-002.p2.releng.scl1.mozilla.com/api/device/panda-0101/state/ will put Apache in a mode where some requests take >10s, and even running the `-c 10` version will show these long requests. NOTE: I used /state/ because it caches data locally and doesn't hit the DB for every request. Adding 'disablereuse=On' to the proxy worker config prevents this terrible behavior. Instead, up to a "reasonable" concurrency (4), I get <30ms maximum. With high concurrency (20), the median is still in the low 10's of ms, but with a few 3000s requests. I assume that this is a 3s timeout in either Apache or web.py when the connection pool is exhausted. Another fix is to add SetEnv force-proxy-request-1.0 1 as suggested in http://httpd.apache.org/docs/2.2/mod/mod_proxy.html. This "fixes" the problem the same way disablereuse=On does. So, I'm guessing that web.py has a bug in its implementation of HTTP/1.1 and connection re-use.
Attached patch bug825766.patchSplinter Review
Attachment #697170 - Flags: review?(jwatkins)
Attachment #697170 - Flags: review?(jwatkins) → review+
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Component: Server Operations: RelEng → RelOps
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: