Closed
Bug 786569
Opened 13 years ago
Closed 13 years ago
sync: server-storage stage deploy: server_storage -> 1.13-7, server_core -> 2.10-7
Categories
(Cloud Services :: Operations: Deployment Requests - DEPRECATED, task)
Cloud Services
Operations: Deployment Requests - DEPRECATED
x86_64
Linux
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: rfkelly, Unassigned)
Details
(Whiteboard: [qa+])
Please deploy server-storage 1.13-7 and server-core 2.10-7 to stage sync server environments. Build command:
make build PYPI=http://pypi.build.mtv1.svc.mozilla.com/simple PYPIEXTRAS=http://pypi.build.mtv1.svc.mozilla.com/extras PYPISTRICT=1 SERVER_STORAGE=rpm-1.13-7 SERVER_CORE=rpm-2.10-7 CHANNEL=prod RPM_CHANNEL=prod build_rpms
This includes an experimental umemcache-based backend so that we can try to do some basic connection pooling, and is otherwise unchanged from 1.13-6.
Please deploy it with the MozScvWorker gunicorn worker enabled as described in Bug 786479 Comment 1.
Comment 1•13 years ago
|
||
Built and deployed.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
| Reporter | ||
Comment 2•13 years ago
|
||
Light load is running, a couple of failures which I will dig into.
Interestingly, the greenlet-blocking detector is reporting lots of instances of blocking in os-level functions like this:
File "/usr/lib/python2.6/site-packages/gunicorn/arbiter.py", line 442, in spawn_worker
worker.init_process()
File "/usr/lib/python2.6/site-packages/services/gunicorn_worker.py", line 93, in init_process
super(MozSvcWorker, self).init_process()
File "/usr/lib/python2.6/site-packages/gunicorn/workers/ggevent.py", line 105, in init_process
super(GeventWorker, self).init_process()
File "/usr/lib/python2.6/site-packages/gunicorn/workers/base.py", line 102, in init_process
self.run()
File "/usr/lib/python2.6/site-packages/gunicorn/workers/ggevent.py", line 77, in run
self.notify()
File "/usr/lib/python2.6/site-packages/gunicorn/workers/base.py", line 66, in notify
self.tmp.notify()
File "/usr/lib/python2.6/site-packages/gunicorn/workers/workertmp.py", line 34, in notify
os.fchmod(self._tmp.fileno(), self.spinner)
Not much we can do about these and they don't seem to be causing any problems, we may just have to up the checking interval enough to exclude them.
| Reporter | ||
Comment 3•13 years ago
|
||
Testing with no connection pooling produced very similar results to the ones encountered with python-memcached, as reported in Bug 786536 Comment 2. This is as expected since it would be using approximately the same number of connections to couchbase.
In addition I am seeing some requests time out after approx 40 seconds. This is new, and at a first guess I'd say it's probably a bug in the connection manager I implemented on top of umemcache.
Next I will try some connection pooling to see if I can remove the "proxy downstream timeout" errors.
| Reporter | ||
Comment 4•13 years ago
|
||
Limiting things to one memcache connection per worker process, this version has sustained 10mins of loadtest, serving ~300qps with no errors. That's a very good start...
| Reporter | ||
Comment 5•13 years ago
|
||
With three connections per worker process I also see no errors. With ten connections per worker process, I see regular occurrences of the "proxy downstream timeout" error.
This seems to support our hypothesis that the loadtest errors from yesterday were caused by too many concurrent connections to couchbase, especially since the graphs of couchbase activity are pretty much identical between these runs.
(This seems like a very low number of total connections: 10 connections x 4 workers x 4 webheads. But it's hard to argue with the results.)
Comment 6•13 years ago
|
||
Reopening this one....
:bobm and :rfkelly to be consistent, we should be deploying to Dev as well...
Dev Sync webheads (sync{2..5}.web.mtv1.dev) are running the following:
python26-services-2.10-6
python26-syncstorage-1.13-6
or older (inconsistent)!
Also, looks like sync1.web.mtv1.dev.svc is down?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 7•13 years ago
|
||
Alternately (rather than follow the ts/aitc model), I can open a separate bug for Sync Dev Env...
Whiteboard: [qa+]
| Reporter | ||
Comment 8•13 years ago
|
||
IIRC sync dev env is currently blocked due to some dependency problems, pending a rebuilt with new OS. In any case, I'm about to prep a new deploy request so no point hassling :bobm to get this one into dev :-)
Comment 9•13 years ago
|
||
Sounds good. Let's go with a fresh ticket then.
Status: REOPENED → RESOLVED
Closed: 13 years ago → 13 years ago
Resolution: --- → FIXED
Updated•13 years ago
|
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•