Closed Bug 866568 Opened 11 years ago Closed 11 years ago

Deploy server-whoami into staging environment

Categories

(Cloud Services :: Operations: Deployment Requests - DEPRECATED, task)

x86_64
Windows 7
task
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: rfkelly, Assigned: bobm)

References

Details

(Whiteboard: [qa+])

Attachments

(2 files)

To support our experiment in standing up storage nodes in AWS, we need to deploy the new server-whoami service into stage. It should work similarly to other sync-related products, but will require new VMs and a custom config file etc.

Build command:

make build PYPI=http://pypi.build.mtv1.svc.mozilla.com/simple PYPIEXTRAS=http://pypi.build.mtv1.svc.mozilla.com/extras PYPISTRICT=1 SERVER_WHOAMI=rpm-1.0-1 SERVER_CORE=rpm-2.13-1 CHANNEL=prod RPM_CHANNEL=prod build_rpms


This is a new service, described in Bug 865936.  It requires a read-only connection to LDAP through which it will authenticate users.  No MySQL access, no memcache.  I will attach proposed /etc/sync/ config files to this bug, which are based on the syncstorage config files with the [storage] sections removed.

Nginx, gunicorn, etc should be set up like they are for server-storage.  The gunicorn runline will need to reference the syncwhoami application rather than syncstorage, like so:

  exec /usr/bin/gunicorn -k gevent -w 4 -b 127.0.0.1:8000 syncwhoami.run:application


You can test that the server is running by visiting it in the browser; the root URL should serve an "It Works!" page.

We also need to make sure this service can be reached from whatever instances we stand up in AWS.  Preferably it would not be open to the whole internet, but that's not super important for stage/testing purposes.
Blocks: 866573
Our PyPI mirror is currently down, so the above build command won't work.  If it's still down come build time, remove the PYPI= PYPIEXTRAS= and PYPISTRICT= arguments and try again.
Depends on: 866576
It looks like our Ganeti VM host in mtv (vm2-*.mtv1.svc.mozilla.com) crashed sometime ago.  I've power cycled both hosts, which includes the PyPI mirror/build VM.
Whiteboard: [qa+]
Assignee: nobody → bobm
Status: NEW → ASSIGNED
Puppet issues resolved for the Server-Whoami VMs, and the hosts have been puppetized.
RPMs built.  Removed the "--allow-hosts" option in the Make file and added "--force" to the build options to get it to build.  Will deploy and test in Stage.
:bobm nice to hear.
Add a note to this ticket with a block of time for tomorrow (Tuesday) when QA can jump on and try a simple load test...
Configuration added to puppet.  Just need to add sync::app::whoami to the node classification script (node-info.pl) for server-whoami hosts.
Added a whoami class to our puppet node classification script, and the whoami servers have been puppetized and configured to run.

Gunicorn starts as long as gevent isn't specified on the command line.  However, scrypt wasn't built with the RPM bundle, and it also won't start without that.  Going to look for a copy on r6.
Hold that thought.  It built.  Fixing RPM bundle and trying again.
Added the module, and it's still a problem.
(In reply to Bob Micheletto [:bobm] from comment #10)
> Hold that thought.  It built.  Fixing RPM bundle and trying again.

It shouldn't have, I didn't copy across the scrypt-rpm-building logic from the syncstorage tag.  I've done so now, tag rpm-1.0-2, please try again with this updated build command:

make build PYPI=http://pypi.build.mtv1.svc.mozilla.com/simple PYPIEXTRAS=http://pypi.build.mtv1.svc.mozilla.com/extras PYPISTRICT=1 SERVER_WHOAMI=rpm-1.0-2 SERVER_CORE=rpm-2.13-1 CHANNEL=prod RPM_CHANNEL=prod build_rpms
That fixed it.  It's now returning "It works!".  The configuration needs to be cleaned up on these hosts, and the ZLB configuration needs to be done before load tests can start.
:bobm ok thanks for the update.
:bobm and :telliott
So, we need to come up with a sound test schedule.
Today is OK for starting some load testing in Stage.
Friday is travel day.
Monday onward is also good.
I think Sunday evening/night would be a bit to difficult given Ryan's travel schedule.
Ideas?
Starting serious testing on Monday seems fine. It shouldn't take long, honestly - there's only one function to test!
ZLB configuration complete.  Initial load balance testing results in 200's for the service.  However, load testing is failing.
Hitting http://whoami.services.mozilla.com/whoami and providing my sync credentials via basic auth, I see what I know to be the correct node-assignment and what looks like a plausible userid.  So this seems to be working correcly.

BTW, should we make this one "stage-whoami.s.m.c" since it's talking to the stage ldap?
It would be good to change to stage-whoami, but it's okay for present testing.  The whoami servers are accessible in Stage now, and a load test resulted in the entries like the following:

From the Nginx access log:
10.14.214.201 - cuser226164 [10/May/2013:17:32:43 -0700] "GET /whoami HTTP/1.1" 500 73 "-" "Python-urllib/2.6" XFF="-" TIME=0.021

From the Gunicorn application.log:
[2013/May/10:17:32:40 -0700] 77971e5daaeacc79df3f87feafe3a731
[2013/May/10:17:32:40 -0700] Uncaught exception while processing request:
GET /whoami
  File "/usr/lib/python2.6/site-packages/services/util.py", line 324, in __call__
    return self.app(environ, start_response)
  File "/usr/lib/python2.6/site-packages/paste/translogger.py", line 68, in __call__
    return self.application(environ, replacement_start_response)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 147, in __call__
    resp = self.call_func(req, *args, **self.kwargs)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 208, in call_func
    return self.func(req, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 225, in __notified
    response = func(self, request)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 259, in __call__
    response = self._dispatch_request(request)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 315, in _dispatch_request
    self.auth.check(request, match)
  File "/usr/lib/python2.6/site-packages/services/wsgiauth.py", line 92, in check
    match.get('username'))
  File "/usr/lib/python2.6/site-packages/services/wsgiauth.py", line 194, in authenticate_user
    attrs)
  File "/usr/lib/python2.6/site-packages/syncwhoami/__init__.py", line 39, in _authenticate_user
    return self._orig_authenticate_user(user, credentials, attrs)
  File "/usr/lib/python2.6/site-packages/services/user/__init__.py", line 231, in wrapped_method
    return func(self, user, credentials, *args, **kwds)
  File "/usr/lib/python2.6/site-packages/services/user/mozilla_ldap.py", line 188, in authenticate_user
    user[attr] = result[attr][0]
<type 'exceptions.KeyError'>
KeyError('syncNode',)
(In reply to Bob Micheletto [:bobm] from comment #19)
>
> From the Gunicorn application.log:
> [2013/May/10:17:32:40 -0700] Uncaught exception while processing request:
> GET /whoami
>
> ..snip..
>
>     user[attr] = result[attr][0]
> <type 'exceptions.KeyError'>
> KeyError('syncNode',)

Ugh.  I bet the user accounts associated with the loadtest don't have a "syncNode" attribute set in LDAP.  These user accounts are like "cuser1" through "cuser99999" - Bob, can you please check one in LDAP and confirm whether it is missing the syncNode attribute.

We have two options:

  * assign them syncNode attributes, despite this not being necessary for the loadtest
  * fix server-whoami so it doesn't fail if syncNode is missing, e.g. by returning None rather than raising an error

Toby, thoughts?  This problem is specific to the loadtest user accounts, all real user accounts will have a "syncNode" attribute since we depend on it in production.
(In reply to Ryan Kelly [:rfkelly] from comment #20)
>   * fix server-whoami so it doesn't fail if syncNode is missing, e.g. by
> returning None rather than raising an error

I implemented this, but I don't particularly like it.  We've got plenty of code that assumes syncNode attribute will exist, and it's ugly to work around this assumption.

IMHO it would be better to fix the LDAP data by assigning an arbitrary syncNode.  It's safe to put any value in there, and not necessary to distribute the users across the stage storage nodes - the loadtest sends them to a random node regardless of their syncNode attribute.
A syncNode attribute with the value of '-' was added to uids cuser{1..1000000}.  The load test was restarted, and the application.log file on the AWS Sync node is now giving the following type of error:

[2013/May/11:15:50:21 -0400] 7892aabdfaf1c482754466283d336ded
[2013/May/11:15:50:21 -0400] Uncaught exception while processing request:
GET /1.1/cuser536756/info/collections
  File "/usr/lib/python2.6/site-packages/services/util.py", line 324, in __call__
    return self.app(environ, start_response)
  File "/usr/lib/python2.6/site-packages/paste/translogger.py", line 68, in __call__
    return self.application(environ, replacement_start_response)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 147, in __call__
    resp = self.call_func(req, *args, **self.kwargs)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 208, in call_func
    return self.func(req, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 225, in __notified
    response = func(self, request)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 259, in __call__
    response = self._dispatch_request(request)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 315, in _dispatch_request
    self.auth.check(request, match)
  File "/usr/lib/python2.6/site-packages/services/wsgiauth.py", line 92, in check
    match.get('username'))
  File "/usr/lib/python2.6/site-packages/services/wsgiauth.py", line 194, in authenticate_user
    attrs)
  File "/usr/lib/python2.6/site-packages/services/user/__init__.py", line 231, in wrapped_method
    return func(self, user, credentials, *args, **kwds)
  File "/usr/lib/python2.6/site-packages/services/user/proxycache.py", line 172, in authenticate_user
    if not self._cache.create_user(**new_user_data):
<type 'exceptions.TypeError'>
TypeError('create_user() takes exactly 4 non-keyword arguments (3 given)',)
:bobm ok...dang...
So we need :rfkelly to look at this before I start any further testing in Stage.
Bleh, a stupid typo bug in ProxyCacheUser: Bug 871333.  I'll need to roll a new server-core release once the fix lands.
Depends on: 871333
The fix for this needs to go into the AWS storage nodes, per Bug 866573 Comment 18.  No need to roll out an updated server-core on the whoami nodes, but you can if you want to keep it all consistent:


make build PYPI=http://pypi.build.mtv1.svc.mozilla.com/simple PYPIEXTRAS=http://pypi.build.mtv1.svc.mozilla.com/extras PYPISTRICT=1 SERVER_WHOAMI=rpm-1.0-2 SERVER_CORE=rpm-2.13-3 CHANNEL=prod RPM_CHANNEL=prod build_rpms
The RPMs have been built and installed on the AWS storage node.  A load test was attempted, and it now appears that the nodes are truncating or not including the proper table information in the SQL queries.  

See following traceback:
[2013/May/13:18:07:31 -0400] f4f5f484a0553ef9728af981eed6447b
[2013/May/13:18:07:31 -0400] Uncaught exception while processing request:
GET /1.1/cuser622375/info/collections
  File "/usr/lib/python2.6/site-packages/services/util.py", line 324, in __call__
    return self.app(environ, start_response)
  File "/usr/lib/python2.6/site-packages/paste/translogger.py", line 68, in __call__
    return self.application(environ, replacement_start_response)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 147, in __call__
    resp = self.call_func(req, *args, **self.kwargs)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 208, in call_func
    return self.func(req, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 225, in __notified
    response = func(self, request)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 259, in __call__
    response = self._dispatch_request(request)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 317, in _dispatch_request
    response = self._dispatch_request_with_match(request, match)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 345, in _dispatch_request_with_match
    result = function(request, **params)
  File "/usr/lib/python2.6/site-packages/metlog/decorators/base.py", line 154, in __call__
    return self._real_call(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/services/metrics.py", line 101, in metlog_call
    return self._fn(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/metlog/decorators/base.py", line 154, in __call__
    return self._real_call(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/metlog/decorators/stats.py", line 50, in metlog_call
    result = self._fn(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/metlog/decorators/base.py", line 154, in __call__
    return self._real_call(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/services/metrics.py", line 114, in metlog_call
    result = self._fn(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/syncstorage/controller.py", line 114, in get_collections
    collections = storage.get_collection_timestamps(user_id)
  File "/usr/lib/python2.6/site-packages/syncstorage/storage/sql.py", line 610, in get_collection_timestamps
    bigint2time(stamp)) for coll_id, stamp in res])
  File "/usr/lib/python2.6/site-packages/syncstorage/storage/sql.py", line 423, in _do_query_fetchall
    res = timed_safe_execute(self._engine, *args, **kwds)
  File "/usr/lib/python2.6/site-packages/metlog/decorators/base.py", line 154, in __call__
    return self._real_call(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/metlog/decorators/stats.py", line 36, in metlog_call
    return self._fn(*args, **kwargs)
  File "/usr/lib/python2.6/site-packages/services/util.py", line 388, in safe_execute
    return engine.execute(*args, **kwargs)
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 2446, in execute
    return connection.execute(statement, *multiparams, **params)
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1449, in execute
    params)
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1584, in _execute_clauseelement
    compiled_sql, distilled_params
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1698, in _execute_context
    context)
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/base.py", line 1691, in _execute_context
    context)
  File "/usr/lib64/python2.6/site-packages/sqlalchemy/engine/default.py", line 331, in do_execute
    cursor.execute(statement, parameters)
  File "/usr/lib/python2.6/site-packages/pymysql/cursors.py", line 117, in execute
    self.errorhandler(self, exc, value)
  File "/usr/lib/python2.6/site-packages/pymysql/connections.py", line 189, in defaulterrorhandler
    raise errorclass, errorvalue
<class 'sqlalchemy.exc.ProgrammingError'>
ProgrammingError('(ProgrammingError) (1146, u"Table \'weave2.wbo\' doesn\'t exist")',)
Depends on: 871842
New tag made with latest fixes, to be deployed to AWS storage nodes per Bug 866573 Comment 19.
Deployment was made. Test updates are going into https://bugzilla.mozilla.org/show_bug.cgi?id=866573
Per Bug 866573 Comment 25, under load we're seeing tracebacks in the server-whoami log that look like this:


[2013/May/16:00:57:32 -0700] 800498c7c59d984c1af0accacfb91796
[2013/May/16:00:57:32 -0700] Uncaught exception while processing request:
GET 
  File "/usr/lib/python2.6/site-packages/services/util.py", line 324, in __call__
    return self.app(environ, start_response)
  File "/usr/lib/python2.6/site-packages/paste/translogger.py", line 68, in __call__
    return self.application(environ, replacement_start_response)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 147, in __call__
    resp = self.call_func(req, *args, **self.kwargs)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 208, in call_func
    return self.func(req, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 225, in __notified
    response = func(self, request)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 259, in __call__
    response = self._dispatch_request(request)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 303, in _dispatch_request
    match = self.mapper.routematch(url=request.path_info)
  File "/usr/lib/python2.6/site-packages/routes/mapper.py", line 688, in routematch
    raise RoutesException('URL or environ must be provided')
<class 'routes.util.RoutesException'>
RoutesException('URL or environ must be provided',)


That appears to be a GET request with no URL specified.  Unclear whether that's being generated by the ProxyUserCache code, or is some corruption in the vein of the "OST" bugs we've seen previously.  In either case, if it's an invalid request then it should be getting filtered out at some higher level.
server-whomi loadtest script in Bug 873342.

Running this on a couple of client machines shows the server not even breaking a sweat.  So we need to try this from within AWS to see if that makes a difference.
OK, so once the review is done, we can make use of that script in the previous load test scenario and go from there...
Depends on: 873342
OK, I will work on this using the new script with :rfkelly Sunday evening PDT....
During load test, seeing errors like the following in LDAP log:
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593189 op=25 SRCH base="dc=mozilla"
 scope=2 deref=0 filter="(uid=cuser815097)"
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593189 op=25 SRCH attr=uidNumber
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593092 op=1 UNBIND
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593103 op=1 UNBIND
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593092 fd=136 closed
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593103 fd=185 closed
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593189 op=25 SEARCH RESULT tag=101 
err=0 nentries=1 text=
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593114 op=15 UNBIND
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593114 fd=202 closed
May 20 15:15:11 slave1 slapd2.4[1951]: connection_read(202): no connection!
Corresponding log entries from entire 1593114 connection:
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 fd=202 ACCEPT from IP=10.14.212.201:21951 (IP=0.0.0.0:389)
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=0 BIND dn="uid=binduser,ou=logins,dc=mozilla" method=128
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=0 BIND dn="uid=binduser,ou=logins,dc=mozilla" mech=SIMPLE ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=0 RESULT tag=97 err=0 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=1 SRCH base="dc=mozilla" scope=2 deref=0 filter="(uid=cuser268230)"
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=1 SRCH attr=uidNumber
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=1 SEARCH RESULT tag=101 err=0 nentries=1 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=2 BIND anonymous mech=implicit ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=2 BIND dn="uidNumber=268230,ou=users,dc=mozilla" method=128
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=2 BIND dn="uidNumber=268230,ou=users,dc=mozilla" mech=SIMPLE ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=2 RESULT tag=97 err=0 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=3 SRCH base="uidNumber=268230,ou=users,dc=mozilla" scope=0 deref=0 filter="(objectClass=*)"
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=3 SRCH attr=syncNode account-enabled
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=3 SEARCH RESULT tag=101 err=0 nentries=1 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=4 BIND anonymous mech=implicit ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=4 BIND dn="uid=binduser,ou=logins,dc=mozilla" method=128
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=4 BIND dn="uid=binduser,ou=logins,dc=mozilla" mech=SIMPLE ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=4 RESULT tag=97 err=0 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=5 SRCH base="dc=mozilla" scope=2 deref=0 filter="(uid=cuser445338)"
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=5 SRCH attr=uidNumber
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=5 SEARCH RESULT tag=101 err=0 nentries=1 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=6 BIND anonymous mech=implicit ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=6 BIND dn="uidNumber=445338,ou=users,dc=mozilla" method=128
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=6 BIND dn="uidNumber=445338,ou=users,dc=mozilla" mech=SIMPLE ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=6 RESULT tag=97 err=0 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=7 SRCH base="uidNumber=445338,ou=users,dc=mozilla" scope=0 deref=0 filter="(objectClass=*)"
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=7 SRCH attr=syncNode account-enabled
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=7 SEARCH RESULT tag=101 err=0 nentries=1 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=8 BIND anonymous mech=implicit ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=8 BIND dn="uid=binduser,ou=logins,dc=mozilla" method=128
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=8 BIND dn="uid=binduser,ou=logins,dc=mozilla" mech=SIMPLE ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=8 RESULT tag=97 err=0 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=9 SRCH base="dc=mozilla" scope=2 deref=0 filter="(uid=cuser222534)"
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=9 SRCH attr=uidNumber
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=9 SEARCH RESULT tag=101 err=0 nentries=1 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=10 BIND anonymous mech=implicit ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=10 BIND dn="uidNumber=222534,ou=users,dc=mozilla" method=128
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=10 BIND dn="uidNumber=222534,ou=users,dc=mozilla" mech=SIMPLE ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=10 RESULT tag=97 err=0 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=11 SRCH base="uidNumber=222534,ou=users,dc=mozilla" scope=0 deref=0 filter="(objectClass=*)"
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=11 SRCH attr=syncNode account-enabled
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=11 SEARCH RESULT tag=101 err=0 nentries=1 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=12 BIND anonymous mech=implicit ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=12 BIND dn="uid=binduser,ou=logins,dc=mozilla" method=128
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=12 BIND dn="uid=binduser,ou=logins,dc=mozilla" mech=SIMPLE ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=12 RESULT tag=97 err=0 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=13 SRCH base="dc=mozilla" scope=2 deref=0 filter="(uid=cuser751915)"
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=13 SRCH attr=uidNumber
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=13 SEARCH RESULT tag=101 err=0 nentries=1 text=
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=14 BIND anonymous mech=implicit ssf=0
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=14 BIND dn="uidNumber=751915,ou=users,dc=mozilla" method=128
May 20 15:15:10 slave1 slapd2.4[1951]: conn=1593114 op=14 RESULT tag=97 err=49 text=
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593114 op=15 UNBIND
May 20 15:15:11 slave1 slapd2.4[1951]: conn=1593114 fd=202 closed
May 20 15:15:11 slave1 slapd2.4[1951]: connection_read(202): no connection!
Given that Bug 866573 has identified the cause of the strange errors, and that this server stood up to a nice bit of load in isolation, I think it's safe to call this deployed and close it.  Bob and James, agree?
I agree.
If :bobm can RESOLVE it, I can VERIFY it.
Setting ticket to resolved.
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
OK. We have sign-off from QA, Dev, OPs.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: