Closed Bug 693893 Opened 13 years ago Closed 12 years ago

Key error u'meta' when PUTs with X-If-Unmodified-Since when memcache 'meta' key is not yet set

Categories

(Cloud Services Graveyard :: Server: Sync, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: philikon, Assigned: rfkelly)

References

Details

(Whiteboard: [sync 2.0][qa?])

Attachments

(2 files)

Starting with bug 684798 we're using X-If-Unmodified-Since to add some transactionality for uploads to meta/global by passing in the timestamp from the previous DELETE. This works fine when running a local instance of server-full, but it breaks on stage and prod, returning a 503 that's a converted 500:

1318384894745	Sync.Resource	TRACE	HTTP Header x-if-unmodified-since: 1318384895.55
1318384894745	Sync.Resource	TRACE	HTTP Header authorization: ***** (suppressed)
1318384894746	Sync.Resource	DEBUG	PUT Length: 78
1318384894746	Sync.Resource	TRACE	PUT Body: {"payload":"{\"syncID\":\"g3BTSjZgFxx9\",\"storageVersion\":5}","id":"global"}
1318384895017	Sync.Resource	TRACE	onStartRequest called for channel [xpconnect wrapped nsIRequest @ 0x12d2b1ac0 (native @ 0x129191a00)].
1318384895018	Sync.Resource	TRACE	onStartRequest: PUT https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage/meta/global
1318384895019	Sync.Resource	TRACE	Request status is 0
1318384895019	Sync.Resource	TRACE	Channel for PUT https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage/meta/global: isSuccessCode(0)? true, isSuccessCode(0)? true
1318384895019	Sync.Resource	TRACE	Channel: flags = 640, URI = https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage/meta/global, HTTP success? false
1318384895019	Sync.Resource	TRACE	In _onComplete. Error is null.
1318384895019	Sync.Resource	TRACE	Channel: [xpconnect wrapped (nsISupports, nsIChannel, nsIRequest, nsIHttpChannel, nsIUploadChannel) @ 0x14acde2f0 (native @ 0x129191a00)]
1318384895019	Sync.Resource	TRACE	Action: PUT
1318384895020	Sync.Resource	TRACE	Status: 503
1318384895020	Sync.Resource	TRACE	Success: false
1318384895020	Sync.Resource	DEBUG	mesg: PUT fail 503 https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage/meta/global
1318384895020	Sync.Resource	DEBUG	PUT fail 503 https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage/meta/global
1318384895020	Sync.Resource	TRACE	PUT body: "server error: converted 500"
stacktrace identifies storage server failure to handle memcache contents, moving to backend.  are any XIUS tests failing right now?

details follow: list of python26 rpms installed, python stacktrace, contents of memcache.

python26-1.0-1.el6.x86_64
python26-amqplib-0.6.1-1.noarch
python26-cef-0.2-1.noarch
python26-followtail-0.1-1.noarch
python26-gevent-0.13.6-1.x86_64
python26-greenlet-0.3.1-1.x86_64
python26-ldap-2.3.13pre-1moz.x86_64
python26-mako-0.4.1-1.noarch
python26-markupsafe-0.12-1.x86_64
python26-memcached-1.47-1.noarch
python26-mysql-python-1.2.3-1.x86_64
python26-paste-1.7.5.1-1.noarch
python26-pastedeploy-1.3.4-1.noarch
python26-pastescript-1.7.3-1.noarch
python26-pylibmc-1.1.1-1.x86_64
python26-pymysql-0.4.2-1.noarch
python26-pymysql_sa-1.0-1.noarch
python26-recaptcha-client-1.0.6-1.noarch
python26-routes-1.12.3-1.noarch
python26-services-2.6.3-1.noarch
python26-setuptools-1.0-1.el6.x86_64
python26-simplejson-2.1.6-1.x86_64
python26-sqlalchemy-0.6.6-1.noarch
python26-syncstorage-1.10-1.noarch
python26-twisted-10.2.0-1.x86_64
python26-txamqp-0.4-1.noarch
python26-webob-1.0.7-1.noarch
python26-wsgiproxy-0.2.2-1.noarch
python26-zope.interface-3.6.1-1.x86_64

2011-10-11 19:01:35,879 ERROR [syncserver] cf01c183bebe9467def5ceb566527c89
2011-10-11 19:01:35,880 ERROR [syncserver] Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/services/util.py", line 495, in __call__
    return self.app(environ, start_response)
  File "/usr/lib/python2.6/site-packages/paste/translogger.py", line 68, in __call__
    return self.application(environ, replacement_start_response)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 147, in __call__
    resp = self.call_func(req, *args, **self.kwargs)
  File "/usr/lib/python2.6/site-packages/webob/dec.py", line 208, in call_func
    return self.func(req, *args, **kwargs)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 191, in __notified
    response = func(self, request)
  File "/usr/lib/python2.6/site-packages/services/baseapp.py", line 270, in __call__
    result = function(request, **params)
  File "/usr/lib/python2.6/site-packages/syncstorage/controller.py", line 272, in set_item
    if self._was_modified(request, user_id, collection_name):
  File "/usr/lib/python2.6/site-packages/syncstorage/controller.py", line 80, in _was_modified
    collection_name)
  File "/usr/lib/python2.6/site-packages/syncstorage/storage/memcachedsql.py", line 376, in get_collection_max_timestamp
    return stamps[collection_name]
KeyError: u'meta'

# echo 1100048 | python26 549513.py localhost:11222
1100048:tabs	None
1100048:meta:global	None
1100048:stamps	{}
1100048:size	None
Component: Operations → Server: Sync
QA Contact: operations → sync-server
Revised description, since philikon created a new acccount with no data, and the error appears to be "failure to check for empty meta in memcache".
Summary: PUTs with X-If-Unmodified-Since fail with a 503 (converted 500) → Key error u'meta' when PUTs with X-If-Unmodified-Since when memcache 'meta' key is not yet set
Thank you, Richard!

Some background for the client's behaviour here: This is during an initial sync. This could happen for instance after a node reallocation. The first thing we do is DELETE the user's entire storage. We then take the timestamp resulting from that and PUT a new meta/global record with X-If-Unmodified-Since referring to that timestamp. This is to ensure that there wasn't another client racing us that uploaded a new meta/global record before we could.

I expect the DELETE to update the collection's internal timestamp just the same as a POST/PUT would. Using it as a reference point in uploads via X-If-Unmodified-Since would be vital for improving race condition behaviour. See also bug 692700.

Full client log excerpt showing first the DELETE, then the PUT to meta/global:

1318384894487	Sync.Resource	TRACE	HTTP Header x-confirm-delete: 1
1318384894487	Sync.Resource	TRACE	HTTP Header authorization: ***** (suppressed)
1318384894737	Sync.Resource	TRACE	onStartRequest called for channel [xpconnect wrapped nsIRequest @ 0x14acd42d0 (native @ 0x12752d600)].
1318384894738	Sync.Resource	TRACE	onStartRequest: DELETE https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage
1318384894739	Sync.Resource	TRACE	Request status is 0
1318384894740	Sync.Resource	TRACE	Channel for DELETE https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage: isSuccessCode(0)? true, isSuccessCode(0)? true
1318384894740	Sync.Resource	TRACE	Channel: flags = 640, URI = https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage, HTTP success? true
1318384894740	Sync.Resource	TRACE	In _onComplete. Error is null.
1318384894740	Sync.Resource	TRACE	Channel: [xpconnect wrapped (nsISupports, nsIChannel, nsIRequest, nsIHttpChannel) @ 0x127e6ee60 (native @ 0x12752d600)]
1318384894740	Sync.Resource	TRACE	Action: DELETE
1318384894741	Sync.Resource	TRACE	Status: 200
1318384894741	Sync.Resource	TRACE	Success: true
1318384894741	Sync.Resource	DEBUG	mesg: DELETE success 200 https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage
1318384894741	Sync.Resource	DEBUG	DELETE success 200 https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage
1318384894741	Sync.Resource	TRACE	DELETE body: true
1318384894742	Sync.Resource	TRACE	Processing response headers.
1318384894742	Sync.Service	TRACE	Event: weave:service:wipe-server:finish
1318384894743	Sync.Service	DEBUG	New metadata record: {"syncID":"g3BTSjZgFxx9","storageVersion":5}
1318384894744	Sync.Resource	TRACE	In _doRequest.
1318384894745	Sync.Resource	TRACE	HTTP Header x-if-unmodified-since: 1318384895.55
1318384894745	Sync.Resource	TRACE	HTTP Header authorization: ***** (suppressed)
1318384894746	Sync.Resource	DEBUG	PUT Length: 78
1318384894746	Sync.Resource	TRACE	PUT Body: {"payload":"{\"syncID\":\"g3BTSjZgFxx9\",\"storageVersion\":5}","id":"global"}
1318384895017	Sync.Resource	TRACE	onStartRequest called for channel [xpconnect wrapped nsIRequest @ 0x12d2b1ac0 (native @ 0x129191a00)].
1318384895018	Sync.Resource	TRACE	onStartRequest: PUT https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage/meta/global
1318384895019	Sync.Resource	TRACE	Request status is 0
1318384895019	Sync.Resource	TRACE	Channel for PUT https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage/meta/global: isSuccessCode(0)? true, isSuccessCode(0)? true
1318384895019	Sync.Resource	TRACE	Channel: flags = 640, URI = https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage/meta/global, HTTP success? false
1318384895019	Sync.Resource	TRACE	In _onComplete. Error is null.
1318384895019	Sync.Resource	TRACE	Channel: [xpconnect wrapped (nsISupports, nsIChannel, nsIRequest, nsIHttpChannel, nsIUploadChannel) @ 0x14acde2f0 (native @ 0x129191a00)]
1318384895019	Sync.Resource	TRACE	Action: PUT
1318384895020	Sync.Resource	TRACE	Status: 503
1318384895020	Sync.Resource	TRACE	Success: false
1318384895020	Sync.Resource	DEBUG	mesg: PUT fail 503 https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage/meta/global
1318384895020	Sync.Resource	DEBUG	PUT fail 503 https://stage-sync71.services.mozilla.com/1.1/jbr7wdjnlkqikno5wajt6c7tzianicqc/storage/meta/global
1318384895020	Sync.Resource	TRACE	PUT body: "server error: converted 500"
(To be clear, no Firefox client out there use the XIUS header yet. I was hoping to use it for bug 684798 but I guess I can't now.)
(In reply to Richard Soderberg [:atoll] from comment #1)
> stacktrace identifies storage server failure to handle memcache contents,
> moving to backend.  are any XIUS tests failing right now?

You say "tests"...

  server-storage $ ack -i "x-if-unmodified"
  syncstorage/controller.py
  73:        """Checks the X-If-Unmodified-Since header."""
  74:        unmodified = request.headers.get('X-If-Unmodified-Since')
  
  syncstorage/tests/functional/test_storage.py
  617:                     headers=[('X-If-Unmodified-Since', str(ts))],
Assignee: nobody → rkelly
Whiteboard: [sync 2.0]
Whiteboard: [sync 2.0] → [sync 2.0][qa?]
Blocks: 727761
Indeed, this was a simple case of mis-handling the modified time on empty collections.
Attachment #610799 - Flags: review?(telliott)
Blocks: 740695
Attachment #610799 - Flags: review?(telliott) → review+
https://github.com/mozilla-services/server-syncstorage/commit/c3ac711b786a9d3005f51a3017475cf8a2f46883
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Oops, that patch gives a spurious test failure when memcached is not running.  New patch fixes it to bail out if memcached is not running.
Attachment #611057 - Flags: review?(telliott)
Attachment #611057 - Flags: review?(telliott) → review+
Verified through code and "make test" with 4/16 build.
Going to mark this as Verified.
Status: RESOLVED → VERIFIED
Backported to 1.10 release branch: http://hg.mozilla.org/services/server-storage/rev/ee4d972a68fa
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: