Closed Bug 970763 Opened 11 years ago Closed 11 years ago

Key error tracebacks in Sync server storage 1.5.1-2

Categories

(Cloud Services Graveyard :: Server: Sync, defect, P3)

defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: bobm, Assigned: rfkelly)

References

Details

(Whiteboard: [qa+])

Attachments

(1 file, 1 obsolete file)

Trace backs like the following have been showing up in the production sync.err log: 014-02-07 09:29:02,249 CRITI [root][Dummy-54] Uncaught exception while processing request: DELETE https://sync-2-us-east-1.sync.services.mozilla.com/1.5/XXXX/storage/tabs/XXXXXXXXXXXX File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/mozsvc/tweens.py", line 70, in log_uncaught_exceptions_tween return handler(request) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/mozsvc/tweens.py", line 36, in catch_backend_errors_tween return handler(request) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/tweens.py", line 21, in excview_tween response = handler(request) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/router.py", line 163, in handle_request response = view_callable(context, request) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/config/views.py", line 596, in __call__ return view(context, request) File "/media/ephemeral0/server-syncstorag e/lib/python2.6/site-packages/pyramid/config/views.py", line 329, in attr_view return view(context, request) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/config/views.py", line 305, in predicate_wrapper return view(context, request) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/config/views.py", line 245, in _secured_view return view(context, request) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/config/views.py", line 377, in rendered_view context) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/renderers.py", line 416, in render_view return self.render_to_response(response, system, request=request) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/renderers.py", line 439, in render_to_response result = self.render(value, system_values, request=request) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-p ackages/pyramid/renderers.py", line 435, in render result = renderer(value, system_values) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/views/renderers.py", line 22, in __call__ self.adjust_response(value, request, response) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/views/renderers.py", line 41, in adjust_response super(JsonRenderer, self).adjust_response(value, request, response) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/views/renderers.py", line 30, in adjust_response ts = get_resource_timestamp(request) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/views/util.py", line 86, in get_resource_timestamp return storage.get_item_timestamp(userid, collection, item) File "/media/ephemeral0/server-syncstorage/lib/python2.6 /site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 354, in get_item_timestamp return colmgr.get_item_timestamp(userid, item) File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 820, in get_item_timestamp File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 814, in get_item it through to the underlying store. It manages its own timestamps File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 780, in get_items except ValueError: File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 776, in <genexpr> # Note that we defaulted it to zero above. File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncS torage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 767, in <genexpr> bsos = list(bsos) <type 'exceptions.KeyError'>
Hardware: x86 → All
Ugh, more under-tested edge-cases for the memcached storage layer. This patch adds two new test runs to the functional test suite: one with the memcached layer where all collections are cache-only, and one with the memcached layer where all collections are write-through-cache. It also fixes four issues uncovered by the extra testing, including the one causing the above traceback: * get_items() would erroring out if you asked it for a non-existing item id * get_items() would not accept an "ids=" filter from calling code * "payload" was not defaulted to zero if missing at upload time * filtering for expired ttls was not working correctly
Attachment #8373829 - Flags: review?(telliott)
Comment on attachment 8373829 [details] [diff] [review] sync15-test-all-the-memcached-things.diff Review of attachment 8373829 [details] [diff] [review]: ----------------------------------------------------------------- ::: syncstorage/storage/memcached.py @@ +698,5 @@ > except KeyError: > num_created += 1 > + # Set default payload on newly-created items. > + if "payload" not in bso: > + bso["payload"] = "" This seems a little alarming. If there's a bso without a payload, that implies an update (which further implies the previous existence of an item). I don't feel like we should be caching something that's almost certainly erroneous.
> If there's a bso without a payload, that implies an update (which further implies the previous existence of an item). Not necessarily. Per spec for the "PUT" operation: """ If the target BSO does not exist, then fields that are not provided in the request body will be set to their default value by the server. """ IIRC we explicitly called this out for 1.5, but it was implicit in the behaviour of the 1.1 servers. Allowing the payload to default to an empty string makes several operations much simpler, in particular our use of "UPDATE ON DUPLICATE KEY" for bulk updates. So this is basically just bringing the memcached backend into line with what the SQL backend already does.
Right, but this is a key error on a cache. That should imply that we don't have the original record handy there and should either always fall through (until the next time it gets written) or grab the data from the backend to populate it correctly. Or am I misunderstanding that flow there?
Basically, I want to avoid the scenario where we have a record in the db, but not in the cache, an update gets called, and the cache subsequently has a live record with an empty payload.
> That should imply that we don't have the original record handy there There's no original record in this case, it's a brand new record that happens to be uploaded with no payload. We may be talking across each other here, let's chat about it IRL later today.
Whiteboard: [qa+]
So, we chatted about this one. The concern is that a record with an empty payload that isn't in cache is dangerous if there's been a cache failure. It would mask the actual result in the DB. *If* the client is never expected to send a record with an empty payload as the first time that record shows up (adding rnewman to verify that), we think the correct behavior is to blow away the writethrough memcache for that record type and let it repopulate from db. That's bad if empty-first-payload is a common semantic, but fine if it just happens occasionally.
Flags: needinfo?(rnewman)
Clients should never be sending an empty record *ever*, in reality. That part of the API only exists to allow for TTL extension. I would perhaps suggest: * If we get an empty payload * If it's not in the cache * Repopulate from DB * If the record wasn't in the DB, return a 400.
Flags: needinfo?(rnewman)
> Clients should never be sending an empty record *ever*, in reality. > That part of the API only exists to allow for TTL extension. Do we ever do TTL extension in practice? Maybe we can just yank it from the API entirely and insist that all uploads are full BSO objects, which sure simplify a not of stuff on the server.
Flags: needinfo?(rnewman)
Historical context: Bug 784599, which came about because of Bug 768663. I'm quite confident that Android Sync never does a refreshing-POST. Reading desktop code it appears that we don't currently do so there, either. Generally speaking, none of our records are so big that reuploading them is a problem.
Flags: needinfo?(rnewman)
Patch updated based on above discussion. For cache-only collections, we default to an empty payload if it's not specified on upload. For write-through-cache collections, we don't repopulate the cache after a write. This will cause it to be lazily refreshed on the next read request.
Attachment #8376937 - Flags: review?(telliott)
Depends on: 972190
Attachment #8373829 - Attachment is obsolete: true
Attachment #8373829 - Flags: review?(telliott)
Attachment #8376937 - Flags: review?(telliott) → review+
Worth noting: this keeps the partial-update semantics of the current protocol, including the ability to send records with an empty payload to bump their ttl.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Priority: -- → P3
Verified in code in by running the unit tests locally and against Stage.
Status: RESOLVED → VERIFIED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: