Closed
Bug 970763
Opened 11 years ago
Closed 11 years ago
Key error tracebacks in Sync server storage 1.5.1-2
Categories
(Cloud Services Graveyard :: Server: Sync, defect, P3)
Cloud Services Graveyard
Server: Sync
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: bobm, Assigned: rfkelly)
References
Details
(Whiteboard: [qa+])
Attachments
(1 file, 1 obsolete file)
|
9.67 KB,
patch
|
telliott
:
review+
|
Details | Diff | Splinter Review |
Trace backs like the following have been showing up in the production sync.err log:
014-02-07 09:29:02,249 CRITI [root][Dummy-54] Uncaught exception while processing request:
DELETE https://sync-2-us-east-1.sync.services.mozilla.com/1.5/XXXX/storage/tabs/XXXXXXXXXXXX
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/mozsvc/tweens.py", line 70, in log_uncaught_exceptions_tween
return handler(request)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/mozsvc/tweens.py", line 36, in catch_backend_errors_tween
return handler(request)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/tweens.py", line 21, in excview_tween
response = handler(request)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/router.py", line 163, in handle_request
response = view_callable(context, request)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/config/views.py", line 596, in __call__
return view(context, request)
File "/media/ephemeral0/server-syncstorag
e/lib/python2.6/site-packages/pyramid/config/views.py", line 329, in attr_view
return view(context, request)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/config/views.py", line 305, in predicate_wrapper
return view(context, request)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/config/views.py", line 245, in _secured_view
return view(context, request)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/config/views.py", line 377, in rendered_view
context)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/renderers.py", line 416, in render_view
return self.render_to_response(response, system, request=request)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/pyramid/renderers.py", line 439, in render_to_response
result = self.render(value, system_values, request=request)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-p
ackages/pyramid/renderers.py", line 435, in render
result = renderer(value, system_values)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/views/renderers.py", line 22, in __call__
self.adjust_response(value, request, response)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/views/renderers.py", line 41, in adjust_response
super(JsonRenderer, self).adjust_response(value, request, response)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/views/renderers.py", line 30, in adjust_response
ts = get_resource_timestamp(request)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/views/util.py", line 86, in get_resource_timestamp
return storage.get_item_timestamp(userid, collection, item)
File "/media/ephemeral0/server-syncstorage/lib/python2.6
/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 354, in get_item_timestamp
return colmgr.get_item_timestamp(userid, item)
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 820, in get_item_timestamp
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 814, in get_item
it through to the underlying store. It manages its own timestamps
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 780, in get_items
except ValueError:
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncStorage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 776, in <genexpr>
# Note that we defaulted it to zero above.
File "/media/ephemeral0/server-syncstorage/lib/python2.6/site-packages/SyncS
torage-1.5.0-py2.6.egg/syncstorage/storage/memcached.py", line 767, in <genexpr>
bsos = list(bsos)
<type 'exceptions.KeyError'>
| Assignee | ||
Updated•11 years ago
|
Hardware: x86 → All
| Assignee | ||
Comment 1•11 years ago
|
||
Ugh, more under-tested edge-cases for the memcached storage layer.
This patch adds two new test runs to the functional test suite: one with the memcached layer where all collections are cache-only, and one with the memcached layer where all collections are write-through-cache. It also fixes four issues uncovered by the extra testing, including the one causing the above traceback:
* get_items() would erroring out if you asked it for a non-existing item id
* get_items() would not accept an "ids=" filter from calling code
* "payload" was not defaulted to zero if missing at upload time
* filtering for expired ttls was not working correctly
| Assignee | ||
Comment 2•11 years ago
|
||
Attachment #8373829 -
Flags: review?(telliott)
Comment 3•11 years ago
|
||
Comment on attachment 8373829 [details] [diff] [review]
sync15-test-all-the-memcached-things.diff
Review of attachment 8373829 [details] [diff] [review]:
-----------------------------------------------------------------
::: syncstorage/storage/memcached.py
@@ +698,5 @@
> except KeyError:
> num_created += 1
> + # Set default payload on newly-created items.
> + if "payload" not in bso:
> + bso["payload"] = ""
This seems a little alarming. If there's a bso without a payload, that implies an update (which further implies the previous existence of an item). I don't feel like we should be caching something that's almost certainly erroneous.
| Assignee | ||
Comment 4•11 years ago
|
||
> If there's a bso without a payload, that implies an update (which further implies the previous existence of an item).
Not necessarily. Per spec for the "PUT" operation:
"""
If the target BSO does not exist, then fields that are not provided in the request body will be set to their default value by the server.
"""
IIRC we explicitly called this out for 1.5, but it was implicit in the behaviour of the 1.1 servers. Allowing the payload to default to an empty string makes several operations much simpler, in particular our use of "UPDATE ON DUPLICATE KEY" for bulk updates.
So this is basically just bringing the memcached backend into line with what the SQL backend already does.
Comment 5•11 years ago
|
||
Right, but this is a key error on a cache. That should imply that we don't have the original record handy there and should either always fall through (until the next time it gets written) or grab the data from the backend to populate it correctly. Or am I misunderstanding that flow there?
Comment 6•11 years ago
|
||
Basically, I want to avoid the scenario where we have a record in the db, but not in the cache, an update gets called, and the cache subsequently has a live record with an empty payload.
| Assignee | ||
Comment 7•11 years ago
|
||
> That should imply that we don't have the original record handy there
There's no original record in this case, it's a brand new record that happens to be uploaded with no payload. We may be talking across each other here, let's chat about it IRL later today.
Updated•11 years ago
|
Whiteboard: [qa+]
Comment 8•11 years ago
|
||
So, we chatted about this one. The concern is that a record with an empty payload that isn't in cache is dangerous if there's been a cache failure. It would mask the actual result in the DB.
*If* the client is never expected to send a record with an empty payload as the first time that record shows up (adding rnewman to verify that), we think the correct behavior is to blow away the writethrough memcache for that record type and let it repopulate from db. That's bad if empty-first-payload is a common semantic, but fine if it just happens occasionally.
Flags: needinfo?(rnewman)
Comment 9•11 years ago
|
||
Clients should never be sending an empty record *ever*, in reality. That part of the API only exists to allow for TTL extension.
I would perhaps suggest:
* If we get an empty payload
* If it's not in the cache
* Repopulate from DB
* If the record wasn't in the DB, return a 400.
Flags: needinfo?(rnewman)
| Assignee | ||
Comment 10•11 years ago
|
||
> Clients should never be sending an empty record *ever*, in reality.
> That part of the API only exists to allow for TTL extension.
Do we ever do TTL extension in practice? Maybe we can just yank it from the API entirely and insist that all uploads are full BSO objects, which sure simplify a not of stuff on the server.
Flags: needinfo?(rnewman)
Comment 11•11 years ago
|
||
Historical context: Bug 784599, which came about because of Bug 768663.
I'm quite confident that Android Sync never does a refreshing-POST.
Reading desktop code it appears that we don't currently do so there, either.
Generally speaking, none of our records are so big that reuploading them is a problem.
Flags: needinfo?(rnewman)
| Assignee | ||
Comment 12•11 years ago
|
||
Patch updated based on above discussion. For cache-only collections, we default to an empty payload if it's not specified on upload. For write-through-cache collections, we don't repopulate the cache after a write. This will cause it to be lazily refreshed on the next read request.
Attachment #8376937 -
Flags: review?(telliott)
| Assignee | ||
Updated•11 years ago
|
Attachment #8373829 -
Attachment is obsolete: true
Attachment #8373829 -
Flags: review?(telliott)
Updated•11 years ago
|
Attachment #8376937 -
Flags: review?(telliott) → review+
| Assignee | ||
Comment 13•11 years ago
|
||
Worth noting: this keeps the partial-update semantics of the current protocol, including the ability to send records with an empty payload to bump their ttl.
| Assignee | ||
Comment 14•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Priority: -- → P3
Comment 15•11 years ago
|
||
Verified in code in by running the unit tests locally and against Stage.
Status: RESOLVED → VERIFIED
Updated•2 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•