Closed
Bug 988134
Opened 11 years ago
Closed 11 years ago
X-Client-State tracking prevents proper node-reassignment
Categories
(Cloud Services Graveyard :: Server: Token, defect)
Cloud Services Graveyard
Server: Token
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: rfkelly, Assigned: rfkelly)
References
Details
(Whiteboard: [qa+])
Attachments
(2 files)
3.55 KB,
patch
|
telliott
:
review+
|
Details | Diff | Splinter Review |
4.03 KB,
patch
|
telliott
:
review+
|
Details | Diff | Splinter Review |
(I suspect this to be the server-side problem underlying Bug 985145)
The tokenserver database currently has a unique constraint on (email, service, client_state) in order to help enforce the "client-state values must never be re-used" rule. Unfortunately this will interact badly with node-reassignments.
There are two ways we could try to node-reassign a user:
1. Change the value of the "node" column in their current assignment row.
(Or equivalently, delete their current assignment row).
2. Mark their existing assignment row as "replaced", so that the next
time they ask for a token a new assignment row will be created.
Doing (1) means losing the history of where that user's data has been, which means we can't properly clean it up for Bug 971907 and friends.
But doing (2) means that the next time they show up, we must create them a new row with their current (unchanged) value of X-Client-State. This will fail due to the unique index.
Bleh.
I think the right thing to do is (2) but without the unique constraint on client_state. We can still check for old values by slurping in the old rows and doing a lookup, and indeed we already do this in a number of places. But it may have some interesting implications for concurrency/race-conditions which I need to dig in to.
Assignee | ||
Comment 1•11 years ago
|
||
This patch removes the unique index on client-state, and the corresponding places in the code where this might previously have raised an IntegrityError.
This opens up the possibility of two clients racing and creating two different active rows for a single user. This seems unlikely in practice and I don't think it will actually *break* anything. Future attempts to read the user's active record will consistently select whichever one sorts highest during the select. So the only outcome of the race would be that one client syncs to the wrong place for a while.
An alternative is to take a transaction, and check that there's no existing active record before inserting a new one. At first glance it doesn't seem worth it to me.
Attachment #8396908 -
Flags: review?(telliott)
Assignee | ||
Comment 2•11 years ago
|
||
And an Ops/QA note to self: this needs corresponding DB migration before it can go live.
No longer blocks: 988137
Updated•11 years ago
|
Whiteboard: [qa+]
Comment 3•11 years ago
|
||
Comment on attachment 8396908 [details] [diff] [review]
ts-no-unique-index-on-client-state.diff
Review of attachment 8396908 [details] [diff] [review]:
-----------------------------------------------------------------
We could catch this in the get_node query - if 2 nodes come back, we trigger a subprocess that removes one of them.
Attachment #8396908 -
Flags: review?(telliott) → review+
Assignee | ||
Comment 4•11 years ago
|
||
> We could catch this in the get_node query - if 2 nodes come back,
> we trigger a subprocess that removes one of them.
This is a very good idea, we iterate through the old records in get_user() already, so it costs basically nothing to check that they're marked as replaced and update them if not.
Assignee | ||
Comment 5•11 years ago
|
||
Here is a quick follow-up to recover from racy record creation by marking them as replaced at runtime.
Note that _GET_USER_RECORDS will always fetch in a fully consistent order, since it sorts by (created_at, uid).
Attachment #8397456 -
Flags: review?(telliott)
Updated•11 years ago
|
Attachment #8397456 -
Flags: review?(telliott) → review+
Assignee | ||
Comment 6•11 years ago
|
||
https://github.com/mozilla-services/wimms/commit/ee59b851369dd873a15eb0e35a27874164683baa
https://github.com/mozilla-services/wimms/commit/e1dada719b6e0b4d078bfaffe871fc6964b572ce
Resolving fixed, but let's not verify this until we get it rolled to prod with appropriate schema changes.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Comment 7•11 years ago
|
||
Actually, we need to verify this in Stage as part of the fix for bug 985145
Comment 8•11 years ago
|
||
I am wondering if we can add this to the Verifier deployment and test: bug 996763
Just to catch two important changes in one bit of work?...
Assignee | ||
Comment 9•11 years ago
|
||
This is now in prod, I think we can call it verified.
Updated•2 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•