Closed
Bug 1002898
Opened 11 years ago
Closed 11 years ago
Normalize storage of node names in tokenserver db?
Categories
(Cloud Services Graveyard :: Server: Token, defect)
Cloud Services Graveyard
Server: Token
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rfkelly, Assigned: rfkelly)
References
Details
Attachments
(1 file)
|
8.60 KB,
patch
|
telliott
:
review+
|
Details | Diff | Splinter Review |
The addition of an integer-id-based index for nodes in Bug 988643 has left us with denormalized storage of name nodes in the tokenserver db. The "users" table has both an integer "nodeid" and a varchar "node" column to refer to the user's assigned node both by id and by name.
One one hand, this denormalization of node name is helpful because we don't have to do a join to look up the node name on each query. On the other, it's duplication of data which will bloat the clustered index for the "users" table, and a join onto the very small "nodes" table is probably very fast in practice.
We should consider dropping the varchar "node" column and looking up the node-name from the node id.
| Assignee | ||
Comment 1•11 years ago
|
||
Here's a quick stab at this. It updates the code to use a join when loading the users rows, and to stop reading/writing the "node" column in the "users" table. It also adds a migration to let the column be nullable. A future deployment could then safely drop the column.
One trick is that removing a node may produce orphaned rows in the "users" table - expired rows which have not been cleaned up yet, but which do not have a matching nodeid entry in the "nodes" table. Using an outer join means these rows report a node-name of None, which seems like a sensible solution and allows the row-purging script to easily skip over them.
Toby, Benson, any thoughts on whether this is actually a good idea?
Assignee: nobody → rfkelly
Attachment #8414187 -
Flags: review?(telliott)
Attachment #8414187 -
Flags: feedback?(bwong)
| Assignee | ||
Updated•11 years ago
|
Summary: Normalize storage of node names in tokenserver db → Normalize storage of node names in tokenserver db?
| Assignee | ||
Comment 2•11 years ago
|
||
Potential extension of this: slurp in the "down" and "backoff" columns from the nodes table and use them to generate errors for the client, to prevent a second round-trip if their node is known to be down.
Comment 3•11 years ago
|
||
(In reply to Ryan Kelly [:rfkelly] from comment #1)
> One trick is that removing a node may produce orphaned rows in the "users"
> table - expired rows which have not been cleaned up yet, but which do not
> have a matching nodeid entry in the "nodes" table. Using an outer join
> means these rows report a node-name of None, which seems like a sensible
> solution and allows the row-purging script to easily skip over them.
I don't have a problem using an outer join to find users with no actual node association.
I don't know the reasoning for deleting records from the nodes table vs marking them as "gone/decommissioned". Either way, I don't think either solution will affect regular performance much.
Comment 4•11 years ago
|
||
Comment on attachment 8414187 [details] [diff] [review]
ts-stop-using-node-column.diff
Review of attachment 8414187 [details] [diff] [review]:
-----------------------------------------------------------------
I like the cleanup effect on the code overall. One minor concern.
::: tokenserver/assignment/sqlnode/sql.py
@@ +305,3 @@
> if client_state in user['old_client_states']:
> raise BackendError('previously seen client-state string')
> # need to create a new record for new client_state.
It occurs to me that this is a minor attack surface. If I keep giving you certs with new generation numbers, I can create quite a lot of rows here. Now that it's an outer join, I'm sensitive to this possibility :)
Attachment #8414187 -
Flags: review?(telliott) → review+
| Assignee | ||
Comment 5•11 years ago
|
||
> It occurs to me that this is a minor attack surface.
> If I keep giving you certs with new generation numbers, I can create quite a lot of rows here.
Interesting. We do a "LIMIT 20" on the query, I wonder if mysql is smart enough to apply the limit before doing the join.
Updated•11 years ago
|
Attachment #8414187 -
Flags: feedback?(bwong)
| Assignee | ||
Comment 6•11 years ago
|
||
Committed in:
https://github.com/mozilla-services/tokenserver/commit/39afced940487a6a3f5d07bf5e4c2f844ec06239
Filed Bug 1008081 as a follow-up to remove the now-unused "node" column.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Updated•2 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•