Closed
Bug 988643
Opened 11 years ago
Closed 11 years ago
Add index for looking up assignment records by node name.
Categories
(Cloud Services Graveyard :: Server: Token, defect, P1)
Cloud Services Graveyard
Server: Token
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: rfkelly, Assigned: rfkelly)
References
Details
(Whiteboard: [qa+])
Attachments
(3 files, 2 obsolete files)
4.27 KB,
patch
|
telliott
:
review+
|
Details | Diff | Splinter Review |
1.35 KB,
patch
|
telliott
:
review+
|
Details | Diff | Splinter Review |
8.17 KB,
patch
|
telliott
:
review+
|
Details | Diff | Splinter Review |
Per Bug 985145 Comment 26, we should add an index to simplify the "find all rows assigned to this node" operation that's required when doing node migrations.
Sadly, the "node" column is currently a VARCHAR(64) rather than an integer reference into the "nodes" table. It may be worth refactoring this, but the alterations it would imply on the live data in prod database scare me...
Updated•11 years ago
|
Whiteboard: [qa+]
Comment 1•11 years ago
|
||
Gimme the SQL to run and I'll make it happen. :) We're using MySQL 5.6 and it has online DDL (alter table) operations.
Assignee | ||
Comment 2•11 years ago
|
||
Let's not normalize the node name out into a separate table for now, unless profiling proves we need to. It will pessimize the read case into requiring a join or a second query.
This patch to WIMMS adds the index declaration and some interface methods for using it. I can them roll a tokenserver management script on top of these methods.
The corresponding DDL for modifying the production databases is:
CREATE INDEX node_idx ON users (service, node);
Adding this will increase our write load on the tokenserver db, we'll have to see how it fares in practise. We could consider indexing only a prefix of the node name in order to decrease the size of the index.
Attachment #8398334 -
Flags: review?(telliott)
Comment 3•11 years ago
|
||
Comment on attachment 8398334 [details] [diff] [review]
wimms-clear-node-assignment.diff
Review of attachment 8398334 [details] [diff] [review]:
-----------------------------------------------------------------
::: wimms/sql.py
@@ +440,5 @@
> res.close()
>
> + def remove_node(self, service, node, timestamp=None):
> + """Remove definition for a node."""
> + self.unassign_node(service, node, timestamp)
Race condition? Shouldn't we delete the node, then unassign the users? Otherwise you run into a danger that someone gets assigned in between.
Assignee | ||
Comment 4•11 years ago
|
||
> Shouldn't we delete the node, then unassign the users?
> Otherwise you run into a danger that someone gets assigned in between.
Yes, this makes sense, I'll switch those around. I was implicitly assuming that the node would be markd as "down" before doing this, but there's no such facility in this patch so I shouldn't assume that.
Assignee | ||
Comment 5•11 years ago
|
||
Updated with the ordering reversed, otherwise unchanged.
Attachment #8398334 -
Attachment is obsolete: true
Attachment #8398334 -
Flags: review?(telliott)
Attachment #8402889 -
Flags: review?(telliott)
Updated•11 years ago
|
Attachment #8402889 -
Flags: review?(telliott) → review+
Assignee | ||
Comment 6•11 years ago
|
||
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 7•11 years ago
|
||
Re-opening this as :mostlygeek has expressed strong concerns about the size of this new index, and would prefer referencing the node by its id
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 8•11 years ago
|
||
Yes, I think it'll just take up too much RAM to keep the index in memory. I would much prefer the index is as small as possible w/ two ints.
Assignee | ||
Comment 9•11 years ago
|
||
Note that this index is not used by the webapp itself, only be backend node-management scripts that have to operate on "all users assigned to a node". Which might not make any difference in the scheme of things, but worth pointing out explicitly.
Updated•11 years ago
|
Priority: -- → P1
Assignee | ||
Comment 10•11 years ago
|
||
OK, here's another take on this based on :mostlygeek concerns about the previous approach. It adds a "nodeid" column to the users table and indexes on it for use by the backend cleanup scripts. The acutal look-up-the-node-for-a-user query still uses the string-base "node" column. Once we've got the new column successfully deployed we can do another migration to drop the old one.
The migration is the trickiest part here, as we need to populate the column for existing users. There may be a small window where newly-created user records get a mismatched nodeid/nodename pair; we can work around this by either (1) running the same update again after we have finished the deployment, or (2) eating some downtime and disabling the service during the update.
Thoughts?
Attachment #8405899 -
Flags: review?(telliott)
Attachment #8405899 -
Flags: feedback?(bwong)
Assignee | ||
Comment 11•11 years ago
|
||
Actually, let's take a more nuanced and downtime-avoiding approach here. I'm obsoleting the previous patch with a two-part process.
The first is a migration that just adds the "nodeid" column. It doesn't do anything to it or with it, so all nodeids will initially be NULL.
The second patch updates the code to actually write the "nodeid" column, as well as to use it for the desired node-clearing operation. It includes a migration to set the nodeid column for existing users and then add the index on it.
The idea is that we can deploy as follows:
1) deploy and apply the first migration, so the column is available for use
2) deploy the second patch but don't apply the migration; get it running on
all webheads and correctly writing to the nodeid column.
3) apply the second migration, which will fixup the existing data and created
the index.
We must do any node migrations between (2) and (3) but other than that this should be a safe update.
Attachment #8405899 -
Attachment is obsolete: true
Attachment #8405899 -
Flags: review?(telliott)
Attachment #8405899 -
Flags: feedback?(bwong)
Attachment #8405910 -
Flags: review?(telliott)
Attachment #8405910 -
Flags: feedback?(bwong)
Assignee | ||
Comment 12•11 years ago
|
||
Attaching part 2.
Additional improvement: there's no need to index on (service, nodeid) since each nodeid is specific to a service already. We can just index the nodeid column in isolation.
We can consider dropping the string "node" column once this is live, but it gets a bit fiddly....
Attachment #8405911 -
Flags: review?(telliott)
Updated•11 years ago
|
Attachment #8405910 -
Flags: review?(telliott) → review+
Updated•11 years ago
|
Attachment #8405911 -
Flags: review?(telliott) → review+
Assignee | ||
Comment 13•11 years ago
|
||
https://github.com/mozilla-services/tokenserver/commit/a6f6c95e11c29be85bb512a63a3bc60f34e3dec1
https://github.com/mozilla-services/tokenserver/commit/71a590f85a27eff6ee4cddfe25aaeb33d210baa2
:jbonacci in order to verify this, we need to get them all the way to prod and see that the new index works ok. See Bug 996915 and an as-yet-unfiled deployment bug for the second step of the procedure.
Status: REOPENED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Assignee | ||
Comment 14•11 years ago
|
||
Comment on attachment 8405910 [details] [diff] [review]
ts-add-nodeid-column.diff
chatted briefly with benson IRL, clearing feedback flag
Attachment #8405910 -
Flags: feedback?(bwong)
Comment 15•11 years ago
|
||
Verified via load testing in Stage: bug 1002295
Status: RESOLVED → VERIFIED
Updated•2 years ago
|
Product: Cloud Services → Cloud Services Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•