AFAICT the tokenserver doesn't update the current_load after it assigns a node. This means that all nodes appear to be unloaded, and all assignments wind up going to the same node due to our deterministic selection algorithm. The old node-assignment server updated this field after each node assignment: http://hg.mozilla.org/services/server-node-assignment/file/1a94108c67ad/mozsvcnodes/storage.py#l183 Should we do the same, or do we plan to update this in some other way e.g. with a periodic maintenance script?
To the best of my knowledge, the node should get a +1 any time it does an update. Among other things, it means we can do round-robin during particularly busy parts. If that's missing, it seems like an oversight.
Note to self, we should also be using log() for the order-by per Bug 688098
Hmm, it does seem to be in there though, need to dig deeper: https://github.com/mozilla-services/wimms/blob/master/wimms/sql.py#L324
Created attachment 8366375 [details] [diff] [review] tokenserver_load_update.diff This functionality stopped working during the recent refactor because we didn't have a test for it. I've added one and fixed two underlying bugs: * we were updating the counts without translating the service name into a service id, hence the query wasn't hitting any rows * sqlite was doing integer arithemtic, truncating load/capacity results to zero and hence returning things in row order This patch fixes them and adds in the log() hack for MySQL while we're there.
Verified in code and in TS load testing...