Closed
Bug 882525
Opened 12 years ago
Closed 12 years ago
Grab some item-count metrics from production dbs
Categories
(Cloud Services :: Operations: Miscellaneous, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: rfkelly, Assigned: bobm)
Details
(Whiteboard: [qa-])
Attachments
(5 files)
For PiCL planning purposes, I'm interested in the distribution of the per-user item count for a couple of collections. Can you please pick a couple of representative databases and grab me the count of items for each user in the "clients", "bookmarks", "passwords" and "history" collections?
Something like the following:
echo 'SELECT COUNT(*) FROM <wboX> WHERE collection=<CID> GROUP BY username ORDER BY 1 desc' | mysql | uniq -c
The collection ids of interest are:
bookmarks: 7
passwords: 10
history: 4
clients: 100
Sadly, the "clients" collection doesn't have a fixed id, thanks to Bug 688623, but it should almost always be 100. Please double-check that by doing a quick count on the "collections" table:
SELECT collectionid, count(collectionid) FROM collections WHERE name = 'clients' GROUP BY collectionid ORDER BY 2 DESC;
This should show collection 100 as the top result by far.
| Reporter | ||
Comment 1•12 years ago
|
||
Also of interest is the distribution of individual record sizes across all users. Something like the following query:
SELECT payload_size, COUNT(payload_size) FROM <wboX> WHERE collection=<CID> GROUP BY payload_size;
(Obviously we'll want to break these down into a much smaller number of buckets for manual digestion, but I'm happy to crunch that raw data myself - unless there's a way to get MySQL to do it for us more efficiently...?)
Updated•12 years ago
|
Whiteboard: [qa-]
| Assignee | ||
Comment 2•12 years ago
|
||
Attachment #765209 -
Flags: review?(rfkelly)
| Assignee | ||
Updated•12 years ago
|
Assignee: nobody → bobm
Status: NEW → ASSIGNED
| Reporter | ||
Comment 3•12 years ago
|
||
Thanks Bob! I'm a bit apprehensive though as this seems like too little data for each node.
For example, in the Bookmarks counts data for sync1-collections.txt, I see 18 lines with a count of 1 each. Does this mean there's only 18 users on this DB who have bookmark records? That seems low to me. Is it just a subset of the users on each database? Or are we sharding *really* well at this level?
(Entirely possibly I stuffed up with my little command-line uniq piping thing there and it's mangled the data too)
| Assignee | ||
Comment 4•12 years ago
|
||
Everything I grabbed is from a single table on a single DB. Perhaps that isn't sufficient data to extrapolate from. I'll run it through a loop and grab all of the tables for a database, unless you'd prefer the entire DB host.
| Reporter | ||
Comment 5•12 years ago
|
||
(In reply to Bob Micheletto [:bobm] from comment #4)
> Everything I grabbed is from a single table on a single DB.
Wow. I guess I just didn't realize quite how much we were sharding these users out!
> Perhaps that isn't sufficient data to extrapolate from. I'll run it through a
> loop and grab all of the tables for a database
All tables from a single database would be awesome - IIRC that should equate to 100 times more data, which seems sufficient. Thanks mate.
| Reporter | ||
Comment 6•12 years ago
|
||
Comment on attachment 765209 [details]
Requested stats information
r+ after discussion above, will be good to have the same results across all databases on a host.
Attachment #765209 -
Flags: review?(rfkelly) → review+
| Assignee | ||
Comment 7•12 years ago
|
||
Attachment #776605 -
Flags: review?(rfkelly)
| Assignee | ||
Comment 8•12 years ago
|
||
Just noticed that the clients collection is missing, collecting that now.
| Reporter | ||
Comment 9•12 years ago
|
||
Comment on attachment 776605 [details]
Total collection counts for sync40.db
Awesome, thanks Bob. I will crunch these numbers into buckets and add them to the user modelling.
Any thoughts on the payload-size metrics described in Comment 1? (Which probably should have been a separate bug rather than a pile-on to this one...)
Attachment #776605 -
Flags: review?(rfkelly) → review+
| Assignee | ||
Comment 10•12 years ago
|
||
Raw clients collection counts from sync64.db in PHX1.
Attachment #779382 -
Flags: review?(rfkelly)
| Assignee | ||
Comment 11•12 years ago
|
||
Comment 12•12 years ago
|
||
used the sync40 numbers to get:
collection mean median min max mode
history -> 3107 1294 1 299300 1
bookmarks -> 463 137 1 234656 12
passwords -> 66 31 1 3594 1
clients -> 0 1 1 74 1
Comment 13•12 years ago
|
||
sorry, last one should be addons
Comment 14•12 years ago
|
||
(In reply to Toby Elliott [:telliott] from comment #13)
> sorry, last one should be addons
or no, clients, thanks to that old bug
Revised values (with better rounding):
history -> 3110.59 1294 1 299300 1
bookmarks -> 468.51 137 1 234656 12
passwords -> 69.19 31 1 3594 1
clients -> 1.46 1 1 74 1
| Assignee | ||
Comment 15•12 years ago
|
||
Final collections payload information.
| Assignee | ||
Comment 16•12 years ago
|
||
Closing this ticket, please re-open if additional information is required.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
| Reporter | ||
Updated•12 years ago
|
Attachment #779382 -
Flags: review?(rfkelly) → review+
You need to log in
before you can comment on or make changes to this bug.
Description
•