Closed Bug 882525 Opened 12 years ago Closed 12 years ago

Grab some item-count metrics from production dbs

Categories

(Cloud Services :: Operations: Miscellaneous, task)

x86_64
Windows 7
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rfkelly, Assigned: bobm)

Details

(Whiteboard: [qa-])

Attachments

(5 files)

For PiCL planning purposes, I'm interested in the distribution of the per-user item count for a couple of collections. Can you please pick a couple of representative databases and grab me the count of items for each user in the "clients", "bookmarks", "passwords" and "history" collections? Something like the following: echo 'SELECT COUNT(*) FROM <wboX> WHERE collection=<CID> GROUP BY username ORDER BY 1 desc' | mysql | uniq -c The collection ids of interest are: bookmarks: 7 passwords: 10 history: 4 clients: 100 Sadly, the "clients" collection doesn't have a fixed id, thanks to Bug 688623, but it should almost always be 100. Please double-check that by doing a quick count on the "collections" table: SELECT collectionid, count(collectionid) FROM collections WHERE name = 'clients' GROUP BY collectionid ORDER BY 2 DESC; This should show collection 100 as the top result by far.
Also of interest is the distribution of individual record sizes across all users. Something like the following query: SELECT payload_size, COUNT(payload_size) FROM <wboX> WHERE collection=<CID> GROUP BY payload_size; (Obviously we'll want to break these down into a much smaller number of buckets for manual digestion, but I'm happy to crunch that raw data myself - unless there's a way to get MySQL to do it for us more efficiently...?)
Whiteboard: [qa-]
Attachment #765209 - Flags: review?(rfkelly)
Assignee: nobody → bobm
Status: NEW → ASSIGNED
Thanks Bob! I'm a bit apprehensive though as this seems like too little data for each node. For example, in the Bookmarks counts data for sync1-collections.txt, I see 18 lines with a count of 1 each. Does this mean there's only 18 users on this DB who have bookmark records? That seems low to me. Is it just a subset of the users on each database? Or are we sharding *really* well at this level? (Entirely possibly I stuffed up with my little command-line uniq piping thing there and it's mangled the data too)
Everything I grabbed is from a single table on a single DB. Perhaps that isn't sufficient data to extrapolate from. I'll run it through a loop and grab all of the tables for a database, unless you'd prefer the entire DB host.
(In reply to Bob Micheletto [:bobm] from comment #4) > Everything I grabbed is from a single table on a single DB. Wow. I guess I just didn't realize quite how much we were sharding these users out! > Perhaps that isn't sufficient data to extrapolate from. I'll run it through a > loop and grab all of the tables for a database All tables from a single database would be awesome - IIRC that should equate to 100 times more data, which seems sufficient. Thanks mate.
Comment on attachment 765209 [details] Requested stats information r+ after discussion above, will be good to have the same results across all databases on a host.
Attachment #765209 - Flags: review?(rfkelly) → review+
Attachment #776605 - Flags: review?(rfkelly)
Just noticed that the clients collection is missing, collecting that now.
Comment on attachment 776605 [details] Total collection counts for sync40.db Awesome, thanks Bob. I will crunch these numbers into buckets and add them to the user modelling. Any thoughts on the payload-size metrics described in Comment 1? (Which probably should have been a separate bug rather than a pile-on to this one...)
Attachment #776605 - Flags: review?(rfkelly) → review+
Attached file sync64-clients.txt.gz
Raw clients collection counts from sync64.db in PHX1.
Attachment #779382 - Flags: review?(rfkelly)
used the sync40 numbers to get: collection mean median min max mode history -> 3107 1294 1 299300 1 bookmarks -> 463 137 1 234656 12 passwords -> 66 31 1 3594 1 clients -> 0 1 1 74 1
sorry, last one should be addons
(In reply to Toby Elliott [:telliott] from comment #13) > sorry, last one should be addons or no, clients, thanks to that old bug Revised values (with better rounding): history -> 3110.59 1294 1 299300 1 bookmarks -> 468.51 137 1 234656 12 passwords -> 69.19 31 1 3594 1 clients -> 1.46 1 1 74 1
Final collections payload information.
Closing this ticket, please re-open if additional information is required.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Attachment #779382 - Flags: review?(rfkelly) → review+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: