776777 - Desired batch size and item size for POST requests in sync loadtests?

Reporter

Description

•

13 years ago

The sync loadtests include the use of "POST /storage/collection" to stores batches of items. We need to know an appropriate number of items to sent per batch, and an appropriate size per item, so that this emulates what would be sent by a real FF sync session. The old grinder-based sync loadtests send batches of 10 records, each with a random payload size approximately between 500 and 2000 bytes. I feel like this is too few records to be realistic. But if it's in the original loadtests then it was based on real data at some stage, right? The new funkload-based sync loadtests send batches of 100 records, each with a random payload size approximately between 500 and 2000 bytes. This seems more realistic to me, but I don't have hard data to back that up. So: to generate realistic load, how many items should we send per batch, and how big should each item payload be? Some details on client behaviour from IRC: ------ <nalexander> rfkelly: android sync also uploads in batches of 100 records. <nalexander> rfkelly: sorry, 50 records, max 1mb per upload. <nagios-svc-scl2> [127] mon1.scl2.svc:cepmon_sync_503 is OK: (null) <rfkelly> nalexander: any idea of total records uploaded per session? <rfkelly> e.g. does it usually take multiple batches to get things done? <nalexander> rfkelly: on average? We batch per engine, so generally I expect one upload for form history, bookmarks, passwords and probably several uploads for history. <rfkelly> nalexander thanks <nalexander> Since we are syncing every 5 minutes. But on phones, we may not get too many shots to sync, so we might have a very large sync every few days. ------

Ryan Kelly [:rfkelly]

Reporter

Comment 1

•

13 years ago

------ <atoll> rnewman: POST batch size still 100? <rnewman> 50 <rnewman> on Androuid <atoll> oh, interesting <rnewman> and 1MB ------

Richard Newman [:rnewman]

Comment 2

•

13 years ago

Dammit, my typo immortalized in Bugzilla. My research in February suggested that the average history record was 450 bytes. Batches for 100 items with a curve from, say, 350 up to several KB, with a mean of 450, sounds about right to me.

James Bonacci [:jbonacci]

Updated

•

13 years ago

Whiteboard: [qa+]

Ryan Kelly [:rfkelly]

Reporter

Updated

•

13 years ago

No longer blocks: 776767

Ryan Kelly [:rfkelly]

Reporter

Updated

•

12 years ago

Blocks: 875552

James Bonacci [:jbonacci]

Comment 3

•

11 years ago

Anything we want to do here for Sync 1.5 load tests?

Priority: -- → P3

Ryan Kelly [:rfkelly]

Reporter

Updated

•

11 years ago

OS: Linux → All

Hardware: x86 → All

James Bonacci [:jbonacci]

Updated

•

11 years ago

Priority: P3 → P2

Ryan Kelly [:rfkelly]

Reporter

Comment 4

•

11 years ago

Adding Bob who did some modelling based on current sync data usage, and may be able to suggest more realistic payload breakdown for each collection time.

Mark Mayo [:mmayo]

Comment 5

•

11 years ago

Bob, new relic might be a good place to check for per-collection distribution sizes.

Ryan Kelly [:rfkelly]

Reporter

Comment 6

•

11 years ago

So here's what we actually currently do. Despite comments in the code and in this bug saying it's batches of 100 small items, it's really batches of 10 big items: items_per_batch = 10 for i in range(items_per_batch): id = base64.urlsafe_b64encode(os.urandom(10)).rstrip("=") id += str(int((time.time() % 100) * 100000)) payload = self.auth_token * random.randint(50, 200) wbo = {'id': id, 'payload': payload} data.append(wbo) data = json.dumps(data) response = self.session.post(url, data=data, **reqkwds) The self.auth_token string is 200 bytes long, so the payload here is uniformly distributed between 10000 and 40000 bytes in length. So yeah, that probably skews a little big! We should target something closer to Richard's Comment 2 unless Bob has more precise requirements. Bob, it's also feasible to use a different distribution for different collections if there are patterns that show up in the data.

Ryan Kelly [:rfkelly]

Reporter

Comment 7

•

11 years ago

Attached patch sync-loadtest-payload-size-skew.diff — Details — Splinter Review

Attached patch would bring us in line with what Richard suggests above. Putting it out there as a starting point.

Attachment #8385820 - Flags: review?(telliott)

Mark Mayo [:mmayo]

Comment 8

•

11 years ago

(In reply to Mark Mayo [:mmayo] from comment #5) > Bob, new relic might be a good place to check for per-collection > distribution sizes. Ignore this. I looked and couldn't get any useful info our of New Relic re: query and/or response sizes..

Tarek Ziadé (:tarek)

Comment 9

•

11 years ago

Comment on attachment 8385820 [details] [diff] [review] sync-loadtest-payload-size-skew.diff Review of attachment 8385820 [details] [diff] [review]: ----------------------------------------------------------------- LGTM

Attachment #8385820 - Flags: review+

James Bonacci [:jbonacci]

Comment 10

•

11 years ago

Contents of original email thread (read top down): Tarek, Chris and Mark need some input from you on transaction sizes for the loads test/loads cluster... ie how do we vary the body/size of the transaction for TS/Sync load tests... Thanks for getting back to us during your PTO :-) James --------------------------------------------------------------------------------------------------- Varying payload specifically for the sync tests - there's some concern that without trying to resemble the distribution of record sizes from real-world users we see in production sync, we end up with a loads workload on the staging servers that's too far from the real world to tell us much about how a given ec2 instance type will stand up in the wild. I hope that made sense? -Mark --------------------------------------------------------------------------------------------------- On 5/03/2014 7:41 AM, Mark Mayo wrote: > Varying payload specifically for the sync tests - there's some concern > that without trying to resemble the distribution of record sizes from > real-world users we see in production sync, we end up with a loads > workload on the staging servers that's too far from the real world to > tell us much about how a given ec2 instance type will stand up in the wild. Is this essentially the same question as Bug 776777? https://bugzilla.mozilla.org/show_bug.cgi?id=776777 Ryan --------------------------------------------------------------------------------------------------- I believe that is the same ticket that bobm mentioned in the meeting. james --------------------------------------------------------------------------------------------------- > Le 04/03/14 23:12, James Bonacci a écrit : > > Tarek, > > Chris and Mark need some input from you on transaction sizes for the > loads test/loads cluster... > ie how do we vary the body/size of the transaction for TS/Sync load > tests... It's already the case - payloads sizes are randomly varying between a min and a max size This behavior was copied over the grinder initial load tests we had a few years ago

Ryan Kelly [:rfkelly]

Reporter

Comment 11

•

11 years ago

Comment on attachment 8385820 [details] [diff] [review] sync-loadtest-payload-size-skew.diff Committed r=tarek https://github.com/mozilla-services/server-syncstorage/commit/aa4545d7171deadfae9241a9f0d1f6f012c1762b

Attachment #8385820 - Flags: review?(telliott)

Ryan Kelly [:rfkelly]

Reporter

Comment 12

•

11 years ago

So the above tweak should be a good start, but we can continue to refine based on more detailed modelling of production payload size distribution.

James Bonacci [:jbonacci]

Comment 13

•

11 years ago

Should we change this to Resolved and get some load tests going?

Ryan Kelly [:rfkelly]

Reporter

Comment 14

•

11 years ago

We can get some load tests going, but I think we should leave this open pending further analysis of prod data patterns.

James Bonacci [:jbonacci]

Comment 15

•

11 years ago

Fair enough.

Status: NEW → ASSIGNED

Ryan Kelly [:rfkelly]

Reporter

Comment 16

•

11 years ago

Assigning to :bobm for further triage and/or closing depending on what ops wants out of this

Assignee: nobody → bobm

Chris Karlof [:ckarlof]

Updated

•

10 years ago

Priority: P2 → P4

Jason Thomas [:jason]

Comment 17

•

9 years ago

Please reopen if this still needed.

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

Resolution: --- → WONTFIX

BMO Automation

Updated

•

2 years ago

Product: Cloud Services → Cloud Services Graveyard