Closed Bug 1319872 Opened 8 years ago Closed 7 years ago

Go vs Python Sync 1.6.0 Latency Comparison

Categories

(Cloud Services Graveyard :: Server: Sync, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mostlygeek, Assigned: mostlygeek)

Details

This bug tracks nginx latency counts over time of the gosync#2 (gosync) box vs python1.6.0#1 (pysync). Both servers have approximately the same number of users and traffic so it is easier to directly compare their performance. 

gosync server: 
 - r3.large
 - 16GB RAM
 - 1TB EBS volume

pysync server: 
 - c3.2xlarge
 - 16GB RAM
 - 1TB EBS volume
 - more ECU and more IO network capacity

Methodology: 
============

We are only interested in comparing nginx reported latency. Both servers have the same log format so we can use the latency-buckets-pro.awk [1] script to generate output. We care about the type of requests. Separate reports will be generated for: 

 1. POST, PUT, DELETE requests 
 2. GET info/(configuration|collections) 
 3. GET (not info/*) requests

To keep things fair we will use the same nginx log archive to generate statistics.
Assignee: nobody → bwong
[1] latency-buckets-pro.awk: https://gist.github.com/mostlygeek/d3d0cc1ab08923933e672329fa494914

Script to extract data: 

FILE=access.log-20161123.gz; \
    printf "=============\nGET info/*\n=============\n"; \
    gunzip -c $FILE | grep '[0-9]*/info/' | latency-buckets-pro.awk; \
    printf "\n=============\nGET ! info/*\n=============\n"; \
    gunzip -c $FILE | grep GET | grep -v '[0-9]*/info/' | latency-buckets-pro.awk; \
    printf "\n=============\nPOST, PUT, DELETE/*\n=============\n"; \
    gunzip -c $FILE | grep -E '(POST|PUT|DELETE)' | latency-buckets-pro.awk

---------------------------------------------
pysync access.log-20161123.gz

=============
GET info/*
=============
Total rows: 1156484
   bucket    count pct
<     5ms  1019359 88.14%
<    10ms   129618 11.21%
<    25ms     6693 0.58%
<    50ms      562 0.05%
<   100ms      177 0.02%
<   500ms       67 0.01%
<  1000ms        3 0.00%
< 60000ms        5 0.00%

=============
GET ! info/*
=============
Total rows: 1666517
   bucket    count pct
<     5ms   710662 42.64%
<    10ms   852189 51.14%
<    25ms    84641 5.08%
<    50ms    10941 0.66%
<   100ms     5022 0.30%
<   500ms     2668 0.16%
<  1000ms      356 0.02%
< 60000ms       38 0.00%

=============
POST, PUT, DELETE/*
=============
Total rows: 1636266
   bucket    count pct
<     5ms   236571 14.46%
<    10ms   340071 20.78%
<    25ms    18197 1.11%
<    50ms   344339 21.04%
<   100ms   660560 40.37%
<   500ms    36415 2.23%
<  1000ms       98 0.01%
< 60000ms       15 0.00%
gosync access.log-20161123.gz

=============
GET info/*
=============
Total rows: 1264556
   bucket    count pct
<     5ms   876292 69.30%
<    10ms    86636 6.85%
<    25ms   231455 18.30%
<    50ms    67031 5.30%
<   100ms     2504 0.20%
<   500ms      637 0.05%
<  1000ms        1 0.00%
< 60000ms        0 0.00%

=============
GET ! info/*
=============
Total rows: 1718603
   bucket    count pct
<     5ms  1642112 95.55%
<    10ms    37540 2.18%
<    25ms    30148 1.75%
<    50ms     7472 0.43%
<   100ms     1020 0.06%
<   500ms      309 0.02%
<  1000ms        2 0.00%
< 60000ms        0 0.00%

=============
POST, PUT, DELETE/*
=============
Total rows: 1590402
   bucket    count pct
<     5ms   332817 20.93%
<    10ms   309783 19.48%
<    25ms   774099 48.67%
<    50ms   164415 10.34%
<   100ms     7365 0.46%
<   500ms     1916 0.12%
<  1000ms        5 0.00%
< 60000ms        2 0.00%
This format makes it a little easier to compare: 

gosync                        pysync
=============                 =============
GET info/*                    GET info/*
=============                 =============
Total rows: 1264556           Total rows: 1156484
   bucket    count pct           bucket    count pct
<     5ms   876292 69.30%     <     5ms  1019359 88.14%
<    10ms    86636 6.85%      <    10ms   129618 11.21%
<    25ms   231455 18.30%     <    25ms     6693 0.58%
<    50ms    67031 5.30%      <    50ms      562 0.05%
<   100ms     2504 0.20%      <   100ms      177 0.02%
<   500ms      637 0.05%      <   500ms       67 0.01%
<  1000ms        1 0.00%      <  1000ms        3 0.00%
< 60000ms        0 0.00%      < 60000ms        5 0.00%

=============                 =============
GET ! info/*                  GET ! info/*
=============                 =============
Total rows: 1718603           Total rows: 1666517
   bucket    count pct           bucket    count pct
<     5ms  1642112 95.55%     <     5ms   710662 42.64%
<    10ms    37540 2.18%      <    10ms   852189 51.14%
<    25ms    30148 1.75%      <    25ms    84641 5.08%
<    50ms     7472 0.43%      <    50ms    10941 0.66%
<   100ms     1020 0.06%      <   100ms     5022 0.30%
<   500ms      309 0.02%      <   500ms     2668 0.16%
<  1000ms        2 0.00%      <  1000ms      356 0.02%
< 60000ms        0 0.00%      < 60000ms       38 0.00%

=============                 =============
POST, PUT, DELETE/*           POST, PUT, DELETE/*
=============                 =============
Total rows: 1590402           Total rows: 1636266
   bucket    count pct           bucket    count pct
<     5ms   332817 20.93%     <     5ms   236571 14.46%
<    10ms   309783 19.48%     <    10ms   340071 20.78%
<    25ms   774099 48.67%     <    25ms    18197 1.11%
<    50ms   164415 10.34%     <    50ms   344339 21.04%
<   100ms     7365 0.46%      <   100ms   660560 40.37%
<   500ms     1916 0.12%      <   500ms    36415 2.23%
<  1000ms        5 0.00%      <  1000ms       98 0.01%
< 60000ms        2 0.00%      < 60000ms       15 0.00%
Component: Firefox Sync: Backend → Server: Sync
Change comparison of go-sync server to a py-sync server with batch uploads enabled. 
A lot of the py-sync batch upload appends (http 202) are between 2000ms to 4000ms. 

FILE=access.log-20161130.gz; \
    printf "=============\nGET info/*\n=============\n"; \
    gunzip -c $FILE | grep '[0-9]*/info/' | latency-buckets-pro.awk; \
    printf "\n=============\nGET ! info/*\n=============\n"; \
    gunzip -c $FILE | grep GET | grep -v '[0-9]*/info/' | latency-buckets-pro.awk; \
    printf "\n=============\nPOST, PUT, DELETE\n=============\n"; \
    gunzip -c $FILE | grep -E '(POST|PUT|DELETE)' | latency-buckets-pro.awk; \
    printf "\n=============\nBatch Uploads (202)\n=============\n"; \
    gunzip -c $FILE | grep 'HTTP/1.1" 202' | latency-buckets-pro.awk;

go-sync                       py-sync
=============                 =============
GET info/*                    GET info/*
=============                 =============
Total rows: 1507439           Total rows: 1171137
   bucket    count pct           bucket    count pct
<     5ms  1029964 68.33%     <     5ms  1073194 91.64%
<    10ms   106029 7.03%      <    10ms    87466 7.47%
<    25ms   272038 18.05%     <    25ms     9792 0.84%
<    50ms    92807 6.16%      <    50ms      523 0.04%
<   100ms     5289 0.35%      <   100ms      114 0.01%
<   500ms     1307 0.09%      <   500ms       47 0.00%
<  1000ms        4 0.00%      <  1000ms        1 0.00%
< 60000ms        1 0.00%      < 60000ms        0 0.00%

=============                 =============
GET ! info/*                  GET ! info/*
=============                 =============
Total rows: 2038526           Total rows: 1697874
   bucket    count pct           bucket    count pct
<     5ms  1926924 94.53%     <     5ms  1004366 59.15%
<    10ms    53847 2.64%      <    10ms   627329 36.95%
<    25ms    44277 2.17%      <    25ms    52487 3.09%
<    50ms    11372 0.56%      <    50ms     8217 0.48%
<   100ms     1710 0.08%      <   100ms     3444 0.20%
<   500ms      388 0.02%      <   500ms     1734 0.10%
<  1000ms        7 0.00%      <  1000ms      239 0.01%
< 60000ms        1 0.00%      < 60000ms       58 0.00%

=============                 =============
POST, PUT, DELETE             POST, PUT, DELETE
=============                 =============
Total rows: 1812890           Total rows: 1622974
   bucket    count pct           bucket    count pct
<     5ms   339299 18.72%     <     5ms   333783 20.57%
<    10ms   349686 19.29%     <    10ms   235133 14.49%
<    25ms   892142 49.21%     <    25ms    15418 0.95%
<    50ms   217264 11.98%     <    50ms   377890 23.28%
<   100ms    11200 0.62%      <   100ms   621244 38.28%
<   500ms     3289 0.18%      <   500ms    36207 2.23%
<  1000ms       10 0.00%      <  1000ms     1339 0.08%
< 60000ms        0 0.00%      < 60000ms     1960 0.12%

=============                 =============
Batch Uploads (202)           Batch Uploads (202)
=============                 =============
Total rows: 1829              Total rows: 1115
   bucket    count pct           bucket    count pct
<     5ms        0 0.00%      <     5ms        0 0.00%
<    10ms      372 20.34%     <    10ms        0 0.00%
<    25ms     1088 59.49%     <    25ms        0 0.00%
<    50ms      342 18.70%     <    50ms        0 0.00%
<   100ms       25 1.37%      <   100ms        0 0.00%
<   500ms        2 0.11%      <   500ms        1 0.09%
<  1000ms        0 0.00%      <  1000ms        0 0.00%
< 60000ms        0 0.00%      < 60000ms     1114 99.91%
> A lot of the py-sync batch upload appends (http 202) are between 2000ms to 4000ms. 

That's not awesome, I would hope to see the distribution more closely match that of the non-batch uploads.  I wonder if this is a symptom of e.g. contention on the batch_id index in the batch upload table, making for more lock contention between users.
Crunched Dec 11th to Dec 17th worth of nginx logs. 
Stats seem to be holding fairly consistently with the pysync's batch uploads still taking around 2000ms to 4000ms. 

> printf "=============\nGET info/*\n=============\n"; \
    gunzip -c access.log-2016121{1,2,3,4,5,6,7}* | grep '[0-9]*/info/' | latency-buckets-pro.awk; \
    printf "\n=============\nGET ! info/*\n=============\n"; \
    gunzip -c access.log-2016121{1,2,3,4,5,6,7}* | grep GET | grep -v '[0-9]*/info/' | latency-buckets-pro.awk; \
    printf "\n=============\nPOST, PUT, DELETE\n=============\n"; \
    gunzip -c access.log-2016121{1,2,3,4,5,6,7}* | grep -E '(POST|PUT|DELETE)' | latency-buckets-pro.awk; \
    printf "\n=============\nBatch Uploads (202)\n=============\n"; \
    gunzip -c access.log-2016121{1,2,3,4,5,6,7}* | grep 'HTTP/1.1" 202' | latency-buckets-pro.awk;

go-sync                        py-sync
=============                  =============
GET info/*                     GET info/*
=============                  =============
Total rows: 8631237            Total rows: 6797211
   bucket    count pct            bucket    count pct
<     5ms  6006648 69.59%      <     5ms  6308172 92.81%
<    10ms   624595 7.24%       <    10ms   441381 6.49%
<    25ms  1467515 17.00%      <    25ms    43720 0.64%
<    50ms   500721 5.80%       <    50ms     3156 0.05%
<   100ms    25463 0.30%       <   100ms      556 0.01%
<   500ms     6254 0.07%       <   500ms      213 0.00%
<  1000ms       30 0.00%       <  1000ms       12 0.00%
< 60000ms       11 0.00%       < 60000ms        1 0.00%

=============                  =============
GET ! info/*                   GET ! info/*
=============                  =============
Total rows: 11561268           Total rows: 9197735
   bucket    count pct            bucket    count pct
<     5ms 10831921 93.69%      <     5ms  5757775 62.60%
<    10ms   332292 2.87%       <    10ms  3047325 33.13%
<    25ms   297601 2.57%       <    25ms   319492 3.47%
<    50ms    83658 0.72%       <    50ms    44001 0.48%
<   100ms    10386 0.09%       <   100ms    18698 0.20%
<   500ms     5346 0.05%       <   500ms     9369 0.10%
<  1000ms       46 0.00%       <  1000ms      944 0.01%
< 60000ms       18 0.00%       < 60000ms      131 0.00%

=============                  =============
POST, PUT, DELETE              POST, PUT, DELETE
=============                  =============
Total rows: 8835251            Total rows: 7545509
   bucket    count pct            bucket    count pct
<     5ms  1110749 12.57%      <     5ms  1713752 22.71%
<    10ms  1653158 18.71%      <    10ms   910177 12.06%
<    25ms  4687798 53.06%      <    25ms    41771 0.55%
<    50ms  1288501 14.58%      <    50ms  2504290 33.19%
<   100ms    77926 0.88%       <   100ms  2211853 29.31%
<   500ms    17072 0.19%       <   500ms   151583 2.01%
<  1000ms       44 0.00%       <  1000ms     6134 0.08%
< 60000ms        3 0.00%       < 60000ms     5949 0.08%

=============                  =============
Batch Uploads (202)            Batch Uploads (202)
=============                  =============
Total rows: 7986               Total rows: 2700
   bucket    count pct            bucket    count pct
<     5ms        0 0.00%       <     5ms        0 0.00%
<    10ms     1986 24.87%      <    10ms        0 0.00%
<    25ms     4489 56.21%      <    25ms        0 0.00%
<    50ms     1365 17.09%      <    50ms        0 0.00%
<   100ms      139 1.74%       <   100ms        0 0.00%
<   500ms        7 0.09%       <   500ms        0 0.00%
<  1000ms        0 0.00%       <  1000ms        0 0.00%
< 60000ms        0 0.00%       < 60000ms     2700 100.00%
> pysync's batch uploads still taking around 2000ms to 4000ms. 

:bobm noted that the batch-upload tables are current standard InnoDB rather than TokuDB tables, and he was going to try with TokuDB to see if this made any different.  ni? :bobm for an update on that front.

The queries that are taking a long time here, are the ones that do an `INSERT OR UPDATE` into the `batch_upload_items` table.  We should also see if we can get more detailed stats on what they're waiting for, e.g. IO throughput or lock contention.
Flags: needinfo?(bobm)
(In reply to Ryan Kelly [:rfkelly] from comment #7)
> :bobm noted that the batch-upload tables are current standard InnoDB rather
> than TokuDB tables, and he was going to try with TokuDB to see if this made
> any different.  ni? :bobm for an update on that front.
Testing for this is still pending.

> The queries that are taking a long time here, are the ones that do an
> `INSERT OR UPDATE` into the `batch_upload_items` table.  We should also see
> if we can get more detailed stats on what they're waiting for, e.g. IO
> throughput or lock contention.
I will dig into this further.  We did some initial digging at the all-hands and ran into a limitation using EXPLAIN for these types of queries in MySQL 5.5.x.
Flags: needinfo?(bobm)
Latency Stats for 2017-01-17

A few changes: 

 - the gosync server has started running large history purge jobs. 
   60 TTL has passed for a lot of records and we're seeing millions 
   being purged. These jobs usually take between 800ms and 4000ms
 - tweaked the latency buckets to better show where pysync's batch
   times are being spent

go-sync                        py-sync
=============                  =============
GET info/*                     GET info/*
=============                  =============
Total rows: 1162051            Total rows: 810173
   bucket    count pct            bucket    count pct
<     5ms   789411 67.93%      <     5ms   762927 94.17%
<    10ms    90557 7.79%       <    10ms    42156 5.20%
<    25ms   206508 17.77%      <    25ms     4712 0.58%
<    50ms    68680 5.91%       <    50ms      272 0.03%
<   100ms     4282 0.37%       <   100ms       85 0.01%
<   500ms     2380 0.20%       <   500ms       21 0.00%
<  1000ms      181 0.02%       <  1000ms        0 0.00%
<  2000ms       46 0.00%       <  2000ms        0 0.00%
<  4000ms        5 0.00%       <  4000ms        0 0.00%
< 10000ms        1 0.00%       < 10000ms        0 0.00%
< 60000ms        0 0.00%       < 60000ms        0 0.00%

=============                  =============
GET ! info/*                   GET ! info/*
=============                  =============
Total rows: 1567346            Total rows: 1110159
   bucket    count pct            bucket    count pct
<     5ms  1487140 94.88%      <     5ms   672544 60.58%
<    10ms    40233 2.57%       <    10ms   378558 34.10%
<    25ms    29293 1.87%       <    25ms    48786 4.39%
<    50ms     8008 0.51%       <    50ms     5338 0.48%
<   100ms     1539 0.10%       <   100ms     2899 0.26%
<   500ms     1117 0.07%       <   500ms     1827 0.16%
<  1000ms       10 0.00%       <  1000ms      203 0.02%
<  2000ms        6 0.00%       <  2000ms        3 0.00%
<  4000ms        0 0.00%       <  4000ms        1 0.00%
< 10000ms        0 0.00%       < 10000ms        0 0.00%
< 60000ms        0 0.00%       < 60000ms        0 0.00%

=============                  =============
POST, PUT, DELETE              POST, PUT, DELETE
=============                  =============
Total rows: 1179560            Total rows: 895708
   bucket    count pct            bucket    count pct
<     5ms   139941 11.86%      <     5ms   195351 21.81%
<    10ms   222647 18.88%      <    10ms   115719 12.92%
<    25ms   630283 53.43%      <    25ms     3710 0.41%
<    50ms   171488 14.54%      <    50ms   304844 34.03%
<   100ms    12523 1.06%       <   100ms   248701 27.77%
<   500ms     2643 0.22%       <   500ms    26282 2.93%
<  1000ms        1 0.00%       <  1000ms      661 0.07%
<  2000ms       24 0.00%       <  2000ms      268 0.03%
<  4000ms       10 0.00%       <  4000ms      172 0.02%
< 10000ms        0 0.00%       < 10000ms        0 0.00%
< 60000ms        0 0.00%       < 60000ms        0 0.00%

=============                  =============
Batch Uploads (202)            Batch Uploads (202)
=============                  =============
Total rows: 146                Total rows: 96
   bucket    count pct            bucket    count pct
<     5ms        0 0.00%       <     5ms        0 0.00%
<    10ms       35 23.97%      <    10ms        0 0.00%
<    25ms      100 68.49%      <    25ms        0 0.00%
<    50ms        8 5.48%       <    50ms        0 0.00%
<   100ms        1 0.68%       <   100ms        0 0.00%
<   500ms        2 1.37%       <   500ms        0 0.00%
<  1000ms        0 0.00%       <  1000ms        0 0.00%
<  2000ms        0 0.00%       <  2000ms        0 0.00%
<  4000ms        0 0.00%       <  4000ms       96 100.00
< 10000ms        0 0.00%       < 10000ms        0 0.00%
< 60000ms        0 0.00%       < 60000ms        0 0.00%
> tweaked the latency buckets to better show where pysync's batch
> times are being spent

I've spun this out into a separate bug in Bug 1332987, because *ugh*, that's just terrible, and it's so much worse than the non-batched upload case that it's definitely a sign of some unexpected pathological behaviour.
Looks like Bug 1332987 had a huge difference in latency times. Stats for Feb 9 and Feb 10: 

gosync                         pysync
=============                  =============
GET info/*                     GET info/*
=============                  =============
Total rows: 3992843            Total rows: 2660721
   bucket    count pct            bucket    count pct
<     5ms  3259872 81.64%      <     5ms  1158419 43.54%
<    10ms   188462 4.72%       <    10ms  1184221 44.51%
<    25ms   378769 9.49%       <    25ms   311842 11.72%
<    50ms   151163 3.79%       <    50ms     5356 0.20%
<   100ms    10999 0.28%       <   100ms      763 0.03%
<   500ms     3282 0.08%       <   500ms      114 0.00%
<  1000ms      254 0.01%       <  1000ms        6 0.00%
<  2000ms       37 0.00%       <  2000ms        0 0.00%
<  4000ms        4 0.00%       <  4000ms        0 0.00%
< 10000ms        1 0.00%       < 10000ms        0 0.00%
< 60000ms        0 0.00%       < 60000ms        0 0.00%

=============                  =============
GET ! info/*                   GET ! info/*
=============                  =============
Total rows: 2970407            Total rows: 2349025
   bucket    count pct            bucket    count pct
<     5ms  2805532 94.45%      <     5ms   403971 17.20%
<    10ms    84269 2.84%       <    10ms  1363373 58.04%
<    25ms    59210 1.99%       <    25ms   528131 22.48%
<    50ms    18017 0.61%       <    50ms    37674 1.60%
<   100ms     2567 0.09%       <   100ms    10016 0.43%
<   500ms      753 0.03%       <   500ms     5505 0.23%
<  1000ms       49 0.00%       <  1000ms      336 0.01%
<  2000ms        9 0.00%       <  2000ms       15 0.00%
<  4000ms        1 0.00%       <  4000ms        2 0.00%
< 10000ms        0 0.00%       < 10000ms        1 0.00%
< 60000ms        0 0.00%       < 60000ms        1 0.00%

=============                  =============
POST, PUT, DELETE              POST, PUT, DELETE
=============                  =============
Total rows: 2192313            Total rows: 1646607
   bucket    count pct            bucket    count pct
<     5ms    36447 1.66%       <     5ms    44762 2.72%
<    10ms   205171 9.36%       <    10ms   236969 14.39%
<    25ms  1125496 51.34%      <    25ms   284334 17.27%
<    50ms   756971 34.53%      <    50ms   275660 16.74%
<   100ms    59672 2.72%       <   100ms   754487 45.82%
<   500ms     8505 0.39%       <   500ms    50290 3.05%
<  1000ms       36 0.00%       <  1000ms       96 0.01%
<  2000ms       15 0.00%       <  2000ms        9 0.00%
<  4000ms        0 0.00%       <  4000ms        0 0.00%
< 10000ms        0 0.00%       < 10000ms        0 0.00%
< 60000ms        0 0.00%       < 60000ms        0 0.00%

=============                  =============
Batch Uploads (202)            Batch Uploads (202)
=============                  =============
Total rows: 28354              Total rows: 16612
   bucket    count pct            bucket    count pct
<     5ms        0 0.00%       <     5ms        0 0.00%
<    10ms     7063 24.91%      <    10ms        0 0.00%
<    25ms    19016 67.07%      <    25ms        0 0.00%
<    50ms     2047 7.22%       <    50ms    16088 96.85%
<   100ms      188 0.66%       <   100ms      517 3.11%
<   500ms       38 0.13%       <   500ms        6 0.04%
<  1000ms        0 0.00%       <  1000ms        1 0.01%
<  2000ms        2 0.01%       <  2000ms        0 0.00%
<  4000ms        0 0.00%       <  4000ms        0 0.00%
< 10000ms        0 0.00%       < 10000ms        0 0.00%
< 60000ms        0 0.00%       < 60000ms        0 0.00%
Feb 2017 stats. Pysync from Feb 09 to Feb 28 (due to log rotation).

go-sync (c4.large)              py-sync (c3.2xlarge)
=============                   =============
GET info/*                      GET info/*
=============                   =============
Total rows: 52064935            Total rows: 25131756
   bucket    count pct             bucket    count pct
<     5ms 42724563 82.06%       <     5ms 10616359 42.24%
<    10ms  2443276 4.69%        <    10ms 10917868 43.44%
<    25ms  4898779 9.41%        <    25ms  3521216 14.01%
<    50ms  1819441 3.49%        <    50ms    64826 0.26%
<   100ms   133513 0.26%        <   100ms     9476 0.04%
<   500ms    41548 0.08%        <   500ms     1921 0.01%
<  1000ms     3283 0.01%        <  1000ms       79 0.00%
<  2000ms      454 0.00%        <  2000ms        4 0.00%
<  4000ms       64 0.00%        <  4000ms        3 0.00%
< 10000ms       13 0.00%        < 10000ms        4 0.00%
< 60000ms        1 0.00%        < 60000ms        0 0.00%

=============                   =============
GET ! info/*                    GET ! info/*
=============                   =============
Total rows: 38767732            Total rows: 22855172
   bucket    count pct             bucket    count pct
<     5ms 36675022 94.60%       <     5ms  3403051 14.89%
<    10ms  1054561 2.72%        <    10ms 12821105 56.10%
<    25ms   766919 1.98%        <    25ms  5613880 24.56%
<    50ms   227604 0.59%        <    50ms   812392 3.55%
<   100ms    32802 0.08%        <   100ms   134337 0.59%
<   500ms     9988 0.03%        <   500ms    66245 0.29%
<  1000ms      697 0.00%        <  1000ms     3799 0.02%
<  2000ms      131 0.00%        <  2000ms      181 0.00%
<  4000ms        6 0.00%        <  4000ms      117 0.00%
< 10000ms        2 0.00%        < 10000ms       63 0.00%
< 60000ms        0 0.00%        < 60000ms        2 0.00%

=============                   =============
POST, PUT, DELETE               POST, PUT, DELETE
=============                   =============
Total rows: 27955597            Total rows: 15487603
   bucket    count pct             bucket    count pct
<     5ms   464401 1.66%        <     5ms   338624 2.19%
<    10ms  2727571 9.76%        <    10ms  1887511 12.19%
<    25ms 14592456 52.20%       <    25ms  2983013 19.26%
<    50ms  9380939 33.56%       <    50ms  2663402 17.20%
<   100ms   692794 2.48%        <   100ms  7038629 45.45%
<   500ms    96335 0.34%        <   500ms   575014 3.71%
<  1000ms      824 0.00%        <  1000ms     1209 0.01%
<  2000ms      256 0.00%        <  2000ms      164 0.00%
<  4000ms       20 0.00%        <  4000ms       28 0.00%
< 10000ms        1 0.00%        < 10000ms        8 0.00%
< 60000ms        0 0.00%        < 60000ms        1 0.00%

=============                   =============
Batch Uploads (202)             Batch Uploads (202)
=============                   =============
Total rows: 348310              Total rows: 195298
   bucket    count pct             bucket    count pct
<     5ms        1 0.00%        <     5ms        0 0.00%
<    10ms    88820 25.50%       <    10ms        1 0.00%
<    25ms   222858 63.98%       <    25ms       97 0.05%
<    50ms    29831 8.56%        <    50ms   188265 96.40%
<   100ms     5221 1.50%        <   100ms     6794 3.48%
<   500ms     1568 0.45%        <   500ms      139 0.07%
<  1000ms        8 0.00%        <  1000ms        2 0.00%
<  2000ms        3 0.00%        <  2000ms        0 0.00%
<  4000ms        0 0.00%        <  4000ms        0 0.00%
< 10000ms        0 0.00%        < 10000ms        0 0.00%
< 60000ms        0 0.00%        < 60000ms        0 0.00%
Between comment #9 and #11 the stats for the pysync node shifted quite a bit. 
Decided to compare the newer pysync node to the oldest one we still have around. 

The old one seems to have a lot better performance across the board. There are some differences between them: 

 - pysync newer has a working BSO purge script running
 - this purge script would have a lot of work to do ~60 days after
   the first users were assigned to it. 60 days = history BSO TTL
   .. about the time between data in comment #9 and comment #10
 - could it be the purge background job bringing down overall performance?
 - POST/PUT/DELETE request performance do seem to degrade over time on pysync-old


Feb 2017 stats
-------------------

pysync (newer)                 psync (old, 652 days run time)
=============                  =============
GET info/*                     GET info/*
=============                  =============
Total rows: 25131756           Total rows: 27549489
   bucket    count pct            bucket    count pct
<     5ms 10616359 42.24%      <     5ms 26595148 96.54%
<    10ms 10917868 43.44%      <    10ms   903059 3.28%
<    25ms  3521216 14.01%      <    25ms    24642 0.09%
<    50ms    64826 0.26%       <    50ms     9683 0.04%
<   100ms     9476 0.04%       <   100ms     1807 0.01%
<   500ms     1921 0.01%       <   500ms     8430 0.03%
<  1000ms       79 0.00%       <  1000ms     3835 0.01%
<  2000ms        4 0.00%       <  2000ms     2874 0.01%
<  4000ms        3 0.00%       <  4000ms        6 0.00%
< 10000ms        4 0.00%       < 10000ms        4 0.00%
< 60000ms        0 0.00%       < 60000ms        1 0.00%

=============                  =============
GET ! info/*                   GET ! info/*
=============                  =============
Total rows: 22855172           Total rows: 20090066
   bucket    count pct            bucket    count pct
<     5ms  3403051 14.89%      <     5ms 12636551 62.90%
<    10ms 12821105 56.10%      <    10ms  4345336 21.63%
<    25ms  5613880 24.56%      <    25ms  1415897 7.05%
<    50ms   812392 3.55%       <    50ms  1362022 6.78%
<   100ms   134337 0.59%       <   100ms   190355 0.95%
<   500ms    66245 0.29%       <   500ms   121749 0.61%
<  1000ms     3799 0.02%       <  1000ms    13607 0.07%
<  2000ms      181 0.00%       <  2000ms     3429 0.02%
<  4000ms      117 0.00%       <  4000ms      627 0.00%
< 10000ms       63 0.00%       < 10000ms      290 0.00%
< 60000ms        2 0.00%       < 60000ms      203 0.00%

=============                  =============
POST, PUT, DELETE              POST, PUT, DELETE
=============                  =============
Total rows: 15487603           Total rows: 13371412
   bucket    count pct            bucket    count pct
<     5ms   338624 2.19%       <     5ms  2334711 17.46%
<    10ms  1887511 12.19%      <    10ms  3066168 22.93%
<    25ms  2983013 19.26%      <    25ms    19153 0.14%
<    50ms  2663402 17.20%      <    50ms  1832860 13.71%
<   100ms  7038629 45.45%      <   100ms  4720077 35.30%
<   500ms   575014 3.71%       <   500ms  1396970 10.45%
<  1000ms     1209 0.01%       <  1000ms     1048 0.01%
<  2000ms      164 0.00%       <  2000ms      197 0.00%
<  4000ms       28 0.00%       <  4000ms      105 0.00%
< 10000ms        8 0.00%       < 10000ms       95 0.00%
< 60000ms        1 0.00%       < 60000ms       28 0.00%

=============                  (api not enabled)
Batch Uploads (202)
=============
Total rows: 195298
   bucket    count pct
<     5ms        0 0.00%
<    10ms        1 0.00%
<    25ms       97 0.05%
<    50ms   188265 96.40%
<   100ms     6794 3.48%
<   500ms      139 0.07%
<  1000ms        2 0.00%
<  2000ms        0 0.00%
<  4000ms        0 0.00%
< 10000ms        0 0.00%
< 60000ms        0 0.00%
The pysync node is no longer available due to migration. We have the data we need. Closing this issue.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.