Closed Bug 1014568 Opened 11 years ago Closed 11 years ago

[ceph] performance issue in stage - 2+ minutes to store crashes

Categories

(Socorro :: Infra, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: selenamarie, Assigned: selenamarie)

Details

Attachments

(1 file)

Typically, we've seen something around 2-30 seconds to store a crash in our development ceph cluster. Today, however: 2014-05-22 06:59:46,547 DEBUG - Thread-9 - ceph save_raw_crash 0:00:54.768174 2014-05-22 07:00:04,210 DEBUG - Thread-20 - ceph save_raw_crash 0:01:11.300586 2014-05-22 07:00:38,695 DEBUG - Thread-2 - ceph save_raw_crash 0:01:47.006469 2014-05-22 07:00:41,102 DEBUG - Thread-8 - ceph save_raw_crash 0:01:48.302644 2014-05-22 07:00:42,742 DEBUG - Thread-21 - ceph save_raw_crash 0:01:49.856423 2014-05-22 07:00:45,897 DEBUG - Thread-12 - ceph save_raw_crash 0:01:54.132093 2014-05-22 07:00:52,739 DEBUG - Thread-22 - ceph save_raw_crash 0:01:59.861903 2014-05-22 07:00:53,699 DEBUG - Thread-11 - ceph save_raw_crash 0:02:00.836320 2014-05-22 07:00:55,099 DEBUG - Thread-13 - ceph save_raw_crash 0:02:03.311690 2014-05-22 07:01:03,074 DEBUG - Thread-10 - ceph save_raw_crash 0:02:11.293695 2014-05-22 07:01:03,269 DEBUG - Thread-3 - ceph save_raw_crash 0:02:11.543504 2014-05-22 07:01:03,380 DEBUG - Thread-1 - ceph save_raw_crash 0:02:10.575387 2014-05-22 07:01:06,331 DEBUG - Thread-15 - ceph save_raw_crash 0:02:14.523498 2014-05-22 07:01:09,292 DEBUG - Thread-14 - ceph save_raw_crash 0:02:17.507183 2014-05-22 07:01:13,199 DEBUG - Thread-6 - ceph save_raw_crash 0:02:21.444210 2014-05-22 07:01:18,406 DEBUG - Thread-18 - ceph save_raw_crash 0:02:26.603336 We need to figure out why this is happening and see if there's a way to make it a bit faster!
Assignee: nobody → sdeckelmann
Basics -- verified we're using xfs with noatime mount option: /dev/sdb1 /var/lib/ceph/osd/ceph-0 xfs rw,noatime 0 0 /dev/sdc1 /var/lib/ceph/osd/ceph-1 xfs rw,noatime 0 0 /dev/sdd1 /var/lib/ceph/osd/ceph-2 xfs rw,noatime 0 0 /dev/sde1 /var/lib/ceph/osd/ceph-3 xfs rw,noatime 0 0 (ceph did this automatically when we set everything up) Looking through https://software.intel.com/en-us/blogs/2013/10/25/measure-ceph-rbd-performance-in-a-quantitative-way-part-i Readahead buffer is set to 2048kb: [root@ceph1.dev.phx1 ~]# /sbin/blockdev --getra /dev/sdb1 256 I added some "big bytes" settings from http://ceph.com/community/ceph-bobtail-jbod-performance-tuning/ to see what would happen. Not much changed :D I found that we had some stale fcgi processes, so I restarted apache. This seemed to improve things a bit. When I first looked, we were hovering about 30 crashes per minute, and after we are now hovering around 70 crashes per minute. Still not 300 like we were doing a few days ago. I suspect the 'scrubbing' is the root cause of the problems here. Still digging into how to determine that.
Ok, here's some more promising logs: /var/log/ceph/ceph.log:2014-05-22 14:06:34.478149 osd.7 10.8.100.69:6814/30109 314 : [INF] 9.b8 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 14:10:25.312681 osd.2 10.8.100.68:6805/31485 200 : [INF] 9.5e deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 14:10:53.953597 osd.8 10.8.100.70:6825/11117 756 : [INF] 9.17 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 14:13:18.068502 osd.2 10.8.100.68:6805/31485 201 : [INF] 9.4e deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 14:22:38.551132 osd.7 10.8.100.69:6814/30109 325 : [INF] 9.4 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 14:25:06.499355 osd.5 10.8.100.69:6800/29102 774 : [INF] 9.b3 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 14:25:06.913611 osd.2 10.8.100.68:6805/31485 202 : [INF] 8.6c deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 14:34:16.511165 osd.7 10.8.100.69:6814/30109 326 : [INF] 9.a4 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 14:39:51.150544 osd.11 10.8.100.70:6810/10511 453 : [INF] 5.8f deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 14:40:10.095819 osd.9 10.8.100.70:6805/10068 429 : [INF] 9.d3 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 14:42:52.805000 osd.7 10.8.100.69:6814/30109 327 : [INF] 9.d4 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 15:02:32.651190 osd.8 10.8.100.70:6825/11117 768 : [INF] 9.94 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 15:15:22.852645 osd.1 10.8.100.68:6812/18633 1 : [INF] 9.14 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 15:17:27.175100 osd.10 10.8.100.70:6800/9535 540 : [INF] 9.3a deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 15:19:01.927371 osd.9 10.8.100.70:6805/10068 430 : [INF] 9.2e deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 15:19:03.283057 osd.5 10.8.100.69:6800/29102 775 : [INF] 8.c0 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 15:27:00.743415 osd.11 10.8.100.70:6810/10511 454 : [INF] 5.16 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 15:29:49.799919 osd.7 10.8.100.69:6814/30109 328 : [INF] 9.25 deep-scrub ok /var/log/ceph/ceph.log:2014-05-22 15:32:27.309071 osd.5 10.8.100.69:6800/29102 776 : [INF] 9.a6 deep-scrub ok
Corresponding # of objects inserted from ceph logs during 15:00: [root@ceph1.dev.phx1 ~]# tail -10000 /var/log/ceph/client.radosgw.gateway.log | grep object | awk {'print $2}' | awk -F':' '{print $2}' | uniq -c 5 13 45 14 30 15 35 16 56 17 55 18 95 19 103 20 99 21 100 22 108 23 96 24 102 25 96 26 107 27 111 28 106 29 90 30 92 31 85 32 48 33 30 34 40 35 55 36 48 37 42 38 61 39
Ceph does scheduled scrubbing, to verify data integrity, but I believe it's fairly tweakable, so that might be what you need to do.
On the socorro side -- * :lars worked out that bucket creation was taking a long time, so we're reducing our total number of buckets to a single bucket per day. * :lars also determined that we were spending a lot of time re-creating connections with Boto that we could cache. He made all those connections cache-able and we're back up to around 200 crashes per minute, except for when we're scrubbing. Now digging into the scrubbing configs.
In just eyeballing stats on all three ceph systems, it appears we're maxing out IO on 1-2 disks on each system. Utilization is pegged at 100%. await is hovering around 400-500 ms, sometimes around 1500 ms, peaking around 5000 ms. The busy disks appear to be handling a disproportionate number of reads per second compared to other disks, while the *size* of the reads doesn't appear to be very large. ** A note for future monitoring is that we probably want to get detailed iostat info graphed.
Commit pushed to master at https://github.com/mozilla/socorro https://github.com/mozilla/socorro/commit/c3aa6b28dfcf79e7b1a4dee9efce0733a21e6ee4 Merge pull request #2061 from twobraids/ceph-cache fixed Bug 1014568 - added BotoS3 cache of connection & bucket - single bucket days
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → 87
Target Milestone: 87 → 88
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: