[openstack-dev] [Ceph] Why performance of benchmarks with small blocks is extremely small?

Gregory Farnum greg at inktank.com
Tue Sep 30 18:42:12 UTC 2014


On Sat, Sep 27, 2014 at 8:14 AM, Timur Nurlygayanov
<tnurlygayanov at mirantis.com> wrote:
> Hello all,
>
> I installed OpenStack with Glance + Ceph OSD with replication factor 2 and
> now I can see the write operations are extremly slow.
> For example, I can see only 0.04 MB/s write speed when I run rados bench
> with 512b blocks:
>
> rados bench -p test 60 write --no-cleanup -t 1 -b 512
>
>  Maintaining 1 concurrent writes of 512 bytes for up to 60 seconds or 0
> objects
>  Object prefix: benchmark_data_node-17.domain.tld_15862
>    sec Cur ops   started  finished    avg MB/s     cur MB/s       last lat
> avg lat
>      0       0         0         0              0                0
> -                   0
>      1       1        83        82            0.0400341   0.0400391
> 0.008465       0.0120985
>      2       1       169       168          0.0410111    0.0419922
> 0.080433       0.0118995
>      3       1       240       239          0.0388959    0.034668
> 0.008052       0.0125385
>      4       1       356       355          0.0433309   0.0566406
> 0.00837         0.0112662
>      5       1       472       471          0.0459919   0.0566406
> 0.008343       0.0106034
>      6       1       550       549          0.0446735   0.0380859
> 0.036639       0.0108791
>      7       1       581       580          0.0404538   0.0151367
> 0.008614       0.0120654
>
>
> My test environment configuration:
> Hardware servers with 1Gb network interfaces, 64Gb RAM and 16 CPU cores per
> node, HDDs WDC WD5003ABYX-01WERA0.
> OpenStack with 1 controller, 1 compute and 2 ceph nodes (ceph on separate
> nodes).
> CentOS 6.5, kernel 2.6.32-431.el6.x86_64.
>
> I tested several config options for optimizations, like in
> /etc/ceph/ceph.conf:
>
> [default]
> ...
> osd_pool_default_pg_num = 1024
> osd_pool_default_pgp_num = 1024
> osd_pool_default_flag_hashpspool = true
> ...
> [osd]
> osd recovery max active = 1
> osd max backfills = 1
> filestore max sync interval = 30
> filestore min sync interval = 29
> filestore flusher = false
> filestore queue max ops = 10000
> filestore op threads = 16
> osd op threads = 16
> ...
> [client]
> rbd_cache = true
> rbd_cache_writethrough_until_flush = true
>
> and in /etc/cinder/cinder.conf:
>
> [DEFAULT]
> volume_tmp_dir=/tmp
>
> but in the result performance was increased only on ~30 % and it not looks
> like huge success.
>
> Non-default mount options and TCP optimization increase the speed in about
> 1%:
>
> [root at node-17 ~]# mount | grep ceph
> /dev/sda4 on /var/lib/ceph/osd/ceph-0 type xfs
> (rw,noexec,nodev,noatime,nodiratime,user_xattr,data=writeback,barrier=0)
>
> [root at node-17 ~]# cat /etc/sysctl.conf
> net.core.rmem_max = 16777216
> net.core.wmem_max = 16777216
> net.ipv4.tcp_rmem = 4096 87380 16777216
> net.ipv4.tcp_wmem = 4096 65536 16777216
> net.ipv4.tcp_window_scaling = 1
> net.ipv4.tcp_timestamps = 1
> net.ipv4.tcp_sack = 1
>
>
> Do we have other ways to significantly improve CEPH storage performance?
> Any feedback and comments are welcome!

This is entirely latency dominated and OpenStack configuration changes
aren't going to be able to do much — you're getting 80 sequential ops
a second out of a system that has to do two round trips over a network
and hit two hard drives on every operation. You might want to spend
some time looking at how latency, bandwidth, and concurrency are
(often not) related. :)
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com



More information about the OpenStack-dev mailing list