[Openstack-operators] Openstack Ceph Backend and Performance Information Sharing

Warren Wang warren at wangspeed.com
Fri Feb 17 15:41:56 UTC 2017


@Vahric, FYI, if you use directio, instead of sync (like a database is is
default configured for), you will just be using the RBD cache. Look at the
latency on your numbers. It is lower than is possible for a packet to
traverse the network. You'll need to use sync=1 if you want to see what the
performance is like for sync writes. You can reduce it with higher CPU
frequencies (change the governor), c-state disable, better network, the
right NVMe for journal, and other stuff. In the end, we're happy to see
even 500-600 IOPS for sync writes with a numjobs=1, iodepth=1 (256 is
unreasonable).

@Luis, since this is an OpenStack list, I assume he is accessing it via
Cinder.

Warren

On Fri, Feb 17, 2017 at 7:11 AM, Luis Periquito <periquito at gmail.com> wrote:

> There is quite some information missing: how much RAM do the nodes
> have? What SSDs? What Kernel (there has been complaints of a
> performance regression on 4.4+).
>
> You also never state how you have configured the OSDs, their journals,
> filestore or bluestore, etc...
>
> You never specify how you're accessing the RBD device...
>
> For you to achieve high IOPS you need higher frequency CPUs. Also you
> have to remember that the scale-out architecture of ceph means the
> more nodes you add the better performance you'll have.
>
> On Thu, Feb 16, 2017 at 4:26 PM, Vahric Muhtaryan <vahric at doruk.net.tr>
> wrote:
> > Hello All ,
> >
> > For a long time we are testing Ceph from Firefly to Kraken , tried to
> > optimise many things which are very very common I guess like test
> tcmalloc
> > version 2.1 , 2,4 , jemalloc , setting debugs 0/0 , op_tracker and such
> > others and I believe with out hardware we almost reach to end of the
> road.
> >
> > Some vendor tests mixed us a lot like samsung
> > http://www.samsung.com/semiconductor/support/tools-
> utilities/All-Flash-Array-Reference-Design/downloads/
> Samsung_NVMe_SSDs_and_Red_Hat_Ceph_Storage_CS_20160712.pdf
> > , DELL Dell PowerEdge R730xd Performance and Sizing Guide for Red Hat …
> and
> > from intel
> > http://www.flashmemorysummit.com/English/Collaterals/
> Proceedings/2015/20150813_S303E_Zhang.pdf
> >
> > At the end using 3 replica (Actually most of vendors are testing with 2
> but
> > I believe that its very very wrong way to do because when some of failure
> > happen you should wait 300 sec which is configurable but from blogs we
> > understaood that sometimes OSDs can be down and up again because of that
> I
> > believe very important to set related number but we do not want instances
> > freeze )  with config below with 4K , random and fully write only .
> >
> > I red a lot about OSD and OSD process eating huge CPU , yes it is and we
> are
> > very well know that we couldn’t get total of iOPS capacity of each raw
> SSD
> > drives.
> >
> > My question is , can you pls share almost same or closer config or any
> > config test or production results ? Key is write, not %70 of read % 30
> write
> > or full read things …
> >
> > Hardware :
> >
> > 6 x Node
> > Each Node  Have :
> > 2 Socker CPU 1.8 GHZ each and total 16 core
> > 3 SSD + 12 HDD (SSDs are in journal mode 4 HDD to each SSD)
> > Raid Cards Configured Raid 0
> > We did not see any performance different with JBOD mode of raid card
> because
> > of that continued with raid 0
> > Also raid card write back cache is used because its adding extra IOPS
> too !
> >
> > Achieved IOPS : 35 K (Single Client)
> > We tested up to 10 Clients which ceph fairly share this usage like
> almost 4K
> > for each
> >
> > Test Command : fio --randrepeat=1 --ioengine=libaio --direct=1
> > --gtod_reduce=1 --name=test --filename=test --bs=4k —iodepth=256
> --size=1G
> > --numjobs=8 --readwrite=randwrite —group_reporting
> >
> >
> > Regards
> > Vahric Muhtaryan
> >
> > _______________________________________________
> > OpenStack-operators mailing list
> > OpenStack-operators at lists.openstack.org
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
> >
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170217/ab10b6a9/attachment.html>


More information about the OpenStack-operators mailing list