[nova] local ssd disk performance
Donny Davis
donny at fortnebula.com
Mon Aug 5 14:01:46 UTC 2019
If it was me, I would use cinder. Reasons why are as follows
DB data doesn't belong on an ephemeral disk, it probably belongs on block
storage.
When you remove the requirements for data to be stored on ephemeral disks,
live migration is less of an issue.
You can tune the c-vol node providing the block storage to meet the
performance requirements of the end users application.
To start with I would look at local performance of disk on the block
storage node. In my case, raid5/6 with nvme disks put the cpu at 100%
because mdadm isn't multithreaded. Raid 10 provides enough performance, but
only because I created seperate raid1 volumes, then put raid0 across them.
This way I get more threads from mdadm.
Once local performance meets expectations, I would move to creating a test
volume and then mounting it on the hypervisor manually. Then tune network
performance to meet your goals. Finally test in vm performance to make sure
it's doing what you want.
Donny Davis
c: 805 814 6800
irc: donnyd
On Mon, Aug 5, 2019, 9:45 AM Budai Laszlo <laszlo.budai at gmail.com> wrote:
> Thank you for the info. Our is a generic openstack having the main storage
> on CEPH. We had a requirement from one tenant to provide a very fast
> storage for a no-sql database. So it came the idea to add some nvme storage
> to a few computing nodes, and to provide the storage from those to the
> specific tenant.
>
> We have investigated different options in providing this.
>
> 1. The ssd managed by nova as LVM
> 2. the ssd managed by cinder and use the instance locality filter
> 3. the ssd mounted on the /var/liv/instances and the ephemeral disk
> managed by nova.
>
> Kind regards,
> Laszlo
>
> On 8/5/19 3:08 PM, Donny Davis wrote:
> > I am happy to share numbers from my iscsi setup. However these numbers
> probably won't mean much for your workloads. I tuned my openstack to
> perform as well as possible for a specific workload (Openstack CI), so some
> of the things I have put my efforts into are for CI work and not really
> relevant to general purpose. Also your cinder performance hinges greatly on
> your networks capabilities. I use a dedicated nic for iscsi traffic, and
> MTU's are set at 9000 for every device in the iscsi path. *Only* that nic
> is set at MTU 9000, because if the rest of the openstack network is, it can
> create more problems than it solves. My network spine is 40G, and each
> compute node has 4 10G nics. I only use one nic for iscsi traffic. The
> block storage node has two 40G nics.
> >
> > With that said, I use the fio tool to benchmark performance on linux
> systems.
> > Here is the command i use to run the benchmark
> >
> > fio --numjobs=16 --randrepeat=1 --ioengine=libaio --direct=1
> --gtod_reduce=1 --name=test --filename=test --bs=64k --iodepth=32
> --size=10G --readwrite=randrw --rwmixread=50
> >
> > From the block storage node locally
> >
> > Run status group 0 (all jobs):
> > READ: bw=2960MiB/s (3103MB/s), 185MiB/s-189MiB/s (194MB/s-198MB/s),
> io=79.9GiB (85.8GB), run=26948-27662msec
> > WRITE: bw=2963MiB/s (3107MB/s), 185MiB/s-191MiB/s (194MB/s-200MB/s),
> io=80.1GiB (85.0GB), run=26948-27662msec
> >
> > From inside a vm
> >
> > Run status group 0 (all jobs):
> > READ: bw=441MiB/s (463MB/s), 73.4MiB/s-73.0MiB/s (76.0MB/s-77.6MB/s),
> io=30.0GiB (32.2GB), run=69242-69605msec
> > WRITE: bw=441MiB/s (463MB/s), 73.4MiB/s-73.0MiB/s (76.0MB/s-77.6MB/s),
> io=29.0GiB (32.2GB), run=69242-69605msec
> >
> > The vm side of the test is able to push pretty close to the limits of
> the nic. My cloud also currently has a full workload on it, as I have
> learned in working to get an optimized for CI cloud... it does matter if
> there is a workload or not.
> >
> >
> > Are you using raid for your ssd's, if so what type?
> >
> > Do you mind sharing what workload will go on your Openstack deployment?
> > Is it DB, web, general purpose, etc.
> >
> > ~/Donny D
> >
> >
> >
> >
> >
> >
> >
> >
> > On Mon, Aug 5, 2019 at 4:54 AM Budai Laszlo <laszlo.budai at gmail.com
> <mailto:laszlo.budai at gmail.com>> wrote:
> >
> > Hi,
> >
> > well, we used the same command to measure the different storage
> possibilities (sudo iozone -e -I -t 32 -s 100M -r 4k -i 0 -i 1 -i 2), we
> have measured the disk mounted directly on the host, and we have used the
> same command to measure the performance in the guests using different ways
> to attach the storage to the VM.
> >
> > for instance on the host we were able to measure 408MB/s initial
> writes, 420MB/s rewrites, 397MB/s Random writes, and 700MB/s random reads,
> on the guest we got the following, using the different technologies:
> >
> > 1. Ephemeral served by nova (SSD mounted on /var/lib/nova/instances,
> images type =raw, without preallocate images)
> > Initial writes 60Mb/s, rewrites 70Mb/s, random writes 73MB/s, random
> reads 427MB/s.
> >
> > 2. Ephemeral served by nova (images type = lvm, without preallocate
> images)
> > Initial writes 332Mb/s, rewrites 416Mb/s, random writes 417MB/s,
> random reads 550MB/s.
> >
> > 3. Cinder attached LVM with instance locality
> > Initial writes 148Mb/s, rewrites 151Mb/s, random writes 149MB/s,
> random reads 160MB/s.
> >
> > 4. Cinder attached LVM without instance locality
> > Initial writes 103Mb/s, rewrites 109Mb/s, random writes 103MB/s,
> random reads 105MB/s.
> >
> > 5. Ephemeral served by nova (SSD mounted on /var/lib/nova/instances,
> images type =raw, witht preallocate images)
> > Initial writes 348Mb/s, rewrites 400Mb/s, random writes 393MB/s,
> random reads 553MB/s
> >
> >
> > So points 3,4 are using ISCSI. As you can see those numbers are far
> below the local volume based or the local file based with preallocate
> images.
> >
> > Could you share some nubers about the performance of your ISCSI
> based setup? that would allow us to see whether we are doing something
> wrong related to the iscsi. Thank you.
> >
> > Kind regards,
> > Laszlo
> >
> >
> > On 8/3/19 8:41 PM, Donny Davis wrote:
> > > I am using the cinder-lvm backend right now and performance is
> quite good. My situation is similar without the migration parts. Prior to
> this arrangement I was using iscsi to mount a disk in
> /var/lib/nova/instances and that also worked quite well.
> > >
> > > If you don't mind me asking, what kind of i/o performance are
> you looking for?
> > >
> > > On Fri, Aug 2, 2019 at 12:25 PM Budai Laszlo <
> laszlo.budai at gmail.com <mailto:laszlo.budai at gmail.com> <mailto:
> laszlo.budai at gmail.com <mailto:laszlo.budai at gmail.com>>> wrote:
> > >
> > > Thank you Daniel,
> > >
> > > My colleague found the same solution in the meantime. And that
> helped us as well.
> > >
> > > Kind regards,
> > > Laszlo
> > >
> > > On 8/2/19 6:50 PM, Daniel Speichert wrote:
> > > > For the case of simply using local disk mounted for
> /var/lib/nova and raw disk image type, you could try adding to nova.conf:
> > > >
> > > > preallocate_images = space
> > > >
> > > > This implicitly changes the I/O method in libvirt from
> "threads" to "native", which in my case improved performance a lot (10
> times) and generally is the best performance I could get.
> > > >
> > > > Best Regards
> > > > Daniel
> > > >
> > > > On 8/2/2019 10:53, Budai Laszlo wrote:
> > > >> Hello all,
> > > >>
> > > >> we have a problem with the performance of the disk IO in a
> KVM instance.
> > > >> We are trying to provision VMs with high performance SSDs.
> we have investigated different possibilities with different results ...
> > > >>
> > > >> 1. configure Nova to use local LVM storage (images_types =
> lvm) - provided the best performance, but we could not migrate our
> instances (seems to be a bug).
> > > >> 2. use cinder with lvm backend and instance locality, we
> could migrate the instances, but the performance is less than half of the
> previous case
> > > >> 3. mount the ssd on /var/lib/nova/instances and use the
> images_type = raw in nova. We could migrate, but the write performance
> dropped to ~20% of the images_types = lvm performance and read performance
> is ~65% of the lvm case.
> > > >>
> > > >> do you have any idea to improve the performance for any of
> the cases 2 or 3 which allows migration.
> > > >>
> > > >> Kind regards,
> > > >> Laszlo
> > > >>
> > >
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190805/fdabb16a/attachment.html>
More information about the openstack-discuss
mailing list