If it was me, I would use cinder. Reasons why are as follows

DB data doesn't belong on an ephemeral disk, it probably belongs on block storage.

When you remove the requirements for data to be stored on ephemeral disks, live migration is less of an issue. 

You can tune the c-vol node providing the block storage to meet the performance requirements of the end users application.

To start with I would look at local performance of disk on the block storage node. In my case, raid5/6 with nvme disks put the cpu at 100% because mdadm isn't multithreaded. Raid 10 provides enough performance, but only because I created seperate raid1 volumes, then put raid0 across them. This way I get more threads from mdadm.

Once local performance meets expectations, I would move to creating a test volume and then mounting it on the hypervisor manually. Then tune network performance to meet your goals. Finally test in vm performance to make sure it's doing what you want.



Donny Davis
c: 805 814 6800
irc: donnyd

On Mon, Aug 5, 2019, 9:45 AM Budai Laszlo <laszlo.budai@gmail.com> wrote:
Thank you for the info. Our is a generic openstack having the main storage on CEPH. We had a requirement from one tenant to provide a very fast storage for a no-sql database. So it came the idea to add some nvme storage to a few computing nodes, and to provide the storage from those to the specific tenant.

We have investigated different options in providing this.

1. The ssd managed by nova as LVM
2. the ssd managed by cinder and use the instance locality filter
3. the ssd mounted on the /var/liv/instances and the ephemeral disk managed by nova.

Kind regards,
Laszlo

On 8/5/19 3:08 PM, Donny Davis wrote:
> I am happy to share numbers from my iscsi setup. However these numbers probably won't mean much for your workloads. I tuned my openstack to perform as well as possible for a specific workload (Openstack CI), so some of the things I have put my efforts into are for CI work and not really relevant to general purpose. Also your cinder performance hinges greatly on your networks capabilities. I use a dedicated nic for iscsi traffic, and MTU's are set at 9000 for every device in the iscsi path. *Only* that nic is set at MTU 9000, because if the rest of the openstack network is, it can create more problems than it solves. My network spine is 40G, and each compute node has 4 10G nics. I only use one nic for iscsi traffic. The block storage node has two 40G nics. 
>
> With that said, I use the fio tool to benchmark performance on linux systems. 
> Here is the command i use to run the benchmark
>
> fio --numjobs=16 --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=64k --iodepth=32 --size=10G --readwrite=randrw --rwmixread=50
>
> From the block storage node locally
>
> Run status group 0 (all jobs):
>    READ: bw=2960MiB/s (3103MB/s), 185MiB/s-189MiB/s (194MB/s-198MB/s), io=79.9GiB (85.8GB), run=26948-27662msec
>   WRITE: bw=2963MiB/s (3107MB/s), 185MiB/s-191MiB/s (194MB/s-200MB/s), io=80.1GiB (85.0GB), run=26948-27662msec
>
> From inside a vm
>
> Run status group 0 (all jobs):
>    READ: bw=441MiB/s (463MB/s), 73.4MiB/s-73.0MiB/s (76.0MB/s-77.6MB/s), io=30.0GiB (32.2GB), run=69242-69605msec
>   WRITE: bw=441MiB/s (463MB/s), 73.4MiB/s-73.0MiB/s (76.0MB/s-77.6MB/s), io=29.0GiB (32.2GB), run=69242-69605msec
>
> The vm side of the test is able to push pretty close to the limits of the nic. My cloud also currently has a full workload on it, as I have learned in working to get an optimized for CI cloud... it does matter if there is a workload or not. 
>
>
> Are you using raid for your ssd's, if so what type?
>
> Do you mind sharing what workload will go on your Openstack deployment?
> Is it DB, web, general purpose, etc.
>
> ~/Donny D
>
>
>
>
>
>
>
>
> On Mon, Aug 5, 2019 at 4:54 AM Budai Laszlo <laszlo.budai@gmail.com <mailto:laszlo.budai@gmail.com>> wrote:
>
>     Hi,
>
>     well, we used the same command to measure the different storage possibilities (sudo iozone -e -I -t 32 -s 100M -r 4k -i 0 -i 1 -i 2), we have measured the disk mounted directly on the host, and we have used the same command to measure the performance in the guests using different ways to attach the storage to the VM.
>
>     for instance on the host we were able to measure 408MB/s initial writes, 420MB/s rewrites, 397MB/s Random writes, and 700MB/s random reads, on the guest we got the following, using the different technologies:
>
>     1. Ephemeral served by nova (SSD mounted on /var/lib/nova/instances, images type =raw, without preallocate images)
>     Initial writes 60Mb/s, rewrites 70Mb/s, random writes 73MB/s, random reads 427MB/s.
>
>     2. Ephemeral served by nova (images type = lvm, without preallocate images)
>     Initial writes 332Mb/s, rewrites 416Mb/s, random writes 417MB/s, random reads 550MB/s.
>
>     3. Cinder attached LVM with instance locality
>     Initial writes 148Mb/s, rewrites 151Mb/s, random writes 149MB/s, random reads 160MB/s.
>
>     4. Cinder attached LVM without instance locality
>     Initial writes 103Mb/s, rewrites 109Mb/s, random writes 103MB/s, random reads 105MB/s.
>
>     5. Ephemeral served by nova (SSD mounted on /var/lib/nova/instances, images type =raw, witht preallocate images)
>     Initial writes 348Mb/s, rewrites 400Mb/s, random writes 393MB/s, random reads 553MB/s
>
>
>     So points 3,4 are using ISCSI. As you can see those numbers are far below the local volume based or the local file based with preallocate images.
>
>     Could you share some nubers about the performance of your ISCSI based setup? that would allow us to see whether we are doing something wrong related to the iscsi. Thank you.
>
>     Kind regards,
>     Laszlo
>
>
>     On 8/3/19 8:41 PM, Donny Davis wrote:
>     > I am using the cinder-lvm backend right now and performance is quite good. My situation is similar without the migration parts. Prior to this arrangement I was using iscsi to mount a disk in /var/lib/nova/instances and that also worked quite well. 
>     >
>     > If you don't mind me asking, what kind of i/o performance are you looking for?
>     >
>     > On Fri, Aug 2, 2019 at 12:25 PM Budai Laszlo <laszlo.budai@gmail.com <mailto:laszlo.budai@gmail.com> <mailto:laszlo.budai@gmail.com <mailto:laszlo.budai@gmail.com>>> wrote:
>     >
>     >     Thank you Daniel,
>     >
>     >     My colleague found the same solution in the meantime. And that helped us as well.
>     >
>     >     Kind regards,
>     >     Laszlo
>     >
>     >     On 8/2/19 6:50 PM, Daniel Speichert wrote:
>     >     > For the case of simply using local disk mounted for /var/lib/nova and raw disk image type, you could try adding to nova.conf:
>     >     >
>     >     >     preallocate_images = space
>     >     >
>     >     > This implicitly changes the I/O method in libvirt from "threads" to "native", which in my case improved performance a lot (10 times) and generally is the best performance I could get.
>     >     >
>     >     > Best Regards
>     >     > Daniel
>     >     >
>     >     > On 8/2/2019 10:53, Budai Laszlo wrote:
>     >     >> Hello all,
>     >     >>
>     >     >> we have a problem with the performance of the disk IO in a KVM instance.
>     >     >> We are trying to provision VMs with high performance SSDs. we have investigated different possibilities with different results ...
>     >     >>
>     >     >> 1. configure Nova to use local LVM storage (images_types = lvm) - provided the best performance, but we could not migrate our instances (seems to be a bug).
>     >     >> 2. use cinder with lvm backend  and instance locality, we could migrate the instances, but the performance is less than half of the previous case
>     >     >> 3. mount the ssd on /var/lib/nova/instances and use the images_type = raw in nova. We could migrate, but the write performance dropped to ~20% of the images_types = lvm performance and read performance is ~65% of the lvm case.
>     >     >>
>     >     >> do you have any idea to improve the performance for any of the cases 2 or 3 which allows migration.
>     >     >>
>     >     >> Kind regards,
>     >     >> Laszlo
>     >     >>
>     >
>     >
>