Hi,

Thank you once again for your valuable suggestions. I conducted another round of tests with C-states disabled. I checked the BIOS settings and added the suggested line to the grub startup. After rebooting my compute node, I observed an improvement in performance, reaching around 20,000 IOPS. Although there was a modest performance boost, it wasn't as substantial as I had anticipated.
Additionally, I configured a ramdisk to establish a baseline for comparison. The results were quite significant, with the ramdisk achieving approximately 72,000 IOPS [1] [2]. However, I had initially expected even higher figures. Regardless, such outcomes would be highly beneficial for my NVMe virtual machines.
Nonetheless, I'm at a loss regarding potential further optimizations. I've explored some resources, such as those found at: https://docs.openstack.org/nova/rocky/user/flavors.html, which outline IO limits. However, I am under the impression that these limits might only restrict performance rather than enhancing it. Could you kindly confirm if my understanding is accurate?
I extend my gratitude in advance for any forthcoming suggestions. It's possible that I might be searching in the wrong places for solutions.
/Jan Wasilewski

References:
[2] fio results of vm where ramdisk is a main disk: https://paste.openstack.org/show/bdII56cavmVwNAIq3axQ/

czw., 10 sie 2023 o 14:05 Damian Pietras <damian.pietras@hardit.pl> napisał(a):
HI,

You wrote "/sys/devices/system/cpu/*/cpuidle/state*/disable output is 0 for all cpus". It means all C-states (power saving states) are _enabled_.
This may cause lower and inconsistent results. I would repeat the test with deeper C-states disabled. I think simplest way to do that is to boot
system (nova compute node) with "|intel_idle.max_cstate=1" added to kernel command line parameters. I had
similar issues with I/O performance inside VMs (but with lower disk
queue depth) and power saving / frequency scaling had greatest influence
on the results and also caused variations in the results between test
runs. If you are out of ideas you could also rule out disk / filesystem
/ RAID configuration influence by temporary mounting tmpfs in
/var/lib/nova/instances so the instances will have RAM-backed volumes.
You need enough RAM for that of course. |

On 10/08/2023 13:35, Jan Wasilewski wrote:
> Hi,
>
> I wanted to express my sincere gratitude for all the help and advice
> you've given me. I followed your suggestions and carried out a bunch
> of tests, but unfortunately, the performance boost I was hoping for
> hasn't materialized.
>
> Let me break down the configurations I've tried and the results I've
> got. Just to give you some context, all my tests were done using two
> INTEL SSDPE2MD400G4 NVMe disks and Ubuntu 20.04LTS as the OS on the
> compute node. You can find all the nitty-gritty details in [1] and
> [2]. Additionally, I've shared the results of the fio tests directly
> executed on the RAID directory within the compute node in [3].
>
> Then, I expanded my testing to instances, and here's what I found:
>
>  1. I tested things out with the default settings and Ubuntu 22.04 LTS
>     image. The iOPS results were hovering around 18-18.5k. Check out
>     [4] and [5] for the specifics.
>  2. I tweaked the nova.conf file with two changes: force_raw_images =
>     true and images_type = flat. Unfortunately, this only brought the
>     iOPS down a bit, to just under 18k. You can see more in [6] and [7].
>  3. I made an extra change in nova.conf by switching the cpu_model
>     from SandyBridge to IvyBridge. This change dropped the iOPS
>     further, to around 17k. Details are in [8] and [9].
>  4. Lastly, I played around with image properties, setting
>     hw_scsi_model=virtio-scsi and hw_disk_bus=scsi. However, this also
>     resulted in around 17k iOPS. You can find out more in [10] and [11].
>
> It's a bit disheartening that none of these changes seemed to have the
> impact I was aiming for. So, I'm starting to think there might be a
> crucial piece of the puzzle that I'm missing here. If you have any
> ideas or insights, I'd be incredibly grateful for your input.
>
> Thanks once more for all your help and support.
>
> /Jan Wasilewski
>
>
> /References:
> /
> /[1]  Disk details and raid details:
> https://paste.openstack.org/show/bRyLPZ6TDHpIEKadLC7z//
> /[2] Compute node and nova details:
> https://paste.openstack.org/show/bcGw3Glm6U0r1kUsg8nU//
> /[3] fio results executed in raid directory inside compute node:
> https://paste.openstack.org/show/bN0EkBjoAP2Ig5PSSfy3//
> /[4] dumpxml of instance from test 1:
> https://paste.openstack.org/show/bVSq8tz1bSMdiYXcF3IP//
> /[5] fio results from test 1:
> https://paste.openstack.org/show/bKlxom8Yl7NtHO8kO53a//
> /[6] dumpxml of instance from test 2:
> https://paste.openstack.org/show/bN2JN9DXT4DGKNZnzkJ8//
> /[7] fio results from test 2:
> https://paste.openstack.org/show/b7GXIVI2Cv0qkVLQaAF3//
> /[8] dumpxml of instance from test 3:
> https://paste.openstack.org/show/b0821V4IUq8N7YPb73sg//
> /[9] fio results from test 3:
> https://paste.openstack.org/show/bT1Erfxq4XTj0ubTTgdj//
> /[10] dumpxml of instance from test 4:
> https://paste.openstack.org/show/bjTXM0do1xgzmVZO02Q7//
> /[11] fio results from test 4:
> https://paste.openstack.org/show/bpbVJntkR5aNke3trtRd//
>
>
> śr., 9 sie 2023 o 19:56 Damian Pietras <damian.pietras@hardit.pl>
> napisał(a):
>
>     I would suggest to:
>
>     - make sure that "none" I/O scheduler is used inside VM (e.g.
>     /sys/block/sda/queue/scheduler). I assume quite recent kernel,
>     otherwise "noop".
>
>     - make sure that host has CPU C-States above C1 disabled (check
>     values of all /sys/devices/system/cpu/*/cpuidle/state*/disable for
>     while [..]/name is different than "POLL", C1, C1E) or use some
>     tool that disables that.
>
>     - Use raw images instead of qcow2: in [libvirt] section of
>     nova.conf set force_raw_images=True and images_type=flat and
>     recreate the instance
>
>     Is the difference so big also when you lower I/O depth (for
>     example to 1) or increase block size (for example to 64k) ?
>
>
>     On 09/08/2023 10:02, Jan Wasilewski wrote:
>>     Hi,
>>
>>     I am reaching out to inquire about the performance of our local
>>     storage setup. Currently, I am conducting tests using NVMe disks;
>>     however, the results appear to be underwhelming.
>>
>>     In terms of my setup, I have recently incorporated two NVMe disks
>>     into my compute node. These disks have been configured as RAID1
>>     under md127 and subsequently mounted at /var/lib/nova/instances
>>     [1]. During benchmarking using the fio tool within this
>>     directory, I am achieving approximately 160,000 IOPS [2]. This
>>     figure serves as a satisfactory baseline and reference point for
>>     upcoming VM tests.
>>
>>     As the next phase, I have established a flavor that employs a
>>     root disk for my virtual machine [3]. Regrettably, the resulting
>>     performance yields around 18,000 IOPS, which is nearly ten times
>>     poorer than the compute node results [4]. While I expected some
>>     degradation, a tenfold decrease seems excessive. Realistically, I
>>     anticipated no more than a twofold reduction compared to the
>>     compute node's performance. Hence, I am led to ask: what should
>>     be configured to enhance performance?
>>
>>     I have already experimented with the settings recommended on the
>>     Ceph page for image properties [5]; however, these changes did
>>     not yield the desired improvements. In addition, I attempted to
>>     modify the CPU architecture within the nova.conf file, switching
>>     to Cascade Lake architecture, yet this endeavor also proved
>>     ineffective. For your convenience, I have included a link to my
>>     current dumpxml results [6].
>>
>>     Your insights and guidance would be greatly appreciated. I am
>>     confident that there is a solution to this performance disparity
>>     that I may have overlooked. Thank you in advance for your help.
>>
>>     /Jan Wasilewski
>>
>>     /References:/
>>     /[1] nvme allocation and raid configuration:
>>     https://paste.openstack.org/show/bMMgGqu5I6LWuoQWV7TV//
>>     /[2] fio performance inside compute node:
>>     https://paste.openstack.org/show/bcMi4zG7QZwuJZX8nyct//
>>     /[3] Flavor configuration:
>>     https://paste.openstack.org/show/b7o9hCKilmJI3qyXsP5u//
>>     /[4] fio performance inside VM:
>>     https://paste.openstack.org/show/bUjqxfU4nEtSFqTlU8oH//
>>     /[5] image properties:
>>     https://docs.ceph.com/en/pacific/rbd/rbd-openstack/#image-properties/
>>     /[6] dumpxml of vm:
>>     https://paste.openstack.org/show/bRECcaSMqa8TlrPp0xrT//
>
>     --
>     Damian Pietras
>
--
Damian Pietras