HI, You wrote "/sys/devices/system/cpu/*/cpuidle/state*/disable output is 0 for all cpus". It means all C-states (power saving states) are _enabled_. This may cause lower and inconsistent results. I would repeat the test with deeper C-states disabled. I think simplest way to do that is to boot system (nova compute node) with "|intel_idle.max_cstate=1" added to kernel command line parameters. I had similar issues with I/O performance inside VMs (but with lower disk queue depth) and power saving / frequency scaling had greatest influence on the results and also caused variations in the results between test runs. If you are out of ideas you could also rule out disk / filesystem / RAID configuration influence by temporary mounting tmpfs in /var/lib/nova/instances so the instances will have RAM-backed volumes. You need enough RAM for that of course. | On 10/08/2023 13:35, Jan Wasilewski wrote:
Hi,
I wanted to express my sincere gratitude for all the help and advice you've given me. I followed your suggestions and carried out a bunch of tests, but unfortunately, the performance boost I was hoping for hasn't materialized.
Let me break down the configurations I've tried and the results I've got. Just to give you some context, all my tests were done using two INTEL SSDPE2MD400G4 NVMe disks and Ubuntu 20.04LTS as the OS on the compute node. You can find all the nitty-gritty details in [1] and [2]. Additionally, I've shared the results of the fio tests directly executed on the RAID directory within the compute node in [3].
Then, I expanded my testing to instances, and here's what I found:
1. I tested things out with the default settings and Ubuntu 22.04 LTS image. The iOPS results were hovering around 18-18.5k. Check out [4] and [5] for the specifics. 2. I tweaked the nova.conf file with two changes: force_raw_images = true and images_type = flat. Unfortunately, this only brought the iOPS down a bit, to just under 18k. You can see more in [6] and [7]. 3. I made an extra change in nova.conf by switching the cpu_model from SandyBridge to IvyBridge. This change dropped the iOPS further, to around 17k. Details are in [8] and [9]. 4. Lastly, I played around with image properties, setting hw_scsi_model=virtio-scsi and hw_disk_bus=scsi. However, this also resulted in around 17k iOPS. You can find out more in [10] and [11].
It's a bit disheartening that none of these changes seemed to have the impact I was aiming for. So, I'm starting to think there might be a crucial piece of the puzzle that I'm missing here. If you have any ideas or insights, I'd be incredibly grateful for your input.
Thanks once more for all your help and support.
/Jan Wasilewski
/References: / /[1] Disk details and raid details: https://paste.openstack.org/show/bRyLPZ6TDHpIEKadLC7z// /[2] Compute node and nova details: https://paste.openstack.org/show/bcGw3Glm6U0r1kUsg8nU// /[3] fio results executed in raid directory inside compute node: https://paste.openstack.org/show/bN0EkBjoAP2Ig5PSSfy3// /[4] dumpxml of instance from test 1: https://paste.openstack.org/show/bVSq8tz1bSMdiYXcF3IP// /[5] fio results from test 1: https://paste.openstack.org/show/bKlxom8Yl7NtHO8kO53a// /[6] dumpxml of instance from test 2: https://paste.openstack.org/show/bN2JN9DXT4DGKNZnzkJ8// /[7] fio results from test 2: https://paste.openstack.org/show/b7GXIVI2Cv0qkVLQaAF3// /[8] dumpxml of instance from test 3: https://paste.openstack.org/show/b0821V4IUq8N7YPb73sg// /[9] fio results from test 3: https://paste.openstack.org/show/bT1Erfxq4XTj0ubTTgdj// /[10] dumpxml of instance from test 4: https://paste.openstack.org/show/bjTXM0do1xgzmVZO02Q7// /[11] fio results from test 4: https://paste.openstack.org/show/bpbVJntkR5aNke3trtRd//
śr., 9 sie 2023 o 19:56 Damian Pietras <damian.pietras@hardit.pl> napisał(a):
I would suggest to:
- make sure that "none" I/O scheduler is used inside VM (e.g. /sys/block/sda/queue/scheduler). I assume quite recent kernel, otherwise "noop".
- make sure that host has CPU C-States above C1 disabled (check values of all /sys/devices/system/cpu/*/cpuidle/state*/disable for while [..]/name is different than "POLL", C1, C1E) or use some tool that disables that.
- Use raw images instead of qcow2: in [libvirt] section of nova.conf set force_raw_images=True and images_type=flat and recreate the instance
Is the difference so big also when you lower I/O depth (for example to 1) or increase block size (for example to 64k) ?
On 09/08/2023 10:02, Jan Wasilewski wrote:
Hi,
I am reaching out to inquire about the performance of our local storage setup. Currently, I am conducting tests using NVMe disks; however, the results appear to be underwhelming.
In terms of my setup, I have recently incorporated two NVMe disks into my compute node. These disks have been configured as RAID1 under md127 and subsequently mounted at /var/lib/nova/instances [1]. During benchmarking using the fio tool within this directory, I am achieving approximately 160,000 IOPS [2]. This figure serves as a satisfactory baseline and reference point for upcoming VM tests.
As the next phase, I have established a flavor that employs a root disk for my virtual machine [3]. Regrettably, the resulting performance yields around 18,000 IOPS, which is nearly ten times poorer than the compute node results [4]. While I expected some degradation, a tenfold decrease seems excessive. Realistically, I anticipated no more than a twofold reduction compared to the compute node's performance. Hence, I am led to ask: what should be configured to enhance performance?
I have already experimented with the settings recommended on the Ceph page for image properties [5]; however, these changes did not yield the desired improvements. In addition, I attempted to modify the CPU architecture within the nova.conf file, switching to Cascade Lake architecture, yet this endeavor also proved ineffective. For your convenience, I have included a link to my current dumpxml results [6].
Your insights and guidance would be greatly appreciated. I am confident that there is a solution to this performance disparity that I may have overlooked. Thank you in advance for your help.
/Jan Wasilewski
/References:/ /[1] nvme allocation and raid configuration: https://paste.openstack.org/show/bMMgGqu5I6LWuoQWV7TV// /[2] fio performance inside compute node: https://paste.openstack.org/show/bcMi4zG7QZwuJZX8nyct// /[3] Flavor configuration: https://paste.openstack.org/show/b7o9hCKilmJI3qyXsP5u// /[4] fio performance inside VM: https://paste.openstack.org/show/bUjqxfU4nEtSFqTlU8oH// /[5] image properties: https://docs.ceph.com/en/pacific/rbd/rbd-openstack/#image-properties/ /[6] dumpxml of vm: https://paste.openstack.org/show/bRECcaSMqa8TlrPp0xrT//
-- Damian Pietras
-- Damian Pietras