[nova] Slow nvme performance for local storage instances

Jan Wasilewski finarffin at gmail.com
Fri Aug 11 08:08:31 UTC 2023


Hi,

Thank you once again for your valuable suggestions. I conducted another
round of tests with C-states disabled. I checked the BIOS settings and
added the suggested line to the grub startup. After rebooting my compute
node, I observed an improvement in performance, reaching around 20,000
IOPS. Although there was a modest performance boost, it wasn't as
substantial as I had anticipated.
Additionally, I configured a ramdisk to establish a baseline for
comparison. The results were quite significant, with the ramdisk achieving
approximately 72,000 IOPS [1] [2]. However, I had initially expected even
higher figures. Regardless, such outcomes would be highly beneficial for my
NVMe virtual machines.
Nonetheless, I'm at a loss regarding potential further optimizations. I've
explored some resources, such as those found at:
https://docs.openstack.org/nova/rocky/user/flavors.html, which outline IO
limits. However, I am under the impression that these limits might only
restrict performance rather than enhancing it. Could you kindly confirm if
my understanding is accurate?
I extend my gratitude in advance for any forthcoming suggestions. It's
possible that I might be searching in the wrong places for solutions.
/Jan Wasilewski


*References:*
*[1] dumpxml config for ramdisk vm:
https://paste.openstack.org/show/b7AgTZBjvSpWMioJzmoA/
<https://paste.openstack.org/show/b7AgTZBjvSpWMioJzmoA/>*
*[2] fio results of vm where ramdisk is a main disk:
https://paste.openstack.org/show/bdII56cavmVwNAIq3axQ/
<https://paste.openstack.org/show/bdII56cavmVwNAIq3axQ/>*

czw., 10 sie 2023 o 14:05 Damian Pietras <damian.pietras at hardit.pl>
napisał(a):

> HI,
>
> You wrote "/sys/devices/system/cpu/*/cpuidle/state*/disable output is 0
> for all cpus". It means all C-states (power saving states) are _enabled_.
> This may cause lower and inconsistent results. I would repeat the test
> with deeper C-states disabled. I think simplest way to do that is to boot
> system (nova compute node) with "|intel_idle.max_cstate=1" added to kernel
> command line parameters. I had
> similar issues with I/O performance inside VMs (but with lower disk
> queue depth) and power saving / frequency scaling had greatest influence
> on the results and also caused variations in the results between test
> runs. If you are out of ideas you could also rule out disk / filesystem
> / RAID configuration influence by temporary mounting tmpfs in
> /var/lib/nova/instances so the instances will have RAM-backed volumes.
> You need enough RAM for that of course. |
>
> On 10/08/2023 13:35, Jan Wasilewski wrote:
> > Hi,
> >
> > I wanted to express my sincere gratitude for all the help and advice
> > you've given me. I followed your suggestions and carried out a bunch
> > of tests, but unfortunately, the performance boost I was hoping for
> > hasn't materialized.
> >
> > Let me break down the configurations I've tried and the results I've
> > got. Just to give you some context, all my tests were done using two
> > INTEL SSDPE2MD400G4 NVMe disks and Ubuntu 20.04LTS as the OS on the
> > compute node. You can find all the nitty-gritty details in [1] and
> > [2]. Additionally, I've shared the results of the fio tests directly
> > executed on the RAID directory within the compute node in [3].
> >
> > Then, I expanded my testing to instances, and here's what I found:
> >
> >  1. I tested things out with the default settings and Ubuntu 22.04 LTS
> >     image. The iOPS results were hovering around 18-18.5k. Check out
> >     [4] and [5] for the specifics.
> >  2. I tweaked the nova.conf file with two changes: force_raw_images =
> >     true and images_type = flat. Unfortunately, this only brought the
> >     iOPS down a bit, to just under 18k. You can see more in [6] and [7].
> >  3. I made an extra change in nova.conf by switching the cpu_model
> >     from SandyBridge to IvyBridge. This change dropped the iOPS
> >     further, to around 17k. Details are in [8] and [9].
> >  4. Lastly, I played around with image properties, setting
> >     hw_scsi_model=virtio-scsi and hw_disk_bus=scsi. However, this also
> >     resulted in around 17k iOPS. You can find out more in [10] and [11].
> >
> > It's a bit disheartening that none of these changes seemed to have the
> > impact I was aiming for. So, I'm starting to think there might be a
> > crucial piece of the puzzle that I'm missing here. If you have any
> > ideas or insights, I'd be incredibly grateful for your input.
> >
> > Thanks once more for all your help and support.
> >
> > /Jan Wasilewski
> >
> >
> > /References:
> > /
> > /[1]  Disk details and raid details:
> > https://paste.openstack.org/show/bRyLPZ6TDHpIEKadLC7z//
> > /[2] Compute node and nova details:
> > https://paste.openstack.org/show/bcGw3Glm6U0r1kUsg8nU//
> > /[3] fio results executed in raid directory inside compute node:
> > https://paste.openstack.org/show/bN0EkBjoAP2Ig5PSSfy3//
> > /[4] dumpxml of instance from test 1:
> > https://paste.openstack.org/show/bVSq8tz1bSMdiYXcF3IP//
> > /[5] fio results from test 1:
> > https://paste.openstack.org/show/bKlxom8Yl7NtHO8kO53a//
> > /[6] dumpxml of instance from test 2:
> > https://paste.openstack.org/show/bN2JN9DXT4DGKNZnzkJ8//
> > /[7] fio results from test 2:
> > https://paste.openstack.org/show/b7GXIVI2Cv0qkVLQaAF3//
> > /[8] dumpxml of instance from test 3:
> > https://paste.openstack.org/show/b0821V4IUq8N7YPb73sg//
> > /[9] fio results from test 3:
> > https://paste.openstack.org/show/bT1Erfxq4XTj0ubTTgdj//
> > /[10] dumpxml of instance from test 4:
> > https://paste.openstack.org/show/bjTXM0do1xgzmVZO02Q7//
> > /[11] fio results from test 4:
> > https://paste.openstack.org/show/bpbVJntkR5aNke3trtRd//
> >
> >
> > śr., 9 sie 2023 o 19:56 Damian Pietras <damian.pietras at hardit.pl>
> > napisał(a):
> >
> >     I would suggest to:
> >
> >     - make sure that "none" I/O scheduler is used inside VM (e.g.
> >     /sys/block/sda/queue/scheduler). I assume quite recent kernel,
> >     otherwise "noop".
> >
> >     - make sure that host has CPU C-States above C1 disabled (check
> >     values of all /sys/devices/system/cpu/*/cpuidle/state*/disable for
> >     while [..]/name is different than "POLL", C1, C1E) or use some
> >     tool that disables that.
> >
> >     - Use raw images instead of qcow2: in [libvirt] section of
> >     nova.conf set force_raw_images=True and images_type=flat and
> >     recreate the instance
> >
> >     Is the difference so big also when you lower I/O depth (for
> >     example to 1) or increase block size (for example to 64k) ?
> >
> >
> >     On 09/08/2023 10:02, Jan Wasilewski wrote:
> >>     Hi,
> >>
> >>     I am reaching out to inquire about the performance of our local
> >>     storage setup. Currently, I am conducting tests using NVMe disks;
> >>     however, the results appear to be underwhelming.
> >>
> >>     In terms of my setup, I have recently incorporated two NVMe disks
> >>     into my compute node. These disks have been configured as RAID1
> >>     under md127 and subsequently mounted at /var/lib/nova/instances
> >>     [1]. During benchmarking using the fio tool within this
> >>     directory, I am achieving approximately 160,000 IOPS [2]. This
> >>     figure serves as a satisfactory baseline and reference point for
> >>     upcoming VM tests.
> >>
> >>     As the next phase, I have established a flavor that employs a
> >>     root disk for my virtual machine [3]. Regrettably, the resulting
> >>     performance yields around 18,000 IOPS, which is nearly ten times
> >>     poorer than the compute node results [4]. While I expected some
> >>     degradation, a tenfold decrease seems excessive. Realistically, I
> >>     anticipated no more than a twofold reduction compared to the
> >>     compute node's performance. Hence, I am led to ask: what should
> >>     be configured to enhance performance?
> >>
> >>     I have already experimented with the settings recommended on the
> >>     Ceph page for image properties [5]; however, these changes did
> >>     not yield the desired improvements. In addition, I attempted to
> >>     modify the CPU architecture within the nova.conf file, switching
> >>     to Cascade Lake architecture, yet this endeavor also proved
> >>     ineffective. For your convenience, I have included a link to my
> >>     current dumpxml results [6].
> >>
> >>     Your insights and guidance would be greatly appreciated. I am
> >>     confident that there is a solution to this performance disparity
> >>     that I may have overlooked. Thank you in advance for your help.
> >>
> >>     /Jan Wasilewski
> >>
> >>     /References:/
> >>     /[1] nvme allocation and raid configuration:
> >>     https://paste.openstack.org/show/bMMgGqu5I6LWuoQWV7TV//
> >>     /[2] fio performance inside compute node:
> >>     https://paste.openstack.org/show/bcMi4zG7QZwuJZX8nyct//
> >>     /[3] Flavor configuration:
> >>     https://paste.openstack.org/show/b7o9hCKilmJI3qyXsP5u//
> >>     /[4] fio performance inside VM:
> >>     https://paste.openstack.org/show/bUjqxfU4nEtSFqTlU8oH//
> >>     /[5] image properties:
> >>
> https://docs.ceph.com/en/pacific/rbd/rbd-openstack/#image-properties/
> >>     /[6] dumpxml of vm:
> >>     https://paste.openstack.org/show/bRECcaSMqa8TlrPp0xrT//
> >
> >     --
> >     Damian Pietras
> >
> --
> Damian Pietras
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230811/6a45ec42/attachment-0001.htm>


More information about the openstack-discuss mailing list