[nova] Slow nvme performance for local storage instances

smooney at redhat.com smooney at redhat.com
Tue Aug 15 14:21:29 UTC 2023


On Mon, 2023-08-14 at 17:29 +0200, Sven Kieske wrote:
> Hi,
> 
> Am Montag, dem 14.08.2023 um 14:37 +0200 schrieb Jan Wasilewski:
> > *[2] fio results of OpenStack managed instance with "vdb" attached:
> > https://paste.openstack.org/show/bViUpJTf7UYpsRyGCAt9/
> > <https://paste.openstack.org/show/bViUpJTf7UYpsRyGCAt9/>*
> > *[3] dumpxml of Libvirt managed instance with "vdb" attached:
> > https://paste.openstack.org/show/bGv8dT1l2QaTiAybYrJi/
> > <https://paste.openstack.org/show/bGv8dT1l2QaTiAybYrJi/>*

looking at this xml you attach the qcow file via ide and passthough the nvme dev
directly via virtio-blk

<disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='none' io='native' discard='unmap'/>
      <source file='/var/lib/nova/instances/test/disk' index='1'/>
      <backingStore type='file' index='2'>
        <format type='raw'/>
        <source file='/var/lib/nova/instances/_base/78f03ab8f57b6e53f615f89f7ca212c729cb2f29'/>
        <backingStore/>
      </backingStore>
      <target dev='hda' bus='ide'/>
      <alias name='ide0-0-0'/>
      <address type='drive' controller='0' bus='0' target='0' unit='0'/>
    </disk>

    <disk type='block' device='disk'>
      <driver name='qemu' type='raw'/>
      <source dev='/dev/nvme1n1p1' index='4'/>
      <backingStore/>
      <target dev='vdb' bus='virtio'/>
      <alias name='virtio-disk1'/>
      <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
    </disk>

that is not a fir comparison as ide will also bottleneck the performance
you shoudl use the same bus for both.

> > *[4] fio results of Libvirt managed instance with "vdb" attached:
> > https://paste.openstack.org/show/bOzYXkbco0oDfgaD0co8/
> > <https://paste.openstack.org/show/bOzYXkbco0oDfgaD0co8/>*
> > *[5] xml configuration of vdb drive:
> > https://paste.openstack.org/show/bAJ9MyEWEGOteeJnH5D8/
> > <https://paste.openstack.org/show/bAJ9MyEWEGOteeJnH5D8/>*
> 
> one difference I can see in the fio results, is that the openstack
> provided vm does a lot more context switches and has a different cpu
> usage profile in general:
> 
> Openstack Instance:
> 
>   cpu          : usr=27.16%, sys=62.24%, ctx=3246653, majf=0, minf=14
> 
> plain libvirt instance:
> 
>   cpu          : usr=15.75%, sys=56.31%, ctx=2860657, majf=0, minf=15

one thing this might be related is the libvirt created vm does not have the
virtual performance monitoring unit enabled (vPMU).
i added the ablity to turn that off a few relases ago 
https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/libvirt-pmu-configuration.html
via a boolean image metadata key hw_pmu=True|False and a corresponding flavor extra spec hw:pmu=True|False 
so you coudl try disabling that and see if it helps with the context switching.
> 
> this indicates, that some other workload is running there or work is
> scheduled at least in a different way then on the plain libvirt
> machine, one example to check might be the irq balancing on different
> cores, but I can't remember atm, if this is fixed already on this
> kernel release (iirc in the past you used to run the irq-balance daemon
> which got obsolete after kernel 4.19 according to
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=926967 )
> 
> how many other vms are running on that openstack hypervisor?
> 
> I hope the hypervisor is not oversubscribed? You can easily see this
> in a modern variant of "top" which reports stolen cpu cycles, if you
> got cpu steal your cpu is oversubscribed.
> 
> depending on the deployment, you will of course also incur additional
> overhead from other openstack services - beginning with nova, which
> might account for the additional context switches on the hypervisor.
> 
> In general 3 million context switches is not that much and should not
> impact performance by much, but it's still a noticeable difference
> between the two systems.
> 
> are the cpu models on the hypervisors exactly the same? I can't tell it
> from the libvirt dumps, but I notice that certain cpu flags are
> explicitly set for the libvirt managed instance, which might affect the
> end result.
> 
> What's more bothering is, that the libvirt provided VM
> has a total cpu usage of roundabout 70% whereas the openstack provided
> one is closer to 90%.
> 
> this leads me to believe that either one of the following is true:
> 
> - the hypervisor cpus differ in a meaningful way, performance wise.
> - the hypervisor is somehow oversubscribed / has more work to do for
> the openstack deployed server, which results in worse benchmarks/more
> cpu being burnt by constantly evicting the task from the lower level
> l1/l2 cpu caches.
> - the context switches eat up significant cpu performance on the
> openstack instance (least likely imho).
> 
> what would be interesting to know would be if mq-deadline and multi
> queue are enabled in the plain libvirt machine (are libvirt and qemu
> versions the same as in the openstack deploment?).
> 
> you can check this like it is described here:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1827722
> 
> But I don't see "num_queues" or "queues" mentioned anywhere, so I
> assume it's turned off. Enabling it could also boost your performance
> by a lot.
we do not support multi queue for virtio blk or scsi in nova
its on our todo list but not available in any current release.
https://review.opendev.org/c/openstack/nova-specs/+/878066
the person that was propsoign this is nolonger working on openstack so if
peopel are interest feel free to get involved.
otherwise it will liely get enabled in a release or two when we find time to work on it.
> 
> Another thing to check - especially since I noticed the cpu differences
> - would be the numa layout of the hypervisor and how the VM is affected
> by it.
> 




More information about the openstack-discuss mailing list