[nova] Slow nvme performance for local storage instances
smooney at redhat.com
smooney at redhat.com
Tue Aug 15 14:21:29 UTC 2023
On Mon, 2023-08-14 at 17:29 +0200, Sven Kieske wrote:
> Hi,
>
> Am Montag, dem 14.08.2023 um 14:37 +0200 schrieb Jan Wasilewski:
> > *[2] fio results of OpenStack managed instance with "vdb" attached:
> > https://paste.openstack.org/show/bViUpJTf7UYpsRyGCAt9/
> > <https://paste.openstack.org/show/bViUpJTf7UYpsRyGCAt9/>*
> > *[3] dumpxml of Libvirt managed instance with "vdb" attached:
> > https://paste.openstack.org/show/bGv8dT1l2QaTiAybYrJi/
> > <https://paste.openstack.org/show/bGv8dT1l2QaTiAybYrJi/>*
looking at this xml you attach the qcow file via ide and passthough the nvme dev
directly via virtio-blk
<disk type='file' device='disk'>
<driver name='qemu' type='qcow2' cache='none' io='native' discard='unmap'/>
<source file='/var/lib/nova/instances/test/disk' index='1'/>
<backingStore type='file' index='2'>
<format type='raw'/>
<source file='/var/lib/nova/instances/_base/78f03ab8f57b6e53f615f89f7ca212c729cb2f29'/>
<backingStore/>
</backingStore>
<target dev='hda' bus='ide'/>
<alias name='ide0-0-0'/>
<address type='drive' controller='0' bus='0' target='0' unit='0'/>
</disk>
<disk type='block' device='disk'>
<driver name='qemu' type='raw'/>
<source dev='/dev/nvme1n1p1' index='4'/>
<backingStore/>
<target dev='vdb' bus='virtio'/>
<alias name='virtio-disk1'/>
<address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/>
</disk>
that is not a fir comparison as ide will also bottleneck the performance
you shoudl use the same bus for both.
> > *[4] fio results of Libvirt managed instance with "vdb" attached:
> > https://paste.openstack.org/show/bOzYXkbco0oDfgaD0co8/
> > <https://paste.openstack.org/show/bOzYXkbco0oDfgaD0co8/>*
> > *[5] xml configuration of vdb drive:
> > https://paste.openstack.org/show/bAJ9MyEWEGOteeJnH5D8/
> > <https://paste.openstack.org/show/bAJ9MyEWEGOteeJnH5D8/>*
>
> one difference I can see in the fio results, is that the openstack
> provided vm does a lot more context switches and has a different cpu
> usage profile in general:
>
> Openstack Instance:
>
> cpu : usr=27.16%, sys=62.24%, ctx=3246653, majf=0, minf=14
>
> plain libvirt instance:
>
> cpu : usr=15.75%, sys=56.31%, ctx=2860657, majf=0, minf=15
one thing this might be related is the libvirt created vm does not have the
virtual performance monitoring unit enabled (vPMU).
i added the ablity to turn that off a few relases ago
https://specs.openstack.org/openstack/nova-specs/specs/train/implemented/libvirt-pmu-configuration.html
via a boolean image metadata key hw_pmu=True|False and a corresponding flavor extra spec hw:pmu=True|False
so you coudl try disabling that and see if it helps with the context switching.
>
> this indicates, that some other workload is running there or work is
> scheduled at least in a different way then on the plain libvirt
> machine, one example to check might be the irq balancing on different
> cores, but I can't remember atm, if this is fixed already on this
> kernel release (iirc in the past you used to run the irq-balance daemon
> which got obsolete after kernel 4.19 according to
> https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=926967 )
>
> how many other vms are running on that openstack hypervisor?
>
> I hope the hypervisor is not oversubscribed? You can easily see this
> in a modern variant of "top" which reports stolen cpu cycles, if you
> got cpu steal your cpu is oversubscribed.
>
> depending on the deployment, you will of course also incur additional
> overhead from other openstack services - beginning with nova, which
> might account for the additional context switches on the hypervisor.
>
> In general 3 million context switches is not that much and should not
> impact performance by much, but it's still a noticeable difference
> between the two systems.
>
> are the cpu models on the hypervisors exactly the same? I can't tell it
> from the libvirt dumps, but I notice that certain cpu flags are
> explicitly set for the libvirt managed instance, which might affect the
> end result.
>
> What's more bothering is, that the libvirt provided VM
> has a total cpu usage of roundabout 70% whereas the openstack provided
> one is closer to 90%.
>
> this leads me to believe that either one of the following is true:
>
> - the hypervisor cpus differ in a meaningful way, performance wise.
> - the hypervisor is somehow oversubscribed / has more work to do for
> the openstack deployed server, which results in worse benchmarks/more
> cpu being burnt by constantly evicting the task from the lower level
> l1/l2 cpu caches.
> - the context switches eat up significant cpu performance on the
> openstack instance (least likely imho).
>
> what would be interesting to know would be if mq-deadline and multi
> queue are enabled in the plain libvirt machine (are libvirt and qemu
> versions the same as in the openstack deploment?).
>
> you can check this like it is described here:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=1827722
>
> But I don't see "num_queues" or "queues" mentioned anywhere, so I
> assume it's turned off. Enabling it could also boost your performance
> by a lot.
we do not support multi queue for virtio blk or scsi in nova
its on our todo list but not available in any current release.
https://review.opendev.org/c/openstack/nova-specs/+/878066
the person that was propsoign this is nolonger working on openstack so if
peopel are interest feel free to get involved.
otherwise it will liely get enabled in a release or two when we find time to work on it.
>
> Another thing to check - especially since I noticed the cpu differences
> - would be the numa layout of the hypervisor and how the VM is affected
> by it.
>
More information about the openstack-discuss
mailing list