- Is there any isolation done at the Kernel level of the compute OS? No, There is no changes made in kernel. - What does your flavor look like right now? Is the behavior different if you remove the numa constraint? My flavor has currently below specs set. | dec2e31d-e1e8-4ff2-90d5-955be7e9efd3 | RC-16G | 16384 | 5 | 0 | 8 | True | 0 | 1.0 | hw:cpu_cores='2', hw:cpu_sockets='4' | But I have set cpu pinning in compute node. [compute] cpu_shared_set = 2-7,10-15,18-23,26-31 If I remove hw:cpu_* from flavor and remove above config from nova.conf of compute. Instance deployment works fine. You seem to have 4 NUMA as well, but only two physical sockets (8CPUx2threads - 16 vCPUs per socket x 2 = 32) This is my test compute node have 4 cpu sockets and 4 numa nodes. root@kvm10-a1-khi01:~# numactl -H available: 4 nodes (0-3) node 0 cpus: 0 1 2 3 4 5 6 7 node 0 size: 31675 MB node 0 free: 30135 MB node 1 cpus: 8 9 10 11 12 13 14 15 node 1 size: 64510 MB node 1 free: 63412 MB node 2 cpus: 16 17 18 19 20 21 22 23 node 2 size: 64510 MB node 2 free: 63255 MB node 3 cpus: 24 25 26 27 28 29 30 31 node 3 size: 64485 MB node 3 free: 63117 MB node distances: node 0 1 2 3 0: 10 16 16 16 1: 16 10 16 16 2: 16 16 10 16 3: 16 16 16 10 The generated XML seems to be set for a 4 socket topology? Yes I am testing this on my test compute node first. On Wed, Jun 30, 2021 at 7:15 AM Laurent Dumont <laurentfdumont@gmail.com> wrote:
- Is there any isolation done at the Kernel level of the compute OS? - What does your flavor look like right now? Is the behavior different if you remove the numa constraint?
You seem to have 4 NUMA as well, but only two physical sockets (8CPUx2threads - 16 vCPUs per socket x 2 = 32)
The generated XML seems to be set for a 4 socket topology?
<cpu mode='custom' match='exact' check='partial'> <model fallback='allow'>Opteron_G5</model> <topology sockets='4' cores='2' threads='1'/> </cpu>
On Tue, Jun 29, 2021 at 12:41 PM Ammad Syed <syedammad83@gmail.com> wrote:
Hi Stephen,
I have checked all cpus are online.
root@kvm10-a1-khi01:/etc/nova# lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian Address sizes: 48 bits physical, 48 bits virtual CPU(s): 32 On-line CPU(s) list: 0-31 Thread(s) per core: 2 Core(s) per socket: 8 Socket(s): 2 NUMA node(s): 4 Vendor ID: AuthenticAMD CPU family: 21
I have made below configuration in nova.conf.
[compute]
cpu_shared_set = 2-7,10-15,18-23,26-31
Below is the xml in nova logs that nova is trying to create domain.
2021-06-29 16:30:56.576 2819 ERROR nova.virt.libvirt.guest [req-c76c6809-1775-43a8-bfb1-70f6726cad9d 2af528fdf3244e15b4f3f8fcfc0889c5 890eb2b7d1b8488aa88de7c34d08817a - default default] Error launching a defined domain with XML: <domain type='kvm'> <name>instance-0000026d</name> <uuid>06ff4fd5-b21f-4f64-9dde-55e86dd15da6</uuid> <metadata> <nova:instance xmlns:nova=" http://openstack.org/xmlns/libvirt/nova/1.1"> <nova:package version="23.0.0"/> <nova:name>cpu</nova:name> <nova:creationTime>2021-06-29 16:30:50</nova:creationTime> <nova:flavor name="RC-16G"> <nova:memory>16384</nova:memory> <nova:disk>5</nova:disk> <nova:swap>0</nova:swap> <nova:ephemeral>0</nova:ephemeral> <nova:vcpus>8</nova:vcpus> </nova:flavor> <nova:owner> <nova:user uuid="2af528fdf3244e15b4f3f8fcfc0889c5">admin</nova:user> <nova:project uuid="890eb2b7d1b8488aa88de7c34d08817a">admin</nova:project> </nova:owner> <nova:root type="image" uuid="1024317b-0db6-418c-bc39-de9b61d8ce59"/> <nova:ports> <nova:port uuid="dccf68a2-ec48-4985-94ce-b3487cfc99f3"> <nova:ip type="fixed" address="192.168.100.106" ipVersion="4"/> </nova:port> </nova:ports> </nova:instance> </metadata> <memory unit='KiB'>16777216</memory> <currentMemory unit='KiB'>16777216</currentMemory> <vcpu placement='static' cpuset='2-7,10-15,18-23,26-31'>8</vcpu> <cputune> <shares>8192</shares> </cputune> <sysinfo type='smbios'> <system> <entry name='manufacturer'>OpenStack Foundation</entry> <entry name='product'>OpenStack Nova</entry> <entry name='version'>23.0.0</entry> <entry name='serial'>06ff4fd5-b21f-4f64-9dde-55e86dd15da6</entry> <entry name='uuid'>06ff4fd5-b21f-4f64-9dde-55e86dd15da6</entry> <entry name='family'>Virtual Machine</entry> </system> </sysinfo> <os> <type arch='x86_64' machine='pc-i440fx-4.2'>hvm</type> <boot dev='hd'/> <smbios mode='sysinfo'/> </os> <features> <acpi/> <apic/> </features> <cpu mode='custom' match='exact' check='partial'> <model fallback='allow'>Opteron_G5</model> <topology sockets='4' cores='2' threads='1'/> </cpu> <clock offset='utc'> <timer name='pit' tickpolicy='delay'/> <timer name='rtc' tickpolicy='catchup'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source file='/var/lib/nova/instances/06ff4fd5-b21f-4f64-9dde-55e86dd15da6/disk'/> <target dev='sda' bus='scsi'/> <address type='drive' controller='0' bus='0' target='0' unit='0'/> </disk> <controller type='scsi' index='0' model='virtio-scsi'> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </controller> <controller type='usb' index='0' model='piix3-uhci'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'/> <interface type='bridge'> <mac address='fa:16:3e:f5:46:7d'/> <source bridge='br-int'/> <virtualport type='openvswitch'> <parameters interfaceid='dccf68a2-ec48-4985-94ce-b3487cfc99f3'/> </virtualport> <target dev='tapdccf68a2-ec'/> <model type='virtio'/> <driver name='vhost' queues='8'/> <mtu size='1442'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <log file='/var/lib/nova/instances/06ff4fd5-b21f-4f64-9dde-55e86dd15da6/console.log' append='off'/> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> </serial> <console type='pty'> <log file='/var/lib/nova/instances/06ff4fd5-b21f-4f64-9dde-55e86dd15da6/console.log' append='off'/> <target type='serial' port='0'/> </console> <input type='tablet' bus='usb'> <address type='usb' bus='0' port='1'/> </input> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='spice' autoport='yes' listen='0.0.0.0'> <listen type='address' address='0.0.0.0'/> </graphics> <video> <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <stats period='10'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </memballoon> <rng model='virtio'> <backend model='random'>/dev/urandom</backend> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </rng> </devices> </domain> : libvirt.libvirtError: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine-qemu\x2d1\x2dinstance\x2d0000026d.scope/emulator/cpuset.cpus': Permission denied
Ammad
On Tue, Jun 29, 2021 at 8:29 PM Stephen Finucane <stephenfin@redhat.com> wrote:
Thanks,, the information is really helpful. I am have set below
flavor according to my numa policies.
--property hw:numa_nodes=FLAVOR-NODES \ --property hw:numa_cpus.N=FLAVOR-CORES \ --property hw:numa_mem.N=FLAVOR-MEMORY
I am having below error in compute logs. Any advise.
libvirt.libvirtError: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine- qemu\x2d48\x2dinstance\x2d0000026b.scope/emulator/cpuset.cpus': Permission denied 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest Traceback (most recent call last): 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest File "/usr/lib/python3/dist-packages/nova/virt/libvirt/guest.py", line 155, in launch 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest return self._domain.createWithFlags(flags) 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 193, in doit 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest result = proxy_call(self._autowrap, f, *args, **kwargs) 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 151, in
2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest rv = execute(f, *args, **kwargs) 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 132, in execute 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest six.reraise(c, e, tb) 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest File "/usr/lib/python3/dist-packages/six.py", line 703, in reraise 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest raise value 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest File "/usr/lib/python3/dist-packages/eventlet/tpool.py", line 86, in tworker 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest rv = meth(*args, **kwargs) 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest File "/usr/lib/python3/dist-packages/libvirt.py", line 1265, in createWithFlags 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest if ret == -1: raise libvirtError ('virDomainCreateWithFlags() failed', dom=self) 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest libvirt.libvirtError: Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine- qemu\x2d48\x2dinstance\x2d0000026b.scope/emulator/cpuset.cpus': Permission denied 2021-06-29 12:33:10.144 1310945 ERROR nova.virt.libvirt.guest 2021-06-29 12:33:10.146 1310945 ERROR nova.virt.libvirt.driver [req-4f6fc6aa- 04d6-4dc0-921f-2913b40a76a9 2af528fdf3244e15b4f3f8fcfc0889c5 890eb2b7d1b8488aa88de7c34d08817a - default default] [instance: ed87bf68-b631- 4a00-9eb5-22d32ec37402] Failed to start libvirt guest:
On Tue, 2021-06-29 at 17:44 +0500, Ammad Syed wrote: properties to proxy_call libvirt.libvirtError:
Unable to write to '/sys/fs/cgroup/cpuset/machine.slice/machine- qemu\x2d48\x2dinstance\x2d0000026b.scope/emulator/cpuset.cpus': Permission denied 2021-06-29 12:33:10.150 1310945 INFO os_vif [req-4f6fc6aa-04d6-4dc0-921f- 2913b40a76a9 2af528fdf3244e15b4f3f8fcfc0889c5 890eb2b7d1b8488aa88de7c34d08817a - default default] Successfully unplugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:ba:3d:c8,bridge_name='br- int',has_traffic_filtering=True,id=a991cd33-2610-4823-a471- 62171037e1b5,network=Network(a0d85af2-a991-4102-8453-
lete=False,vif_name='tapa991cd33-26') 2021-06-29 12:33:10.151 1310945 INFO nova.virt.libvirt.driver [req-4f6fc6aa- 04d6-4dc0-921f-2913b40a76a9 2af528fdf3244e15b4f3f8fcfc0889c5 890eb2b7d1b8488aa88de7c34d08817a - default default] [instance: ed87bf68-b631- 4a00-9eb5-22d32ec37402] Deleting instance files /var/lib/nova/instances/ed87bf68-b631-4a00-9eb5-22d32ec37402_del 2021-06-29 12:33:10.152 1310945 INFO nova.virt.libvirt.driver [req-4f6fc6aa- 04d6-4dc0-921f-2913b40a76a9 2af528fdf3244e15b4f3f8fcfc0889c5 890eb2b7d1b8488aa88de7c34d08817a - default default] [instance: ed87bf68-b631- 4a00-9eb5-22d32ec37402] Deletion of /var/lib/nova/instances/ed87bf68-b631-4a00- 9eb5-22d32ec37402_del complete 2021-06-29 12:33:10.258 1310945 ERROR nova.compute.manager [req-4f6fc6aa-04d6- 4dc0-921f-2913b40a76a9 2af528fdf3244e15b4f3f8fcfc0889c5 890eb2b7d1b8488aa88de7c34d08817a - default default] [instance: ed87bf68-b631- 4a00-9eb5-22d32ec37402] Instance failed to spawn:
ba68c5e10b65),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_de libvirt.libvirtError: Unable
to write to '/sys/fs/cgroup/cpuset/machine.slice/machine- qemu\x2d48\x2dinstance\x2d0000026b.scope/emulator/cpuset.cpus': Permission denied
Any advise how to fix this permission issue ?
I have manually created the directory machine-qemu in /sys/fs/cgroup/cpuset/machine.slice/ but still having the same error.
I have also tried to set [compute] cpu_shared_set AND [compute] cpu_dedicated_set they are also giving the same error.
There are quite a few bugs about this [1][2]. It seems most of them are caused by CPUs being offlined. Have you offline CPUs? Are the CPUs listed in the mask all available?
Stephen
[1] https://bugzilla.redhat.com/show_bug.cgi?id=1609785 [2] https://bugzilla.redhat.com/show_bug.cgi?id=1842716
Using ubuntu20.04 and qemu-kvm 4.2.
Ammad
On Fri, 2021-06-25 at 10:02 +0500, Ammad Syed wrote:
Hi,
I am using openstack wallaby on ubuntu 20.04 and kvm. I am working to make optimized flavor properties that should provide optimal
reviewing the document below.
https://docs.openstack.org/nova/wallaby/admin/cpu-topologies.html
I have two socket AMD compute node. The workload running on nodes are mixed workload.
My question is should I use default nova CPU topology and NUMA node that nova deploys instance by default OR should I use hw:cpu_sockets='2' and hw:numa_nodes='2'. the latter hw:cpu_sockets='2' and hw:numa_nodes='2' should give you better performce however you should also set hw:mem_page_size=small or hw:mem_page_size=any when you enable virtual numa policies we afinities the guest memory to host numa nodes. This can lead to Out of memory evnet on the the host numa nodes which can result in vms being killed by the host kernel memeory reaper if you do not enable numa aware memeory trackign iin nova which is done by setting hw:mem_page_size. setting hw:mem_page_size has the side effect of of disabling memory over commit so you have to bare that in mind. if you are using numa toplogy you should almost always also use hugepages which are enabled using hw:mem_page_size=large this however requires you to configure hupgepages in the host at boot.
Which one from above provide best instance performance ? or any other tuning should I do ?
in the libvirt driver the default cpu toplogy we will genergated is 1 thread per core, 1 core per socket and 1 socket per flavor.vcpu. (technially this is an undocumeted implemation detail that you should not rely on, we have the hw:cpu_* element if you care about the toplogy)
this was more effincet in the early days of qemu/openstack but has may issue when software is chagne per sokcet or oepreating systems have a limit on socket supported such as windows.
generally i advies that you set hw:cpu_sockets to the typical number of sockets on the underlying host. simialrly if the flavor will only be run on host with SMT/hypertreading enabled on you shoudl set hw:cpu_threads=2
the flavor.vcpus must be devisable by the product of hw:cpu_sockets, hw:cpu_cores and hw:cpu_threads if they are set.
so if you have hw:cpu_threads=2 it must be devisable by 2 if you have hw:cpu_threads=2 and hw:cpu_sockets=2 flavor.vcpus must be a multiple of 4
The note in the URL (CPU topology sesion) suggests that I should
default options that nova provides. in generaly no you should aling it to the host toplogy if you have similar toplogy across your data center.
stay with the default should always just work but its not nessisarly optimal and window sguest might not boot if you have too many sockets. windows 10 for exmple only supprot 2 socket so you could only have 2 flavor.vcpus if you used the default toplogy.
Currently it also works with libvirt/QEMU driver but we don’t
in production use cases. This is because vCPUs are actually running in one thread on host in qemu TCG (Tiny Code Generator), which is the backend for libvirt/QEMU driver. Work to enable full multi-threading support for TCG (a.k.a. MTTCG) is on going in QEMU community. Please see this MTTCG project <http://wiki.qemu.org/Features/tcg-multithread> page for detail. we do not gnerally recommende using qemu without kvm in produciton.
recommend it the mttcg backend is useful in cases where you want to emulate other
On Fri, Jun 25, 2021 at 10:54 AM Sean Mooney <smooney@redhat.com> wrote: performance. I was plathform
but that usecsae is not currently supported in nova. for your deployment you should use libvirt with kvm and you should also consider if you want to support nested virtualisation or not.
Ammad
-- Regards,
Syed Ammad Ali
-- Regards, Syed Ammad Ali