numa affinity question
Dear openstack user group, I have a server with 2 numa nodes and I am trying to setup nova numa affinity. [root@zeus-53 ~]# numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 node 0 size: 262029 MB node 0 free: 2536 MB node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 node 1 size: 262144 MB node 1 free: 250648 MB node distances: node 0 1 0: 10 21 1: 21 10 openstack flavor create --public xlarge.numa.perf --ram 250000 --disk 700 --vcpus 25 --property hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property pci_passthrough:alias='nvme:4' openstack server create --network hpc --flavor xlarge.numa.perf --image centos7.6-kudu-image --availability-zone nova:zeus-53.localdomain --key-name mykey kudu-1 This is the xmldump for the created vm But for some reason the second VM fails to create with the error <name>instance-00000108</name> <uuid>5d278c90-27ab-4ee4-aeea-e1bf36ac246a</uuid> <metadata> <nova:instance xmlns:nova="http://openstack.org/xmlns/libvirt/nova/1.0"> <nova:package version="18.2.2-1.el7"/> <nova:name>kudu-4</nova:name> <nova:creationTime>2019-09-25 07:20:32</nova:creationTime> <nova:flavor name="xlarge.numa.perf"> <nova:memory>250000</nova:memory> <nova:disk>700</nova:disk> <nova:swap>0</nova:swap> <nova:ephemeral>0</nova:ephemeral> <nova:vcpus>25</nova:vcpus> </nova:flavor> <nova:owner> <nova:user uuid="91e83343e9834c8ba0172ff369c8acac">admin</nova:user> <nova:project uuid="b91520cff5bd45c59a8de07c38641582">admin</nova:project> </nova:owner> <nova:root type="image" uuid="ff9e09ac-86d5-4698-9883-cf3e6579caec"/> </nova:instance> </metadata> <memory unit='KiB'>256000000</memory> <currentMemory unit='KiB'>256000000</currentMemory> <vcpu placement='static'>25</vcpu> <cputune> <shares>25600</shares> <vcpupin vcpu='0' cpuset='27'/> <vcpupin vcpu='1' cpuset='55'/> <vcpupin vcpu='2' cpuset='50'/> <vcpupin vcpu='3' cpuset='22'/> <vcpupin vcpu='4' cpuset='49'/> <vcpupin vcpu='5' cpuset='21'/> <vcpupin vcpu='6' cpuset='48'/> <vcpupin vcpu='7' cpuset='20'/> <vcpupin vcpu='8' cpuset='25'/> <vcpupin vcpu='9' cpuset='53'/> <vcpupin vcpu='10' cpuset='18'/> <vcpupin vcpu='11' cpuset='46'/> <vcpupin vcpu='12' cpuset='51'/> <vcpupin vcpu='13' cpuset='23'/> <vcpupin vcpu='14' cpuset='19'/> <vcpupin vcpu='15' cpuset='47'/> <vcpupin vcpu='16' cpuset='26'/> <vcpupin vcpu='17' cpuset='54'/> <vcpupin vcpu='18' cpuset='42'/> <vcpupin vcpu='19' cpuset='14'/> <vcpupin vcpu='20' cpuset='17'/> <vcpupin vcpu='21' cpuset='45'/> <vcpupin vcpu='22' cpuset='43'/> <vcpupin vcpu='23' cpuset='15'/> <vcpupin vcpu='24' cpuset='24'/> <emulatorpin cpuset='14-15,17-27,42-43,45-51,53-55'/> </cputune> <numatune> <memory mode='strict' nodeset='1'/> <memnode cellid='0' mode='strict' nodeset='1'/> </numatune> <sysinfo type='smbios'> <system> <entry name='manufacturer'>RDO</entry> <entry name='product'>OpenStack Compute</entry> <entry name='version'>18.2.2-1.el7</entry> <entry name='serial'>00000000-0000-0000-0000-0cc47aa482cc</entry> <entry name='uuid'>5d278c90-27ab-4ee4-aeea-e1bf36ac246a</entry> <entry name='family'>Virtual Machine</entry> </system> </sysinfo> <os> <type arch='x86_64' machine='pc-i440fx-rhel7.6.0'>hvm</type> <boot dev='hd'/> <smbios mode='sysinfo'/> </os> <features> <acpi/> <apic/> </features> <cpu mode='host-model' check='partial'> <model fallback='allow'/> <topology sockets='25' cores='1' threads='1'/> <numa> <cell id='0' cpus='0-24' memory='256000000' unit='KiB'/> </numa> </cpu> <clock offset='utc'> <timer name='pit' tickpolicy='delay'/> <timer name='rtc' tickpolicy='catchup'/> <timer name='hpet' present='no'/> </clock> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/libexec/qemu-kvm</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source file='/var/lib/nova/instances/5d278c90-27ab-4ee4-aeea-e1bf36ac246a/disk'/> <target dev='vda' bus='virtio'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> <controller type='usb' index='0' model='piix3-uhci'> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='pci' index='0' model='pci-root'/> <interface type='bridge'> <mac address='fa:16:3e:d6:1d:78'/> <source bridge='qbr16cb6622-01'/> <target dev='tap16cb6622-01'/> <model type='virtio'/> <mtu size='1500'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <serial type='pty'> <log file='/var/lib/nova/instances/5d278c90-27ab-4ee4-aeea-e1bf36ac246a/console.log' append='off'/> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> </serial> <console type='pty'> <log file='/var/lib/nova/instances/5d278c90-27ab-4ee4-aeea-e1bf36ac246a/console.log' append='off'/> <target type='serial' port='0'/> </console> <input type='tablet' bus='usb'> <address type='usb' bus='0' port='1'/> </input> <input type='mouse' bus='ps2'/> <input type='keyboard' bus='ps2'/> <graphics type='vnc' port='-1' autoport='yes' listen='192.168.1.53'> <listen type='address' address='192.168.1.53'/> </graphics> <video> <model type='cirrus' vram='16384' heads='1' primary='yes'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x87' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x86' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x85' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </hostdev> <hostdev mode='subsystem' type='pci' managed='yes'> <source> <address domain='0x0000' bus='0x84' slot='0x00' function='0x0'/> </source> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </hostdev> <memballoon model='virtio'> <stats period='10'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </memballoon> </devices> </domain> QUESTIONS: Are my vcpus in the same numa node? if not why? How can I tell from the xmldump that all vcpus are assigned to the same numa node? Thank you very much NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
On Wed, 2019-09-25 at 07:49 +0000, Manuel Sopena Ballesteros wrote:
Dear openstack user group,
I have a server with 2 numa nodes and I am trying to setup nova numa affinity.
[root@zeus-53 ~]# numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 node 0 size: 262029 MB node 0 free: 2536 MB node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 node 1 size: 262144 MB node 1 free: 250648 MB node distances: node 0 1 0: 10 21 1: 21 10
openstack flavor create --public xlarge.numa.perf --ram 250000 --disk 700 --vcpus 25 --property hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property pci_passthrough:alias='nvme:4' openstack server create --network hpc --flavor xlarge.numa.perf --image centos7.6-kudu-image --availability-zone nova:zeus-53.localdomain --key-name mykey kudu-1
This is the xmldump for the created vm
But for some reason the second VM fails to create with the error <name>instance-00000108</name> <uuid>5d278c90-27ab-4ee4-aeea-e1bf36ac246a</uuid>
[snip]
<vcpu placement='static'>25</vcpu> <cputune> <shares>25600</shares> <vcpupin vcpu='0' cpuset='27'/> <vcpupin vcpu='1' cpuset='55'/> <vcpupin vcpu='2' cpuset='50'/> <vcpupin vcpu='3' cpuset='22'/> <vcpupin vcpu='4' cpuset='49'/> <vcpupin vcpu='5' cpuset='21'/> <vcpupin vcpu='6' cpuset='48'/> <vcpupin vcpu='7' cpuset='20'/> <vcpupin vcpu='8' cpuset='25'/> <vcpupin vcpu='9' cpuset='53'/> <vcpupin vcpu='10' cpuset='18'/> <vcpupin vcpu='11' cpuset='46'/> <vcpupin vcpu='12' cpuset='51'/> <vcpupin vcpu='13' cpuset='23'/> <vcpupin vcpu='14' cpuset='19'/> <vcpupin vcpu='15' cpuset='47'/> <vcpupin vcpu='16' cpuset='26'/> <vcpupin vcpu='17' cpuset='54'/> <vcpupin vcpu='18' cpuset='42'/> <vcpupin vcpu='19' cpuset='14'/> <vcpupin vcpu='20' cpuset='17'/> <vcpupin vcpu='21' cpuset='45'/> <vcpupin vcpu='22' cpuset='43'/> <vcpupin vcpu='23' cpuset='15'/> <vcpupin vcpu='24' cpuset='24'/> <emulatorpin cpuset='14-15,17-27,42-43,45-51,53-55'/> </cputune>
[snip] This is what you want. See those 'cpuset' values? All of them are taken from cores on host NUMA node #1, as you noted previously.
node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55
So to answer your questions:
Are my vcpus in the same numa node? if not why?
Yes.
How can I tell from the xmldump that all vcpus are assigned to the same numa node?
For a single guest topology, look at the cpuset value and ensure they all map to cores from a single host NUMA node. Stephen
ok, ok, but numa node 0 has much less free memory so it looks like the vm memory is assigned to node 0 while the vcpus are in node 1. is that right? how can I enforce openstack to tell kvm to put memory and vcpus local? thank you ________________________________________ From: Stephen Finucane [sfinucan@redhat.com] Sent: Wednesday, 25 September 2019 19:08 To: Manuel Sopena Ballesteros; openstack-discuss@lists.openstack.org Subject: Re: numa affinity question On Wed, 2019-09-25 at 07:49 +0000, Manuel Sopena Ballesteros wrote:
Dear openstack user group,
I have a server with 2 numa nodes and I am trying to setup nova numa affinity.
[root@zeus-53 ~]# numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 node 0 size: 262029 MB node 0 free: 2536 MB node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 node 1 size: 262144 MB node 1 free: 250648 MB node distances: node 0 1 0: 10 21 1: 21 10
openstack flavor create --public xlarge.numa.perf --ram 250000 --disk 700 --vcpus 25 --property hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property pci_passthrough:alias='nvme:4' openstack server create --network hpc --flavor xlarge.numa.perf --image centos7.6-kudu-image --availability-zone nova:zeus-53.localdomain --key-name mykey kudu-1
This is the xmldump for the created vm
But for some reason the second VM fails to create with the error <name>instance-00000108</name> <uuid>5d278c90-27ab-4ee4-aeea-e1bf36ac246a</uuid>
[snip]
<vcpu placement='static'>25</vcpu> <cputune> <shares>25600</shares> <vcpupin vcpu='0' cpuset='27'/> <vcpupin vcpu='1' cpuset='55'/> <vcpupin vcpu='2' cpuset='50'/> <vcpupin vcpu='3' cpuset='22'/> <vcpupin vcpu='4' cpuset='49'/> <vcpupin vcpu='5' cpuset='21'/> <vcpupin vcpu='6' cpuset='48'/> <vcpupin vcpu='7' cpuset='20'/> <vcpupin vcpu='8' cpuset='25'/> <vcpupin vcpu='9' cpuset='53'/> <vcpupin vcpu='10' cpuset='18'/> <vcpupin vcpu='11' cpuset='46'/> <vcpupin vcpu='12' cpuset='51'/> <vcpupin vcpu='13' cpuset='23'/> <vcpupin vcpu='14' cpuset='19'/> <vcpupin vcpu='15' cpuset='47'/> <vcpupin vcpu='16' cpuset='26'/> <vcpupin vcpu='17' cpuset='54'/> <vcpupin vcpu='18' cpuset='42'/> <vcpupin vcpu='19' cpuset='14'/> <vcpupin vcpu='20' cpuset='17'/> <vcpupin vcpu='21' cpuset='45'/> <vcpupin vcpu='22' cpuset='43'/> <vcpupin vcpu='23' cpuset='15'/> <vcpupin vcpu='24' cpuset='24'/> <emulatorpin cpuset='14-15,17-27,42-43,45-51,53-55'/> </cputune>
[snip] This is what you want. See those 'cpuset' values? All of them are taken from cores on host NUMA node #1, as you noted previously.
node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55
So to answer your questions:
Are my vcpus in the same numa node? if not why?
Yes.
How can I tell from the xmldump that all vcpus are assigned to the same numa node?
For a single guest topology, look at the cpuset value and ensure they all map to cores from a single host NUMA node. Stephen NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
On Wed, 2019-09-25 at 15:02 +0000, Manuel Sopena Ballesteros wrote:
ok,
ok, but numa node 0 has much less free memory so it looks like the vm memory is assigned to node 0 while the vcpus are in node 1.
is that right? no how can I enforce openstack to tell kvm to put memory and vcpus local? if you have a guest with a numa toplogy nova will require that hte guest memeroy is allcoated form the same numa nodes as the vcpus. you cannot disable this behavior. so your vm will have its memroy also allcoated form host numa node 0
you can see that from the numatune element <numatune> <memory mode='strict' nodeset='1'/> <memnode cellid='0' mode='strict' nodeset='1'/> </numatune> in this case guest numa node 0 is mapped to host numa node 1 so all guest memory and cpus are maped form host numa node 1
thank you ________________________________________ From: Stephen Finucane [sfinucan@redhat.com] Sent: Wednesday, 25 September 2019 19:08 To: Manuel Sopena Ballesteros; openstack-discuss@lists.openstack.org Subject: Re: numa affinity question
On Wed, 2019-09-25 at 07:49 +0000, Manuel Sopena Ballesteros wrote:
Dear openstack user group,
I have a server with 2 numa nodes and I am trying to setup nova numa affinity.
[root@zeus-53 ~]# numactl -H available: 2 nodes (0-1) node 0 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 28 29 30 31 32 33 34 35 36 37 38 39 40 41 node 0 size: 262029 MB node 0 free: 2536 MB node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55 node 1 size: 262144 MB node 1 free: 250648 MB node distances: node 0 1 0: 10 21 1: 21 10
openstack flavor create --public xlarge.numa.perf --ram 250000 --disk 700 --vcpus 25 --property hw:cpu_policy=dedicated --property hw:emulator_threads_policy=isolate --property hw:numa_nodes='1' --property pci_passthrough:alias='nvme:4' openstack server create --network hpc --flavor xlarge.numa.perf --image centos7.6-kudu-image --availability-zone nova:zeus-53.localdomain --key-name mykey kudu-1
This is the xmldump for the created vm
But for some reason the second VM fails to create with the error <name>instance-00000108</name> <uuid>5d278c90-27ab-4ee4-aeea-e1bf36ac246a</uuid>
[snip]
<vcpu placement='static'>25</vcpu> <cputune> <shares>25600</shares> <vcpupin vcpu='0' cpuset='27'/> <vcpupin vcpu='1' cpuset='55'/> <vcpupin vcpu='2' cpuset='50'/> <vcpupin vcpu='3' cpuset='22'/> <vcpupin vcpu='4' cpuset='49'/> <vcpupin vcpu='5' cpuset='21'/> <vcpupin vcpu='6' cpuset='48'/> <vcpupin vcpu='7' cpuset='20'/> <vcpupin vcpu='8' cpuset='25'/> <vcpupin vcpu='9' cpuset='53'/> <vcpupin vcpu='10' cpuset='18'/> <vcpupin vcpu='11' cpuset='46'/> <vcpupin vcpu='12' cpuset='51'/> <vcpupin vcpu='13' cpuset='23'/> <vcpupin vcpu='14' cpuset='19'/> <vcpupin vcpu='15' cpuset='47'/> <vcpupin vcpu='16' cpuset='26'/> <vcpupin vcpu='17' cpuset='54'/> <vcpupin vcpu='18' cpuset='42'/> <vcpupin vcpu='19' cpuset='14'/> <vcpupin vcpu='20' cpuset='17'/> <vcpupin vcpu='21' cpuset='45'/> <vcpupin vcpu='22' cpuset='43'/> <vcpupin vcpu='23' cpuset='15'/> <vcpupin vcpu='24' cpuset='24'/> <emulatorpin cpuset='14-15,17-27,42-43,45-51,53-55'/> </cputune>
[snip]
This is what you want. See those 'cpuset' values? All of them are taken from cores on host NUMA node #1, as you noted previously.
node 1 cpus: 14 15 16 17 18 19 20 21 22 23 24 25 26 27 42 43 44 45 46 47 48 49 50 51 52 53 54 55
So to answer your questions:
Are my vcpus in the same numa node? if not why?
Yes.
How can I tell from the xmldump that all vcpus are assigned to the same numa node?
For a single guest topology, look at the cpuset value and ensure they all map to cores from a single host NUMA node.
Stephen
NOTICE Please consider the environment before printing this email. This message and any attachments are intended for the addressee named and may contain legally privileged/confidential/copyright information. If you are not the intended recipient, you should not read, use, disclose, copy or distribute this communication. If you have received this message in error please notify us at once by return email and then delete both messages. We accept no liability for the distribution of viruses or similar in electronic communications. This notice should not be removed.
participants (3)
-
Manuel Sopena Ballesteros
-
Sean Mooney
-
Stephen Finucane