What is the difference between 4 and 8 virtual sockets to physical sockets?
Hello Team,
What is the difference between creating an instance with *4* or *8 virtual sockets*, since the hypervisor has only *4 physical sockets*. My question is where do sockets, cores and virtual threads fit into the physical hardware. I think this question is not just related to Openstack, but with any virtualization.
My hypervisor configuration is as follows:
CPU(s): 192 Online CPU(s) list: 0-191 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 4 NUMA node(s): 4
Do you have any documentation that I can read and understand better?
That we have a nice week!
in general the only parameter you want to align to the physical host is the number of thread
so if the phsyical host has 2 thread per physical core then its best to also do that in the vm
we generally recommend setting the number of virutal socket equal to the number of virutal numa nodes if the vm has no explict numa toploly then you should set sockets=1 else hw:cpu_sockets==hw:numa_nodes is our recomendation.
for windows in partcalar the default config generated is suboptimal as windows client only supprot 1-2 sockets and windows serverver maxes out at 4 i believe.
On Mon, 2023-09-04 at 11:51 -0300, Jorge Visentini wrote:
Hello Team,
What is the difference between creating an instance with *4* or *8 virtual sockets*, since the hypervisor has only *4 physical sockets*. My question is where do sockets, cores and virtual threads fit into the physical hardware. I think this question is not just related to Openstack, but with any virtualization.
My hypervisor configuration is as follows:
CPU(s): 192 Online CPU(s) list: 0-191 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 4 NUMA node(s): 4
Do you have any documentation that I can read and understand better?
That we have a nice week!
Yes, yes, that I understand.
I know that for example if I want to use host_passthrough then I must use cpu_sockets == numa_nodes. *My question is more conceptual, for me to understand.*
*For example:* If I have a physical host with 1 physical processor (1 socket), can I define my instance with 2, 4, 8+ sockets? I mean, is it good practice? Is it correct to define the instance with 1 socket and increase the amount of socket colors?
Not sure if I could explain my question... In short, what is the relationship between the socket, cores and virtual thread and the socket, cores and physical thread. PS: I'm not into the issue of passthrough or not.
Em seg., 4 de set. de 2023 às 13:10, smooney@redhat.com escreveu:
in general the only parameter you want to align to the physical host is the number of thread
so if the phsyical host has 2 thread per physical core then its best to also do that in the vm
we generally recommend setting the number of virutal socket equal to the number of virutal numa nodes if the vm has no explict numa toploly then you should set sockets=1 else hw:cpu_sockets==hw:numa_nodes is our recomendation.
for windows in partcalar the default config generated is suboptimal as windows client only supprot 1-2 sockets and windows serverver maxes out at 4 i believe.
On Mon, 2023-09-04 at 11:51 -0300, Jorge Visentini wrote:
Hello Team,
What is the difference between creating an instance with *4* or *8
virtual
sockets*, since the hypervisor has only *4 physical sockets*. My question is where do sockets, cores and virtual threads fit into the physical hardware. I think this question is not just related to
Openstack,
but with any virtualization.
My hypervisor configuration is as follows:
CPU(s): 192 Online CPU(s) list: 0-191 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 4 NUMA node(s): 4
Do you have any documentation that I can read and understand better?
That we have a nice week!
On Mon, 2023-09-04 at 18:13 -0300, Jorge Visentini wrote:
Yes, yes, that I understand.
I know that for example if I want to use host_passthrough then I must use cpu_sockets == numa_nodes.
no even without cpu_mode=host_passthough you shoudl use cpu_sockets == numa_nodes. The reason for that is that the windows and to a lesser degree linux scheduler tends to make better schduling desicions when the numa nodes and sockets are aligned.
intel supported cluster on die since around sandybridge/ivybridge but it was not until very recentlly with amd epic platform and the move to chiplet design that the windows and linux kernels schdulers really got optimised to properly handel multipel numa nodes per socket.
in the very early days fo openstack qemu/libvirt strongly prefer generating toplogies with 1 socket, 1 core and 1 thread per vcpu. that is still what we defautl to today in nova because 10+ years ago that outperformed other toplogies in some test done in a non openstack env i.e. with just libvirt/qemu
when i got involed in openstack 10 years ago the cracks were already forming in that analasy and wehn we started to look at addign cpu pinnng and numa support the initall benchmarks we did show no benift to 1 socket per vcpu and infact if you had hyperthread on the host it actully perfromed worse.
for legacy reason we did not change the defautl toplogy but our recomendation has been to 1 socket per numa node and 1 thread per host thread by default
so 8 VCPU with a normal floatign vm with no special numa toplogy shoudl be
1 socket 2 thread and 4 cpus if the host has smt/hyperthreading enable or 1 socket 1 threadn and 8 cpus. not 8 sockets, 1 thread and 1 cpu which is the defualt toplogy.
*My question is more conceptual, for me to understand.*
no worries
*For example:* If I have a physical host with 1 physical processor (1 socket), can I define my instance with 2, 4, 8+ sockets? I mean, is it good practice? Is it correct to define the instance with 1 socket and increase the amount of socket colors?
you do not need to align the virtual toplogy to the hsot toplogy in anyway.
so on a host with 1 socket and 1 thread per core and 64 cores you can create a vm with 16 sockets and 2 thread per core and 1 core per socket
largely this wont impact performace vs 1 sockets and 1 thread per core and 32 core per socket
which woudl be our recomended toplogy.
what you are trying to do with the virtual toplogy is create a config that enabel the guest kernel schduler to make good choices
historically kernel schduler were largely not numa aware but did consider moving a process between socket to be very costly and as a result it avoided it. that meant for old kernels (windows 2012 or linux (centos 6/linux 2.x era)) it was more effeicnt to have 1 socket per vcpu
that change in the linux side many year ago adn windows a lot more recently. now its more imporant to the scudherl to knwo about the numa toplogy and thread topopligy. i.e. in the old model fo 1 socket per vcpu if you had hyperthreading enabld the guest kernel would expect that it can run n process in parralle when infact it really can only run n/2 so match the theread count to the host tread count allow the guest kernel to make better choices.
setting thread=2 when the host has thread=1 also tends to have littel downside for what its worth.
for what its worth on the linux side its my understanding that the kernel also does som extra work per socket that can be elimated if you use 1 socket instead of 32 but that is negligible unless your dealing with realtime workloads where it might matter.
Not sure if I could explain my question... In short, what is the relationship between the socket, cores and virtual thread and the socket, cores and physical thread. PS: I'm not into the issue of passthrough or not.
yep so conceptually as a user o fopenstack you should have no awareness of the hsot platform. specirfcaly as an unprivladged user you are not even ment to know if its libvirt/kvm or vmware from an api prespective.
form an openstack admin point of view its in yoru interest to craft your flavors and default images to be as power efficent as possible. the best way to acihve power effeicncy is often to make the workload perform better so that it can idle sooner and you can take one step in that direction by using some of your knowlage of the hardware to turn your images.
as an end user because you are not ment to knwo the toplogy of the host systems you generally shoudl benchmark your workload but you shoudl not really expect a large delta regardless of what you choose today.
openstack is not a virtualisation plathform, its a cloud plathform and while we supprot some enhanced paltform awareness features like cpu pinning this si largely intened to be someing the cloud admin configured and the end user just selects rather then something an end user should have to deeply understand.
so tl;dr virtual cpu tpopligies are about optimisting the vm for the workload not the host. optimising for the host can give a minimal performance uplift for the vm but the virutal cpu toplogoy is intentionally decloupeld form the host toplogy.
the virtual numa topology feature is purely a performance enhancement feature and is implictly tied to a host. ie. if you ask for 2 numa nodes we will pin each guest numa node to a seperate host numa node. virtual numa topligecs and cpu toplogies had two entrily diffent design goals. numa is tied to the host toplogy to optimise for hardware constraits, cpu topliges are not tied to the host hardware as they are optimising for guest software constraits (i.e. windows server only support 4 sockets so if you need more the 4 cpus you cannot have 1 socket pre vcpu)
Em seg., 4 de set. de 2023 às 13:10, smooney@redhat.com escreveu:
in general the only parameter you want to align to the physical host is the number of thread
so if the phsyical host has 2 thread per physical core then its best to also do that in the vm
we generally recommend setting the number of virutal socket equal to the number of virutal numa nodes if the vm has no explict numa toploly then you should set sockets=1 else hw:cpu_sockets==hw:numa_nodes is our recomendation.
for windows in partcalar the default config generated is suboptimal as windows client only supprot 1-2 sockets and windows serverver maxes out at 4 i believe.
On Mon, 2023-09-04 at 11:51 -0300, Jorge Visentini wrote:
Hello Team,
What is the difference between creating an instance with *4* or *8
virtual
sockets*, since the hypervisor has only *4 physical sockets*. My question is where do sockets, cores and virtual threads fit into the physical hardware. I think this question is not just related to
Openstack,
but with any virtualization.
My hypervisor configuration is as follows:
CPU(s): 192 Online CPU(s) list: 0-191 Thread(s) per core: 2 Core(s) per socket: 24 Socket(s): 4 NUMA node(s): 4
Do you have any documentation that I can read and understand better?
That we have a nice week!
participants (2)
-
Jorge Visentini
-
smooney@redhat.com