What is the difference between 4 and 8 virtual sockets to physical sockets?

smooney at redhat.com smooney at redhat.com
Tue Sep 5 08:48:52 UTC 2023


On Mon, 2023-09-04 at 18:13 -0300, Jorge Visentini wrote:
> Yes, yes, that I understand.
> 
> I know that for example if I want to use host_passthrough then I must use
> cpu_sockets == numa_nodes.
no even without cpu_mode=host_passthough  you shoudl use cpu_sockets == numa_nodes.
The reason for that is that the windows and to a lesser degree linux scheduler
tends to make better schduling desicions when the numa nodes and sockets are aligned.

intel supported cluster on die since around sandybridge/ivybridge but it was not until
very recentlly with amd epic platform and the move to chiplet design that the windows
and linux kernels schdulers really got optimised to properly handel multipel numa nodes
per socket.

in the very early days fo openstack qemu/libvirt strongly prefer generating toplogies with 1
socket, 1 core and 1 thread per vcpu. that is still what we defautl to today in nova because
10+ years ago that outperformed other toplogies in some test done in a non openstack env i.e. with just libvirt/qemu

when i got involed in openstack 10 years ago the cracks were already forming in that analasy and wehn we
started to look at addign cpu pinnng and numa support the initall benchmarks we did show no benift to 1 socket per
vcpu and infact if you had hyperthread on the host it actully perfromed worse.

for legacy reason we did not change the defautl toplogy but our recomendation has been to 
1 socket per numa node and 1 thread per host thread by default 

so 8 VCPU with a normal floatign vm with no special numa toplogy shoudl be 

1 socket 2 thread and 4 cpus if the host has smt/hyperthreading enable or 
1 socket 1 threadn and 8 cpus. not 8 sockets,  1 thread and 1 cpu which is the defualt toplogy.

> *My question is more conceptual, for me to understand.*
no worries
> 
> *For example:* If I have a physical host with 1 physical processor (1
> socket), can I define my instance with 2, 4, 8+ sockets? I mean, is it good
> practice? Is it correct to define the instance with 1 socket and increase
> the amount of socket colors?

you do not need to align the virtual toplogy to the hsot toplogy in anyway.

so on a host with 1 socket and 1 thread per core and 64 cores
you can create a vm with 16 sockets and 2 thread per core and 1 core per socket

largely this wont impact performace vs
1 sockets and 1 thread per core and 32 core per socket

which woudl be our recomended toplogy.

what you are trying to do with the virtual toplogy is create a config that enabel the guest
kernel schduler to make good choices

historically kernel schduler were largely not numa aware but did consider moving a process
between socket to be very costly and as a result it avoided it.
that meant for old kernels (windows 2012 or linux (centos 6/linux 2.x era)) it was more
effeicnt to have 1 socket per vcpu

that change in the linux side many year ago adn windows a lot more recently.
now its more imporant to the scudherl to knwo about the numa toplogy and thread
topopligy. i.e. in the old model fo 1 socket per vcpu if you had hyperthreading enabld
the guest kernel would expect that it can run n process in parralle when infact
it really can only run n/2 so match the theread count to the host tread count allow
the guest kernel to make better choices.

setting thread=2 when the host has thread=1 also tends to have littel downside for what its worth.

for what its worth on the linux side its my understanding that the kernel also does som extra work per socket
that can be elimated if you use 1 socket instead of 32 but that is negligible unless your dealing
with realtime workloads where it might matter.


> 
> Not sure if I could explain my question... In short, what is the
> relationship between the socket, cores and virtual thread and the socket,
> cores and physical thread.
> PS: I'm not into the issue of passthrough or not.

yep so conceptually as a user o fopenstack you should have no awareness of the hsot platform.
specirfcaly as an unprivladged user you are not even ment to know if its libvirt/kvm or vmware
from an api prespective.

form an openstack admin point of view its in yoru interest to craft your flavors and default
images to be as power efficent as possible. the best way to acihve power effeicncy is often
to make the workload perform better so that it can idle sooner and you can take one
step in that direction by using some of your knowlage of the hardware to turn your images.

as an end user because you are not ment to knwo the toplogy of the host systems you
generally shoudl benchmark your workload but you shoudl not really expect a large delta
regardless of what you choose today.


openstack is not a virtualisation plathform, its a cloud plathform and while we supprot
some enhanced paltform awareness features like cpu pinning this si largely intened to be
someing the cloud admin configured and the end user just selects rather then something
an end user should have to deeply understand.

so tl;dr virtual cpu tpopligies are about optimisting the vm for the workload not the host.
optimising for the host can give a minimal performance uplift for the vm but the virutal
cpu toplogoy is intentionally decloupeld form the host toplogy.

the virtual numa topology feature is purely a performance enhancement feature
and is implictly tied to a host. ie. if you ask for 2 numa nodes we will pin
each guest numa node to a seperate host numa node. virtual numa topligecs
and cpu toplogies had two entrily diffent design goals. numa is tied to the
host toplogy to optimise for hardware constraits, cpu topliges are not tied
to the host hardware as they are optimising for guest software constraits 
(i.e. windows server only support 4 sockets so if you need more the 4 cpus you cannot
have 1 socket pre vcpu)

> 
> Em seg., 4 de set. de 2023 às 13:10, <smooney at redhat.com> escreveu:
> 
> > in general the only parameter you want to align to the physical host is
> > the number of thread
> > 
> > so if the phsyical host has 2 thread per physical core then its best to
> > also do that in the vm
> > 
> > we generally recommend setting the number of virutal socket equal to the
> > number of virutal numa nodes
> > if the vm has no explict numa toploly then you should set sockets=1
> > else hw:cpu_sockets==hw:numa_nodes is our recomendation.
> > 
> > for windows in partcalar the default config generated is suboptimal as
> > windows client only supprot 1-2 sockets
> > and windows serverver maxes out at 4 i believe.
> > 
> > On Mon, 2023-09-04 at 11:51 -0300, Jorge Visentini wrote:
> > > Hello Team,
> > > 
> > > What is the difference between creating an instance with *4* or *8
> > virtual
> > > sockets*, since the hypervisor has only *4 physical sockets*.
> > > My question is where do sockets, cores and virtual threads fit into the
> > > physical hardware. I think this question is not just related to
> > Openstack,
> > > but with any virtualization.
> > > 
> > > My hypervisor configuration is as follows:
> > > 
> > > CPU(s): 192
> > > Online CPU(s) list: 0-191
> > > Thread(s) per core: 2
> > > Core(s) per socket: 24
> > > Socket(s): 4
> > > NUMA node(s): 4
> > > 
> > > Do you have any documentation that I can read and understand better?
> > > 
> > > That we have a nice week!
> > 
> > 
> 




More information about the openstack-discuss mailing list