CPU Topology confusion

Satish Patel satish.txt at gmail.com
Thu Mar 5 17:11:37 UTC 2020


Eddie,

I have tried everything to match or fix CPU Topology layout but its
never come down to correct as i mentioned in screenshot, I have check
on Alicloud and they are also running KVM and their virtual machine
lstopo output is really match with physical machine, like L1i / L1d
cache layout etc.

if you look at following output its strange i am using "-cpu host"
option but still there are lots of missing flags on my virtual machine
cpuinfo, is that normal?

This is my VM output (virtual machine)

# grep flags /proc/cpuinfo | uniq
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm
constant_tsc arch_perfmon rep_good nopl xtopology eagerfpu pni
pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt
tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm
arat fsgsbase tsc_adjust bmi1 avx2 smep bmi2 erms invpcid xsaveopt

This is compute machine (physical host)

# grep flags /proc/cpuinfo | uniq
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov
pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx
pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl
xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq dtes64 monitor
ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1
sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c
rdrand lahf_lm abm epb invpcid_single intel_ppin ssbd ibrs ibpb stibp
tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 avx2
smep bmi2 erms invpcid cqm xsaveopt cqm_llc cqm_occup_llc dtherm ida
arat pln pts md_clear spec_ctrl intel_stibp flush_l1d

On Thu, Mar 5, 2020 at 11:26 AM Eddie Yen <missile0407 at gmail.com> wrote:
>
> Hi Satish,
>
> Since you already set "cpu_mode = host-passthrough", there's no need
> to set cpu_model.
>
> BTW, we're not known about the CPU topology a lot. But IME we always
> set "hw_cpu_sockets = 2" in specified image or flavor metadata if running
> Windows instance. In default, KVM always allocate all vcpus into sockets
> in CPU topology, and this will affect the Windows VM performance since
> Windows only support maximum 2 CPU sockets.
>
> Perhaps you can try limit socket numbers by setting hw_cpu_sockets in
> image metadata (or hw:cpu_sockets in flavor metadata.)
>
> Satish Patel <satish.txt at gmail.com> 於 2020年3月5日 週四 下午10:46寫道:
>>
>>
>> cpu_mode = cpu-passthrough
>> cpu_model = none
>>
>> Do you think cpu_model make difference ?
>>
>>
>> Sent from my iPhone
>>
>> On Mar 5, 2020, at 7:18 AM, Satish Patel<satish.txt at gmail.com> wrote:
>>
>> 
>>
>> cpu-passthrough
>>
>> Sent from my iPhone
>>
>> On Mar 4, 2020, at 9:24 PM, rui zang <rui.zang at yandex.com> wrote:
>>
>> 
>> Hi,
>>
>> What is the value for the "cpu_mode" configuration option?
>> https://docs.openstack.org/mitaka/config-reference/compute/hypervisor-kvm.html
>>
>> Thanks,
>> Zang, Rui
>>
>>
>> 05.03.2020, 01:24, "Satish Patel" <satish.txt at gmail.com>:
>>
>> Folks,
>>
>> We are running openstack with KVM and i have noticed kvm presenting
>> wrong CPU Tolopoly to VM and because of that we are seeing bad
>> performance to our application.
>>
>> This is openstack compute:
>>
>> # lstopo-no-graphics --no-io
>> Machine (64GB total)
>>   NUMANode L#0 (P#0 32GB) + Package L#0 + L3 L#0 (25MB)
>>     L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
>>       PU L#0 (P#0)
>>       PU L#1 (P#20)
>>     L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
>>       PU L#2 (P#1)
>>       PU L#3 (P#21)
>>
>> This is VM running on above compute
>>
>> # lstopo-no-graphics --no-io
>> Machine (59GB total)
>>   NUMANode L#0 (P#0 29GB) + Package L#0 + L3 L#0 (16MB)
>>     L2 L#0 (4096KB) + Core L#0
>>       L1d L#0 (32KB) + L1i L#0 (32KB) + PU L#0 (P#0)
>>       L1d L#1 (32KB) + L1i L#1 (32KB) + PU L#1 (P#1)
>>     L2 L#1 (4096KB) + Core L#1
>>       L1d L#2 (32KB) + L1i L#2 (32KB) + PU L#2 (P#2)
>>       L1d L#3 (32KB) + L1i L#3 (32KB) + PU L#3 (P#3)
>>
>> if you noticed P#0 and P#1 has own (32KB) cache per thread that is
>> wrong presentation if you compare with physical CPU.
>>
>> This is a screenshot of AWS vs Openstack CPU Topology and looking at
>> openstack its presentation is little odd, is that normal?
>>
>> https://imgur.com/a/2sPwJVC
>>
>> I am running CentOS7.6 with kvm 2.12 version.
>>



More information about the openstack-discuss mailing list