[nova] [nova-scheduler] Performance Issue with CPU Pinning Scheduling and Duplicate IPs
Hi, In our environment (Kolla-Ansible Ussuri, 1200+ hypervisors), we are experiencing performance issues during bulk VM deployments. When launching around 100 VMs, the operation takes approximately *15 minutes*. During this time, the *nova-scheduler* spends over *15 minutes calculating available pinned CPUs*. However, in our Yoga production deployment, the same operation is about *10 times faster* — around *1–2 minutes* for 100 VMs. Additionally, after the 100 VMs are created, we have observed that some instances could not be created and some of the rest receive *multiple (2–3) IP addresses* from Neutron instead of one. Our nova services are running in a *NUMA-aware, CPU-pinning* configuration, and other settings are almost identical to the default configuration. Below is an excerpt from the nova-scheduler logs during VM creation (this message is repeated thousands of times within 15 minutes): Computed NUMA topology CPU pinning: usable pCPUs: [[26, 90], [89, 25], [24, 88], [27, 91], [29, 93], [10, 74], [86, 22], [28, 92], [73, 9], [23, 87], [8, 72], [18, 82], [94, 30], [81, 17], [31, 95]], vCPUs mapping: [(0, 26), (1, 90), (2, 89), (3, 25), (4, 24), (5, 88), (6, 27), (7, 91), (8, 29), (9, 93), (10, 10), (11, 74), (12, 86), (13, 22), (14, 28)] Do you have any suggestions on what might be causing: 1. The scheduler to spend so much time calculating pinned CPUs in Yoga, and 2. Some VMs to get multiple IPs from Neutron after creation? Any guidance would be greatly appreciated. Best regards, İzzettin
participants (1)
-
İzzettin Erdem