[openstack-dev] realtime kvm cpu affinities
chris.friesen at windriver.com
Wed Jun 21 17:40:14 UTC 2017
On 06/21/2017 10:46 AM, Henning Schild wrote:
> Am Wed, 21 Jun 2017 10:04:52 -0600
> schrieb Chris Friesen <chris.friesen at windriver.com>:
> i guess you are talking about that section from :
>>>> We could use a host level tunable to just reserve a set of host
>>>> pCPUs for running emulator threads globally, instead of trying to
>>>> account for it per instance. This would work in the simple case,
>>>> but when NUMA is used, it is highly desirable to have more fine
>>>> grained config to control emulator thread placement. When real-time
>>>> or dedicated CPUs are used, it will be critical to separate
>>>> emulator threads for different KVM instances.
Yes, that's the relevant section.
> I know it has been considered, but i would like to bring the topic up
> again. Because doing it that way allows for many more rt-VMs on a host
> and i am not sure i fully understood why the idea was discarded in the
> I do not really see the influence of NUMA here. Say the
> emulator_pin_set is used only for realtime VMs, we know that the
> emulators and IOs can be "slow" so crossing numa-nodes should not be an
> issue. Or you could say the set needs to contain at least one core per
> numa-node and schedule emulators next to their vcpus.
> As we know from our setup, and as Luiz confirmed - it is _not_ "critical
> to separate emulator threads for different KVM instances".
> They have to be separated from the vcpu-cores but not from each other.
> At least not on the "cpuset" basis, maybe "blkio" and cgroups like that.
I'm reluctant to say conclusively that we don't need to separate emulator
threads since I don't think we've considered all the cases. For example, what
happens if one or more of the instances are being live-migrated? The migration
thread for those instances will be very busy scanning for dirty pages, which
could delay the emulator threads for other instances and also cause significant
cross-NUMA traffic unless we ensure at least one core per NUMA-node.
Also, I don't think we've determined how much CPU time is needed for the
emulator threads. If we have ~60 CPUs available for instances split across two
NUMA nodes, can we safely run the emulator threads of 30 instances all together
on a single CPU? If not, how much "emulator overcommit" is allowable?
More information about the OpenStack-dev