[openstack-dev] realtime kvm cpu affinities

Henning Schild henning.schild at siemens.com
Wed Jun 21 16:46:07 UTC 2017


Am Wed, 21 Jun 2017 10:04:52 -0600
schrieb Chris Friesen <chris.friesen at windriver.com>:

> On 06/21/2017 09:45 AM, Chris Friesen wrote:
> > On 06/21/2017 02:42 AM, Henning Schild wrote:  
> >> Am Tue, 20 Jun 2017 10:41:44 -0600
> >> schrieb Chris Friesen <chris.friesen at windriver.com>:  
> >  
> >>>> Our goal is to reach a high packing density of realtime VMs. Our
> >>>> pragmatic first choice was to run all non-vcpu-threads on a
> >>>> shared set of pcpus where we also run best-effort VMs and host
> >>>> load. Now the OpenStack guys are not too happy with that because
> >>>> that is load outside the assigned resources, which leads to
> >>>> quota and accounting problems.  
> >>>
> >>> If you wanted to go this route, you could just edit the
> >>> "vcpu_pin_set" entry in nova.conf on the compute nodes so that
> >>> nova doesn't actually know about all of the host vCPUs.  Then you
> >>> could run host load and emulator threads on the pCPUs that nova
> >>> doesn't know about, and there will be no quota/accounting issues
> >>> in nova.  
> >>
> >> Exactly that is the idea but OpenStack currently does not allow
> >> that. No thread will ever end up on a core outside the
> >> vcpu_pin_set and emulator/io-threads are controlled by
> >> OpenStack/libvirt.  
> >
> > Ah, right.  This will isolate the host load from the guest load,
> > but it will leave the guest emulator work running on the same pCPUs
> > as one or more vCPU threads.
> >
> > Your emulator_pin_set idea is interesting...it might be worth
> > proposing in nova.  
> 
> Actually, based on [1] it appears they considered it and decided that
> it didn't provide enough isolation between realtime VMs.

Hey Chris,

i guess you are talking about that section from [1]:

>>> We could use a host level tunable to just reserve a set of host
>>> pCPUs for running emulator threads globally, instead of trying to
>>> account for it per instance. This would work in the simple case,
>>> but when NUMA is used, it is highly desirable to have more fine
>>> grained config to control emulator thread placement. When real-time
>>> or dedicated CPUs are used, it will be critical to separate
>>> emulator threads for different KVM instances.

I know it has been considered, but i would like to bring the topic up
again. Because doing it that way allows for many more rt-VMs on a host
and i am not sure i fully understood why the idea was discarded in the
end.

I do not really see the influence of NUMA here. Say the
emulator_pin_set is used only for realtime VMs, we know that the
emulators and IOs can be "slow" so crossing numa-nodes should not be an
issue. Or you could say the set needs to contain at least one core per
numa-node and schedule emulators next to their vcpus.

As we know from our setup, and as Luiz confirmed - it is _not_ "critical
to separate emulator threads for different KVM instances".
They have to be separated from the vcpu-cores but not from each other.
At least not on the "cpuset" basis, maybe "blkio" and cgroups like that.

Henning

> Chris
> 
> [1] 
> https://specs.openstack.org/openstack/nova-specs/specs/ocata/approved/libvirt-emulator-threads-policy.html




More information about the OpenStack-dev mailing list