[openstack-dev] realtime kvm cpu affinities

Sahid Orentino Ferdjaoui sferdjao at redhat.com
Sun Jun 25 08:09:10 UTC 2017


On Fri, Jun 23, 2017 at 10:34:26AM -0600, Chris Friesen wrote:
> On 06/23/2017 09:35 AM, Henning Schild wrote:
> > Am Fri, 23 Jun 2017 11:11:10 +0200
> > schrieb Sahid Orentino Ferdjaoui <sferdjao at redhat.com>:
> 
> > > In Linux RT context, and as you mentioned, the non-RT vCPU can acquire
> > > some guest kernel lock, then be pre-empted by emulator thread while
> > > holding this lock. This situation blocks RT vCPUs from doing its
> > > work. So that is why we have implemented [2]. For DPDK I don't think
> > > we have such problems because it's running in userland.
> > > 
> > > So for DPDK context I think we could have a mask like we have for RT
> > > and basically considering vCPU0 to handle best effort works (emulator
> > > threads, SSH...). I think it's the current pattern used by DPDK users.
> > 
> > DPDK is just a library and one can imagine an application that has
> > cross-core communication/synchronisation needs where the emulator
> > slowing down vpu0 will also slow down vcpu1. You DPDK application would
> > have to know which of its cores did not get a full pcpu.
> > 
> > I am not sure what the DPDK-example is doing in this discussion, would
> > that not just be cpu_policy=dedicated? I guess normal behaviour of
> > dedicated is that emulators and io happily share pCPUs with vCPUs and
> > you are looking for a way to restrict emulators/io to a subset of pCPUs
> > because you can live with some of them beeing not 100%.
> 
> Yes.  A typical DPDK-using VM might look something like this:
> 
> vCPU0: non-realtime, housekeeping and I/O, handles all virtual interrupts
> and "normal" linux stuff, emulator runs on same pCPU
> vCPU1: realtime, runs in tight loop in userspace processing packets
> vCPU2: realtime, runs in tight loop in userspace processing packets
> vCPU3: realtime, runs in tight loop in userspace processing packets
> 
> In this context, vCPUs 1-3 don't really ever enter the kernel, and we've
> offloaded as much kernel work as possible from them onto vCPU0.  This works
> pretty well with the current system.
> 
> > > For RT we have to isolate the emulator threads to an additional pCPU
> > > per guests or as your are suggesting to a set of pCPUs for all the
> > > guests running.
> > > 
> > > I think we should introduce a new option:
> > > 
> > >    - hw:cpu_emulator_threads_mask=^1
> > > 
> > > If on 'nova.conf' - that mask will be applied to the set of all host
> > > CPUs (vcpu_pin_set) to basically pack the emulator threads of all VMs
> > > running here (useful for RT context).
> > 
> > That would allow modelling exactly what we need.
> > In nova.conf we are talking absolute known values, no need for a mask
> > and a set is much easier to read. Also using the same name does not
> > sound like a good idea.
> > And the name vcpu_pin_set clearly suggest what kind of load runs here,
> > if using a mask it should be called pin_set.
> 
> I agree with Henning.
> 
> In nova.conf we should just use a set, something like
> "rt_emulator_vcpu_pin_set" which would be used for running the emulator/io
> threads of *only* realtime instances.

I'm not agree with you, we have a set of pCPUs and we want to
substract some of them for the emulator threads. We need a mask. The
only set we need is to selection which pCPUs Nova can use
(vcpus_pin_set).

> We may also want to have "rt_emulator_overcommit_ratio" to control how many
> threads/instances we allow per pCPU.

Not really sure to have understand this point? If it is to indicate
that for a pCPU isolated we want X guest emulator threads, the same
behavior is achieved by the mask. A host for realtime is dedicated for
realtime, no overcommitment and the operators know the number of host
CPUs, they can easily deduct a ratio and so the corresponding mask.

> > > If on flavor extra-specs It will be applied to the vCPUs dedicated for
> > > the guest (useful for DPDK context).
> > 
> > And if both are present the flavor wins and nova.conf is ignored?
> 
> In the flavor I'd like to see it be a full bitmask, not an exclusion mask
> with an implicit full set.  Thus the end-user could specify
> "hw:cpu_emulator_threads_mask=0" and get the emulator threads to run
> alongside vCPU0.

Same here, I'm not agree, the only set is the vCPUs of the guest. Then
we want a mask to substract some of them.

> Henning, there is no conflict, the nova.conf setting and the flavor setting
> are used for two different things.
> 
> Chris
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list