[openstack-dev] realtime kvm cpu affinities

Henning Schild henning.schild at siemens.com
Thu Jul 6 15:16:03 UTC 2017


Stephen,

thanks for summing it all up! I am guessing that a blueprint or updates
to an existing blueprint will be next. We currently have a patch that
introduces a second pin_set to nova.conf and solves problem1 and 2 in
ocata. But that might be overlooking a couple of cases we do not care
about/did not come across yet.
Next to the text, that could serve as a discussion basis for what will
be imlpemented eventually.

I am happy because the two problems where acknowledged, the placement 
strategy of the threads was discussed/reviewed with some input from kvm,
and we already talked about possible solutions.
So things are moving ;)

regards,
Henning

Am Thu, 29 Jun 2017 17:59:41 +0100
schrieb <sfinucan at redhat.com>:

> On Tue, 2017-06-20 at 09:48 +0200, Henning Schild wrote:
> > Hi,
> > 
> > We are using OpenStack for managing realtime guests. We modified
> > it and contributed to discussions on how to model the realtime
> > feature. More recent versions of OpenStack have support for
> > realtime, and there are a few proposals on how to improve that
> > further.
> > 
> > ...  
> 
> I'd put off working my way through this thread until I'd time to sit
> down and read it in full. Here's what I'm seeing by way of summaries
> _so far_.
> 
> # Current situation
> 
> I think this tree (sans 'hw' prefixes for brevity) represents the
> current situation around flavor extra specs and image meta. Pretty
> much everything hangs off cpu_policy=dedicated. Correct me if I'm
> wrong.
> 
>   cpu_policy
>   ╞═> shared
>   ╘═> dedicated
>       ├─> cpu_thread_policy  
>       │   ╞═> prefer
>       │   ╞═> isolate
>       │   ╘═> require
>       ├─> emulator_threads_policy (*)
>       │   ╞═> share  
>       │   ╘═> isolate
>       └─> cpu_realtime
>           ╞═> no
>           ╘═> yes
>               └─> cpu_realtime_mask
>                   ╘═> (a mask of guest cores)  
> 
> (*) this one isn't configurable via images. I never really got why
> but meh.
> 
> There's also some host-level configuration options
> 
>   vcpu_pin_set
>   ╘═> (a list of host cores that nova can use)
> 
> Finally, there's some configuration you can do with your choice of
> kernel and kernel options (e.g. 'isolcpus').
> 
> For real time workloads, the expectation would be that you would set:
> 
>   cpu_policy
>   ╘═> dedicated
>       ├─> cpu_thread_policy
>       │   ╘═> isolate
>       ├─> emulator_threads_policy
>       │   ╘═> isolate
>       └─> cpu_realtime
>           ╘═> yes
>               └─> cpu_realtime_mask
>                   ╘═> (a mask of guest cores)  
> 
> That would result in a host that would use N+1 vCPUs, where N
> corresponds to the number of instance cores. Of the N cores, the set
> masked by 'cpu_realtime_mask' will be non-realtime. The remainder
> will be realtime.
> 
> # The Problem(s)
> 
> I'm going to thread this to capture the arguments and counter
> arguments:
> 
> ## Problem 1
> 
> henning.schild suggested that the current implementation of
> 'emulator_thread_policy' is too resource intensive, as the 1 core
> generally has a minimal workload for entire guests. This can
> significantly limit the number of guests that can be booted per host,
> particularly for guests with smaller numbers of cores. Instead, he
> has implemented a 'emulator_pin_set' host-level option, which
> complements 'vcpu_pin_set'. This allows us to "pool" emulator
> threads, similar to how vCPU threads behave with 'cpu_policy=shared'.
> He suggests this be adopted by nova.
> 
>   sahid seconded this, but suggests 'emulator_pin_set' be renamed
>   'cpu_emulator_threads_mask' and work as a mask of 'vcpu_pin_set'.
> He also suggested making a similarly-named flavor property, that
> would allow the user to use one of their cores for non-realtime 
> 
>     henning.schild suggested a set would still be better, but that
>     'vpu_pin_set' be renamed to 'pin_set', as it would no longer be
> for only vCPUs
>     
>       cfriesen seconded henning.schild's position but was not entirely
>       convinced that sharing emulator threads on a single pCPU is
> guaranteed to be safe, for example if one instance starts seriously
> hammering on I/O or does live migration or something. He suggested
> that an additional option, 'rt_emulator_overcommit_ratio' be added to
> make overcommitting explicit. In addition, he suggested making the
> flavor property a bitmask
> 
>         sahid questioned the need for an overcommit ratio, given that
> there is no overcommit of the hosts. An operator could synthesize a
> suitable value for 'emulator_pin_set'/'cpu_emulator_threads_mask'. He
> also disagreed with the suggestion that the flavor property be a
> bitmask as the only set is that of the vCPUs.
> 
>           cfriesen clarifies to point out how a few instances with
> many vCPUs will have more overhead requirements than many instances
> with few vCPUs. We need to be able to fail scheduling if the emulator
> thread cores are oversubscribed.
> 
> ## Problem 2
> 
> henning.schild suggests that hosts should be able to handle both RT
> and non-RT instances. This could be achieved through multiple
> instances of nova
> 
>   sahid points out that the recommendation is to use host aggregates
> to separate the two.
> 
>     henning.schild states that hosts with RT kernels can manage
> non-RT guests just fine. However, if using host aggregates is the
> recommendation then it should be possible to run multiple nova
> instances on a host, because dedicating an entire machine is not
> viable for smaller operations. cfriesen seconds this perspective,
> though not this solution.
> 
> # Solutions
> 
> Thus far, we've no clear conclusions on directions to go, so I've
> took a stab below. Henning, Sahid, Chris: does the above/below make
> sense, and is there anything we need to further clarify?
> 
> # Problem 1
> 
> From the above, there are 3-4 work items:
> 
> - Add a 'emulator_pin_set' or 'cpu_emulator_threads_mask'
> configuration option
> 
>   - If using a mask, rename 'vcpu_pin_set' to 'pin_set' (or, better,
>     'usable_cpus')
> 
> - Add a 'emulator_overcommit_ratio', which will do for emulator
> threads what the other ratios do for vCPUs and memory
> 
> - Deprecate 'hw:emulator_thread_policy'???
> 
> # Problem 2
> 
> No clear conclusions yet?
> 
> ---
> 
> Cheers,
> Stephen




More information about the OpenStack-dev mailing list