[nova][dev] vCPU Pinning for L1/L2 cache side-channel vulnerability mitigation

Robert Donovan rob at cleansafecloud.com
Tue Jan 15 11:02:49 UTC 2019

Thanks, these are all very interesting points, particularly the notes on the nature of floating guest cores as this does indeed go some way towards mitigation. I can certainly see that you could quite easily, without intent, make the situation worse if you stopped supporting floating instances and there was some issue with the pinning algorithm that was used. We want to go a step further in terms of mitigation, ideally without turning off SMT and taking that ~20% performance hit, but accept that our methods are specific to our company’s size, services and infrastructure. We’ll endeavour to share any experiences we have that may be useful to the wider community if we do proceed in any sort of implementation around this.


> On 10 Jan 2019, at 19:02, Sean Mooney <smooney at redhat.com> wrote:
> On Thu, 2019-01-10 at 17:56 +0000, Stephen Finucane wrote:
>> On Thu, 2019-01-10 at 11:05 -0500, Jay Pipes wrote:
>>> On 01/10/2019 10:49 AM, Robert Donovan wrote:
>>>> Hello Nova folks,
>>>> I spoke to some of you very briefly about this in Berlin (thanks
>>>> again for your time), and we were resigned to turning off SMT to
>>>> fully protect against future CPU cache side-channel attacks as I
>>>> know many others have done. However, we have stubbornly done a bit
>>>> of last-resort research and testing into using vCPU pinning on a
>>>> per-tenant basis as an alternative and I’d like to lay it out in
>>>> more detail for you to make sure there are no legs in the idea
>>>> before abandoning it completely.
>>>> The idea is to use libvirt’s vcpupin ability to ensure that two
>>>> different tenants never share the same physical CPU core, so they
>>>> cannot theoretically steal each other’s data via an L1 or L2 cache
>>>> side-channel. The pinning would be optimised to make use of as many
>>>> logical cores as possible for any given tenant. We would also
>>>> isolate other key system processes to a separate range of physical
>>>> cores. After discussions in Berlin, we ran some tests with live
>>>> migration, as this is key to our maintenance activities and would
>>>> be a show-stopped if it didn’t work. We found that removing any
>>>> pinning restrictions immediately prior to migration resulted in
>>>> them being completely reset on the target host, which could then be
>>>> optimised accordingly post-migration. Unfortunately, there would be
>>>> a small window of time where we couldn’t prevent tenants from
>>>> sharing a physical core on the target host after a migration, but
>>>> we think this is an acceptable risk given the nature of these
>>>> attacks.
>>>> Obviously, this approach may not be appropriate in many
>>>> circumstances, such as if you have many tenants who just run single
>>>> VMs with one vCPU, or if over-allocation is in use. We have also
>>>> only looked at KVM and libvirt. I would love to know what people
>>>> think of this approach however. Are there any other clear issues
>>>> that you can think of which we may not have considered? If it seems
>>>> like a reasonable idea, is it something that could fit into Nova
>>>> and, if so, where in the architecture is the best place for it to
>>>> sit? I know you can currently specify per-instance CPU pinning via
>>>> flavor parameters, so a similar approach could be taken for this
>>>> strategy. Alternatively, we can look at implementing it as an
>>>> external plugin of some kind for use by those with a similar setup.
>>> IMHO, if you're going to go through all the hassle of pinning guest vCPU 
>>> threads to distinct logical host processors, you might as well just use 
>>> dedicated CPU resources for everything. As you mention above, you can't 
>>> have overcommit anyway if you're concerned about this problem. Once you 
>>> have a 1.0 cpu_allocation_ratio, you're essentially limiting your CPU 
>>> resources to a dedicated host CPU -> guest CPU situation so you might as 
>>> well just use CPU pinning and deal with all the headaches that brings 
>>> with it.
>> Indeed. My initial answer to this was "use CPU thread policies"
>> (specifically, the 'require' policy) to ensure each instance owns its
>> entire core, thinking you were using dedicated/pinned CPUs. 
> the isolate policy should address this.
> the require policy would for a even number of cores and a singel numa node.
> the require policy does not adress this is you have multiple numa nodes
> e.g. a 14 cores spread aross 2 numa nodes with require will have one free
> ht sibling on each numa node when pinned unless we hava a check for that i missed.
>> For shared
>> CPUs, I'm not sure how we could ever do something like you've proposed
>> in a manner that would result in less than the ~20% or so performance
>> degradation I usually see quoted when turning off SMT. Far too much
>> second guessing of the expected performance requirements of the guest
>> would be necessary.
> for shared cpus the assumtion is that as the guest cores are floating that
> your victim and payload vm woudl not remain running on the same core/hypertread
> for a protracted period of time. if both are activly using cpu cycles then the
> kernel schuler will schduler them to different threads/cores to allow them to
> exectue without contention. Note that im not saying there is not a risk but 
> tenat aware shcduleing for shared cpus effefctivly mean we woudl have to stop supporting
> floating instance entirely and only allow oversubsripton to happen between vms from
> the same tenant which is a unlikely to ever happen in a cloud enviorment as 
> teant vms typically are not coloated on a single host and second is not desirable in all
> environments.
>> Stephen

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190115/22ee0c86/attachment-0001.html>

More information about the openstack-discuss mailing list