[Openstack-operators] Migrating an instance to a host with less cores fails
chris.friesen at windriver.com
Fri Sep 25 18:41:10 UTC 2015
This is a long-standing issue. Nikola has been working on it in Liberty for the
CPU pinning case, not sure about the non-pinned case. And of course patching
back to Kilo hasn't been done yet.
Aubrey, what you're seeing is definitely a bug. There is an existing bug
https://bugs.launchpad.net/nova/+bug/1417667 but that is specifically for
dedicated CPUs which doesn't apply in your case. Please feel free to open a new
On 09/25/2015 12:16 PM, Kris G. Lindgren wrote:
> I believe TWC - (medberry on irc) was lamenting to me about cpusets, different hypervisors HW configs, and unassigned vcpu's in numa nodes.
> The problem is the migration does not re-define the domain.xml, specifically, the vcpu mapping to match what makes sense on the new host. I believe the issue is more pronounced when you go from a compute node with more cores to a compute node with less cores. I believe the opposite migration works, just the vcpu/numa nodes are all wrong.
> CC'ing him as well.
> Kris Lindgren
> Senior Linux Systems Engineer
> On 9/25/15, 11:53 AM, "Steve Gordon" <sgordon at redhat.com> wrote:
>> Adding Nikola as he has been working on this.
>> ----- Original Message -----
>>> From: "Aubrey Wells" <awells at digiumcloud.com>
>>> To: openstack-operators at lists.openstack.org
>>> Trying to decide if this is a bug or just a config option that I can't
>>> find. The setup I'm currently testing in my lab with is two compute nodes
>>> running Kilo, one has 40 cores (2x 10c with HT) and one has 16 cores (2x 4c
>>> + HT). I don't have any CPU pinning enabled in my nova config, which seems
>>> to have the effect of setting in libvirt.xml a vcpu cpuset element like (if
>>> created on the 40c node):
>>> And then if I migrate that instance to the 16c node, it will bomb out with
>>> an exception:
>>> Live Migration failure: Invalid value
>>> '0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38' for 'cpuset.cpus':
>>> Invalid argument
>>> Which makes sense, since that node doesn't have any vcpus after 15 (0-15).
>>> I can fix the symptom by commenting out a line in
>>> nova/virt/libvirt/config.py (circa line 1831) so it always has an empty
>>> cpuset and thus doesn't write that line to libvirt.xml:
>>> # vcpu.set("cpuset", hardware.format_cpu_spec(self.cpuset))
>>> And the instance will happily migrate to the host with less CPUs, but this
>>> loses some of the benefit of openstack trying to evenly spread out the core
>>> usage on the host, at least that's what I think the purpose of that is.
>>> I'd rather fix it the right way if there's a config option I don't see or
>>> file a bug if its a bug.
>>> What I think should be happening is that when it creates the libvirt
>>> definition on the destination compute node, it write out the correct cpuset
>>> per the specs of the hardware its going on to.
>>> If it matters, in my nova-compute.conf file, I also have cpu mode and model
>>> defined to allow me to migrate between the two different architectures to
>>> begin with (the 40c is Sandybridge and the 16c is Westmere so I set it to
>>> the lowest common denominator of Westmere):
>>> Any help is appreciated.
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>> Steve Gordon, RHCE
>> Sr. Technical Product Manager,
>> Red Hat Enterprise Linux OpenStack Platform
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
More information about the OpenStack-operators