[Openstack-operators] Migrating an instance to a host with less cores fails

Kris G. Lindgren klindgren at godaddy.com
Fri Sep 25 18:16:56 UTC 2015


I believe TWC - (medberry on irc) was lamenting to me about cpusets, different hypervisors HW configs, and unassigned vcpu's in numa nodes.

The problem is the migration does not re-define the domain.xml, specifically, the vcpu mapping to match what makes sense on the new host.  I believe the issue is more pronounced when you go from a compute node with more cores to a compute node with less cores. I believe the opposite migration works, just the vcpu/numa nodes are all wrong. 

CC'ing him as well.
___________________________________________________________________
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy







On 9/25/15, 11:53 AM, "Steve Gordon" <sgordon at redhat.com> wrote:

>Adding Nikola as he has been working on this.
>
>----- Original Message -----
>> From: "Aubrey Wells" <awells at digiumcloud.com>
>> To: openstack-operators at lists.openstack.org
>> 
>> Greetings,
>> Trying to decide if this is a bug or just a config option that I can't
>> find. The setup I'm currently testing in my lab with is two compute nodes
>> running Kilo, one has 40 cores (2x 10c with HT) and one has 16 cores (2x 4c
>> + HT). I don't have any CPU pinning enabled in my nova config, which seems
>> to have the effect of setting in libvirt.xml a vcpu cpuset element like (if
>> created on the 40c node):
>> 
>> <vcpu
>> cpuset="1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39">1</vcpu>
>> 
>> And then if I migrate that instance to the 16c node, it will bomb out with
>> an exception:
>> 
>> Live Migration failure: Invalid value
>> '0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38' for 'cpuset.cpus':
>> Invalid argument
>> 
>> Which makes sense, since that node doesn't have any vcpus after 15 (0-15).
>> 
>> I can fix the symptom by commenting out a line in
>> nova/virt/libvirt/config.py (circa line 1831) so it always has an empty
>> cpuset and thus doesn't write that line to libvirt.xml:
>> # vcpu.set("cpuset", hardware.format_cpu_spec(self.cpuset))
>> 
>> And the instance will happily migrate to the host with less CPUs, but this
>> loses some of the benefit of openstack trying to evenly spread out the core
>> usage on the host, at least that's what I think the purpose of that is.
>> 
>> I'd rather fix it the right way if there's a config option I don't see or
>> file a bug if its a bug.
>> 
>> What I think should be happening is that when it creates the libvirt
>> definition on the destination compute node, it write out the correct cpuset
>> per the specs of the hardware its going on to.
>> 
>> If it matters, in my nova-compute.conf file, I also have cpu mode and model
>> defined to allow me to migrate between the two different architectures to
>> begin with (the 40c is Sandybridge and the 16c is Westmere so I set it to
>> the lowest common denominator of Westmere):
>> 
>> cpu_mode=custom
>> cpu_model=Westmere
>> 
>> Any help is appreciated.
>> 
>> ---------------------
>> Aubrey
>> 
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> 
>
>-- 
>Steve Gordon, RHCE
>Sr. Technical Product Manager,
>Red Hat Enterprise Linux OpenStack Platform
>
>_______________________________________________
>OpenStack-operators mailing list
>OpenStack-operators at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators


More information about the OpenStack-operators mailing list