[nova] NUMA live migration - mostly how it's tested

Sean Mooney smooney at redhat.com
Sun Mar 3 13:18:22 UTC 2019


On Sat, 2019-03-02 at 04:38 +0000, Zhong, Luyao wrote:
> 
> 
> > 在 2019年3月1日,下午9:30,Sean Mooney <smooney at redhat.com> 写道:
> > 
> > > On Fri, 2019-03-01 at 17:30 +0800, Luyao Zhong wrote:
> > > Hi all,
> > > 
> > > There was something wrong with the live migration when using 'dedicated' 
> > > cpu_policy in my test. Attached file contains the details.
> > > 
> > > The message body will be so big that it will be held if I attach the 
> > > debug info, so I will send another email.
> > 
> > 
> > looking at your original email  you stated that 
> > 
> > # VM server_on_host1 cpu pinning info
> >  <cputune>
> >    <shares>4096</shares>
> >    <vcpupin vcpu='0' cpuset='43'/>
> >    <vcpupin vcpu='1' cpuset='7'/>
> >    <vcpupin vcpu='2' cpuset='16'/>
> >    <vcpupin vcpu='3' cpuset='52'/>
> >    <emulatorpin cpuset='7,16,43,52'/>
> >  </cputune>
> >  <numatune>
> >    <memory mode='strict' nodeset='0'/>
> >    <memnode cellid='0' mode='strict' nodeset='0'/>
> >  </numatune>
> > 
> > 
> > # VM server_on_host2 cpu pinning info (before migration)
> >  <cputune>
> >    <shares>4096</shares>
> >    <vcpupin vcpu='0' cpuset='43'/>
> >    <vcpupin vcpu='1' cpuset='7'/>
> >    <vcpupin vcpu='2' cpuset='16'/>
> >    <vcpupin vcpu='3' cpuset='52'/>
> >    <emulatorpin cpuset='7,16,43,52'/>
> >  </cputune>
> >  <numatune>
> >    <memory mode='strict' nodeset='0'/>
> >    <memnode cellid='0' mode='strict' nodeset='0'/>
> >  </numatune>
> > 
> > however looking at the full domain
> > server_on_host1 was  
> > <cputune>
> >    <shares>4096</shares>
> >    <vcpupin vcpu='0' cpuset='43'/>
> >    <vcpupin vcpu='1' cpuset='7'/>
> >    <vcpupin vcpu='2' cpuset='16'/>
> >    <vcpupin vcpu='3' cpuset='52'/>
> >    <emulatorpin cpuset='7,16,43,52'/>
> >  </cputune>
> > 
> > but server_on_host2 was 
> > <cputune>
> >    <shares>4096</shares>
> >    <vcpupin vcpu='0' cpuset='2'/>
> >    <vcpupin vcpu='1' cpuset='38'/>
> >    <vcpupin vcpu='2' cpuset='8'/>
> >    <vcpupin vcpu='3' cpuset='44'/>
> >    <emulatorpin cpuset='2,8,38,44'/>
> > 
> > assuming the full xml attached for server_on_host2 is correct
> > then this shows that the code is working correctly as it nolonger overlaps.
> > 
> 
> I use virsh dumpxml to get the full xml, and I usually use virsh edit to see the xml, so I didn’t notice this
> difference before. When using virsh edit ,I can see the overlaps. I’m not sure why I will get different results
> between “virsh edit” and “virsh dumpxml”?
at first i taught that sounded like a libvirt bug however the description of both command differ slightly

 dumpxml domain [--inactive] [--security-info] [--update-cpu] [--migratable]
           Output the domain information as an XML dump to stdout, this format can be used by the create command. 
           Additional options affecting the XML dump may be used. --inactive tells
           virsh to dump domain configuration that will be used on next start of the domain as opposed to the current  
           domain configuration.  Using --security-info will also include
           security sensitive information in the XML dump. --update-cpu updates domain CPU requirements according to 
           host CPU. With --migratable one can request an XML that is suitable
           for migrations, i.e., compatible with older libvirt releases and possibly amended with internal run-time 
           options. This option may automatically enable other options
           (--update-cpu, --security-info, ...) as necessary.

  edit domain
           Edit the XML configuration file for a domain, which will affect the next boot of the guest.

           This is equivalent to:

            virsh dumpxml --inactive --security-info domain > domain.xml
            vi domain.xml (or make changes with your other text editor)
            virsh define domain.xml

           except that it does some error checking.

           The editor used can be supplied by the $VISUAL or $EDITOR environment variables, and defaults to "vi".

i think virsh edit is showing you the state the vm will have on next reboot.
which appears to be the original xml not the migration xml that was used to move the instance.

since openstack destorys the xml and recates it form scratch every time the outpu of virsh edit can be ignored.
virs dumpxml will show the current state of the domain. if you did virsh dumpxml --inactive it would likely match
virsh edit. its possible the modified xml we use when migrating a  domain is is considered transient but that is my best
guess as to why they are different.

for evaulating this we should use the values of virsh dumpxml.
 

> > 
> > > 
> > > Regards,
> > > Luyao
> > > 
> > > 
> > > > On 2019/2/28 下午9:28, Sean Mooney wrote:
> > > > > On Wed, 2019-02-27 at 21:33 -0500, Artom Lifshitz wrote:
> > > > > 
> > > > > 
> > > > > > On Wed, Feb 27, 2019, 21:27 Matt Riedemann, <mriedemos at gmail.com> wrote:
> > > > > > > On 2/27/2019 7:25 PM, Artom Lifshitz wrote:
> > > > > > > What I've been using for testing is this: [3]. It's a series of
> > > > > > > patches to whitebox_tempest_plugin, a Tempest plugin used by a bunch
> > > > > > > of us Nova Red Hatters to automate testing that's outside of Tempest's
> > > > > > > scope.
> > > > > > 
> > > > > > And where is that pulling in your nova series of changes and posting
> > > > > > test results (like a 3rd party CI) so anyone can see it? Or do you mean
> > > > > > here are tests, but you need to provide your own environment if you want
> > > > > > to verify the code prior to merging it.
> > > > > 
> > > > > Sorry, wasn't clear. It's the latter. The test code exists, and has run against my devstack environment with
> > > > > my
> > > > > patches checked out, but there's no CI or public posting of test results. Getting CI coverage for these NUMA
> > > > > things
> > > > > (like the old Intel one) is a whole other topic.
> > > > 
> > > > on the ci front i resolved the nested vert on the server i bought to set up a personal ci for numa testing.
> > > > that set me back a few weeks in setting up that ci but i hope to run artom whitebox test amoung other in that at
> > > > some
> > > > point. vexhost also provided nested virt to the gate vms. im going to see if we can actully create a non voting
> > > > job
> > > > using the ubuntu-bionic-vexxhost nodeset. if ovh or one of the other providers of ci resource renable nested
> > > > virt
> > > > then we can maybe make that job voting and not need thridparty ci anymor.
> > > > > > Can we really not even have functional tests with the fake libvirt
> > > > > > driver and fake numa resources to ensure the flow doesn't blow up?
> > > > > 
> > > > > That's something I have to look into. We have live migration functional tests, and we have NUMA functional
> > > > > tests,
> > > > > but
> > > > > I'm not sure how we can combine the two.
> > > > 
> > > > jus as an addtional proof point im am planning to do a bunch of migration and live migration testing in the next
> > > > 2-4
> > > > weeks.
> > > > 
> > > > my current backlog on no particalar order is
> > > > sriov migration
> > > > numa migration
> > > > vtpm migration
> > > > cross-cell migration
> > > > cross-neutron backend migration (ovs<->linuxbridge)
> > > > cross-firwall migraton (iptables<->contrack) (previously tested and worked at end of queens)
> > > > 
> > > > narrowong in on the numa migration the current set of testcases i plan to manually verify are as follows:
> > > > 
> > > > note assume all flavor will have 256mb of ram and 4 cores unless otherwise stated
> > > > 
> > > > basic tests
> > > > pinned guests (hw:cpu_policy=dedicated)
> > > > pinned-isolated guests (hw:cpu_policy=dedicated hw:thread_policy=isolate)
> > > > pinned-prefer guests (hw:cpu_policy=dedicated hw:thread_policy=prefer)
> > > > unpinned-singel-numa guest (hw:numa_nodes=1)
> > > > unpinned-dual-numa guest (hw:numa_nodes=2)
> > > > unpinned-dual-numa-unblanced guest (hw:numa_nodes=2 hw:numa_cpu.0=1 hw:numa_cpu.1=1-3
> > > > hw:numa_mem.0=64 hw:numa_mem.0=192)
> > > > unpinned-hugepage-implcit numa guest (hw:mem_page_size=large)
> > > > unpinned-hugepage-multi numa guest (hw:mem_page_size=large hw:numa_nodes=2)
> > > > pinned-hugepage-multi numa guest (hw:mem_page_size=large hw:numa_nodes=2 hw:cpu_policy=dedicated)
> > > > realtime guest (hw:cpu_policy=dedicated hw:cpu_realtime=yes hw:cpu_realtime_mask=^0-1)
> > > > emulator-thread-iosolated guest (hw:cpu_policy=dedicated hw:emulator_threads_policy=isolate)
> > > > 
> > > > advanced tests (require extra nova.conf changes)
> > > > emulator-thread-shared guest (hw:cpu_policy=dedicated hw:emulator_threads_policy=shared) note cpu_share_set
> > > > configrued
> > > > unpinned-singel-numa-hetorgious-host guest (hw:numa_nodes=1) note vcpu_pin_set adjusted so that
> > > > host 1 only has cpus on
> > > > numa 1 and host 2 only has cpus on numa node 2.
> > > > supper-optimiesd-guest (hw:numa_nodes=2 hw:numa_cpu.0=1 hw:numa_cpu.1=1-3
> > > > hw:numa_mem.0=64 hw:numa_mem.0=192 hw:cpu_realtime=yes hw:cpu_realtime_mask=^0-1
> > > > hw:emulator_threads_policy=isolate)
> > > > supper-optimiesd-guest-2 (hw:numa_nodes=2 hw:numa_cpu.0=1 hw:numa_cpu.1=1-3 hw:numa_mem.0=64 hw:numa_mem.0=192
> > > > hw:cpu_realtime=yes hw:cpu_realtime_mask=^0-1 hw:emulator_threads_policy=share)
> > > > 
> > > > 
> > > > for each of these test ill provide a test-command file with the command i used to run the tests and reustlts
> > > > file
> > > > with a summary at the top plus the xmls before and after the migration showing that intially the resouces
> > > > would conflict on migration and then the updated xmls after the migration.
> > > > i will also provide the local.conf for the devstack deployment and some details about the env like
> > > > distor/qemu/libvirt
> > > > versions.
> > > > 
> > > > eventurally i hope all those test cases can be added to the whitebox plugin and verifed in a ci.
> > > > we could also try and valideate them in functional tests.
> > > > 
> > > > i have attached the xml for the pinned guest as an example of what to expect but i will be compileing this
> > > > slowly as
> > > > i
> > > > go and zip everying up in an email to the list.
> > > > this will take some time to complete and hosestly i had planned to do most of this testing after feature freeze
> > > > when
> > > > we
> > > > can focus on testing more.
> > > > 
> > > > regards
> > > > sean
> > > > 
> > > > 




More information about the openstack-discuss mailing list