On 2019/3/3 下午9:18, Sean Mooney wrote:
On Sat, 2019-03-02 at 04:38 +0000, Zhong, Luyao wrote:
在 2019年3月1日,下午9:30,Sean Mooney <smooney@redhat.com> 写道:
On Fri, 2019-03-01 at 17:30 +0800, Luyao Zhong wrote: Hi all,
There was something wrong with the live migration when using 'dedicated' cpu_policy in my test. Attached file contains the details.
The message body will be so big that it will be held if I attach the debug info, so I will send another email.
looking at your original email you stated that
# VM server_on_host1 cpu pinning info <cputune> <shares>4096</shares> <vcpupin vcpu='0' cpuset='43'/> <vcpupin vcpu='1' cpuset='7'/> <vcpupin vcpu='2' cpuset='16'/> <vcpupin vcpu='3' cpuset='52'/> <emulatorpin cpuset='7,16,43,52'/> </cputune> <numatune> <memory mode='strict' nodeset='0'/> <memnode cellid='0' mode='strict' nodeset='0'/> </numatune>
# VM server_on_host2 cpu pinning info (before migration) <cputune> <shares>4096</shares> <vcpupin vcpu='0' cpuset='43'/> <vcpupin vcpu='1' cpuset='7'/> <vcpupin vcpu='2' cpuset='16'/> <vcpupin vcpu='3' cpuset='52'/> <emulatorpin cpuset='7,16,43,52'/> </cputune> <numatune> <memory mode='strict' nodeset='0'/> <memnode cellid='0' mode='strict' nodeset='0'/> </numatune>
however looking at the full domain server_on_host1 was <cputune> <shares>4096</shares> <vcpupin vcpu='0' cpuset='43'/> <vcpupin vcpu='1' cpuset='7'/> <vcpupin vcpu='2' cpuset='16'/> <vcpupin vcpu='3' cpuset='52'/> <emulatorpin cpuset='7,16,43,52'/> </cputune>
but server_on_host2 was <cputune> <shares>4096</shares> <vcpupin vcpu='0' cpuset='2'/> <vcpupin vcpu='1' cpuset='38'/> <vcpupin vcpu='2' cpuset='8'/> <vcpupin vcpu='3' cpuset='44'/> <emulatorpin cpuset='2,8,38,44'/>
assuming the full xml attached for server_on_host2 is correct then this shows that the code is working correctly as it nolonger overlaps.
I use virsh dumpxml to get the full xml, and I usually use virsh edit to see the xml, so I didn’t notice this difference before. When using virsh edit ,I can see the overlaps. I’m not sure why I will get different results between “virsh edit” and “virsh dumpxml”?
at first i taught that sounded like a libvirt bug however the description of both command differ slightly
dumpxml domain [--inactive] [--security-info] [--update-cpu] [--migratable] Output the domain information as an XML dump to stdout, this format can be used by the create command. Additional options affecting the XML dump may be used. --inactive tells virsh to dump domain configuration that will be used on next start of the domain as opposed to the current domain configuration. Using --security-info will also include security sensitive information in the XML dump. --update-cpu updates domain CPU requirements according to host CPU. With --migratable one can request an XML that is suitable for migrations, i.e., compatible with older libvirt releases and possibly amended with internal run-time options. This option may automatically enable other options (--update-cpu, --security-info, ...) as necessary.
edit domain Edit the XML configuration file for a domain, which will affect the next boot of the guest.
This is equivalent to:
virsh dumpxml --inactive --security-info domain > domain.xml vi domain.xml (or make changes with your other text editor) virsh define domain.xml
except that it does some error checking.
The editor used can be supplied by the $VISUAL or $EDITOR environment variables, and defaults to "vi".
i think virsh edit is showing you the state the vm will have on next reboot. which appears to be the original xml not the migration xml that was used to move the instance.
since openstack destorys the xml and recates it form scratch every time the outpu of virsh edit can be ignored. virs dumpxml will show the current state of the domain. if you did virsh dumpxml --inactive it would likely match virsh edit. its possible the modified xml we use when migrating a domain is is considered transient but that is my best guess as to why they are different.
for evaulating this we should use the values of virsh dumpxml.
I reboot the migrated VM, and then virsh dumpxml would get the same xml as that got from virsh edit before reboot. Artom Lifshitz mentioned that the instance_numa_topology wasn't updated in the database from Windriver folks. So I guess virsh edit show xml produced by OpenStack according to the db, maybe this is why we get different xmls with 'edit' and 'dumpxml'. This is a little complex to me and still needs more test and debug work. Thank you so much for giving these details. Regards, Luyao
Regards, Luyao
On 2019/2/28 下午9:28, Sean Mooney wrote:
On Wed, 2019-02-27 at 21:33 -0500, Artom Lifshitz wrote:
> On Wed, Feb 27, 2019, 21:27 Matt Riedemann, <mriedemos@gmail.com> wrote: >> On 2/27/2019 7:25 PM, Artom Lifshitz wrote: >> What I've been using for testing is this: [3]. It's a series of >> patches to whitebox_tempest_plugin, a Tempest plugin used by a bunch >> of us Nova Red Hatters to automate testing that's outside of Tempest's >> scope. > > And where is that pulling in your nova series of changes and posting > test results (like a 3rd party CI) so anyone can see it? Or do you mean > here are tests, but you need to provide your own environment if you want > to verify the code prior to merging it.
Sorry, wasn't clear. It's the latter. The test code exists, and has run against my devstack environment with my patches checked out, but there's no CI or public posting of test results. Getting CI coverage for these NUMA things (like the old Intel one) is a whole other topic.
on the ci front i resolved the nested vert on the server i bought to set up a personal ci for numa testing. that set me back a few weeks in setting up that ci but i hope to run artom whitebox test amoung other in that at some point. vexhost also provided nested virt to the gate vms. im going to see if we can actully create a non voting job using the ubuntu-bionic-vexxhost nodeset. if ovh or one of the other providers of ci resource renable nested virt then we can maybe make that job voting and not need thridparty ci anymor.
> Can we really not even have functional tests with the fake libvirt > driver and fake numa resources to ensure the flow doesn't blow up?
That's something I have to look into. We have live migration functional tests, and we have NUMA functional tests, but I'm not sure how we can combine the two.
jus as an addtional proof point im am planning to do a bunch of migration and live migration testing in the next 2-4 weeks.
my current backlog on no particalar order is sriov migration numa migration vtpm migration cross-cell migration cross-neutron backend migration (ovs<->linuxbridge) cross-firwall migraton (iptables<->contrack) (previously tested and worked at end of queens)
narrowong in on the numa migration the current set of testcases i plan to manually verify are as follows:
note assume all flavor will have 256mb of ram and 4 cores unless otherwise stated
basic tests pinned guests (hw:cpu_policy=dedicated) pinned-isolated guests (hw:cpu_policy=dedicated hw:thread_policy=isolate) pinned-prefer guests (hw:cpu_policy=dedicated hw:thread_policy=prefer) unpinned-singel-numa guest (hw:numa_nodes=1) unpinned-dual-numa guest (hw:numa_nodes=2) unpinned-dual-numa-unblanced guest (hw:numa_nodes=2 hw:numa_cpu.0=1 hw:numa_cpu.1=1-3 hw:numa_mem.0=64 hw:numa_mem.0=192) unpinned-hugepage-implcit numa guest (hw:mem_page_size=large) unpinned-hugepage-multi numa guest (hw:mem_page_size=large hw:numa_nodes=2) pinned-hugepage-multi numa guest (hw:mem_page_size=large hw:numa_nodes=2 hw:cpu_policy=dedicated) realtime guest (hw:cpu_policy=dedicated hw:cpu_realtime=yes hw:cpu_realtime_mask=^0-1) emulator-thread-iosolated guest (hw:cpu_policy=dedicated hw:emulator_threads_policy=isolate)
advanced tests (require extra nova.conf changes) emulator-thread-shared guest (hw:cpu_policy=dedicated hw:emulator_threads_policy=shared) note cpu_share_set configrued unpinned-singel-numa-hetorgious-host guest (hw:numa_nodes=1) note vcpu_pin_set adjusted so that host 1 only has cpus on numa 1 and host 2 only has cpus on numa node 2. supper-optimiesd-guest (hw:numa_nodes=2 hw:numa_cpu.0=1 hw:numa_cpu.1=1-3 hw:numa_mem.0=64 hw:numa_mem.0=192 hw:cpu_realtime=yes hw:cpu_realtime_mask=^0-1 hw:emulator_threads_policy=isolate) supper-optimiesd-guest-2 (hw:numa_nodes=2 hw:numa_cpu.0=1 hw:numa_cpu.1=1-3 hw:numa_mem.0=64 hw:numa_mem.0=192 hw:cpu_realtime=yes hw:cpu_realtime_mask=^0-1 hw:emulator_threads_policy=share)
for each of these test ill provide a test-command file with the command i used to run the tests and reustlts file with a summary at the top plus the xmls before and after the migration showing that intially the resouces would conflict on migration and then the updated xmls after the migration. i will also provide the local.conf for the devstack deployment and some details about the env like distor/qemu/libvirt versions.
eventurally i hope all those test cases can be added to the whitebox plugin and verifed in a ci. we could also try and valideate them in functional tests.
i have attached the xml for the pinned guest as an example of what to expect but i will be compileing this slowly as i go and zip everying up in an email to the list. this will take some time to complete and hosestly i had planned to do most of this testing after feature freeze when we can focus on testing more.
regards sean