[nova] NUMA live migration - mostly how it's tested
Luyao Zhong
luyao.zhong at intel.com
Fri Mar 1 09:35:54 UTC 2019
Attached is debug log.
On 2019/3/1 下午5:30, Luyao Zhong wrote:
> Hi all,
>
> There was something wrong with the live migration when using 'dedicated'
> cpu_policy in my test. Attached file contains the details.
>
> The message body will be so big that it will be held if I attach the
> debug info, so I will send another email.
>
> Regards,
> Luyao
>
>
> On 2019/2/28 下午9:28, Sean Mooney wrote:
>> On Wed, 2019-02-27 at 21:33 -0500, Artom Lifshitz wrote:
>>>
>>>
>>> On Wed, Feb 27, 2019, 21:27 Matt Riedemann, <mriedemos at gmail.com> wrote:
>>>> On 2/27/2019 7:25 PM, Artom Lifshitz wrote:
>>>>> What I've been using for testing is this: [3]. It's a series of
>>>>> patches to whitebox_tempest_plugin, a Tempest plugin used by a bunch
>>>>> of us Nova Red Hatters to automate testing that's outside of Tempest's
>>>>> scope.
>>>>
>>>> And where is that pulling in your nova series of changes and posting
>>>> test results (like a 3rd party CI) so anyone can see it? Or do you mean
>>>> here are tests, but you need to provide your own environment if you
>>>> want
>>>> to verify the code prior to merging it.
>>>
>>> Sorry, wasn't clear. It's the latter. The test code exists, and has
>>> run against my devstack environment with my
>>> patches checked out, but there's no CI or public posting of test
>>> results. Getting CI coverage for these NUMA things
>>> (like the old Intel one) is a whole other topic.
>> on the ci front i resolved the nested vert on the server i bought to
>> set up a personal ci for numa testing.
>> that set me back a few weeks in setting up that ci but i hope to run
>> artom whitebox test amoung other in that at some
>> point. vexhost also provided nested virt to the gate vms. im going to
>> see if we can actully create a non voting job
>> using the ubuntu-bionic-vexxhost nodeset. if ovh or one of the other
>> providers of ci resource renable nested virt
>> then we can maybe make that job voting and not need thridparty ci anymor.
>>>> Can we really not even have functional tests with the fake libvirt
>>>> driver and fake numa resources to ensure the flow doesn't blow up?
>>>
>>> That's something I have to look into. We have live migration
>>> functional tests, and we have NUMA functional tests, but
>>> I'm not sure how we can combine the two.
>>
>> jus as an addtional proof point im am planning to do a bunch of
>> migration and live migration testing in the next 2-4
>> weeks.
>>
>> my current backlog on no particalar order is
>> sriov migration
>> numa migration
>> vtpm migration
>> cross-cell migration
>> cross-neutron backend migration (ovs<->linuxbridge)
>> cross-firwall migraton (iptables<->contrack) (previously tested and
>> worked at end of queens)
>>
>> narrowong in on the numa migration the current set of testcases i plan
>> to manually verify are as follows:
>>
>> note assume all flavor will have 256mb of ram and 4 cores unless
>> otherwise stated
>>
>> basic tests
>> pinned guests (hw:cpu_policy=dedicated)
>> pinned-isolated guests (hw:cpu_policy=dedicated hw:thread_policy=isolate)
>> pinned-prefer guests (hw:cpu_policy=dedicated hw:thread_policy=prefer)
>> unpinned-singel-numa guest (hw:numa_nodes=1)
>> unpinned-dual-numa guest (hw:numa_nodes=2)
>> unpinned-dual-numa-unblanced guest (hw:numa_nodes=2 hw:numa_cpu.0=1
>> hw:numa_cpu.1=1-3
>> hw:numa_mem.0=64 hw:numa_mem.0=192)
>> unpinned-hugepage-implcit numa guest (hw:mem_page_size=large)
>> unpinned-hugepage-multi numa guest (hw:mem_page_size=large
>> hw:numa_nodes=2)
>> pinned-hugepage-multi numa guest (hw:mem_page_size=large
>> hw:numa_nodes=2 hw:cpu_policy=dedicated)
>> realtime guest (hw:cpu_policy=dedicated hw:cpu_realtime=yes
>> hw:cpu_realtime_mask=^0-1)
>> emulator-thread-iosolated guest (hw:cpu_policy=dedicated
>> hw:emulator_threads_policy=isolate)
>>
>> advanced tests (require extra nova.conf changes)
>> emulator-thread-shared guest (hw:cpu_policy=dedicated
>> hw:emulator_threads_policy=shared) note cpu_share_set configrued
>> unpinned-singel-numa-hetorgious-host guest (hw:numa_nodes=1) note
>> vcpu_pin_set adjusted so that
>> host 1 only has cpus on
>> numa 1 and host 2 only has cpus on numa node 2.
>> supper-optimiesd-guest (hw:numa_nodes=2 hw:numa_cpu.0=1 hw:numa_cpu.1=1-3
>> hw:numa_mem.0=64 hw:numa_mem.0=192 hw:cpu_realtime=yes
>> hw:cpu_realtime_mask=^0-1 hw:emulator_threads_policy=isolate)
>> supper-optimiesd-guest-2 (hw:numa_nodes=2 hw:numa_cpu.0=1
>> hw:numa_cpu.1=1-3 hw:numa_mem.0=64 hw:numa_mem.0=192
>> hw:cpu_realtime=yes hw:cpu_realtime_mask=^0-1
>> hw:emulator_threads_policy=share)
>>
>>
>> for each of these test ill provide a test-command file with the
>> command i used to run the tests and reustlts file
>> with a summary at the top plus the xmls before and after the migration
>> showing that intially the resouces
>> would conflict on migration and then the updated xmls after the
>> migration.
>> i will also provide the local.conf for the devstack deployment and
>> some details about the env like distor/qemu/libvirt
>> versions.
>>
>> eventurally i hope all those test cases can be added to the whitebox
>> plugin and verifed in a ci.
>> we could also try and valideate them in functional tests.
>>
>> i have attached the xml for the pinned guest as an example of what to
>> expect but i will be compileing this slowly as i
>> go and zip everying up in an email to the list.
>> this will take some time to complete and hosestly i had planned to do
>> most of this testing after feature freeze when we
>> can focus on testing more.
>>
>> regards
>> sean
>>
>>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: debug.log
Type: text/x-log
Size: 21050 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190301/2e2ae364/attachment.bin>
More information about the openstack-discuss
mailing list