[nova] NUMA live migration - mostly how it's tested

Artom Lifshitz alifshit at redhat.com
Fri Mar 1 15:44:18 UTC 2019


On Wed, Feb 27, 2019 at 8:25 PM Artom Lifshitz <alifshit at redhat.com> wrote:
>
> Hey all,
>
> There won't be much new here for those who've reviewed the patches [1]
> already, but I wanted to address the testing situation.
>
> Until recently, the last patch was WIP because I had functional tests
> but no unit tests. Even without NUMA anywhere, the claims part of the
> new code could be tested in functional tests. With the new and
> improved implementation proposed by Dan Smith [2], this is no longer
> the case. Any test more involved than unit testing will need "real"
> NUMA instances on "real" NUMA hosts to trigger the new code. Because
> of that, I've dropped functional testing altogether, have added unit
> tests, and have taken the WIP tag off.

Replying to myself here to address the functional tests situation.
I've explored this a bit, and while it's probably doable (it's code,
everything is doable), I'm wondering whether functional tests would be
worth it.

The problem arises from artificially forcing an overlap of the pin
mappings. In my integration tests, using CPU pinning as an example, I
set vcpu_pin_set to 0,1 on both compute hosts, boot two instances
(making sure they're on different hosts by using the
DifferentHostFilter and the appropriate scheduler hint); then change
vcpu_pin_set to 0-3 on host A, live migrate the instance from host B
onto host A, and assert that they don't end up with overlapping pins.

Applying the same strategy to functional tests isn't straightforward
because the CONF object is very very global, and we can't have
different config values for different services in the same test. One
basic functional test we could have is just asserting that the live
migration is refused if both hosts are "full" with instances -
something that currently works just fine, except for resulting in
overlapping pin mappings.

For more advanced testing, I'm proposing that we shelve functional
tests for now and push on setting up some sort of CI job using OpenLab
hardware. I've already opened a request [1]. If this doesn't pan out,
we can revisit what it would take to have functional tests.

Thoughts?

[1] https://github.com/theopenlab/openlab/issues/200

> What I've been using for testing is this: [3]. It's a series of
> patches to whitebox_tempest_plugin, a Tempest plugin used by a bunch
> of us Nova Red Hatters to automate testing that's outside of Tempest's
> scope. Same idea as the intel-nfv-ci plugin [4]. The tests I currently
> have check that:
>
> * CPU pin mapping is updated if the destination has an instance pinned
> to the same CPUs as the incoming instance
> * emulator thread pins are updated if the destination has a different
> cpu_shared_set value and the instance has the
> hw:emulator_threads_policy set to `share`
> * NUMA node pins are updated for a hugepages instance if the
> destination has a hugepages instances consuming the same NUMA node as
> the incoming instance
>
> It's not exhaustive by any means, but I've made sure that all
> iterations pass those 3 tests. It should be fairly easy to add new
> tests, as most of the necessary scaffolding is already in place.
>
> [1] https://review.openstack.org/#/c/634606/
> [2] https://review.openstack.org/#/c/634828/28/nova/virt/driver.py@1147
> [3] https://review.rdoproject.org/r/#/c/18832/
> [4] https://github.com/openstack/intel-nfv-ci-tests/



-- 
--
Artom Lifshitz
Software Engineer, OpenStack Compute DFG



More information about the openstack-discuss mailing list