[Openstack-operators] [nova] Queens PTG recap - placement
mriedemos at gmail.com
Mon Sep 18 15:28:14 UTC 2017
Placement related items came up a lot at the Queens PTG. Some on Tuesday
, some on Wednesday , some on Thursday  and some on Friday .
Priorities for Queens
The priorities for placement/scheduler related items in Queens are:
1. Migration allocations  - we realized late in Pike that the way we
were tracking allocations across source and dest nodes during a move
operation (cold migrate, live migrate, resize, evacuate) was confusing
and error prone, and we had to "double up" allocations for the instance
during the move. The idea here is to simplify the resource allocation
modeling during a move operation by having the migration record be a
consumer of resource allocations during the move, so we can keep the
source/dest node allocations separate using the instance/migration
records. This is mostly internal technical debt reduction and to
simplify our accounting which should mean fewer bugs.
2. Alternate hosts - this is the work to have the scheduler determine a
set of alternative hosts for reschedules. This is important for cells v2
where the cell conductor and nova-compute services can't reach the API
database or scheduler, so reschedules need to happen within the cell
given a list of pre-determined hosts chosen by the scheduler at the top.
Ed Leafe has already started on some of this .
3. Nested resource providers  - this has been around for awhile now
but hasn't had the proper reviewer focus due to other priorities. We are
making this a priority in Queens as it enables a lot of other use cases
like bandwidth-aware scheduling and being able to eventually remove
major chunks of the claims code in the ResourceTracker in the compute
service. We agreed that in Queens we want to try and keep the scope of
this small and focus on being able to model a simple SR-IOV PF/VF
relationship. Modeling NUMA use cases will be post-Queens. We will need
quite a bit of work on functional testing done along with this so that
we have some fixtures and/or fake virt drivers in place to model things
like CPU pinning, huge pages, NUMA, SR-IOV, etc which also verify
allocations in Placement to know we are doing things correctly from the
client perspective, similar to the functional tests added for verifying
allocations during move operations in Pike.
General device management
This was a more forward looking discussion and the notes are in the
etherpad . This is not really slated for Queens work except to make
sure that things we do in Queens don't limit what we can do for
generically managing devices later, and is tied heavily to the nested
resource providers work.
Traits - supporting required traits in a flavor is on-going and the spec
is here .
Shared storage providers  - we have decided to defer working on this
from Queens given other priorities. Modeling move allocations with
migration records should help here though.
Modeling distance for (anti-)affinity use cases - this is being deferred
from Queens. There are workarounds when running with multiple cells.
Limits and ordering in Placement - Chris Dent has proposed a spec 
so that we can limit the size of a response when getting resource
providers from Placement during scheduling and also optionally configure
the behavior of how Placement orders the returned set, so you can pack
or spread possible build candidates.
OSC plugin - I'm trying to push this work forward. We have the plugin
installed with devstack now and a functional CI job for the repo but
need to move some of the patches forward that add the CLI functionality.
There was lots of other random stuff in  and  but for the most
part are not prioritized, spec'ed out or have a clear owner, so those
are not really getting attention for Queens.
More information about the OpenStack-operators