[openstack-dev] [nova] [placement] resource providers update 18-03

Chris Dent cdent+os at anticdent.org
Fri Jan 19 09:46:30 UTC 2018


Here's resource provider and placement update 18-03. I'm travelling so
this version may be a bit abridged.

# Most Important

This remains mostly the same, getting alternate hosts all the way in
and finishing up nested resource provider support (as ProviderTree
on the nova side and support for nested in /allocation_candidates on
the placement side). Both of these will likely need some time to be
rigorously run through their paces before the end of the cycle, so the
sooner stuff merges the sooner we can start getting the whole suite
exercised by humans.

Earlier in the week I did some exercising by humans and was confused
by the state of traits handling on /allocation_candidates (it could be
the current state is the expected state but the code didn't make that
clear) so I made a bug on it make sure that confusion didn't get forgotten:

     https://bugs.launchpad.net/nova/+bug/1743860

I highlight this not because I think that problems is especially a
"most important" but that it is a type of problem that I think we'll
see a fair bit of over the next small number of weeks as we close out
Queens and head for Rocky.

(Looks like Alex is working on the correct fix at

     https://review.openstack.org/#/c/535642/

Based on that it seems most of the confusion here is mine, but that it
was hard to tell what is up or the plan is is something we probably
need to get better at.)

The Rocky PTG prep etherpad is in flight at

      https://etherpad.openstack.org/p/nova-ptg-rocky

please add things you think need to be talked about at the PTG.

There's an email thread in progress that is probably pretty important
to understand, if you're working on placement related things:

     http://lists.openstack.org/pipermail/openstack-dev/2018-January/126283.html

The behavior of the Aggregate*FilterS has gone awry in the face of
placement satisfying allocation_ratio concerns before those filters
ever see proposed hosts. There are some ideas on how to improve the
situation in the thread, but it appears there are still some open
questions.

# What's Changed

An issue with foreign key constraints and deleting a resource provider
whose root is itself has been resolved and the change merged:

     https://review.openstack.org/#/c/529519/

Anybody (or thing) that was experimenting with deleting resource
providers with a database with some integrity would have encountered
this problem.

A proposal to create a Resource Management SIG has merged. There was
some email discussion about it:

     http://lists.openstack.org/pipermail/openstack-dev/2018-January/126039.html

# Help Wanted

There are a fair few unstarted bugs related to placement that could do
with some attention. Here's a handy URL: https://goo.gl/TgiPXb

# Main Themes

## Nested Providers

The nested provider work is proceeding along two main courses: getting
the ProviderTree on the nova side gathering and syncing all the
necessary information, and enabling nested provider searching when
requesting /allocation_candidates. Both of these are within the same
topic:

     https://review.openstack.org/#/q/topic:bp/nested-resource-providers

One of the challenges this week was working out a reasonable way to
have a read-only and thread-safe duplicate of a ProviderTree so that
tree A and tree B can have what amounts to a diff done on them. This
is being figured out on

     https://review.openstack.org/#/c/533244/

## Alternate Hosts

The last piece of the puzzle, changing the RPC interface, is pending:

      https://review.openstack.org/#/q/topic:bp/return-alternate-hosts

Related to this, exploration has started on limiting the number of
responses that the scheduler will get when requesting hosts (some
of which will become alternates):

      https://review.openstack.org/#/c/531517/

# Other

* Support traits in allocation candidates
   https://review.openstack.org/#/c/535642/

* Extract instance allocation removal code
    https://review.openstack.org/#/c/513041/

* Sending global request ids from nova to placement
    https://review.openstack.org/#/q/topic:bug/1734625

* VGPU suppport
    https://review.openstack.org/#/q/topic:bp/add-support-for-vgpu

* Use traits with ironic
    https://review.openstack.org/#/q/topic:bp/ironic-driver-traits

* Move api schemas to own dir
    https://review.openstack.org/#/c/528629/
    Just one of these left

* request limit /allocation_candidate WIP
    https://review.openstack.org/#/c/531517/

* Update resources once in update available resources
    https://review.openstack.org/#/c/520024/
    (This ought, when it works, to help address some performance
    concerns with nova making too many requests to placement)

* spec: treat devices as generic resources
    https://review.openstack.org/#/c/497978/
    This is a WIP and will need to move to Rocky

* log options at DEBUG when starting wsgi app
    https://review.openstack.org/#/c/519462/

* Support aggregate affinity filters/weighers
    https://review.openstack.org/#/q/topic:bp/aggregate-affinity
    A rocky targeted improvement to affinity handling

* Move placement body samples in docs to own dir
    https://review.openstack.org/#/c/529998/

* Improved functional test coverage for placement
    https://review.openstack.org/#/q/topic:bp/placement-test-enhancement

* Functional tests for traits api
    https://review.openstack.org/#/c/524094/

* Functional test improvements for resource class
    https://review.openstack.org/#/c/524506/

* annotate loadapp() (for placement wsgi app) as public
    https://review.openstack.org/#/c/526691/

* Remove microversion fallback code from report client
    https://review.openstack.org/#/c/528794/

* Fix documentation nits in set_and_clear_allocations
    https://review.openstack.org/#/c/531001/

* WIP: SchedulerReportClient.set_aggregates_for_provider
    https://review.openstack.org/#/c/532995/
    This is likely for rocky as it depends on changing the api for
    aggregates handling on the placement side to accept and provide
    a generation

* Naming update cn to rp (for clarity)
    https://review.openstack.org/#/c/529786/

* Add functional test for two-cell scheduler behaviors
    https://review.openstack.org/#/c/452006/
    (This is old and maybe out of date, but something we might like to
    resurrect)

* Make API history doc consistent
    https://review.openstack.org/#/c/477478/

* WIP: General policy sample file for placement
    https://review.openstack.org/#/c/524425/

* Fix missing marker functions
   https://review.openstack.org/#/c/514579/
   (some placement exceptions are not translatable)

* Support relay RP for allocation candidates
   https://review.openstack.org/#/c/533437/
   Bug fix for sharing with multiple providers

# End

As usual, I'm sure I missed something. Please reply with any
corrections.

Your prize is an invitation to do some exercising by humans.

-- 
Chris Dent                      (⊙_⊙')         https://anticdent.org/
freenode: cdent                                         tw: @anticdent


More information about the OpenStack-dev mailing list