[openstack-dev] [nova] placement/resource providers update 13
cdent+os at anticdent.org
Fri Mar 3 15:52:23 UTC 2017
This week's resource providers/placement update operates only
slightly as a summary of placement-related activity at least week's
PTG. We had a big etherpad of topics
and an entire afternoon (plus some extra time elsewhen) to cover
them, but really only addressed three of them (shared resource
handling, custom resource classes, traits) in any significant
fashion, touching on nested-resource providers a bit in the room and
claims in the placement service on the etherpad. Some summation of
# What Matters Most
A major outcome from the discussion was that the can_host/shared
concept will not be used for dealing with shared resources (such as
shared disk). Instead a resource provider that is a compute node
will be identified by the fact that it has a trait (the actual value
to be determined). When the compute node creates or updates its own
resource provider it will add that trait. When the nova-scheduler
asks for resources to filter it will include that trait.
This means that the traits spec (and its POC) is now priority one in
the placement universe:
# What's Changed
## Ironic Inventory
There was some debate about where in the layering of code within
nova-compute that the creation of custom resource classes should be
handled. Having these is necessary for the effective management of
ironic nodes. The discussion resulted in this new version of ironic
## Nested Resource Providers
There was some discussion at a flipboard about the concept of
resource providers with multiple parents. We eventually decided
"let's not do that". There was also some vague discussion about
whether it was possible to express a hardware configuration that is
currently planned to be expressed as nested resource providers as
instead a custom resource class, along the lines of how bare metal
configurations are described. This was left unresolved, in part
because presumably hardware configuration is dynamic in some or
# Main Themes
There's been a decision to normalize trait names so they look a bit
more like custom resource classes. That work is at
This is being done concurrently with the spec and code for traits
(That topic mismatch needs to be fixed.)
## Shared Resource Providers
As mentioned above, the plan on this work has changed, thus there is
currently no code in flight for it, but there is a blueprint:
## Nested Resource Providers
The start of creating an API ref for the placement API. Not a lot
there yet as I haven't had much of an opportunity to move it along.
There is, however, enough there for additional content to be
started, if people have the opportunity to do so. Check with me to
divvy up the work if you'd like to contribute.
## Claims in the Scheduler
We intended to talk about this at the PTG but we didn't get to it.
There was some discussion on the etherpad (linked above) but the
consensus was that planning for how to do this while the service
was a) still evolving, b) only just starting to do filtering was
premature: Anything we try to plan now will likely be wrong or at
least not aligned with eventual discoveries. We decided, instead,
that the right thing to do was to make what we've got immediately
planned work correctly and to get some real return on the promise of
the placement API (which in the immediate sense means getting shared
disk managed effectively).
Another topic we didn't get to. We're aware that there are some
redundancies in the resource tracker that we'd like to clean up
but it's also the case that we've done no performance testing on the
placement service itself. For example, consider the case where a
CERN-sized cloud is turned on (at Ocata) for the first time. Once
all the nodes have registered themselves as resource providers the
first request for some candidate destinations in the filter
scheduler will get back all those resource providers. That's
probably a waste on several dimensions and will get a bit loady.
We ought to model both these exterme cases and the common cases to
make sure there aren't unexpected performance drains.
## Microversion Handling on the Nova side
Matt identified that we'll need to be more conscious of micoversions
in nova-status, the scheduler and the resource tracker for pike and
# Other Code/Specs
Fixing it so we don't have to add json_error_formatter everywhere.
There's a collection of related fixes attached to that bug report.
Pushkar, you might want to make all of those have the same topic,
or put them in a stack of related changes.
Fixes that ensure that we only accept valid inventories when setting
Removing the Allocation.create() method which was only ever used in
tests and not in the actual, uh, creation of allocations.
Avoid deprecation warnings from oslo_context.
We need to be able to delete all the inventory hosted by one
resource provider in one request. Right now you need one delete for
each class of resource.
A spec for improving the level of detail and structure in placement
error responses so that it is easier to distinguish between
different types of, for example, 409 responses.
Spec for versioned-object based notification of events in the
CORS support in the placement API. We'll need this for browser-side
A little demo script for showing how a cronjob to update inventory
on a shared resource provider might work. This has been around for a
long time, I created it because it seemed like having a sort of demo
would be good, but it's been sitting around for a long time. It may
not be aligned with what we need. If so I'd like to abandon it.
Update placement dev to indicate the new decorator for the
json_error_formatter improvements mentioned above.
Removing SQL from an exception message.
Race condition for allocations during evacuation. Known bug, not
sure of solution.
Cache headers not produced by placement API. This was assigned to
several different people over time, but I'm not sure if there is
any active code.
There's still some lingering stuff on here, some of which is
mentioned elsewhere in this message, but not all.
I suspect I'm missing some items, please let me know.
# End Matter
I think we can think of at least the start of this cycle as a period
of consolidation for the placement service: Making sure that
everything we've started is finished, working accurately, and
returning benefits before making the next great leaps forward. These
leaps include things like:
* claims in the service
* neutron and cinder doing things with placement
* using a different database with placement 
* extracting placement to its own repo
 Patch to use separate database is being kept up to date:
Chris Dent ¯\_(ツ)_/¯ https://anticdent.org/
freenode: cdent tw: @anticdent
More information about the OpenStack-dev