[placement] update 19-08

Chris Dent cdent+os at anticdent.org
Fri Mar 1 13:19:17 UTC 2019


HTML: https://anticdent.org/placement-update-19-08.html

Welcome back to the placement update. If I've read the signs
correctly, I should now be back to this as a regular thing.
Apologies for the gap, I had to attend to some other
responsibilities.

# Most Important

A lot has changed in the past few months, so it's hard to extract
out a most important. It will depend on who is reading. Review
what's changed for a summary of important stuff.

# What's Changed

* Placement is now its own official project. Until elections are
   held (it looks like nominations start this coming Tuesday), Mel is
   the PTL.

* [Setting up storyboard](https://review.openstack.org/#/c/639445/)
   for placement-related projects is in progress. For the time being
   we are continuing to use launchpad for most tracking. See a
   [related email
   thread](http://lists.openstack.org/pipermail/openstack-discuss/2019-February/003102.html).

* Deleting placement code from nova has been put on hold until Train
   to make it easier for certain types of upgrades to happen. New
   installs should prefer the extracted code, as the nova-side is
   frozen, but the placement side is not.

* A large stack of code to remove oslo.versionedobjects from
   placement has merged. This has resulted in a significant change
   in performance on the `perfload` test that runs in the gate. While
   not a complete representation of the entire system, it's enough to
   say "yeah, that was worth it": A request for allocation candidates
   that used to take around 2.5 seconds now takes 1.2. That
   refactoring continues (see below), seeking additional
   simplifications.

* Microversion 1.31 adds `in_tree` and `in_treeN` query parameters
   to GET /allocation_candidates. This is useful in a variety of
   nested resource provider scenarios, including the big bandwidth QoS
   changes that are in progress in nova and neutron.

* Placement is now publishing [install docs](https://docs.openstack.org/placement/latest/install/)
   but it is important to note that those docs have not been
   validated (as far as I'm aware) by the packagers. That's a thing
   that needs to happen, presumably by the packagers.

* os-resource-classes 0.3.0 has been
   [released](https://pypi.org/p/os-resource-classes) with a
   `normalize_name` function.

* There are some pending specs from nova which are primarily
   placement feature specs. We'll continue with those as is (see
   below), but come the next cycle the plan is to manage specs in the
   placement repo, not have a separate repo, and not have separate
   spec cores.

# Specs/Blueprints/Features

## Near to Done

* [Filter Allocation Candidates by Provider Tree](http://specs.openstack.org/openstack/nova-specs/specs/stein/approved/alloc-candidates-in-tree.html)
   has been mostly completed by Tetsuro, but there's a [pending
   update to the spec](https://review.openstack.org/639033).

## Not yet Done

* [Support filtering by forbidden aggregate membership](http://specs.openstack.org/openstack/nova-specs/specs/stein/approved/negative-aggregate-membership.html)
* [Support any traits in allocation_candidates
   query](http://specs.openstack.org/openstack/nova-specs/specs/stein/approved/placement-any-traits-in-allocation_candidates-query.html) 
* [Support mixing required traits with any
   traits](http://specs.openstack.org/openstack/nova-specs/specs/stein/approved/placement-mixing-required-traits-with-any-traits.html)

## Not yet Approved

* [Update alloc-candidates-in-tree](https://review.openstack.org/#/c/639033/)
   updates the in-tree spec above to reflect what was learned while
   doing the actual implementation. Notably how numbered `in_tree`
   parameters impact results.

* [Resource provider - request group mapping in allocation candidate](https://review.openstack.org/#/c/597601/)
   has had a recent resurgence in attention.

# Bugs

* Placement related [bugs not yet in progress](https://goo.gl/TgiPXb): 15.
* [In progress placement bugs](https://goo.gl/vzGGDQ) 17.

# osc-placement

osc-placement is currently behind by 14 microversions.

Code for 1.18 is [under review](https://review.openstack.org/#/c/639738/).

# Main Themes

This section now overlaps a bit with the Specs/Features bit above.
This will settle out with a bit more clarity as we move along.

## Nested

* Reshaper handing in nova keeps exposing additional things that
   need to be remembered on the nova-side, so there are a few patches
   remaining related to [vgpu
   reshaping](https://review.openstack.org/#/q/topic:bp/reshape-provider-tree+status:open)
   but it is mostly ready.

* The bandwidth-resource-provider topic has merged a vast amount of
   code but there is still [plenty
   left](https://review.openstack.org/#/q/topic:bp/bandwidth-resource-provider).

Related to all this nested stuff: The complex hardware models that
drove the development of the nested resource provider system are
challenging to test. The cloud hardware provided to OpenStack
infrastructure does not expose the hardware that would allow real
integration tests. If anyone reading this is in a position to
provide third party CI with fancy hardware for NUMA, NFV, FPGA, and
GPU related integration testing with nova, there's a significant
need for that.

## Refactoring

(I think refactoring should be a constant theme. To reflect that,
I'm going to have a section here. Editorial privilege or something.)

There's a collection of patches in progress, currently under the
topic
[scrub-Lists](https://review.openstack.org/#/q/topic:scrub-Lists)
that is a follow up to the patches that removed oslo versioned
objects. That work pointed out some opportunities to DRY-up the
List classes (e.g., UsageList) to remove some duplication and
simplify. Then, after looking at that, it became clear that entirely
removing the List classes, in favor of using python native lists,
would further simplify the code.

Apart from the previously mentioned performance and simplicity
benefits of these changes, it's also managed to expose and fix a few
bugs, simple because we were looking at things and moving them
around. If you pick up rocks, you can see the bugs and squash them.
If you don't, they breed.

# Other Placement

* <https://review.openstack.org/#/q/topic:improve-debug-log>
   A series of improvements leading to a better debug log when
   retrieving allocation candidates.

* <https://review.openstack.org/#/c/639628/>
   Docs: extract testing info to own sub-page

* <https://review.openstack.org/#/q/topic:cd/gabbi-tempest-job>
   Gabbi-based integration tests of placement. These recently found a
   bug that none of the functional, grenade, nor tempest tests did.

* <https://review.openstack.org/#/c/619050/>
   Optionally migrate database at service startup (so you don't have
   to run `placement-manage db sync` if you don't want to).

* <https://review.openstack.org/#/c/630216/>
   Add a vision-reflection (of the Technical Vision doc).


# Other Service Users

## Nova

See also the several links above for more nova changes. Also, I'm a
bit behind on my tracking in this area, so there is likely plenty of
other stuff too. This will improve over time.

* <https://review.openstack.org/538498>
   Convert driver supported capabilities to compute node provider traits

* <https://review.openstack.org/621494>
   Add descriptions of numbered resource classes and traits

* <https://review.openstack.org/636412>
   Make move_allocations handle empty source allocations
   (Part of a series on cross-cell resize)

* <https://review.openstack.org/#/q/topic:bp/count-quota-usage-from-placement>
   Using placement (from nova) for counting (some of) quota.

## Not Nova

* <https://review.openstack.org/#/q/topic:tripleo-nova-placement-removal>

* <https://review.openstack.org/#/q/topic:tripleo-placement-extraction>

* <https://review.openstack.org/#/q/topic:minimum-bandwidth-allocation-placement-api>
   Neutron side of minimum bandwidth.

* <https://review.openstack.org/#/q/topic:puppet-placement-extraction>

* <https://review.openstack.org/#/q/bp/no-affinity-instance-reservation>
   Blazar reservation handling, including some manipulation of
   inventory in placement.

* <https://review.openstack.org/633204>
   Blazar: Retry on inventory update conflict

# End

Though this is long, it doesn't really bring us fully up to date. If
something is missing that you think is important please let me know.
Once I'm back in the flow it should become increasingly complete.

-- 
Chris Dent                       ٩◔̯◔۶           https://anticdent.org/
freenode: cdent                                         tw: @anticdent


More information about the openstack-discuss mailing list