[nova][ptl][election] Start the battle

Eric Fried openstack at fried.cc
Wed Mar 13 21:25:34 UTC 2019

Thanks for opening this up, Jens; and thanks Chris and Matt for crisping
up the ask.

> what have you done in the last six
> months to demonstrate that you're not only available but helping, or
> better yet, leading to push the bigger things we're working on as a
> team, which could be stuff from our cycle themes [1] or the less
> tangible stuff like the placement extraction work, and how do you plan
> to continue that for the next six months?

In my view, the main [1] vehicle of progress in Nova is Placement. I
have been working tirelessly for the past six (or 18) months to help
shape Placement into being able to support major Nova use cases; and I
have been working likewise in Nova to take advantage of that functionality.

For quite a while, Placement was adding features far faster than Nova
could use them. Around Queens, we started putting a bunch of framework
in Nova to *prepare* for making use of things like nested providers
[2][3], but it has only been in Stein that Nova features have been
implemented which actually use them. This includes VGPU, which prompted
the reshaper effort [4]; and bandwidth resource providers [5], a huge
multi-project multi-cycle effort that is finally landing.

This is just the tip of the iceberg. We have a lot of "Placement
exploitation" work remaining, including (not a complete list, and in no
particular order):

- Making sharing providers work. Famously, if your host's root disk
resource lives on shared storage, the reporting of the amount of disk in
your cloud is wrong by a factor of <number of hosts sharing that
storage>. This was one of the main reasons Placement was created. We
made a good effort to fix a piece of this in Rocky [6], but didn't
quiite get there. This is something I'd like to see finally get some
real traction in Train.
- Modeling NUMA. While nested providers seem to be ideal for
representing resources affined to NUMA cells, a) we haven't done it yet
- hopefully [7] will be a solid step in that direction; and b) there are
still some design gaps, such as inter-provider affinity, that we have
discussed repeatedly but never closed on. I would like to be able to
move those forward and break the "analysis paralysis".
- Accelerators, or more generally, "PCI Passthrough should DIAF". We're
reaching tentative tentacles into representing accelerators in Placement
(cf. VGPUs above), but a more generic and far-reaching solution is
called for. I want to push for the adoption of Cyborg - another thing
we've discussed several times and needs the thrash cycle broken - as a
big step in that direction.

Hopefully the above goes some way toward exemplifying my record of and
commitment to:
- Past: Making Placement and Nova ready for...
- Present: Significant progress on real features; and
- Future: Breaking the cycle of design churn to make forward progress.
This is really important. Some problems are hard, and don't have perfect
solutions. We need to be able to reach a consensus (which often does
*not* mean unanimous agreement) on which imperfect solution is least
painful and DO IT.


[1] obviously not the only; lots of the priority/themed work such as
cells touches Placement marginally or not at all
[2] https://review.openstack.org/#/q/topic:bp/nested-resource-providers
[3] https://review.openstack.org/#/q/topic:bp/granular-resource-requests
[4] https://review.openstack.org/#/q/topic:bp/reshape-provider-tree
[5] https://review.openstack.org/#/q/topic:bp/bandwidth-resource-provider
[6] https://review.openstack.org/#/c/560459/
[7] https://review.openstack.org/#/c/552924/

More information about the openstack-discuss mailing list