[Openstack-operators] [nova] Queens PTG recap - cells
Matt Riedemann
mriedemos at gmail.com
Sat Sep 16 16:37:43 UTC 2017
The full etherpad for cells discussions at the PTG is here [1].
We mostly talked about the limitations with multiple cells identified in
Pike [2] and priorities.
Top priorities for cells in Queens
----------------------------------
* Alternate hosts: with multiple cells in a tiered (super) conductor
mode, we don't have reschedules happening when a server build fails on a
compute. Ed Leafe has already started working on the code to build an
object to pass from the scheduler to the super conductor. We'll then
send that from the super conductor down to the compute service in the
cell and then reschedules can happen within a cell using that provided
list of alternative hosts (and pre-determined allocation requests for
Placement provided by the scheduler). We agreed that we should get this
done early in Queens so that we have ample time to flush out and fix bugs.
* Instance listing across multiple cells: this is going to involve
sorting the instance lists we get back from multiple cells, which today
are filtered/sorted in each cell and then returned out of the API in a
"barber pole" pattern. We are not going to use Searchlight for this, but
instead do it with more efficient cross-cell DB queries. Dan Smith is
going to work on this.
Dealing with up-calls
---------------------
In a multi-cell or tiered (super) conductor mode, the cell conductor and
compute services cannot reach the top-level database or message queue.
This breaks a few existing things today.
* Instance affinity reporting from the computes to the scheduler won't
work without the MQ up-call. There is also a check that happens late in
the build process on the compute which checks to see if server group
affinity/anti-affinity policies are maintained which is an up-call to
the API database. Both of these will be solved long-term when we model
distance in Placement, but we are deferring that from Queens. The late
affinity check in the compute is not an issue if you're running a single
cell (not using a tiered super conductor mode deployment) and if you're
running multiple cells, you can configure the cell conductors to have
access to the API database as a workaround. We wouldn't test with this
workaround in CI, but it's an option for people that need it.
* There is a host aggregate up-call when performing live migration with
the xen driver and you're letting the driver determine if block
migration should be used. We decided to just put a note in the code that
this doesn't work and leave it as a limitation for that driver and
scenario, which xen driver maintainers or users can fix if they want,
but we aren't going to make it a priority.
* There is a host aggregate up-call when doing boot from volume and the
compute service creates the volume, it checks to see if the instance AZ
and volume AZ match when [cinder]/cross_az_attach is False (not the
default). Checking the AZ for the instance involves getting the host
aggregates that the instance is in, and those are in the API database.
We agreed that for now, people running multiple cells and using this
cross_az_attach=False setting can configure the cell conductor to reach
the API database, like the late affinity check described above. Sylvain
Bauza is also looking at reasons why we even do this check if the user
did not request a specific AZ, so there could be other general changes
in the design for this cross_az_check later. That is being discussed
here [3].
Other discussion
----------------
* We have a utility to concurrently run database queries against
multiple cells. We are going to look to see if we can retrofit some
linear paths of the code with this utility to improve performance.
* Making the consoleauth service run per-cell is going to be low
priority until some large cells v2 deployments start showing up and
saying that a global consoleauth service is not scaling and it needs to
be fixed.
* We talked about using the "GET /usages" Placement API for counting
quotas rather than iterating that information from the cells, but there
are quite a few open questions about design and edge cases like move
operations and Ironic with custom resource classes. So while this is
something that should make counting quotas perform better, it's
complicated and not a priority for Queens.
* Finally, we also talked about the future of cells v1 and when we can
officially deprecate and remove it. We've already been putting warnings
in the code, docs and config options for a long time about not using
cells v1 and it being replaced with cells v2. *We agreed that if we can
get efficient multi-cell instance listing fixed in Queens, we'll remove
both cells v1 and nova-network in Rocky.* We've been asking that large
cells v1 deployments start checking out cells v2 and what issues they
run into with the transition, at least since the Boston Pike summit, and
so far we haven't gotten any feedback, so we're hoping this timeline
will spur some movement on that front. Dan Smith also called dibs on the
code removal.
[1] https://etherpad.openstack.org/p/nova-ptg-queens-cells
[2]
https://docs.openstack.org/nova/latest/user/cellsv2_layout.html#caveats-of-a-multi-cell-deployment
[3]
http://lists.openstack.org/pipermail/openstack-operators/2017-September/014200.html
--
Thanks,
Matt
More information about the OpenStack-operators
mailing list