[openstack-dev] [nova] Austin summit cells v2 session recap

Matt Riedemann mriedem at linux.vnet.ibm.com
Tue May 3 01:32:05 UTC 2016


Andrew Laski led a double session for cells v2 on Wednesday afternoon. 
The full session etherpad is here [1].

Andrew started with an overview of what's done and what's in progress. 
Note that some of the background on cells, what's been completed for 
cells v2 and what's being worked on is also in a summit video from a 
user conference talk that Andrew gave [2].

We agreed to add the MQ switching [3] to get_cell_client and see what, 
if anything, breaks.

DB migrations
-------------

We had a quick rundown on the database tables slated for migration to 
the API database. Notable items for the DB table migrations:

* Aggregates and quotas will be migrated, there are specs up for both of 
these from Mark Doffman.
* The nova-network related tables won't be migrated since we've 
deprecated nova-network.
* The agent_builds table won't be migrated. We plan on deprecating this 
API since it's only used by the XenAPI virt driver and it sounds like 
Rackspace doesn't even use/enable it.
* We have to figure out what to do about the certificates table. The 
only thing using it is the os-certificates REST API and nova-cert 
service, and nothing in tree is using either of those now. The problem 
is, the ec2api repo on GitHub is using the nova-cert rpc api directly 
for s3 image download. So we need to figure out if we can move that into 
the ec2api repo and drop it from Nova or find some other solution.
* keypairs will be migrated to the API DB. There was a TODO about 
needing to store the keypair type in the instance. I honestly can't 
remember exactly what that was for now, I'm hoping Andrew remembers.
* We agreed to move instance_groups and instance_group_policy to the API 
DB, but there is a TODO to sort out if instance_group_members should be 
in the API DB.

nova-network
------------

For nova-network we agreed that we'll fail hard if someone tries to add 
a second cell to a cells v2 deployment and they aren't using Neutron.

Testing
-------

Chuck Carmack is working on some test plans for cells v2. There would be 
a multi-node/cell job where one node is running the API and cell0 and 
another is running a regular nova cell. There would also be migration 
testing as part of grenade.

Documentation
-------------

We discussed what needs to be documented and where it should live.

Since all deployments will at least be a cell of one, setting that up 
will be in the install guide in docs.o.o. A multi-cell deployment would 
be documented in the admin guide.

Anything related to the call path flow for different requests would live 
in the nova developer documentation (devref).

Pagination
----------

This took a significant portion of the second cells v2 session and is 
one of the more complicated problems to sort out. There are problems 
with listing all instances across all cells especially when we support 
sorting. And we really have a latent bug in the API since we never 
restricted the list of valid sort keys for listing instances, so you can 
literally sort on anything in the instances table in the DB.

There were some ideas about how to handle this:

1. Don't support sorting in the API if you have multiple cells. Leave it 
up to the caller to sort the results on their own. Obviously this isn't 
a great solution for existing users that rely on this in the API.

2. Each cell sorts the results individually, and the API merge sorts the 
results from the cells. There is still overhead here.

3. Don't split the database, or use a distributed database like Redis. 
Since this wasn't brought up in person in the session, or on Friday, it 
wasn't discussed. There is another thread about this though [4].

4. Use the OpenStack Searchlight project for doing massive queries like 
this. This would be optional for a cell of one but recommended/required 
for anyone running multiple cells. The downside to this is it's a new 
dependency, and requires Elasticsearch (but many deployments are 
probably already running an ELK stack for monitoring their logs). It's 
also unclear at an early stage how easy this would be to integrate into 
Nova. Plus deployers would need to setup Searchlight to listen to 
notifications emitted from Nova so the indexes are updated in ES. It is, 
however, arguably a better tool for the job than Nova trying to deal 
with filtering and sorting with python. There is general agreement 
within the core team that this is the path forward, but it's going to 
require investigation and testing before we get a better idea of how 
feasible this is.

Related to paging, we also have an existing problem with the marker that 
will need to be sorted out before we can support multiple cells with v2. 
Flavors are now in the API DB, and we return a marker for paging, but it 
doesn't have the cell context, so we have to work that in. The good news 
is we control the marker and we never documented anywhere that it's a 
specific resource uuid (although for instances it is the last instance 
uuid processed). So this is fixable, but is a known issue right now.

Quotas
------

Jay Pipes has an idea about a generation ID for quotas, but it wasn't 
fleshed out in the session, so TBD.

Upgrade process
---------------

We didn't get into this too much. People are generally in agreement on 
what's already planned for the upgrade process from a non-cells v1 
deployment to cells v2. Andrew covers some of the proposed commands for 
upgrades in his presentation [2]. We hope to build into oslo.messaging 
the ability to construct a transport_url from config options so that the 
transport_url doesn't have to be provided to the migration commands, we 
can just figure it out automatically.

There is a rough plan for upgrading from cells v1 to v2 in a docs patch 
from Andrew [5].

REST API for managing cells resources
-------------------------------------

This came up at the very end of the session, but operators need a way to 
manage cells resources like they can for hosts. This would ideally be a 
REST API but could start as a nova-manage command for the initial version.

[1] https://etherpad.openstack.org/p/newton-nova-cells
[2] https://www.openstack.org/videos/video/nova-cells-v2-whats-going-on
[3] https://review.openstack.org/#/c/298551/
[4] 
http://lists.openstack.org/pipermail/openstack-dev/2016-April/093151.html
[5] https://review.openstack.org/#/c/267153/

-- 

Thanks,

Matt Riedemann




More information about the OpenStack-dev mailing list