[openstack-dev] [nova] Belated nova newton midcycle recap

Matt Riedemann mriedem at linux.vnet.ibm.com
Mon Aug 1 21:19:23 UTC 2016


It's a little late but I wanted to get a high level recap of the nova 
newton midcycle written up for those that didn't make it.

First off, thanks again to Intel for hosting and especially to Cindy 
Sirianni at Intel for making sure we were taken care of. We had about 40 
people each day in a single room so it was a little cramped but being 
the champions we are we survived.

The full etherpad is here:

https://etherpad.openstack.org/p/nova-newton-midcycle

I won't go into all of the details about every topic because (a) there 
was a lot of discussion and a lot of topics and (b) I honestly didn't 
catch everything, so I'm going to go over the highlights/decisions/todos 
(in no particular order).

* cells v2 progress/status check

The aggregates and server group data migration changes are underway and 
being reviewed. Migrating quotas to the API DB needs work though and 
someone besides Mark Doffman (doffm) will probably need to pick that up.

For cell0 only scheduler failures live there, so we talked about how 
those fit into the server list response. We decided that servers without 
a host will be sorted at the front of the list response, and servers 
with a list will be sorted after that. This will need to be documented 
behavior in the API and could be improved later with Searchlight. We 
would like someone to be a point person for interlocking with the 
Searchlight team and we thought Balazs Gibizer (gibi) would be a good 
person for this.

Andrew Laski has a change up for migrating from non-cells to cells v2. 
We want to force people to upgrade to cells v2 in Newton so that we can 
land a breaking change in Ocata to block people that aren't on cells v2 
yet. Doing this is contingent on grenade testing. Dan Smith has the TODO 
to look at the grenade changes. We don't plan on grenade testing cells 
v1 to cells v2. We'll need to get docs changes for upgrades for the 
process of migrating to cells v2. Michael Still (mikal) said we needed 
to open bugs against the docs team for this.

The goal for Newton with cells v2 is that an instance record will not be 
created until we pick a cell and we'll use the BuildRequest until that 
point, and listing/deleting instances during that window will still work 
as normal. For listing instances, we will prepend BuildRequests to the 
front of the list (unsorted). We'll also limit the sort_keys in the API, 
at least to excluded fields on joined tables - that can be fixed as a 
bug fix.

For RPC/DB context switching, the infrastructure is in place but we 
probably won't use this in Newton. There is a problem with version caps 
and sending a new object to and old cell. There are a few proposed 
solutions and Dan Smith was looking at testing a solution for this, but 
we'll most likely end up documenting it for upgrades.

* API policy in code

Claudiu Belu has a patch up for a nova-manage command to check what APIs 
a given user can perform. This is a first step to eventually getting to 
a discoverable policy CLI and it also provides a debug tool for 
operators when API users get policy errors.

We also said that any command for determining the effective policy of a 
deployment or checking duplicates should live in oslo.policy, not nova, 
since other projects are looking for the same thing, like Ironic. Nova 
wouldn't have a nova-manage command for this but would have an 
entrypoint. We also need to prioritize anything that needs to get into 
oslo.policy so we're not caught by the final non-client library release 
the week of 8/22.

* API docs in tree

Things are slow but that's mostly OK, we'll continue working on this 
past feature freeze since it's docs. And we'll probably schedule an 
api-ref docs review sprint early in September after feature freeze hits.

* Proxy API deprecations

We talked quite a bit about how to land the proxy API deprecation and 
network API changes in a single microversion, which actually happened 
with 2.36 last week.

Most of the discussion was around how to handle the network API 
deprecation since if you're using nova0-network it's not a proxy. We 
didn't want to really case the network APIs though, and we wanted the 
additional signaling mechanism that the network APIs, and nova-network, 
are deprecated, so we ultimately decided to include nova-network and all 
network APIs in the 2.36 microversion for deprecation. The sticky thing 
is that today you can request <2.36 and the API still works. After 
nova-network is deleted from code, that will no longer work. Yes this is 
a backward incompatible change, but we wanted the further signaling of 
the removal rather than just yank it outright when the time comes.

To ease some of the client experience, Dan Smith is working a change in 
python-novaclient to deprecate the network CLIs and if requesting 
microversion>=2.36 we'll fallback to 2.35 (or the latest available that 
still makes this work). So the network CLIs will be deprecated and emit 
a warning but continue to work even though API users will not be able to 
request >2.35 of those APIs.

* API extension folding/removal

There was a concern about losing policy checks and operators controlling 
features in their cloud via policy. For example, you can't disable file 
injection via policy anymore. There is a TODO here to send an email to 
the openstack-operators list as an FYI that we're removing policy for 
server attributes and if you're controlling the API capabilities via 
policy you might have a problem. The goal here is to disable the ability 
to inject things into the API response since that's a barrier to 
interoperability (and also make the code sane in the process).

This work does appear to have stalled a bit though since Sean Dague 
(sdague) has been working on getting oslo.privsep working in our upgrade 
CI jobs.

* Spec for user_id based policy enforcement

https://review.openstack.org/#/c/324068/

We agreed this was a regression and an important operator gap (discussed 
in threads in the operators ML) until we have hierarchical quota support 
in Nova, so we want to move forward with getting the limited support 
added back in Newton. TODOs here for John Garbutt, Andrew Laski and 
myself to review the spec.

* os-vif integration

https://review.openstack.org/#/c/269672/

Jay Pipes and Sean Mooney are happy with the change, but it's blocked on 
sorting out the oslo.privsep/grenade issue (which is nearly complete now).

* oslo.privsep/grenade

This is old news given:

http://lists.openstack.org/pipermail/openstack-dev/2016-July/099705.html

And https://review.openstack.org/#/c/344450/

But to summarize we have a least-terrible path forward on using privsep 
without making upgrade script exceptions in grenade and deployment tools 
for every project and every release that adds new privsep support. Sean 
Dague has been doing a lot of work on getting this done the last two 
weeks and it looks like we should be done by the end of this week:

https://review.openstack.org/#/c/348250/

This will allow nova to use the latest os-brick and os-vif libraries.

* libvirt storage pools

We talked quite a bit about CI coverage for live migration. Timofey 
Durakov was going to look into enabling NFS in our live migration job 
again. We were also going to change the multinode full job to only run 
the 2.1 microversion for live migration tests and let the live migration 
specific job run all of the microversions in the live migration tests 
(this is to avoid some duplication between the jobs).

The libvirt-imagebackend refactor series is still ongoing and massive:

https://review.openstack.org/#/q/topic:libvirt-imagebackend+status:open

I've literally had to request that it be chunked up and not rebased 
altogether because of the strain it puts on infra's CI system. But for 
this we really just need more reviews from the core team. Dan Smith has 
been trying to whip up some of that to keep it moving.

As for the libvirt storage pools spec, Paul Carlton reported that Maxim 
Nestratov had pointed out some issues with the spec regarding ploop 
devices, but that's since been sorted out and Maxim is +1 on the spec:

https://review.openstack.org/#/c/310505/

The deadline for getting the spec approved is this Thursday, 8/4. We 
agreed that the full change isn't going to make Newton at this rate, but 
we can try to get some of the early object code and tests done in Newton 
in parallel to the image backend work so that we have a head start on Ocata.

* Get Me a Network

The only remaining changes for this are the REST API microversion change 
(2.37):

https://review.openstack.org/#/c/316398/

Which is being tested by the Tempest test and devstack change:

https://review.openstack.org/#/c/327901/

But those should be in good shape for Newton.

* Vendor metadata reboot

We agreed that we still wanted mikal to keep working on this:


-- 

Thanks,

Matt Riedemann




More information about the OpenStack-dev mailing list