[openstack-dev] [nova] Belated nova newton midcycle recap
Matt Riedemann
mriedem at linux.vnet.ibm.com
Mon Aug 1 21:19:23 UTC 2016
It's a little late but I wanted to get a high level recap of the nova
newton midcycle written up for those that didn't make it.
First off, thanks again to Intel for hosting and especially to Cindy
Sirianni at Intel for making sure we were taken care of. We had about 40
people each day in a single room so it was a little cramped but being
the champions we are we survived.
The full etherpad is here:
https://etherpad.openstack.org/p/nova-newton-midcycle
I won't go into all of the details about every topic because (a) there
was a lot of discussion and a lot of topics and (b) I honestly didn't
catch everything, so I'm going to go over the highlights/decisions/todos
(in no particular order).
* cells v2 progress/status check
The aggregates and server group data migration changes are underway and
being reviewed. Migrating quotas to the API DB needs work though and
someone besides Mark Doffman (doffm) will probably need to pick that up.
For cell0 only scheduler failures live there, so we talked about how
those fit into the server list response. We decided that servers without
a host will be sorted at the front of the list response, and servers
with a list will be sorted after that. This will need to be documented
behavior in the API and could be improved later with Searchlight. We
would like someone to be a point person for interlocking with the
Searchlight team and we thought Balazs Gibizer (gibi) would be a good
person for this.
Andrew Laski has a change up for migrating from non-cells to cells v2.
We want to force people to upgrade to cells v2 in Newton so that we can
land a breaking change in Ocata to block people that aren't on cells v2
yet. Doing this is contingent on grenade testing. Dan Smith has the TODO
to look at the grenade changes. We don't plan on grenade testing cells
v1 to cells v2. We'll need to get docs changes for upgrades for the
process of migrating to cells v2. Michael Still (mikal) said we needed
to open bugs against the docs team for this.
The goal for Newton with cells v2 is that an instance record will not be
created until we pick a cell and we'll use the BuildRequest until that
point, and listing/deleting instances during that window will still work
as normal. For listing instances, we will prepend BuildRequests to the
front of the list (unsorted). We'll also limit the sort_keys in the API,
at least to excluded fields on joined tables - that can be fixed as a
bug fix.
For RPC/DB context switching, the infrastructure is in place but we
probably won't use this in Newton. There is a problem with version caps
and sending a new object to and old cell. There are a few proposed
solutions and Dan Smith was looking at testing a solution for this, but
we'll most likely end up documenting it for upgrades.
* API policy in code
Claudiu Belu has a patch up for a nova-manage command to check what APIs
a given user can perform. This is a first step to eventually getting to
a discoverable policy CLI and it also provides a debug tool for
operators when API users get policy errors.
We also said that any command for determining the effective policy of a
deployment or checking duplicates should live in oslo.policy, not nova,
since other projects are looking for the same thing, like Ironic. Nova
wouldn't have a nova-manage command for this but would have an
entrypoint. We also need to prioritize anything that needs to get into
oslo.policy so we're not caught by the final non-client library release
the week of 8/22.
* API docs in tree
Things are slow but that's mostly OK, we'll continue working on this
past feature freeze since it's docs. And we'll probably schedule an
api-ref docs review sprint early in September after feature freeze hits.
* Proxy API deprecations
We talked quite a bit about how to land the proxy API deprecation and
network API changes in a single microversion, which actually happened
with 2.36 last week.
Most of the discussion was around how to handle the network API
deprecation since if you're using nova0-network it's not a proxy. We
didn't want to really case the network APIs though, and we wanted the
additional signaling mechanism that the network APIs, and nova-network,
are deprecated, so we ultimately decided to include nova-network and all
network APIs in the 2.36 microversion for deprecation. The sticky thing
is that today you can request <2.36 and the API still works. After
nova-network is deleted from code, that will no longer work. Yes this is
a backward incompatible change, but we wanted the further signaling of
the removal rather than just yank it outright when the time comes.
To ease some of the client experience, Dan Smith is working a change in
python-novaclient to deprecate the network CLIs and if requesting
microversion>=2.36 we'll fallback to 2.35 (or the latest available that
still makes this work). So the network CLIs will be deprecated and emit
a warning but continue to work even though API users will not be able to
request >2.35 of those APIs.
* API extension folding/removal
There was a concern about losing policy checks and operators controlling
features in their cloud via policy. For example, you can't disable file
injection via policy anymore. There is a TODO here to send an email to
the openstack-operators list as an FYI that we're removing policy for
server attributes and if you're controlling the API capabilities via
policy you might have a problem. The goal here is to disable the ability
to inject things into the API response since that's a barrier to
interoperability (and also make the code sane in the process).
This work does appear to have stalled a bit though since Sean Dague
(sdague) has been working on getting oslo.privsep working in our upgrade
CI jobs.
* Spec for user_id based policy enforcement
https://review.openstack.org/#/c/324068/
We agreed this was a regression and an important operator gap (discussed
in threads in the operators ML) until we have hierarchical quota support
in Nova, so we want to move forward with getting the limited support
added back in Newton. TODOs here for John Garbutt, Andrew Laski and
myself to review the spec.
* os-vif integration
https://review.openstack.org/#/c/269672/
Jay Pipes and Sean Mooney are happy with the change, but it's blocked on
sorting out the oslo.privsep/grenade issue (which is nearly complete now).
* oslo.privsep/grenade
This is old news given:
http://lists.openstack.org/pipermail/openstack-dev/2016-July/099705.html
And https://review.openstack.org/#/c/344450/
But to summarize we have a least-terrible path forward on using privsep
without making upgrade script exceptions in grenade and deployment tools
for every project and every release that adds new privsep support. Sean
Dague has been doing a lot of work on getting this done the last two
weeks and it looks like we should be done by the end of this week:
https://review.openstack.org/#/c/348250/
This will allow nova to use the latest os-brick and os-vif libraries.
* libvirt storage pools
We talked quite a bit about CI coverage for live migration. Timofey
Durakov was going to look into enabling NFS in our live migration job
again. We were also going to change the multinode full job to only run
the 2.1 microversion for live migration tests and let the live migration
specific job run all of the microversions in the live migration tests
(this is to avoid some duplication between the jobs).
The libvirt-imagebackend refactor series is still ongoing and massive:
https://review.openstack.org/#/q/topic:libvirt-imagebackend+status:open
I've literally had to request that it be chunked up and not rebased
altogether because of the strain it puts on infra's CI system. But for
this we really just need more reviews from the core team. Dan Smith has
been trying to whip up some of that to keep it moving.
As for the libvirt storage pools spec, Paul Carlton reported that Maxim
Nestratov had pointed out some issues with the spec regarding ploop
devices, but that's since been sorted out and Maxim is +1 on the spec:
https://review.openstack.org/#/c/310505/
The deadline for getting the spec approved is this Thursday, 8/4. We
agreed that the full change isn't going to make Newton at this rate, but
we can try to get some of the early object code and tests done in Newton
in parallel to the image backend work so that we have a head start on Ocata.
* Get Me a Network
The only remaining changes for this are the REST API microversion change
(2.37):
https://review.openstack.org/#/c/316398/
Which is being tested by the Tempest test and devstack change:
https://review.openstack.org/#/c/327901/
But those should be in good shape for Newton.
* Vendor metadata reboot
We agreed that we still wanted mikal to keep working on this:
--
Thanks,
Matt Riedemann
More information about the OpenStack-dev
mailing list