Open Stack

Tue Feb 2 21:14:54 UTC 2016

War and Peace
or
Notes from the Cinder Mitaka Midcycle Sprint
January 26-29

Etherpads from discussions:
* https://etherpad.openstack.org/p/mitaka-cinder-midcycle-day-1
* https://etherpad.openstack.org/p/mitaka-cinder-midcycle-day-2
* https://etherpad.openstack.org/p/mitaka-cinder-midcycle-day-3
* https://etherpad.openstack.org/p/mitaka-cinder-midcycle-day-4

*Topics Covered*
================
In no particular order...

Disable Old Volume Types
========================
There was a request from an end user to have a mechanism to disable
a volume type as part of a workflow for progressing from a beta to
production state.

Of what was known of the request, there was some confusion as to
whether the desired use case couldn't be met with the existing
functionality. It was decided nothing would be done for this until
more input is receieved explaining what is needed and why it cannot
be done as it is today.

User Provided Encryption Keys for Volume Encryption
===================================================
The question was raised as to whether we want to allow user specified
keys. Google has something today where this key can be passed in
headers.

Some concern with doing this, both from a security and amount of work
perspective. It was ultimately agreed this was a better fit for a
cross project discussion.

Adding a Common cinder.conf Setting for Suppressing SSL Warnings
================================================================
Log files get a TON of warnings when using a driver that uses the
requests library internally for communication and you do not have
a signed valid certificate. Some drivers have gotten around this
by implementing their own settings for disabling these warnings.

The question was raised that although not all drivers use requests,
and therefore are not affected by this, should we still have a common
config setting to disable these warnings for those drivers that do use
it.

Different approaches to disabling this will be explored. As long as
it is clear what the option does, we were not opposed to this.

Nested Quotas
=============
The current nested quota enforcement is badly broken. There are many
scenarios the just do not work as expected. There is also some
confusion around how nested quotas should work. Things like setting
-1 for a child quota do not work as expected and things are not
properly enforced during volume creation.

Glance has also started to look at implementing nested quota support
based on Cinder's implementation, so we don't want to cause broken
implementation in Cinder to be propogated to other projects.

Ryan McNair is working with folks on other projects to make find
a better solution and to work through our current issues. This will
be an ongoing effort for now.

The Future of CLI for python-cinderclient
=========================================
A cross project spec has been approved to work toward removing
individual project CLIs to center on the one common osc CLI. We
discussed the feasibility of deprecating the cinder CLI in favor
of focusing all CLI work on osc.

There is also concern about delays getting new functionality
deployed. First we need to make server side API changes, then get
them added to the client library, then get them added to osc.

There is not feature parity between the cinder and osc CLI's at the
moment for cinder functionality. This needs to be addressed first
before we can consider removing or deprecating anything in the cinder
client CLI. Once we have the same level of functionality with both,
we can then decide at what point to only add new CLI commands to osc
and start deprecating the cinder CLI.

Ivan and Ryan will look in to how to implement osc plugins.

We will also look in to using cliff and other osc patterns to see if
we can bring the existing cinder client implementation closer to the
osc implementation to make the switch over smoother.

API Microversions
=================
Scott gave an update on the microversion work.

Cinder patch: https://review.openstack.org/#/c/224910
cinderclient patch: https://review.openstack.org/#/c/248163/
spec: https://review.openstack.org/#/c/223803/
Test cases: https://github.com/scottdangelo/TestCinderAPImicroversions

Ben brought up the need to have a new unique URL endpoint for
this to get around some backward compatibility problems. This new URL
would be made v3 even though it will initially be the same as v2.

We would like to get this in soon so it has some runtime. There were
a lot of work items identified though that should get done before we
land. Scott is going to continue working through these issues.

Async User Reporting
====================
Alex Meade and Sheel have been working on ways to report back better
information for async operations.

https://etherpad.openstack.org/p/mitaka-cinder-midcycle-user-notifications

We will store data in the database rather than the originally
investigated Zaqar approach. There was general agreement that this
work should move forward and would be beneficial.

SDS Driver Proposals
====================
We've had a few requests in the past to add drivers for other SDS
platforms such as VipR, FalconStore, etc. We've rejected this on the
basis that they duplicate much of what Cinder is already doing so it
could potentially leverage Cinder without providing any benefit to
the project as a whole. It was also brought up in the past that third
party CI should then be run against all supported backends under this
other SDS to validate it.

It was brought up the IBM SVC driver could be classified as an SDS.
Jay gave an overview of the system to explain how it works. The
difference there is that although SVC can be configured to manage
other storage, it is the only API for managing one of the IBM storage
systems and is not being marketed as an SDS solution.

In the end we decided that although our concerns for blocking these
in the past are still valid, we will allow them in now if that is
what end users would like to have. They must still have third party
CI, but we will not require CI to be run against every supported
backend under the SDS. We will assume the SDS product does enough
of its own testing to ensure a level of quality and really is then
outside the domain of cinder.

Status of Third Party CI
========================
There are several CI's that have been very unreliable or completely
absent. We have not been strongly enforcing our policies for this.

We will need to start disabling CI's and removing drivers for any
third party drivers that are not in compliance. For now this will
need to be a manual task of identifying and enforcing this.

Some scripts were done in the past to help get some of this data to
help with enforcement. There are also a few dashboards that show some
useful, if not complete, data. These scripts will be expanded to try
to get a more automatic process in place for out of compliance
systems.

Multiattach
===========
Volume multiattach support has been in Cinder for a couple releases
now, but more work is needed to make it usable with Nova. There
have been some changes towards this, but it likely will not get
resolved in Mitaka. This is ongoing and is actively being worked on
by ildikov and hemna.

Consistency Groups
==================
CG APIs are disabled by default by policy. It was brought up whether
this should now change. Since not every backend support CGs it was
decided we will not change this.

There's no force flag for CG snapshots unlike individial volume APIs.
This led to a broader discussion on the need for the force flag in
the first place. General agreement was it should probably just be
removed.

Quotas with CG snapshots - is a new quota needed? Determined existing
volume quotas are all that's needed and nothing special for CGs.

Extend Volume
=============
Code landed in os-brick to do extend volume. This is too late for
making it into Nova though. We should be able to get extend volume
in Newton.

Cinder-Volume A/A HA
====================
Gorka has been working through a series of patches to support this.
Several API race conditions have been fixed for this that will be
good to have even if the full solution doesn't land in time for this
release.
Plan is to get as much useful stuff merged in Mitaka, with the likely
final implementation landing in Newton.

Versioned Objects and Rolling Upgrades
======================================
Michal has several patches out there to implement this. It has been
tested under at least one scenario and appears to be working. We want
to get these landed as much as needed to support this and get some
runtime and testing in on it.

Will look at adding a grenade test to get coverage.

Scalable Backup
===============
https://etherpad.openstack.org/p/mitaka-cinder-midcycle-scaling-backup-service
The proposal is to break out the backup service to possibly have
multiple backup services running, allowing some parallelism and 
distribution of load. Most backups are CPU bound and not I/O bound
so having the ability to move this off of one host could allow
for more scale.

There was some concern that this would not work with devices that
don't use local device paths (CEPH, Sheepdog). These have been
tested and appear to work fine.

Cinder without Nova
===================
It was discussed in Tokyo about the desire to extend Cinder to be
useable outside of an OpenStack cloud as a more general SDS
solution. John was able to do some testing using the minimum
pieces of Cinder, Keystone, RabbitMQ, and MySQL.

This is just a first exploratory step. Additional work will need
to be done to make this a more attractive solution.

As a tangent, the idea was raised that after some of the major
changes are completed for versioned objects and HA, we should
take a step back and review the Cinder architecture to see if
things have changed enough that we should think about 
rearchitecting some things.

Replication
===========
A large part of the third day was spent talking about replication.
There was a lot of concern about the planned v2 implementation.
Most vendors that have added support struggled with some of the
same questions. The final v2 spec also grew beyond its initial
plan of being a crawl, walk, run approach and added too many
things that complicated the API and implementation.

There was general agreement that things didn't end up quite like
we wanted them to be for this go around. Rather than releasing
this v2 and potentially needing to turn around and start working
on a v3, it was decided that we would course correct and change
what we are doing for replication in Mitaka.

There is some (OK, a lot) of concern about doing this so far in
to the development cycle, especially as some vendors have already
landed patches for supporting v2 and there are several in-flight.

We agreed to accept this risk and go for a simpler case that
clearly addresses one use case, rather than keeping what we had
that unclearly addressed several use cases, maybe. For now we
would just address the case of configuring one or more targets
for a given backend. If there is a planned or unplanned outage
for that primary backend, the administrator has the ability to
fail over resources to one of the secondary locations.

This is not a solution for ping ponging back and forth and
keeping your instances up and running and happy. This is a
solution for when something is on fire and you need to move to
a safe location.

Folks were getting hung up on the naming, as replication means
different things to different people. To get around this, we
used code names to talk about different options. The spec for
cheesecake has much more detail about the proposed solution
and valid use case for this iteration of our support.

https://specs.openstack.org/openstack/cinder-specs/specs/mitaka/cheesecake.html

Open Stack

[openstack-dev] [Cinder] Cinder Mitaka Midcycle Summary

OpenStack

Community

Documentation

Branding & Legal