[openstack-dev] [nova] consistency and exposing quiesce in the Nova API

Matt Riedemann mriedem at linux.vnet.ibm.com
Thu Jun 16 17:13:26 UTC 2016

On 6/16/2016 6:12 AM, Preston L. Bannister wrote:
> I am hoping support for instance quiesce in the Nova API makes it into
> OpenStack. To my understanding, this is existing function in Nova, just
> not-yet exposed in the public API. (I believe Cinder uses this via a
> private Nova API.)

I'm assuming you're thinking of the os-assisted-volume-snapshots admin 
API in Nova that is called from the Cinder RemoteFSSnapDrivers 
(glusterfs, scality, virtuozzo and quobyte). I started a separate thread 
about that yesterday, mainly around the lack of CI testing / status so 
we even have an idea if this is working consistently and we don't 
regress it.

> Much of the discussion is around disaster recovery (DR) and NFV - which
> is not wrong, but might be muddling the discussion? Forget DR and NFV,
> for the moment.
> My interest is simply in collecting high quality backups of applications
> (instances) running in OpenStack. (Yes, customers are deploying
> applications into OpenStack that need backup - and at large scale. They
> told us, *very* clearly.) Ideally, I would like to give the application
> a chance to properly quiesce, so the on-disk state is most-consistent,
> before collecting the backup.

We already attempt to quiesce an active volume-backed instance before 
doing a volume snapshot:


> The existing function in Nova should be at least a good start, it just
> needs to be exposed in the public Nova API. (At least, this is my
> understanding.)
> Of course, good backups (however collected) allow you to build DR
> solutions. My immediate interest is simply to collect high-quality backups.
> The part in the blueprint about an atomic operation on a list of
> instances ... this might be over-doing things. First, if you have a set
> of related instances, very likely there is a logical order in which they
> should be quiesced. Some could be quiesced concurrently. Others might
> need to be sequential.
> Assuming the quiesce API *starts* the operation, and there is some means
> to check for completion, then a single-instance quiesce API should be
> sufficient. An API that is synchronous (waits for completion before
> returning) would also be usable. (I am not picky - just want to collect
> better backups for customers.)

As noted above, we already attempt to quiesce when doing a volume-backed 
instance snapshot.

The problem comes in with the chaining and orchestration around a list 
of instances. That requires additional state management and overhead 
within Nova and while we're actively trying to redo parts of the code 
base to make things less terrible, adding more complexity on top at the 
same time doesn't help.

I'm also not sure what something like multiattach volumes will throw 
into the mix with this, but that's another DR/HA requirement.

So I get that lots of people want lots of things that aren't in Nova 
right now. We have that coming from several different projects (cinder 
for multiattach volumes, neutron for vlan-aware-vms and routed 
networks), and several different groups (NFV, ops).

We also have a lot of people that just want the basic IaaS layer to work 
for the compute service in an OpenStack cloud, like being able to scale 
that out better and track resource usage for accurate scheduling.

And we have a lot of developers that want to be able to actually 
understand what it is the code is doing, and a much smaller number of 
core maintainers / reviewers that don't want to have to keep piling 
technical debt into the project while we're trying to fix some of what's 
already built up over the years - and actually have this stuff backed 
with integration testing.

So, I get it. We all have requirements and we all have resource 
limitations, which is why we as a team prioritize our work items for the 
release. This one didn't make it for Newton.

> On Sun, May 29, 2016 at 7:24 PM, joehuang <joehuang at huawei.com
> <mailto:joehuang at huawei.com>> wrote:
>     Hello,
>     This spec[1] was to expose quiesce/unquiesce API, which had been
>     approved in Mitaka, but code not merged in time.
>     The major consideration for this spec is to enable application level
>     consistency snapshot, so that the backup of the snapshot in the
>     remote site could be recovered correctly in case of disaster
>     recovery. Currently there is only single VM level consistency
>     snapshot( through create image from VM ), but it's not enough.
>     First, the disaster recovery is mainly the action in the
>     infrastructure level in case of catastrophic failures (flood,
>     earthquake, propagating software fault), the cloud service provider
>     recover the infrastructure and the applications without the help
>     from each application owner: you can not just recover the OpenStack,
>     then send notification to all applications' owners, to ask them to
>     restore their applications by their own. As the cloud service
>     provider, they should be responsible for the infrastructure and
>     application recovery in case of disaster.
>     The second, this requirement is not to make OpenStack bend over NFV,
>     although this requirement was asked from OPNFV at first, it's
>     general requirement to have application level consistency snapshot.
>     For example, just using OpenStack itself as the application running
>     in the cloud, we can deploy different DB for different service, i.e.
>     Nova has its own mysql server nova-db-VM, Neutron has its own mysql
>     server neutron-db-VM. In fact, I have seen in some production to
>     divide the db for Nova/Cinder/Neutron to different DB server for
>     scalability purpose. We know that there are interaction between Nova
>     and Neutron when booting a new VM, during the VM booting period,
>     some data will be in the memory cache of the
>     nova-db-VM/neutron-db-VM, if we just create snapshot of the volumes
>     of nova-db-VM/neutron-db-VM in Cinder, the data which has not been
>     flushed to the disk will not be in the snapshot of the volumes. We
>     cann't make sure when these data in the memory cache will be
>     flushed, then
>      there is random possibility that the data in the snapshot is not
>     consistent as what happened as in the virtual machines of
>     nova-db-VM/neutron-db-VM.In this case, Nova/Neutron may boot in the
>     disaster recovery site successfully, but some port information may
>     be crushed for not flushed into the neutron-db-VM when doing
>     snapshot, and in the severe situation, even the VM may not be able
>     to recover successfully to run. Although there is one project called
>     Dragon[2], Dragon can't guarantee the consistency of the application
>     snapshot too through OpenStack API.
>     The third, for those applications which can decide the data and
>     checkpoint should be replicated to disaster recovery site, this is
>     the third option discussed and described in our analysis:
>     https://git.opnfv.org/cgit/multisite/tree/docs/requirements/multisite-vnf-gr-requirement.rst.
>     But unfortunately in Cinder, after the volume replication V2.1 is
>     developed, the tenant granularity volume replication is still being
>     discussed, and still not on single volume level. And just like what
>     have mentioned in the first point, both application level and
>     infrastructure level are needed, for you can't only expect that
>     asking each application owners to do recovery after disaster
>     recovery of a site's OpenStack: applications usually can deal with
>     the data generated by it, but for the configuration change's
>     protection, it's out of scope of application. There are several
>     options for disaster recovery, but doesn't mean one option can fit all.
>     There are several -1 for this re-proposed spec which had been
>     approved in Mitaka, so the explanation is sent in the mail-list for
>     discussion. If someone can provide other way to guarantee
>     application level snapshot for disaster recovery purpose, it's also
>     welcome.
>     [1] Re-Propose Expose quiesce/unquiesce API:
>     https://review.openstack.org/#/c/295595/
>     [2] Dragon:
>     https://github.com/os-cloud-storage/openstack-workload-disaster-recovery
>     Best Regards
>     Chaoyi Huang ( Joe Huang )
>     __________________________________________________________________________
>     OpenStack Development Mailing List (not for usage questions)
>     Unsubscribe:
>     OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>     <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



Matt Riedemann

More information about the OpenStack-dev mailing list