[openstack-dev] Disaster Recovery for OpenStack - call for stakeholder

Zhangleiqiang (Trump) zhangleiqiang at huawei.com
Thu Mar 13 07:20:52 UTC 2014

About the (1) [Single VM], the use cases as follows can be supplement.

1. Protection Group: Define the set of instances to be protected.
2. Protection Policy: Define the policy for protection group, such as sync period, sync priority, advanced features, etc.
3. Recovery Plan:    Define the recovery steps during recovery, such as the power-off and boot order of instances, etc

zhangleiqiang (Ray)

Best Regards

> -----Original Message-----
> From: Bruce Montague [mailto:Bruce_Montague at symantec.com]
> Sent: Thursday, March 13, 2014 2:38 AM
> To: openstack-dev at lists.openstack.org
> Subject: Re: [openstack-dev] Disaster Recovery for OpenStack - call for
> stakeholder
> Hi, regarding the call to create a list of disaster recovery (DR) use cases
> ( http://lists.openstack.org/pipermail/openstack-dev/2014-March/028859.html
>  ), the following list sketches some speculative OpenStack DR use cases. These
> use cases do not reflect any specific product behavior and span a wide
> spectrum. This list is not a proposal, it is intended primarily to solicit additional
> discussion. The first basic use case, (1), is described in a bit more detail than
> the others; many of the others are elaborations on this basic theme.
> * (1) [Single VM]
> A single Windows VM with 4 volumes and VSS (Microsoft's Volume Shadowcopy
> Services) installed runs a key application and integral database. VSS can quiesce
> the app, database, filesystem, and I/O on demand and can be invoked external
> to the guest.
>    a. The VM's volumes, including the boot volume, are replicated to a remote
> DR site (another OpenStack deployment).
>    b. Some form of replicated VM or VM metadata exists at the remote site.
> This VM/description includes the replicated volumes. Some systems might use
> cold migration or some form of wide-area live VM migration to establish this
> remote site VM/description.
>    c. When specified by an SLA or policy, VSS is invoked, putting the VM's
> volumes in an application-consistent state. This state is flushed all the way
> through to the remote volumes. As each remote volume reaches its
> application-consistent state, this is recognized in some fashion, perhaps by an
> in-band signal, and a snapshot of the volume is made at the remote site.
> Volume replication is re-enabled immediately following the snapshot. A backup
> is then made of the snapshot on the remote site. At the completion of this cycle,
> application-consistent volume snapshots and backups exist on the remote site.
>    d.  When a disaster or firedrill happens, the replication network
> connection is cut. The remote site VM pre-created or defined so as to use the
> replicated volumes is then booted, using the latest application-consistent state
> of the replicated volumes. The entire VM environment (management accounts,
> networking, external firewalling, console access, etc..), similar to that of the
> primary, either needs to pre-exist in some fashion on the secondary or be
> created dynamically by the DR system. The booting VM either needs to attach
> to a virtual network environment similar to at the primary site or the VM needs
> to have boot code that can alter its network personality. Networking
> configuration may occur in conjunction with an update to DNS and other
> networking infrastructure. It is necessary for all required networking
> configuration  to be pre-specified or done automatically. No manual admin
> activity should be required. Environment requirements may be stored in a DR
> configuration !
> or database associated with the replication.
>    e. In a firedrill or test, the virtual network environment at the remote site
> may be a "test bubble" isolated from the real network, with some provision for
> protected access (such as NAT). Automatic testing is necessary to verify that
> replication succeeded. These tests need to be configurable by the end-user and
> admin and integrated with DR orchestration.
>    f. After the VM has booted and been operational, the network connection
> between the two sites is re-established. A replication connection between the
> replicated volumes is restablished, and the replicated volumes are re-synced,
> with the roles of primary and secondary reversed. (Ongoing replication in this
> configuration may occur, driven from the new primary.)
>    g. A planned failback of the VM to the old primary proceeds similar to the
> failover from the old primary to the old replica, but with roles reversed and the
> process minimizing offline time and data loss.
> * (2) [Core tenant/project infrastructure VMs]
> Twenty VMs power the core infrastructure of a group using a private cloud
> (OpenStack in their own datacenter). Not all VMs run Windows with VSS, some
> run Linux with some equivalent mechanism, such as qemu-ga, driving fsfreeze
> and signal scripts. These VMs are replicated to a remote OpenStack
> deployment, in a fashion similar to (1). Orchestration occurring at the remote
> site on failover is more complex (correct VM boot order is orchestrated, DHCP
> service is configured as expected, all IPs are made available and verified). An
> equivalent virtual network topology consisting of multiple networks or subnets
> might be pre-created or dynamically created at failover time.
>    a. Storage for all volumes of all VMs might be on a single storage backend
> (logically a single large volume containing many smaller sub-volumes, examples
> being a VMware datastore or Hyper-V CSV). This entire large volume might be
> replicated between similar storage backends at the primary and secondary site.
> A single replicated large volume thus replicates all the tenant VM's volumes.
> The DR system must trigger quiesce of all volumes to application-consistent
> state.
>    b. This environment needs to deal with failures of the primary datacenter
> (as when a trenching tool cuts its connection to the internet), routine firedrill
> tests that perform failover and failback, and planned migration.
>    c. VSS or fsfreeze may be expected to fail for some VMs and policies and
> SLAs need to contend with this and alert admins for manual follow-up.
>    d. Network bandwidth used for replication needs to be throttled so as not
> to overly disrupt the private cloud's gateway capacity.
>    e. DR replication needs to deal with intermittent network replication failure
> and recover gracefully. In case of a known network issue, such as maintenance,
> it needs to be possible for the admin to explicitly suspend network replication.
> Replication I/O is then logged locally at the primary site in some fashion. The
> remote site needs to stay replication ready, but failover does not occur. When
> the network issue is over, replication resumes, perhaps recovering via a log, a
> map of updated blocks, or an equivalent technique. In this example the RPO
> window is deliberately ignored and allowed to grow until replication is resumed
> by the admin.
>    f. This tenant requires encryption of network replication traffic.
>    g. Cost accounting and chargeback is required.
> * (3) [Multi-tier app infrastructure]
> A tenant has a service consisting of 8 multi-tier apps that each consist of 3 to 5
> VMs, with each VM having 2 to 4 disks. Replication snapshots need to be made
> of the volumes in an application-consistent way across all the volumes of all the
> VMs in all the multi-tier apps. Again, these volumes may exist on a single large
> volume or datastore, perhaps simplifying creation of the cross-VM application
> consistency snapshot. Not all of the VMs in a multi-tier app may need to be
> quiesced, some may be stateless and simply need to be recovered to a running
> state.
> a. This tenant requires that 3 of the multi-tier apps failover to one remote
> OpenStack site and the other 5 multi-tier apps failover to a different remote
> site than the first.
> b. This tenant weekly performs a non-disruptive test-bubble failover test. Real
> failover is not triggered. Instead, all the multi-tier app VMs that would boot
> upon failure are booted (from their latest snapshots on the secondary), but the
> VM's virtual network environment on the secondary is isolated from external
> networking. Test bubbles at the two OpenStack remote sites may need to be
> connected via some VPN/tunnel or equivalent without manual admin activity.
> * (4) [Tenant failover]
> An OpenStack tenant has 40 VMs, relatively lightly loaded, used for
> development. The VMs do not contain VSS, qemu-ga, or standard tools (they
> may be running any Linux distro, some may be running Plan9, the tenant may
> be doing Linux kernel development (that is, the VMs can be anything)). A
> remote OpenStack deployment needs to exist so that in event of loss of the
> primary OpenStack site, the tenant can continue development. In addition to
> volume replication as in (1), subject to policies and SLAs, cold migration may be
> performed on a VM's volumes upon shutdown (or dismount) and tenant
> end-users can explicitly request replication of a volume that is in an
> application-consistent state (when they have quiesced it by VSS, dismount, or
> equivalent).
> a. Being down for a short period may be acceptable to this tenant. If all the
> hosts on the primary site are rebooted, for instance, due to power failure, it is
> the operators choice to fail over or not. If the operator chooses not to fail over,
> upon reboot of the VM's at the primary site, any established replication should
> automatically be continued.
> * (5) [Scale-out workload]
> A tenant has a Cassandra (or Hadoop or similar type of system) consisting of 75
> VMs. Use is bursty. The system is used by a pharmaceutical company for
> design work. Loss of a week's work can be repeated, but weekly replication is
> mandatory. The application itself may provide some form of built-in
> geo-replication. Some controller-type VMs may need to be replicated as in (1).
> Other VMs may partner with replica VMs for explicit application data
> replication. For weekly replication of Cassandra data, Cassandra user-level
> snapshots are made into replicated volumes attached to each Cassandra VM.
> Replication is periodic with respect to the last replication event, that is, only
> data changed since the last replication event is sent.
>    a. The tenant requires use of a particular aggregated network link for
> replication.
>    b. The tenant requires custom integration with the DR replication workflow
> to quiesce Cassandra via user-level commands and scripts developed by the
> end-user.
>    c. Initial synchronization of replicated primary and secondary volume need
> not be over a network link. Secondary volumes can be created initially from
> physical disks or backups physically moved to the secondary site.
> * (6) [Degraded-mode Mission-critical single VM]
> This single VM use case is similar to (1), but when a network partition occurs
> between the primary and secondary OpenStack sites, with both sites remaining
> up, the primary VM remains operational while the secondary replica VM also
> comes online. Both VMs operate in a mode that resembles replication with a
> momentary network fault, logging their would-be replication traffic for
> continuation when the network comes back. When network connectivity is
> reestablished, one site again becomes the primary and differences in the VM's
> volumes can optionally (as controlled by policy) be reconciled. (In a simple case,
> each site might have its own dedicated volume partition or attached volume
> with its latest state.)
> * (7) [Self-contained application volume]
>  A cinder volume contains a complete database application, including the
> database and all binaries and configuration files. Replication of the entire VM
> to which this volume is attached is not needed. The VM and  its configuration
> can be recreated on demand at the remote site and attached to the replicated
> application volume. The DR system still needs to orchestrate the process and
> create or manage the required network environment. A simple DR strategy can
> be used in which the volume is quiesced on the primary, a volume snapshot
> taken, the volume unquiesced (enabling the VM to continue running), and a
> backup is then made of the snapshot. Backups can be transported by whatever
> means to the DR site, where the volume can be restored to its state at time of
> snapshot.
> * (8) [Stateless]
> No volumes and VMs need to be replicated, as VMs and their configuration can
> be recreated on demand, using configuration tools, and application data is
> accessed over the wide-area network (NFS or object store). The DR process still
> has to orchestrate creating the VMs, running configuration tools to populate
> them, creating the network environment, and booting VMs in required order.
> * (9) [Site Evacuation]
> The holy grail, automatic planned migration of the workload and data from one
> cloud-scale datacenter to another (or a set of others). In practice, likely to
> include admins in-the-loop. At both tenant-scale and entire datacenter scale.
> The entire cloud datacenter is expected to go offline for an extended period
> (the hurricane scenario).
> -bruce
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list