[openstack-dev] Proposal for Raksha, a Data Protection As a Service project

Murali Balcha Murali.Balcha at triliodata.com
Thu Aug 29 23:42:36 UTC 2013

>> From: Ronen Kat <RONENKAT at il.ibm.com>
>> Sen: Thursday, August 29, 2013 2:55 PM
>> To: openstack-dev at lists.openstack.org; openstack-dev at lists.launchpad.net
>> Subject: Re: [openstack-dev] Proposal for Raksha, a Data Protection As a Service project

>> Hi Murali,

>> I think the idea to provide enhanced data protection in OpenStack is a
>> great idea, and I have been thinking about  backup in OpenStack for a while
>> now.
>> I just not sure a new project is the only way to do.

>> (as disclosure, I contributed code to enable IBM TSM as a Cinder backup
>> driver)

Hi Kat,
Consider the following use cases that Raksha will addresses. I will discuss from simple to complex use case and then address your specific questions with inline comments.
1.	VM1 that is created on the local file system with a cinder volume attached
2.	VM2 that is booted off from a cinder volume and has couple of cinder volumes attached
3.	VM1 and VM2 all booted from cinder volumes and has couple of volumes attached. They also share a private network for internal communication.
In all these cases Raksha will take a consistent snap of VMs, walk thru each VM resources and backup the resources to swift end point. 
In case 1, that means backup VM image and Cinder volume image to swift
In case 2 is an extension of case 1.
In case 3, Raksha not only backup VM1 and VM2 and its associated resources, it also backup the network configuration

Now lets consider the restore case. The restore operation walks thru the backup resources and calls into respective openstack services to restore those objects. In case1, it first calls Nova API to restore the VM, it calls into Cinder to restore the volume and attach the volume to the newly restored VM instance. In case of 3, it also calls into Neutron API to restore the networking. Hence my argument is that not one OpenStack project has a global view of VM and all its resources to implement an effective backup and restore services.

>> I wonder what is the added-value of a project approach versus enhancements
>> to the current Nova and Cinder implementations of backup. Let me elaborate.

>> Nova has a "nova backup" feature that performs a backup of a VM to Glance,
>> the backup is managed by tenants in the same way that you propose.
>> While today it provides only point-in-time full backup, it seems reasonable
>> that it can be extended support incremental and consistent backup as well -
>> as the actual work is done either by the Storage or Hypervisor in any case.

Though Nova has API to upload a snapshot of the VM to glance, it does not snapshot any volumes associated with the VM. When a snapshot is uploaded to glance, Nova creates an image by collapsing the qemu image with delta file and uploads the larger file to glance. If we were to perform periodic backups of VMs, this is a very inefficient way to do backup. Also having to manage two end points, one for Nova and Cinder is inefficient. These are the gaps I called out in Raksha wiki page.

>> Cinder has a cinder backup command that performs a volume backup to Swift,
>> Ceph or TSM. The Ceph implementation also support incremental backup (Ceph
>> to Ceph).
>> I envision that Cinder could be expanded to support incremental backup (for
>> persistent storage) by adding drivers/plug-ins that will leverage
>> incremental backup features of either the storage or Hypervisors.
>> Independently, in Havana the ability to do consistent volume snapshots was
>> added to GlusterFS. I assume that this consistency support could be
>> generalized to support other volume drivers, and be utilized as part of a
>> backup code.

I think we are talking specific implementations here. Yes, I am aware of Ceph blueprint to support incremental backup, but Cinder backup APIs are volume specific. That means if a VM has multiple volumes mapped as in the case 2 I discussed, tenant need to call backup api three times. Also if you look at the swift layout of the cinder, it is very difficult to tie the swift images back to a particular VM. Imagine a tenant were to restore a VM and all its resources from a backup copy that was performed a week ago. The restore operation is not straight forward.
It is my understanding that consistency should be maintained at the VM, not at individual volume. It is very difficult to assume how the application data inside VM is laid out.

>> Looking at the key features in Raksha, it seems that the main features
>> (2,3,4,7) could be addressed by improving the current mechanisms in Nova
>> and Cinder. I didn't included 1 as a feature as it is more a statement of
>> intent (or goal) than a feature.
>> Features 5 (dedup) and 6 (scheduler) are indeed new in your proposal.

>> Looking at the source de-duplication feature, and taking Swift as an
>> example, it seems reasonable that if Swift will implement de-duplication,
>> then doing backup to Swift will give us de-duplication for free.
>> In fact it would make sense to do the de-duplication at the Swift level
>> instead of just the backup layer to gain more duplication opportunities.

I agree, however Swift is not the only object store that need to support dedupe. Ceph is another popular object store too. GlusterFS supports Swift end point and there are other commercially available object stores too. So you argument  becomes very product specific. However source level dedupes is different than dedupe at rest. Source level dedupe reduces the backup windows and also reduces the amount of data that need to be pumped to backup end point like swift.

>> Following the above, and assuming it all come true (at times I am known to
>> be an optimistic), then we are left with backup job scheduling, and I
>> wonder if that is enough for a new project.

I hope I convinced that Raksha has more to offer than a simple cron job. Please take a look at the backup apis, its database schema and the usecases it addresses in its wiki page.

Bottom line is irrespective how OpenStack is deployed; here is how Raksha workflow looks like
* Create-backupjob VM1, VM2
       --> Returns backup job id, id1
* Run-backupjob id1
       --> Returns runid rid1
* Run backup job id1
      --> Returns run id rid2
* Restore rid1
       --> Restores PiT of VM1 and VM2 and its associated volumes

>> My question is, would it make sense to add to the current mechanisms in
>> Nova and Cinder than add the complexity of a new project?

I think the answer is yes  :)

Murali Balcha
>> __________________________________________
>> Ronen I. Kat
>> Storage Research
IBM Research - Haifa
Phone: +972.3.7689493
Email: ronenkat at il.ibm.com

From:   Murali Balcha <Murali.Balcha at triliodata.com>
To:     "openstack-dev at lists.openstack.org"
            <openstack-dev at lists.openstack.org>,
            "openstack at list.openstack.org" <openstack at list.openstack.org>,
Date:   29/08/2013 01:18 AM
Subject:        [openstack-dev] Proposal for Raksha, a Data Protection As a
            Service project

Hello Stackers,
We would like to introduce a new project Raksha, a Data Protection As a
Service (DPaaS) for OpenStack Cloud.
Raksha’s primary goal is to provide a comprehensive Data Protection for
OpenStack by leveraging Nova, Swift, Glance and Cinder. Raksha has
following key features:
      1.       Provide an enterprise grade data protection for OpenStack
      based clouds
      2.       Tenant administered backups and restores
      3.       Application consistent backups
      4.       Point In Time(PiT) full and incremental backups and restores
      5.       Dedupe at source for efficient backups
      6.       A job scheduler for periodic backups
      7.       Noninvasive backup solution that does not require service
      interruption during backup window

You will find the rationale behind the need for Raksha in OpenStack in its
Wiki. The wiki also has the preliminary design and the API description.
Some of the Raksha functionality may overlap with Nova and Cinder projects
and as a community lets work together to coordinate the features among
these projects. We would like to seek out early feedback so we can address
as many issues as we can in the first code drop. We are hoping to enlist
the OpenStack community help in making Raksha a part of OpenStack.
Raksha’s project resources:
Wiki: https://wiki.openstack.org/wiki/Raksha
Launchpad: https://launchpad.net/raksha
Github: https://github.com/DPaaS-Raksha/Raksha (We will upload a prototype
code in few days)
If you want to talk to us, send an email to
openstack-dev at lists.launchpad.net with "[raksha]" in the subject or use
#openstack-raksha irc channel.

Best Regards,
Murali Balcha_______________________________________________
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org
OpenStack-dev mailing list
OpenStack-dev at lists.openstack.org

More information about the OpenStack-dev mailing list