[openstack-dev] Announcing Ekko -- Scalable block-based backup for OpenStack

Sam Yaple samuel at yaple.net
Wed Jan 27 23:15:02 UTC 2016


On Wed, Jan 27, 2016 at 8:19 PM, Fausto Marzi <fausto.marzi at gmail.com>
wrote:

> Hi Sam,
>
> After our conversation, I have few questions and consideration about Ekko,
> mainly on how it works et similar. Also to make available to the community
> our discussions:
>
> -          In understand you are placing a backup-agent on the compute
> node and execute actions interacting directly with the hypervisor. I’m
> thinking that while Ekko execute this actions, the Nova service have no
> visibility whatsoever of this. I do not think is a good idea to execute
> actions directly on the hypervisor without interacting with the Nova API.
>
This is not an ideal situation, no. Nova should be aware of what we are
doing and when we are doing it. We are aware of this and plan on purposing
ways to be better integrated with Nova (and Cinder for that matter).

> -          On your assumptions, you said that Nova snapshots creation
> generate a VM downtime. I don’t think the assumption is correct, at least
> in Kilo, Liberty and Mitaka. The only downtime you may have related to the
> snapshot, is when you merge back the snapshot to the original root image,
> and this is not our case here.
>
For Kilo and up Nova does leverage the live snapshot allowing for a
snapshot without placing the instance into a Paused state. That is correct.
Some of the same underlying functions are used for the IncrementalBackup
feature of QEMU as well. So you are right its not fair to say that
snapshots always cause downtime as that hasn't been the case since Kilo.

> -          How the restore would work? If you do a restore of the VM and
> the record of that VM instance is not available in the Nova DB (i.e.
> restoring a VM on a newly installed Openstack cloud, or in another
> region, or after a vm has beed destroyed)what would happen? How do you
> manage the consistency of the data between Nova DB and VM status
>
Restore has two pieces. The one we are definitely implementing is restoring
an backup image to a glance image. At that point anyone could start an
instance off of it. Additionally, it _could_ be restored directly back to
the instance in question by powering off the instance and restoring the
data directly back then starting the instance again.

> -          If you execute a backup of the VM image file without executing
> a backup of the related VM metadata information (in the shortest time frame
> as possible) there are chances the backup can be inconsistent.
>
I don't see how VM metadata information has anything to do with a proper
backup of the data in this case.

> - How the restore would happen if on that moment Keystone or Swift is not
> available?
>
How does anything happen if Keystone isn't available? User can't auth so
nothing happens.

> -          Does the backup that Ekko execute, generates bootable image?
> If not, the image is not usable and the restore process will take longer to
> execute the steps to make the image bootable.
>
The backup Ekko would take would be a bit-for-bit copy of what is on the
underlying disk. If that is bootable, the it is bootable.

> -           I do not see any advantage in Ekko over using Nova API to
> snapshot -> Generate an image -> upload to Glance -> upload to Swift.
>
Snapshots are not backups. This is a very important point. Additionally the
process you describe is extremely expensive in terms of time, bandwidth,
and IO. For sake of example, if you have 1TB of data on an instance and you
snapshot it you must upload 1TB to Glance/Swift. The following day you do
another snapshot and you must upload another 1TB, likely the majority of
the data is exactly the same. With Ekko (or any true backup) you should
only be uploading what has changed since the last backup.

> -          The Ekko approach is limited to Nova, KVM QEMU, having a
> qemu-agent running on the VM. I think the scope is probably a bit limited.
> This is more a feature than a tool itself, but the problem is being solved
> I think more efficiently already.
>
It is not limited to Nova, Libvirt nor QEMU. It also does not _require_ the
qemu-agent. It can be used independent of OpenStack (though that is not the
endgoal) and even with VMWare or Hyper-V since they both support CBT which
is the main component we leverage.

> -          By executing all the actions related to backup (i.e.
> compression, incremental computation, upload, I/O and segmented upload to
> Swift) Ekko is adding a significant load to the Compute Nodes. All the
> work is done on the hypervisor and not taken into account by ceilometer (or
> similar), so for example not billable. I do not think this is a good idea
> as distributing the load over multiple components helps OpenStack to scale
> and by leveraging the existing API you integrated better with existing
> tools.
>
The backup-agent that is purposed to exist on the compute node does not
necessarily perform the upload from the compute node since the data may
exist on a backend like Ceph or NFS. But in the case of a local storage
there is no way to get around this. Further, current nova-snapshots would
do all the same things (minus the compression) and have the same overhead.
I am not sure what you are speaking about when talking about Ceilometer.
Ekko has full plans to have a plugin for this information as well since
Ekko is in control of all of this information. The hypervisor is doing
very, very little of "the work" and doing nothing that is intensive at all.

> -          There’s no documentation whatsoever provided with Ekko. I had
> to read the source code, have conversations directly with you and invest
> significant time on it. I think provide some documentation is helpful, as
> the doc link in the openstack/ekko repo return 404 Not Found.
>
This is true. We are a repo of a few weeks. Give it time :). Ekko has an
informal mid-cycle planned since all the Core contributors as of now will
be at the Kolla midcycle in Feb. We plan on documenting and presenting a
roadmap at this time.

> Please let me know what your thoughts are on this.
>
> Thanks,
> Fausto
>

Hopefully these have answered your questions.

Sam Yaple
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160127/82bcf768/attachment.html>


More information about the OpenStack-dev mailing list