[openstack-dev] Announcing Ekko -- Scalable block-based backup for OpenStack

Preston L. Bannister preston at bannister.us
Tue Feb 2 12:04:15 UTC 2016

To be clear, I work for EMC, and we are building a backup product for
OpenStack (which at this point is very far along). The primary lack is a
good means to efficiently extract changed-block information from OpenStack.
About a year ago I worked through the entire Nova/Cinder/libvirt/QEMU
stack, to see what was possible. The changes to QEMU (which have been
in-flight since 2011) looked most promising, but when they would land was
unclear. They are starting to land. This is big news. :)

That is not the end of the problem. Unless the QEMU folk are perfect, there
are likely bugs to be found when the code is put into production. (With
more exercise, the sooner any problems can be identified and addressed.)
OpenStack uses libvirt to talk to QEMU, and libvirt is a fairly thick
abstraction. Likely there will want to be adjustments to libvirt. Bypassing
Nova and chatting with libvirt directly is a bit suspect (but may be
needed). There might be adjustments needed in Nova.

To offer suggestions...

Ekko is an *opinionated* approach to backup. This is not the only way to
solve the problem. I happen very much like the approach, but as a *specific
*approach, it probably does not belong in Cinder or Nova. (I believe it was
Jay who offered a similar argument about backup more generally.)

(Keep in mind QEMU is not the only hypervisor supported by Nova, if the
majority of use. Would you want to attempt a design that works for all
hypervisors? I would not!  ...at least at this point. Also, last I checked
the Cinder folk were a bit hung up on replication, as finding common
abstractions across storage was not easy. This problem looks similar.)

While wary of bypassing Nova/Cinder, my suggestion would to be rude in the
beginning, with every intent of becoming civil in the end.

Start by talking to libvirt directly. (The was a bypass mechanism in
libvirt that looked like it might be sufficient.) Break QEMU early, and get
it fixed. :)

When QEMU usage is working, talk to the libvirt folk about *proven* needs,
and what is needed to become civil.

When libvirt is updated (or not), talk to Nova folk about *proven* needs,
and what is needed to become civil. (Perhaps simply awareness, or a small
set of primitives.)

It might take quite a while for the latest QEMU and libvirt to ripple
through into OpenStack distributions. Getting any fixes into QEMU early (or
addressing discovered gaps in needed function) seems like a good thing.

All the above is a sufficiently ambitious project, just by itself. To my
mind, that justifies Ekko as a unique, focused project.

On Mon, Feb 1, 2016 at 4:28 PM, Sam Yaple <samuel at yaple.net> wrote:

> On Mon, Feb 1, 2016 at 10:32 PM, Fausto Marzi <fausto.marzi at gmail.com>
> wrote:
>> Hi Preston,
>> Thank you. You saw Fabrizio in Vancouver, I'm Fausto, but it's allright,
>> : P
>> The challenge is interesting. If we want to build a dedicated backup API
>> service (which is always what we wanted to do), probably we need to:
>>    - Place out of Nova and Cinder the backup features, as it wouldn't
>>    make much sense to me to have a Backup service and also have backups
>>    managed independently by Nova and Cinder.
>> That said, I'm not a big fan of the following:
>>    - Interacting with the hypervisors and the volumes directly without
>>    passing through the Nova and Cinder API.
>> Passing through the api will be a huge issue for extracting data due to
> the sheer volume of data needed (TB through the api is going to kill
> everything!)
>>    - Adding any additional workload on the compute nodes or block
>>    storage nodes.
>>    - Computing incremental, compression, encryption is expensive. Have
>>    many simultaneous process doing that may lead  to bad behaviours on core
>>    services.
>> These are valid concerns, but the alternative is still shipping the raw
> data elsewhere to do this work, and that has its own issue in terms of
> bandwidth.
>> My (flexible) thoughts are:
>>    - The feature is needed and is brilliant.
>>    - We should probably implement the newest feature provided by the
>>    hypervisor in Nova and export them from the Nova API.
>>    - Create a plugin that is integrated with Freezer to leverage that
>>    new features.
>>    - Same apply for Cinder.
>>    - The VMs and Volumes backup feature is already available by Nova,
>>    Cinder and Freezer. It needs to be improved for sure a lot, but do we need
>>    to create a new project for a feature that needs to be improved, rather
>>    than work with the existing Teams?
>> I disagree with this statement strongly as I have stated before. Nova has
> snapshots. Cinder has snapshots (though they do say cinder-backup). Freezer
> wraps Nova and Cinder. Snapshots are not backups. They are certainly not
> _incremental_ backups. They can have neither compression, nor encryption.
> With this in mind, Freezer does not have this "feature" at all. Its not
> that it needs improvement, it simply does not exist in Freezer. So a
> separate project dedicated to that one goal is not unreasonable. The real
> question is whether it is practical to merge Freezer and Ekko, and this is
> the question Ekko and the Freezer team are attempting to answer.
>>    - No one wants to block others, Sam proposed solution is indeed
>>    remarkable, but this is OpenStack, we work in Teams, why we cannot do that
>>    and be less fragmented.
>> Thanks,
>> Fausto
> Sam Yaple
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160202/dfddea4b/attachment.html>

More information about the OpenStack-dev mailing list