[openstack-dev] Announcing Ekko -- Scalable block-based backup for OpenStack

Sam Yaple samuel at yaple.net
Wed Feb 3 17:06:53 UTC 2016


On Wed, Feb 3, 2016 at 4:53 PM, Preston L. Bannister <preston at bannister.us>
wrote:

> On Wed, Feb 3, 2016 at 6:32 AM, Sam Yaple <samuel at yaple.net> wrote:
>
>> [snip]
>>
> Full backups are costly in terms of IO, storage, bandwidth and time. A
>> full backup being required in a backup plan is a big problem for backups
>> when we talk about volumes that are terabytes large.
>>
>
> As an incidental note...
>
> You have to collect full backups, periodically. To do otherwise assumes *absolutely
> no failures* anywhere in the entire software/hardware stack -- ever --
> and no failures in storage over time. (Which collectively is a tad
> optimistic, at scale.) Whether due to a rare software bug, a marginal piece
> of hardware, or a stray cosmic ray - an occasional bad block will slip
> through.
>

A new full can be triggered at any time should there be concern of a
problem. (see my next point)

>
> More exactly, you need some means of doing occasional full end-to-end
> verification of stored backups. Periodic full backups are one
> safeguard. How you go about performing full verification, and how often is
> a subject for design and optimization. This is where things get a *bit*
> more complex. :)
>

Yes an end-to-end verification of the backup would be easy to implement,
but costly to run. But thats more on the user to decided those things. With
a proper scheduler this is less an issue for Ekko, and more a backup policy
issue.

>
> Or you just accept a higher error rate. (How high depends on the
> implementation.)
>

And its not a full loss, its just not a 100% valid backup. Luckily youve
only lost a single segment (a few thousand sectors) chances are the
critical stuff you want isn't there. That data can still be recovered.
And object-storage with replication makes it very, very hard to loss data
when properly maintained (look at S3 and the data its lost over time). We
have checksum/hash verification in place already so the underlying data
must be valid or we don't restore. But your points are well received.

>
> And "Yes", multi-terabyte volumes *are* a challenge.
>

And increasingly common...
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160203/33a29609/attachment.html>


More information about the OpenStack-dev mailing list