[openstack-dev] [Swift] Erasure coding and geo replication

Mark Kirkwood mark.kirkwood at catalyst.net.nz
Wed Apr 20 05:47:32 UTC 2016


Hi,

Has the release of 2.7 significantly changed the assessment here?

Thanks

Mark

On 15/02/16 23:29, Kota TSUYUZAKI wrote:
> Hello Mark,
>
> AFAIK, a few reasons for that we still are in working progress for erasure code + geo replication.
>
>>> and expect to survive a region outage...
>>>
>>> With that I mind I did some experiments (Liberty swift) and it looks to me like if you have:
>>>
>>> - num_data_frags < num_nodes in (smallest) region
>>>
>>> and:
>>>
>>> - num_parity_frags = num_data_frags
>>>
>>>
>>> then having a region fail does not result in service outage.
>
> Good point but note that the PyECLib v1.0.7 (pinned to Kilo/Liberty stable) still have a problem which cannot decode the original data when all feed fragments are parity frags[1]. (i.e. if set
> num_parity_frags = num_data frags and then, num_parity_frags comes into proxy for GET request, it will fail at the decoding) The problem was already resolved in the PyECLib/liberasurecode at master
> branch and current swift master has the PyECLib>=1.0.7 dependencies so if you thought to use the newest Swift, it might be not
> a matter.
>
> In the Swift perspective, I think that we need more tests/discussion for geo replication around write/read affinity[2] which is geo replication stuff in Swift itself and performances.
>
> For the write/read affinity, actually we didn't consider the affinity control to simplify the implementation until EC landed into Swift master[3] so I think it's time to make sure how we can use the
> affinity control with EC but it's not done yet.
>
> For the performance perspective, in my experiments, more parities causes quite performance degradation[4]. To prevent the degradation, I am working for the spec which makes duplicated copy from
> data/parity fragments and spread them out into geo regions.
>
> To sumurize, we've not done the work yet but we welcome to discuss and contribute for EC + geo replication anytime, IMO.
>
> Thanks,
> Kota
>
> 1: https://bitbucket.org/tsg-/liberasurecode/commits/a01b1818c874a65d1d1fb8f11ea441e9d3e18771
> 2: http://docs.openstack.org/developer/swift/admin_guide.html#geographically-distributed-clusters
> 3: http://docs.openstack.org/developer/swift/overview_erasure_code.html#region-support
> 4: https://specs.openstack.org/openstack/swift-specs/specs/in_progress/global_ec_cluster.html
>
>
>
> (2016/02/15 18:00), Mark Kirkwood wrote:
>> After looking at:
>>
>> https://www.youtube.com/watch?v=9YHvYkcse-k
>>
>> I have a question (that follows on from Bruno's) about using erasure coding with geo replication.
>>
>> Now the example given to show why you could/should not use erasure coding with geo replication is somewhat flawed as it is immediately clear that you cannot set:
>>
>> - num_data_frags > num_devices (or nodes) in a region
>>
>> and expect to survive a region outage...
>>
>> With that I mind I did some experiments (Liberty swift) and it looks to me like if you have:
>>
>> - num_data_frags < num_nodes in (smallest) region
>>
>> and:
>>
>> - num_parity_frags = num_data_frags
>>
>>
>> then having a region fail does not result in service outage.
>>
>> So my real question is - it looks like it *is* possible to use erasure coding in geo replicated situations - however I may well be missing something significant, so I'd love some clarification here [1]!
>>
>> Cheers
>>
>> Mark
>>
>> [1] Reduction is disk usage and net traffic looks attractive
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
>




More information about the OpenStack-dev mailing list