[openstack-dev] [Swift] Design note of geo-distributed Swift cluster

YUZAWA Takahiko yuzawataka at intellilink.co.jp
Fri Mar 1 12:34:30 UTC 2013


Oleg-san

Thank you for your reply. I think your idea about inter-region 
replication is very interesting.

But we think it isn't good to separate namespace in geo-distributed 
cluster (about #3) for the first step.

So now we have another one idea for geo-distributed cluster, in order to 
meet the requirements that no proxy-server directly connects to storage 
servers of foreign regions.

Basic concept:
  We would like to introduce 'proxy ring' that contains at least an 
information of corresponding proxy-servers to each regions. Swift-server 
can know the location of proxy-servers of each regions by 'proxy ring'.
  And the proxy-server of local connects to the proxy-server of foreign 
directly (not to storage-servers of foreign)  when need to manipulate 
objects in foreign region.
It may seems swift-proxy-server works like 'web forward proxy' for the 
other region proxy-servers.

If 'proxy ring' was introduced, we think that we can implement 
geo-distributed clusters based on 'region-tier' and 'proxy affinity' 
without separated namespace.

The following is the basic behavior of proxy-server.

GET:
  Proxy-server will get objects from local region storage servers as far 
as possible. If there are no local nodes in the primary nodes, 
proxy-server will relay a request to  proxy-server of the other region 
(included in the primary nodes).

PUT:
  Proxy-server will force to store an object into storage-servers of 
local region, and object-replicator makes it replicate toward the other 
region. (same as before)

DELETE:
  Inspecting the primary nodes, If nodes in local were found, 
proxy-server sends a request to local storage server as usual. If nodes 
of the other region were found, proxy-server will relay a request to 
proxy-server of the other region.


For PoC, I made a brief patch againt swift-1.7.6 so that we can check 
the idea works (proxy-ring is not yet implemented though. also doesn't 
implement a process for accounts and containers).

https://github.com/yuzawataka/swift/commit/1381b0713d8676ac1f6a2e48c55264037935e96a

Any suggestions would be appreciated.

Thank you.

-- 
Best regards,
YUZAWA Takahiko
NTTDATA INTELLILINK

(2013/02/27 22:45), Oleg Gelbukh wrote:
> Hello, Adrian, Yuzawa-san
>
> We are still pursuing the global ring implementation with minimal
> changes to replication algorithm. However, it has a number of
> drawbacks. Some of them were obvious from the very beginning (for
> example, a need to tweak rebalance to minimize data transfers between
> regions, or operational overhead required to dirstribute ring files in
> multi-region environment), others have been made visible by this very
> discussion.
>
> We identified 3 basic ways to implement the inter-region replication:
>
> 1. introduce replicator affinity, which is in general resembles proxy
> affinity in a sense that the original replicator handles replication
> to devices from local and foreign regions differently. For example,
> limit the number of REPLICATE calls to foreign regions to one in ten
> replicator runs and only connect single foreign region server in a
> single run. This is an approach we are going to take in the first
> iteration.
>
> 2. implement separate replicator process for cross-region replication.
> The original replicator handles replication to devices in local region
> and ignores devices in foreign regions, while region-replicator acts
> symmetrically ignoring local devices. This approach is basically an
> extension of the first, but allows to isolate changes from the core
> code.
>
> 3. create replicator-server to sit on the edge of region's replication
> network (or storage network if replication network is not used) and
> control replication to foreign regions. This server won't store any
> data, only database of hashes in a sort of 'ring-of-rings', used to
> determine if replication to foreign region required.
> In this case, global namespace will move to that 'ring-of-rings', and
> for inside-region replication, standard ring is used.
> Replication-server represented as special device with very large
> 'weight' parameter, to get information about replicas in local cluster
> from standard replicators. This server will also have to 'proxy'
> replication traffic when it detects that partition is modified in
> local cluster.
> Unlike #1 and #2, this option supports only replication between
> regions, no proxy-servers can talk to storage serevr in foreign
> region. However, it can allow more sophisticated algorithms for
> inter-regions replication.
>
> --
> Best regards,
> Oleg Gelbukh
> Mirantis, Inc.
>





More information about the OpenStack-dev mailing list