[openstack-dev] [Swift] Design note of geo-distributed Swift cluster
YUZAWA Takahiko
yuzawataka at intellilink.co.jp
Fri Mar 1 12:34:30 UTC 2013
Oleg-san
Thank you for your reply. I think your idea about inter-region
replication is very interesting.
But we think it isn't good to separate namespace in geo-distributed
cluster (about #3) for the first step.
So now we have another one idea for geo-distributed cluster, in order to
meet the requirements that no proxy-server directly connects to storage
servers of foreign regions.
Basic concept:
We would like to introduce 'proxy ring' that contains at least an
information of corresponding proxy-servers to each regions. Swift-server
can know the location of proxy-servers of each regions by 'proxy ring'.
And the proxy-server of local connects to the proxy-server of foreign
directly (not to storage-servers of foreign) when need to manipulate
objects in foreign region.
It may seems swift-proxy-server works like 'web forward proxy' for the
other region proxy-servers.
If 'proxy ring' was introduced, we think that we can implement
geo-distributed clusters based on 'region-tier' and 'proxy affinity'
without separated namespace.
The following is the basic behavior of proxy-server.
GET:
Proxy-server will get objects from local region storage servers as far
as possible. If there are no local nodes in the primary nodes,
proxy-server will relay a request to proxy-server of the other region
(included in the primary nodes).
PUT:
Proxy-server will force to store an object into storage-servers of
local region, and object-replicator makes it replicate toward the other
region. (same as before)
DELETE:
Inspecting the primary nodes, If nodes in local were found,
proxy-server sends a request to local storage server as usual. If nodes
of the other region were found, proxy-server will relay a request to
proxy-server of the other region.
For PoC, I made a brief patch againt swift-1.7.6 so that we can check
the idea works (proxy-ring is not yet implemented though. also doesn't
implement a process for accounts and containers).
https://github.com/yuzawataka/swift/commit/1381b0713d8676ac1f6a2e48c55264037935e96a
Any suggestions would be appreciated.
Thank you.
--
Best regards,
YUZAWA Takahiko
NTTDATA INTELLILINK
(2013/02/27 22:45), Oleg Gelbukh wrote:
> Hello, Adrian, Yuzawa-san
>
> We are still pursuing the global ring implementation with minimal
> changes to replication algorithm. However, it has a number of
> drawbacks. Some of them were obvious from the very beginning (for
> example, a need to tweak rebalance to minimize data transfers between
> regions, or operational overhead required to dirstribute ring files in
> multi-region environment), others have been made visible by this very
> discussion.
>
> We identified 3 basic ways to implement the inter-region replication:
>
> 1. introduce replicator affinity, which is in general resembles proxy
> affinity in a sense that the original replicator handles replication
> to devices from local and foreign regions differently. For example,
> limit the number of REPLICATE calls to foreign regions to one in ten
> replicator runs and only connect single foreign region server in a
> single run. This is an approach we are going to take in the first
> iteration.
>
> 2. implement separate replicator process for cross-region replication.
> The original replicator handles replication to devices in local region
> and ignores devices in foreign regions, while region-replicator acts
> symmetrically ignoring local devices. This approach is basically an
> extension of the first, but allows to isolate changes from the core
> code.
>
> 3. create replicator-server to sit on the edge of region's replication
> network (or storage network if replication network is not used) and
> control replication to foreign regions. This server won't store any
> data, only database of hashes in a sort of 'ring-of-rings', used to
> determine if replication to foreign region required.
> In this case, global namespace will move to that 'ring-of-rings', and
> for inside-region replication, standard ring is used.
> Replication-server represented as special device with very large
> 'weight' parameter, to get information about replicas in local cluster
> from standard replicators. This server will also have to 'proxy'
> replication traffic when it detects that partition is modified in
> local cluster.
> Unlike #1 and #2, this option supports only replication between
> regions, no proxy-servers can talk to storage serevr in foreign
> region. However, it can allow more sophisticated algorithms for
> inter-regions replication.
>
> --
> Best regards,
> Oleg Gelbukh
> Mirantis, Inc.
>
More information about the OpenStack-dev
mailing list