<div dir="ltr">Yuzawa-san,<div><br></div><div>Your concept aligns very well with concept of replication based on container sync modified to be service-wide, which we've seen with one of our customers. I'm really excited to see that development of multi-regions feature goes in several directions, as it's beneficial for Swift and the whole ecosystem.</div>
<div><br></div><div style>It appears that 'proxy ring' is an additional DB to global multi-region ring. You could consider including information about region proxy servers with the main ring database, for example, as a device dict entry.</div>
<div style><br></div><div style>PS. Any chance to meet you at Summit in Portland?</div><div style><br></div><div style>--</div><div style>Best regards,</div><div style>Oleg Gelbukh</div><div style>Mirantis Inc.</div></div>
<div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Mar 1, 2013 at 4:34 PM, YUZAWA Takahiko <span dir="ltr"><<a href="mailto:yuzawataka@intellilink.co.jp" target="_blank">yuzawataka@intellilink.co.jp</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Oleg-san<br>
<br>
Thank you for your reply. I think your idea about inter-region replication is very interesting.<br>
<br>
But we think it isn't good to separate namespace in geo-distributed cluster (about #3) for the first step.<br>
<br>
So now we have another one idea for geo-distributed cluster, in order to meet the requirements that no proxy-server directly connects to storage servers of foreign regions.<br>
<br>
Basic concept:<br>
We would like to introduce 'proxy ring' that contains at least an information of corresponding proxy-servers to each regions. Swift-server can know the location of proxy-servers of each regions by 'proxy ring'.<br>
And the proxy-server of local connects to the proxy-server of foreign directly (not to storage-servers of foreign) when need to manipulate objects in foreign region.<br>
It may seems swift-proxy-server works like 'web forward proxy' for the other region proxy-servers.<br>
<br>
If 'proxy ring' was introduced, we think that we can implement geo-distributed clusters based on 'region-tier' and 'proxy affinity' without separated namespace.<br>
<br>
The following is the basic behavior of proxy-server.<br>
<br>
GET:<br>
Proxy-server will get objects from local region storage servers as far as possible. If there are no local nodes in the primary nodes, proxy-server will relay a request to proxy-server of the other region (included in the primary nodes).<br>
<br>
PUT:<br>
Proxy-server will force to store an object into storage-servers of local region, and object-replicator makes it replicate toward the other region. (same as before)<br>
<br>
DELETE:<br>
Inspecting the primary nodes, If nodes in local were found, proxy-server sends a request to local storage server as usual. If nodes of the other region were found, proxy-server will relay a request to proxy-server of the other region.<br>
<br>
<br>
For PoC, I made a brief patch againt swift-1.7.6 so that we can check the idea works (proxy-ring is not yet implemented though. also doesn't implement a process for accounts and containers).<br>
<br>
<a href="https://github.com/yuzawataka/swift/commit/1381b0713d8676ac1f6a2e48c55264037935e96a" target="_blank">https://github.com/yuzawataka/<u></u>swift/commit/<u></u>1381b0713d8676ac1f6a2e48c55264<u></u>037935e96a</a><br>
<br>
Any suggestions would be appreciated.<br>
<br>
Thank you.<div class="im HOEnZb"><br>
<br>
-- <br>
Best regards,<br>
YUZAWA Takahiko<br>
NTTDATA INTELLILINK<br>
<br></div><div class="HOEnZb"><div class="h5">
(2013/02/27 22:45), Oleg Gelbukh wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Hello, Adrian, Yuzawa-san<br>
<br>
We are still pursuing the global ring implementation with minimal<br>
changes to replication algorithm. However, it has a number of<br>
drawbacks. Some of them were obvious from the very beginning (for<br>
example, a need to tweak rebalance to minimize data transfers between<br>
regions, or operational overhead required to dirstribute ring files in<br>
multi-region environment), others have been made visible by this very<br>
discussion.<br>
<br>
We identified 3 basic ways to implement the inter-region replication:<br>
<br>
1. introduce replicator affinity, which is in general resembles proxy<br>
affinity in a sense that the original replicator handles replication<br>
to devices from local and foreign regions differently. For example,<br>
limit the number of REPLICATE calls to foreign regions to one in ten<br>
replicator runs and only connect single foreign region server in a<br>
single run. This is an approach we are going to take in the first<br>
iteration.<br>
<br>
2. implement separate replicator process for cross-region replication.<br>
The original replicator handles replication to devices in local region<br>
and ignores devices in foreign regions, while region-replicator acts<br>
symmetrically ignoring local devices. This approach is basically an<br>
extension of the first, but allows to isolate changes from the core<br>
code.<br>
<br>
3. create replicator-server to sit on the edge of region's replication<br>
network (or storage network if replication network is not used) and<br>
control replication to foreign regions. This server won't store any<br>
data, only database of hashes in a sort of 'ring-of-rings', used to<br>
determine if replication to foreign region required.<br>
In this case, global namespace will move to that 'ring-of-rings', and<br>
for inside-region replication, standard ring is used.<br>
Replication-server represented as special device with very large<br>
'weight' parameter, to get information about replicas in local cluster<br>
from standard replicators. This server will also have to 'proxy'<br>
replication traffic when it detects that partition is modified in<br>
local cluster.<br>
Unlike #1 and #2, this option supports only replication between<br>
regions, no proxy-servers can talk to storage serevr in foreign<br>
region. However, it can allow more sophisticated algorithms for<br>
inter-regions replication.<br>
<br>
--<br>
Best regards,<br>
Oleg Gelbukh<br>
Mirantis, Inc.<br>
<br>
</blockquote>
<br>
<br>
<br></div></div><div class="HOEnZb"><div class="h5">
______________________________<u></u>_________________<br>
OpenStack-dev mailing list<br>
<a href="mailto:OpenStack-dev@lists.openstack.org" target="_blank">OpenStack-dev@lists.openstack.<u></u>org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/<u></u>cgi-bin/mailman/listinfo/<u></u>openstack-dev</a><br>
</div></div></blockquote></div><br></div>