[openstack-dev] [Swift] Design note of geo-distributed Swift cluster

YUZAWA Takahiko yuzawataka at intellilink.co.jp
Tue Jan 29 07:53:36 UTC 2013


Hi, Caitlin-san

 > * Is each region supposed to be self-sufficient, in that get requests 
can be fulfilled by a copy within that
 >      region even if the links to other regions are temporarily down?

Object-replicator would keep retrying toward the other region. And the 
newest object which is PUT in other region can't be reached until 
connection recover.


 > * What is the tolerance for "eventual consistency" when dealing with 
continental distinces and TBs of new
 >      content potentially being created each day?

We think it depends on the performance of object-replicator. So we also 
assume object-replicator as the most important role in geo-distributed 
Swift cluster.
We are considering the improvement of object-replicator. For example, we 
would like to modify object-replicator that a transfer program to be 
pluggable and dynamically switchable. It is aimed to be replaced with 
another program "rsync" (because rsync may be slow in WAN connections as 
Long Fat Pipe).


 > * What happens if the same object is updated concurrently in two 
different regions?

Each object in disk has timestamp in file's name like 
'1342555642.83577.data'.  Object-replicator synchronizes 'partition' 
directories of these object across nodes, and the object is replicated 
with keeping file name.  Same objects' file which concurrently updated 
might exist together in a directory of a storage node, but each file 
names would differ(because it's hard to be just same time by time.time() 
of python).
And the newest object (as the biggest timestamp name in the directory) 
always has priority when requested by GET.
So that The newest object which is PUT has the priority across the whole 
Swift cluster if object-replicators work and are keeping consistency.

--
Best regards,
YUZAWA Takahiko
NTTDATA INTELLILINK


(2013/01/26 3:12), Caitlin Bestler wrote:
> These blueprints and documents are focused almost entirely on how the Swift Proxy creates objects.
>
> I think the more critical issue for Swift objects is how Objects are replicated in a multi-region environment
> when a copy becomes unavailable.
>
> The cold hard fact here is that inter-region replication is considerably more expensive than intra-region
> replication. If you're doing a multi-region cloud obviously you have to do both, but I am skeptical that
> a single algorithm can support both with nothing more than a "distance" metric.
>
> Some serious questions to apply to any design proposal:
>
> * Is each region supposed to be self-sufficient, in that get requests can be fulfilled by a copy within that
>      region even if the links to other regions are temporarily down?
> * What is the tolerance for "eventual consistency" when dealing with continental distinces and TBs of new
>      content potentially being created each day?
> * What happens if the same object is updated concurrently in two different regions?
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>




More information about the OpenStack-dev mailing list