[openstack-dev] [Swift] Design note of geo-distributed Swift cluster
YUZAWA Takahiko
yuzawataka at intellilink.co.jp
Tue Jan 29 07:53:36 UTC 2013
Hi, Caitlin-san
> * Is each region supposed to be self-sufficient, in that get requests
can be fulfilled by a copy within that
> region even if the links to other regions are temporarily down?
Object-replicator would keep retrying toward the other region. And the
newest object which is PUT in other region can't be reached until
connection recover.
> * What is the tolerance for "eventual consistency" when dealing with
continental distinces and TBs of new
> content potentially being created each day?
We think it depends on the performance of object-replicator. So we also
assume object-replicator as the most important role in geo-distributed
Swift cluster.
We are considering the improvement of object-replicator. For example, we
would like to modify object-replicator that a transfer program to be
pluggable and dynamically switchable. It is aimed to be replaced with
another program "rsync" (because rsync may be slow in WAN connections as
Long Fat Pipe).
> * What happens if the same object is updated concurrently in two
different regions?
Each object in disk has timestamp in file's name like
'1342555642.83577.data'. Object-replicator synchronizes 'partition'
directories of these object across nodes, and the object is replicated
with keeping file name. Same objects' file which concurrently updated
might exist together in a directory of a storage node, but each file
names would differ(because it's hard to be just same time by time.time()
of python).
And the newest object (as the biggest timestamp name in the directory)
always has priority when requested by GET.
So that The newest object which is PUT has the priority across the whole
Swift cluster if object-replicators work and are keeping consistency.
--
Best regards,
YUZAWA Takahiko
NTTDATA INTELLILINK
(2013/01/26 3:12), Caitlin Bestler wrote:
> These blueprints and documents are focused almost entirely on how the Swift Proxy creates objects.
>
> I think the more critical issue for Swift objects is how Objects are replicated in a multi-region environment
> when a copy becomes unavailable.
>
> The cold hard fact here is that inter-region replication is considerably more expensive than intra-region
> replication. If you're doing a multi-region cloud obviously you have to do both, but I am skeptical that
> a single algorithm can support both with nothing more than a "distance" metric.
>
> Some serious questions to apply to any design proposal:
>
> * Is each region supposed to be self-sufficient, in that get requests can be fulfilled by a copy within that
> region even if the links to other regions are temporarily down?
> * What is the tolerance for "eventual consistency" when dealing with continental distinces and TBs of new
> content potentially being created each day?
> * What happens if the same object is updated concurrently in two different regions?
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list