[openstack-dev] [Swift] Design note of geo-distributed Swift cluster

Caitlin Bestler caitlin.bestler at nexenta.com
Tue Feb 5 21:47:57 UTC 2013


On 2/4/2013 3:11 PM, John Dickinson wrote:
> When solving the global cluster problem, two things need to be kept in mind, I think. First, I don't need to try (and indeed I can't) solve for every single deployment scenario. Second, I really want to keep "pluggable" parts of Swift to a minimum.
>
> That being said, proxy affinity (ie determining nearness) probably doesn't have to be any more difficult than "my region" vs "far". That would be the first step. If at some future stage we decide to figure out smart ways to determine nearness, great. But that doesn't mean it needs to be a requirement for the first iteration. Perhaps there's even some cool stuff that systems like Quantum could offer eventually. Deployers know what their clusters look like. Let's take advantage of that to get simpler code paths.
>
> Similarly, I don't want to see proposals to replace the rsync transport mechanism with a plugin system that let's you choose anything you want (replication over token ring! replication over bit torrent!). If we need to improve the transport over what rsync offers, let's actually make something better for Swift rather than some plugin system that fragments the Swift deployments. We don't need yet-another-config-option.
>
> --John
>
>
>
While we don't want to solve every possible topology, I think we really 
need to pay attention to what multi-site really requires.

I haven't done any studies of the entire market, but in my experience 
inter-site replication used by storage services is almost always
via dedicated or VPN tunnels, and when VPN tunnels are used they are 
traffic shaped.

This is not just a matter of connecting a bunch of IP addresses on the 
internet and then form a vague impression as to which ones
are "far" away. It is more like the type of discovery routers do where 
each tunnel is a "link".

A proper remote replication solution will be aware of these links, and 
take that into account in its replication strategy. One example
topology that I believe is very likely is a distributed corporate 
intranet. The branch offices are very unlikely to connect with each
other, but rather mostly connect with the central office (and maybe one 
alternate location).

If the communications capacity favors communicating with certain sites, 
then we should favor replicating to those sites. Communications
capacity between corporate sites is typically provisioned (whether with 
dedicated lines or just VPN) and not something you will be able
to just increase on demand instantly. Inter-site bandwidth is still 
expensive.

That said, there are still two important things to reach a consensus on:

* Are we talking about enabling the Swift Proxy to access content that 
is at multiple sites, but each object is linked to a specific site.
    Or are we creating a global namespace with eventual consistency, and 
smart assignment of objects to the sites where they are
    actually referenced? The first goal is certainly easier.
* What forms of site-to-site replication are we going to support? Is 
this something each system administrator specifies (such as
     by adding policies along the lines of "all new objects created at a 
branch office will be replicated to the two central sites on
     a daily basis. Only objects actually referenced at a branch office 
will be cached there.") or something more akin to how Swift
     operates locally where the user does not specify where specific 
things are stored?






More information about the OpenStack-dev mailing list