[openstack-dev] [swift] On Object placement

Christian Schwede christian.schwede at enovance.com
Wed Feb 18 09:53:34 UTC 2015


Hello Jonathan,

On 17.02.15 22:17, Halterman, Jonathan wrote:
> Various services desire the ability to control the location of data
> placed in Swift in order to minimize network saturation when moving data
> to compute, or in the case of services like Hadoop, to ensure that
> compute can be moved to wherever the data resides. Read/write latency
> can also be minimized by allowing authorized services to place one or
> more replicas onto the same rack (with other replicas being placed on
> separate racks). Fault tolerance can also be enhanced by ensuring that
> some replica(s) are placed onto separate racks. Breaking this down we
> come up with the following potential requirements:
> 
> 1. Swift should allow authorized services to place a given number of
> object replicas onto a particular rack, and onto separate racks.

This is already possible if you use zones and regions in your ring
files. For example, if you have 2 racks, you could assign one zone to
each of them and Swift places at least one replica on each rack.

Because Swift takes care of the device weight you could also ensure that
a specific rack gets two copies, and another rack only one.
However, this is only true as long as all primary nodes are accessible.
If Swift stores data on a handoff node this data might be written to a
different node first, and moved to the primary node later on.

Note that placing objects on other than the primary nodes (for example
using an authorized service you described) will only store the data on
these nodes until the replicator moves the data to the primary nodes
described by the ring.
As far as I can see there is no way to ensure that an authorized service
can decide where to place data, and that this data stays on the selected
nodes. That would require a fundamental change within Swift.

> 2. Swift should allow authorized services and administrators to learn
> which racks an object resides on, along with endpoints.

You already mentioned the endpoint middleware, though it is currently
not protected and unauthenticated access is allowed if enabled. You
could easily add another small middleware in the pipeline to check
authentication and grant or deny access to /endpoints based on the
authentication.
You can also get the node (and disk) if you have access to the ring
files. There is a tool included in the Swift source code called
"swift-get-nodes"; however you could simply reuse existing code to
include it in your projects.

Christian



More information about the OpenStack-dev mailing list