[openstack-dev] [swift] On Object placement
Halterman, Jonathan
jonathan.halterman at hp.com
Wed Feb 18 17:13:55 UTC 2015
Hi Christian - thanks for the response,
On 2/18/15, 1:53 AM, "Christian Schwede" <christian.schwede at enovance.com>
wrote:
>Hello Jonathan,
>
>On 17.02.15 22:17, Halterman, Jonathan wrote:
>> Various services desire the ability to control the location of data
>> placed in Swift in order to minimize network saturation when moving data
>> to compute, or in the case of services like Hadoop, to ensure that
>> compute can be moved to wherever the data resides. Read/write latency
>> can also be minimized by allowing authorized services to place one or
>> more replicas onto the same rack (with other replicas being placed on
>> separate racks). Fault tolerance can also be enhanced by ensuring that
>> some replica(s) are placed onto separate racks. Breaking this down we
>> come up with the following potential requirements:
>>
>> 1. Swift should allow authorized services to place a given number of
>> object replicas onto a particular rack, and onto separate racks.
>
>This is already possible if you use zones and regions in your ring
>files. For example, if you have 2 racks, you could assign one zone to
>each of them and Swift places at least one replica on each rack.
>
>Because Swift takes care of the device weight you could also ensure that
>a specific rack gets two copies, and another rack only one.
Presumably a deployment would/should match the DC layout, where racks
could correspond to Azs.
>However, this is only true as long as all primary nodes are accessible.
>If Swift stores data on a handoff node this data might be written to a
>different node first, and moved to the primary node later on.
>
>Note that placing objects on other than the primary nodes (for example
>using an authorized service you described) will only store the data on
>these nodes until the replicator moves the data to the primary nodes
>described by the ring.
>As far as I can see there is no way to ensure that an authorized service
>can decide where to place data, and that this data stays on the selected
>nodes. That would require a fundamental change within Swift.
So - how can we influence where data is stored? In terms of placement
based on a hash ring, I¹m thinking of perhaps restricting the placement of
an object to a subset of the ring based on a zone. We can still hash an
object somewhere on the ring, for the purposes of controlling locality, we
just want it to be within (or without) a particular zone. Any ideas?
>
>> 2. Swift should allow authorized services and administrators to learn
>> which racks an object resides on, along with endpoints.
>
>You already mentioned the endpoint middleware, though it is currently
>not protected and unauthenticated access is allowed if enabled.
This is good to know. We still need to learn which rack an object resides
on though. This information is important in determining whether a swift
object resides on the same rack as a VM.
>You
>could easily add another small middleware in the pipeline to check
>authentication and grant or deny access to /endpoints based on the
>authentication.
>You can also get the node (and disk) if you have access to the ring
>files. There is a tool included in the Swift source code called
>"swift-get-nodes"; however you could simply reuse existing code to
>include it in your projects.
I¹m guessing this would not work for in cloud services?
- jonathan
>
>Christian
>
>__________________________________________________________________________
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 5517 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20150218/f23590b7/attachment.bin>
More information about the OpenStack-dev
mailing list