[Openstack] Ring rebuild, multiple copies of ringbuilder file, wasRe: swift ringbuilder and disk size/capacity relationship

Peter Brouwer peter.brouwer at oracle.com
Tue Apr 19 14:37:11 UTC 2016


Hello All

Followup question.
Assume a swift cluster with a number of swift proxy nodes, each node 
needs to hold a copy of the ring structure, right?
What happens when a disk is added to the ring. After the change is made 
on the first proxy node, the ring config files need to be copied to the 
other proxy nodes, right?
Is there a risk during the period that the new ring builder files are 
copied a file can be stored using the new structure on one proxy node 
and retrieved via an other node that still holds the old structure and 
not returning object not found. Or the odd change an object is moved 
already by the re-balance process while being access by a proxy that 
still has the old ring structure.


Regards
Peter
On 16/03/2016 00:23, Mark Kirkwood wrote:
> On 16/03/16 00:51, Peter Brouwer wrote:
>
>> Ah, good info. Followup question, assume worse case ( just to emphasis
>> the situation) , one copy ( replication = 1 ) , disk approaching its max
>> capacity.
>> How can you monitor this situation, i.e. to avoid the disk full scenario
>> and
>> if the disk is full, what type of error is returned?
>>
>
> Let's do an example: 4 storage nodes (obj1...obj4) each with 1 disk 
> (vdb) added to ring. Replication set to 1.
>
> Firstly write a 1G object (to see where it is gonna go)...host obj1, 
> disk vdb, partition 1003):
>
> obj1 $ ls -l 
> /srv/node/vdb/objects/1003/d31/fae796287c852f0833316a3dadfb3d31/
> total 1048580
> -rw------- 1 swift swift 1073741824 Mar 16 10:15 1458079557.01198.data
>
>
> Then remove it
>
> obj1 $ ls -l 
> /srv/node/vdb/objects/1003/d31/fae796287c852f0833316a3dadfb3d31/
> total 4
> -rw------- 1 swift swift 0 Mar 16 10:47 1458078463.80396.ts
>
>
> ...and use up space on obj1/vdb (dd a 29G file into /srv/node/vdb 
> somewhere)
>
> obj1 $ df -m|grep vdb
> /dev/vdb           30705 29729       977  97% /srv/node/vdb
>
>
> Add object again (ends up on obj4 instead...handoff node)
>
> obj4 $ ls -l 
> /srv/node/vdb/objects/1003/d31/fae796287c852f0833316a3dadfb3d31/
> total 1048580
> -rw------- 1 swift swift 1073741824 Mar 16 11:06 1458079557.01198.data
>
>
> So swift is coping with the obj1/vdb disk being too full. Remove again 
> and exhaust space on all disks (dd again):
>
> @obj[1-4] $ df -h|grep vdb
> /dev/vdb         30G   30G  977M  97% /srv/node/vdb
>
>
> Now attempt to write 1G object again
>
> swiftclient.exceptions.ClientException:
> Object PUT failed:
> http://192.168.122.61:8080/v1/AUTH_9a428d5a6f134f829b2a5e4420f512e7/con0/obj0 
> 503 Service Unavailable
>
>
> So we get an http 503 to show that the put has failed.
>
>
> Now re monitoring. Out of the box swift-recon cover this:
>
> proxy1 $ swift-recon -dv
> =============================================================================== 
>
> --> Starting reconnaissance on 4 hosts
> =============================================================================== 
>
> [2016-03-16 13:16:54] Checking disk usage now
> -> http://192.168.122.63:6000/recon/diskusage: [{u'device': u'vdc', 
> u'avail': 32162807808, u'mounted': True, u'used': 33718272, u'size': 
> 32196526080}, {u'device': u'vdb', u'avail': 1024225280, u'mounted': 
> True, u'used': 31172300800, u'size': 32196526080}]
> -> http://192.168.122.64:6000/recon/diskusage: [{u'device': u'vdc', 
> u'avail': 32162807808, u'mounted': True, u'used': 33718272, u'size': 
> 32196526080}, {u'device': u'vdb', u'avail': 1024274432, u'mounted': 
> True, u'used': 31172251648, u'size': 32196526080}]
> -> http://192.168.122.62:6000/recon/diskusage: [{u'device': u'vdc', 
> u'avail': 32162807808, u'mounted': True, u'used': 33718272, u'size': 
> 32196526080}, {u'device': u'vdb', u'avail': 1024237568, u'mounted': 
> True, u'used': 31172288512, u'size': 32196526080}]
> -> http://192.168.122.65:6000/recon/diskusage: [{u'device': u'vdc', 
> u'avail': 32162807808, u'mounted': True, u'used': 33718272, u'size': 
> 32196526080}, {u'device': u'vdb', u'avail': 1024221184, u'mounted': 
> True, u'used': 31172304896, u'size': 32196526080}]
> Distribution Graph:
>   0%    4 
> *********************************************************************
>  96%    4 
> *********************************************************************
> Disk usage: space used: 124824018944 of 257572208640
> Disk usage: space free: 132748189696 of 257572208640
> Disk usage: lowest: 0.1%, highest: 96.82%, avg: 48.4617574245%
> =============================================================================== 
>
>
>
> So integrating swift-recon into regular monitoring/alerting 
> (collectd/nagios or whatever) is one approach (mind you most folk 
> already monitor disk usage data... and there is nothing overly special 
> about ensuring you don't run of space)!
>
>
>> BTW, thanks for the patience for sticking with me in this.
>
> No worries - a good question (once I finally understood it).
>
> regards
>
> Mark

-- 
Regards,

Peter Brouwer, Principal Software Engineer,
Oracle Application Integration Engineering.
Phone:  +44 1506 672767, Mobile +44 7720 598 226
E-Mail: Peter.Brouwer at Oracle.com





More information about the Openstack mailing list