[Openstack] Swift slow write performance

Rustam Aliyev rustam at code.az
Thu Dec 22 11:05:59 UTC 2011


Hi All,

After few more days of checks I couldn't resolve it. I think I tried 
pretty much everything.

I tend to think that the problem is CentOS/RHEL 5.x and the following error:

     Dec 22 05:03:24 ec01 object-server STDOUT: WARNING:root:Unable to 
locate fallocate in libc.  Leaving as a no-op.

I think the lack of fallocate() in EL5 kernel hinders sequential 
read/write operation. When multiple streams are writing data (3x object 
data + local and remote container updates), disk i/o degrades 
significantly. I tried to fix it by replacing fallocate() with 
posix_fallocate() ( http://paste.openstack.org/show/3932/ ), but things 
gone even worse. This is my theory and unfortunately I don't have a way 
to prove it.

Based on that, I think authors need to make clear that Swift is not 
supported on EL5 and minimum version should be EL6 (where fallocate() 
exists) or fix it in swift.


Regards,
Rustam.


On 20/12/2011 13:59, Ywang225 wrote:
> Both Metadata.update() and container_update() should apply on all 
> replicas, and metadata.update() will write xattrs, and 
> container_update() will  write to sqlite db files. Surprisingly to see 
> so long time to update both.
>
> --ywang
>
> 发自我的 iPhone
>
> 在 2011-12-20,19:13,Rustam Aliyev <rustam at code.az 
> <mailto:rustam at code.az>> 写道:
>
>> Hi Mike,
>>
>> Thanks, I didn't know that PUT operation also includes updating 
>> replica containers. That makes sense. I will check that.
>>
>> In the mean time I've added debug checkpoints into PUT operation to 
>> measure different steps. Modified code is here: 
>> http://paste.openstack.org/show/3899/ (original: 
>> https://github.com/openstack/swift/blob/master/swift/obj/server.py#L530). 
>> Basically, I added some self.logger.debug() with timestamp in few places.
>>
>> I'm not python dev and don't know swift internals, so it's quite 
>> possible that I've got something wrong. In any case here are few 
>> sample results: http://paste.openstack.org/show/3900/
>>
>> Basically, there are 2 steps which take too long:
>>
>>  1. write metadata (line #63 of the first paste, metadata.update()),
>>     this takes 0.5-1.0 sec!
>>  2. update container (line #85, self.container_update()), this is
>>     second slowest, also in the range of ~0.5-1.0 sec.
>>
>> I assume that self.container_update() includes replica updates. But 
>> why does metadata.update() takes so long? Does it also imply replica 
>> updates?
>>
>> Overall, I have to say that troubleshooting of swift is impossible. 
>> There's almost no difference between log levels INFO and DEBUG. Would 
>> be nice to have some more info in DEBUG and even TRACE level for this 
>> kinda problems.
>>
>> --
>> Rustam.
>>
>> On 20/12/2011 04:41, Michael Barton wrote:
>>> On Mon, Dec 19, 2011 at 6:21 AM, Rustam Aliyev<rustam at code.az  <mailto:rustam at code.az>>  wrote:
>>>> The only thing which looks suspicious to me are these errors:
>>>>
>>>> Dec 18 04:01:28 ec01 object-server ERROR container update failed with
>>>> 10.0.1.3:6001/d01 (saving for async update later): Timeout (3s) (txn:
>>>> txdf95ad5a10844ee0b74d70d8a7638082)
>>>> Dec 18 04:01:28 ec01 object-server ERROR container update failed with
>>>> 10.0.1.2:6001/d01 (saving for async update later): Timeout (3s) (txn:
>>>> txee2545ba4610430fa3a6a166ca50c574)
>>>> Dec 18 04:01:28 ec01 object-server ERROR container update failed with
>>>> 10.0.1.8:6001/d01 (saving for async update later): Timeout (3s) (txn:
>>>> tx2546b29b15c643ec90a122a753dfddd3)
>>> Yeah, that is likely to be the culprit.  Each write is taking at least
>>> 3 seconds because it's timing out trying to update the container
>>> servers.
>>>
>>> So you need to debug connectivity from this object server to those IP
>>> addresses on port 6001 -- that the IP addresses and port are correct,
>>> everything's on the same network, there aren't any firewall rules
>>> blocking those connections, that the container servers are running and
>>> accepting connections, etc.  I'll read through your paste in a bit and
>>> see if I notice anything.
>>>
>>> -- Mike
>>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack 
>> <https://launchpad.net/%7Eopenstack>
>> Post to     : openstack at lists.launchpad.net 
>> <mailto:openstack at lists.launchpad.net>
>> Unsubscribe : https://launchpad.net/~openstack 
>> <https://launchpad.net/%7Eopenstack>
>> More help   : https://help.launchpad.net/ListHelp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20111222/8dc97beb/attachment.html>


More information about the Openstack mailing list