[Openstack] [Swift] [Storage node] Lots of timeouts in load test after several hours around 1, 000, 0000 operations

John Dickinson me at not.mn
Sun Jul 1 19:50:32 UTC 2012


I hope you are able to get an answer. I'm traveling this week, so I won't have a chance to look in to it. I hope some of the other core devs will have a chance to help you find an answer.

--John


On Jul 1, 2012, at 2:03 PM, Kuo Hugo <tonytkdk at gmail.com> wrote:

> Hi all , 
> 
> I did several loading tests for swift in recent days. 
> 
> I'm facing an issue ....... Hope you can share you consideration to me ... 
> 
> My environment:
> Swift-proxy with Tempauth in one server : 4 cores/32G rams 
> 
> Swift-object + Swift-account + Swift-container in storage node * 3 , each for : 8 cores/32G rams   2TB SATA HDD * 7 
> =====================================================================================
> bench.conf :
> 
> [bench]
> auth = http://172.168.1.1:8082/auth/v1.0
> user = admin:admin
> key = admin
> concurrency = 200
> object_size = 4048
> num_objects = 100000
> num_gets = 100000
> delete = yes
> =====================================================================
> 
> After 70 rounds .....
> 
> PUT operations get lots of failures , but GET still works properly
> ERROR log:
> Jul  1 04:35:03 proxy-server ERROR with Object server 192.168.100.103:36000/DISK6 re: Trying to get final status of PUT to /v1/AUTH_admin/af5862e653054f7b803d8cf1728412d2_6/24fc2f997bcc4986a86ac5ff992c4370: Timeout (10s) (txn: txd60a2a729bae46be9b667d10063a319f) (client_ip: 172.168.1.2)
> Jul  1 04:34:32 proxy-server ERROR with Object server 192.168.100.103:36000/DISK2 re: Expect: 100-continue on /AUTH_admin/af5862e653054f7b803d8cf1728412d2_19/35993faa53b849a89f96efd732652e31: Timeout (10s)
> 
> 
> And kernel starts to report failed message as below
> kernel failed log:
> 76666 Jul  1 16:37:50 angryman-storage-03 kernel: [350840.020736] w83795 0-002f: Failed to read from register 0x03c, err -6
>    76667 Jul  1 16:37:50 angryman-storage-03 kernel: [350840.052654] w83795 0-002f: Failed to read from register 0x015, err -6
>    76668 Jul  1 16:37:50 angryman-storage-03 kernel: [350840.080613] w83795 0-002f: Failed to read from register 0x03c, err -6
>    76669 Jul  1 16:37:50 angryman-storage-03 kernel: [350840.112583] w83795 0-002f: Failed to read from register 0x016, err -6
>    76670 Jul  1 16:37:50 angryman-storage-03 kernel: [350840.144517] w83795 0-002f: Failed to read from register 0x03c, err -6
>    76671 Jul  1 16:37:50 angryman-storage-03 kernel: [350840.176468] w83795 0-002f: Failed to read from register 0x017, err -6
>    76672 Jul  1 16:37:50 angryman-storage-03 kernel: [350840.208455] w83795 0-002f: Failed to read from register 0x03c, err -6
>    76673 Jul  1 16:37:51 angryman-storage-03 kernel: [350840.240410] w83795 0-002f: Failed to read from register 0x01b, err -6
>    76674 Jul  1 16:37:51 angryman-storage-03 kernel: [350840.272Jul  1 17:05:28 angryman-storage-03 kernel: imklog 6.2.0, log source          = /proc/kmsg started.
> 
> PUTs become slower and slower , from 1,200/s to 200/s ...
> 
> I'm not sure if this is a bug or that's the limitation of XFS. If it's an limit of XFS . How to improve it ?
> 
> An additional question is XFS seems consume lots of memory , does anyone know about the reason of this behavior?
> 
> 
> Appreciate .......
>   
> 
> -- 
> +Hugo Kuo+
> tonytkdk at gmail.com
> +886 935004793
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120701/f1308474/attachment.html>


More information about the Openstack mailing list