[Openstack] [Swift] Cache pressure tuning

ZHOU Yuan dunk007 at gmail.com
Tue Jun 18 11:18:22 UTC 2013


Jonathan, we happen to use SN similar with yours and I could share you some
performance testing data here:

1. 100 container with 10000 objects(base)
The performance is quite good and can hit HW bottleneck

2. 10kcontainer with 100M objects
The performance is not so good, which dropped 80% compared with base

3. 1container with 1000M objects
The performance is not so good, which dropped 95% compared with base

The suspect reason we found are:
1) XFS's overhead W/ huge # of objects. Deleting some files wouldn't help
since the inode allocation is quite sparse on the disks already and later
inode_lookup should costs more disk seek time I guess.  But this could be
greatly improved by setting vfs_cache_presure to a lower value and it
should be safe even we set it to 1 since Swift does use cache at all. If we
could cache all the inode and the performance would become good again.We've
done some tests with precached inode(simply run 'ls -R /srv/nodes') and
verified the performance is quite good.

2) SQLite DB performance bottleneck when there are millions of records in a
single DB. There is a BP to auto split large database but not implemented
yet


Hope this can help.

-- 
Sincerely,  Yuan

On Tue, Jun 18, 2013 at 1:56 PM, Jonathan Lu <jojokururu at gmail.com> wrote:

>  Hi, Huang
>     Thanks a lot. I will try this test.
>
>     One more question:
>     In the 3 following situation, will the base line performance be quite
> different?
>         1. only 1 sontaienr with 10m objects;
>         2. 100,000 objects per container at 100 containers
>         3. 1,000 objects per container at 10,000 containers
>
> Cheers
> Jonathan Lu
>
>
> On 2013/6/18 12:54, Huang Zhiteng wrote:
>
>
>
> On Tue, Jun 18, 2013 at 12:35 PM, Jonathan Lu <jojokururu at gmail.com>wrote:
>
>>  Hi, Huang
>>     Thanks for you explanation. Does it mean that the storage cluster of
>> specific processing ability will be slower and slower with more and more
>> objects? Is there any test about the rate of the decline or is there any
>> lower limit?
>>
>>     For example, my environment is:
>>
>>
>> 	Swift version : grizzly
>> 	Tried on Ubuntu 12.04
>> 	3 Storage-nodes : each for 16GB-ram / CPU 4*2 / 3TB*12
>>
>>     The expected* *throughout is more than 100/s with uploaded objects
>> of 50KB. At the beginning it works quite well and then it drops. If this
>> degradation is unstoppable, I'm afraid that the performance will finally
>> not be able to meet our needs no matter how I tuning other config.
>>
>>   It won't be hard to do a base line performance (without inode cache)
> assessment of your system: populate your system with certain mount of
> objects with desired size (say 50k, 10million objects <1,000 objects per
> container at 10,000 containers>), and *then drop VFS caches explicitly
> before testing*.  Measure performance with your desired IO pattern and in
> the mean time drop VFS cache every once in a while (say every 60s). That's
> roughly the performance you can get when your storage system gets into a
> 'steady' state (i.e. objects # has out grown memory size).  This will give
> you idea of pretty much the worst case.
>
>
>>  Jonathan Lu
>>
>>
>> On 2013/6/18 11:05, Huang Zhiteng wrote:
>>
>>
>> On Tue, Jun 18, 2013 at 10:42 AM, Jonathan Lu <jojokururu at gmail.com>wrote:
>>
>>> On 2013/6/17 18:59, Robert van Leeuwen wrote:
>>>
>>>>  I'm facing the issue about the performance degradation, and once I
>>>>> glanced that changing the value in /proc/sys
>>>>> /vm/vfs_cache_pressure will do a favour.
>>>>> Can anyone explain to me whether and why it is useful?
>>>>>
>>>> Hi,
>>>>
>>>> When this is set to a lower value the kernel will try to keep the
>>>> inode/dentry cache longer in memory.
>>>> Since the swift replicator is scanning the filesystem continuously it
>>>> will eat up a lot of iops if those are not in memory.
>>>>
>>>> To see if a lot of cache misses are happening, for xfs, you can look at
>>>> xs_dir_lookup and xs_ig_missed.
>>>> ( look at http://xfs.org/index.php/Runtime_Stats )
>>>>
>>>> We greatly benefited from setting this to a low value but we have quite
>>>> a lot of files on a node ( 30 million)
>>>> Note that setting this to zero will result in the OOM killer killing
>>>> the machine sooner or later.
>>>> (especially if files are moved around due to a cluster change ;)
>>>>
>>>> Cheers,
>>>> Robert van Leeuwen
>>>>
>>>
>>>  Hi,
>>>     We set this to a low value(20) and the performance is better than
>>> before. It seems quite useful.
>>>
>>>     According to your description, this issue is related with the object
>>> quantity in the storage. We delete all the objects in the storage but it
>>> doesn't help anything. The only method to recover is to format and re-mount
>>> the storage node. We try to install swift on different environment but this
>>> degradation problem seems to be an inevitable one.
>>>
>> It's inode cache for each file(object) helps (reduce extra disk IOs).  As
>> long as your memory is big enough to hold inode information of those
>> frequently accessed objects, you are good.  And there's no need (no point)
>> to limit # of objects for each storage node IMO.  You may manually load
>> inode information of objects into VFS cache if you like (by simply 'ls'
>> files), to _restore_ performance.  But still memory size and object access
>> pattern are the key to this kind of performance tuning, if memory is too
>> small, inode cache will be invalided sooner or later.
>>
>>
>>
>>> Cheers,
>>> Jonathan Lu
>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~openstack
>>> Post to     : openstack at lists.launchpad.net
>>> Unsubscribe : https://launchpad.net/~openstack
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>>
>>
>> --
>> Regards
>> Huang Zhiteng
>>
>>
>>
>
>
> --
> Regards
> Huang Zhiteng
>
>
>
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20130618/f48f6781/attachment.html>


More information about the Openstack mailing list