[Openstack] Caching strategies in Nova ...

Mark Washenberger mark.washenberger at rackspace.com
Fri Mar 23 14:56:13 UTC 2012


Alas, I let my patch get too stale to rebase properly. However, it is a
fairly "dumb" approach I took that can be demonstrated just from the
patch. And in any case, I think the approach you're taking, profiling
based on Tach, is going to be better in the long run and more share-able
in the community.

+ 1 gazillion to getting good metrics!

"Sandy Walsh" <sandy.walsh at rackspace.com> said:

> (resent to list as I realized I just did a Reply)
> 
> Cool! This is great stuff. Look forward to seeing the branch.
> 
> I started working on a similar tool that takes the data collected from
> Tach and fetches the data from Graphite to look at the performance
> issues (no changes to nova trunk requires since Tach is awesome).
> 
> It's a shell of an idea yet, but the basics work:
> https://github.com/ohthree/novaprof
> 
> But if there is something already existing, I'm happy to kill it off.
> 
> I don't doubt for a second the db is the culprit for many of our woes.
> 
> The thing I like about internal caching using established tools is that
> it works for db issues too without having to resort to custom tables.
> SQL query optimization, I'm sure, will go equally far.
> 
> Thanks again for the great feedback ... keep it comin'!
> 
> -S
> 
> 
> On 03/22/2012 11:53 PM, Mark Washenberger wrote:
>> Working on this independently, I created a branch with some simple
>> performance logging around the nova-api, and individually around
>> glance, nova.db, and nova.rpc calls. (Sorry, I only have a local
>> copy and its on a different computer right now, and probably needs
>> a rebase. I will rebase and publish it on GitHub tomorrow.)
>>
>> With this logging, I could get some simple profiling that I found
>> very useful. Here is a GH project with the analysis code as well
>> as some nova-api logs I was using as input.
>>
>> https://github.com/markwash/nova-perflog
>>
>> With these tools, you can get a wall-time profile for individual
>> requests. For example, looking at one server create request (and
>> you can run this directly from the checkout as the logs are saved
>> there):
>>
>> markw at poledra:perflogs$ cat nova-api.vanilla.1.5.10.log | python
>> profile-request.py req-3cc0fe84-e736-4441-a8d6-ef605558f37f
>> key                                        count    avg
>> nova.api.openstack.wsgi.POST                   1  0.657
>> nova.db.api.instance_update                    1  0.191
>> nova.image.show                                1  0.179
>> nova.db.api.instance_add_security_group        1  0.082
>> nova.rpc.cast                                  1  0.059
>> nova.db.api.instance_get_all_by_filters        1  0.034
>> nova.db.api.security_group_get_by_name         2  0.029
>> nova.db.api.instance_create                    1  0.011
>> nova.db.api.quota_get_all_by_project           3  0.003
>> nova.db.api.instance_data_get_for_project      1  0.003
>>
>> key                      count  total
>> nova.api.openstack.wsgi      1  0.657
>> nova.db.api                 10  0.388
>> nova.image                   1  0.179
>> nova.rpc                     1  0.059
>>
>> All times are in seconds. The nova.rpc time is probably high
>> since this was the first call since server restart, so the
>> connection handshake is probably included. This is also probably
>> 1.5 months stale.
>>
>> The conclusion I reached from this profiling is that we just plain
>> overuse the db (and we might do the same in glance). For example,
>> whenever we do updates, we actually re-retrieve the item from the
>> database, update its dictionary, and save it. This is double the
>> cost it needs to be. We also handle updates for data across tables
>> inefficiently, where they could be handled in single database round
>> trip.
>>
>> In particular, in the case of server listings, extensions are just
>> rough on performance. Most extensions hit the database again
>> at least once. This isn't really so bad, but it clearly is an area
>> where we should improve, since these are the most frequent api
>> queries.
>>
>> I just see a ton of specific performance problems that are easier
>> to address one by one, rather than diving into a general (albeit
>> obvious) solution such as caching.
>>
>>
>> "Sandy Walsh" <sandy.walsh at rackspace.com> said:
>>
>>> We're doing tests to find out where the bottlenecks are, caching is the
>>> most obvious solution, but there may be others. Tools like memcache do a
>>> really good job of sharing memory across servers so we don't have to
>>> reinvent the wheel or hit the db at all.
>>>
>>> In addition to looking into caching technologies/approaches we're gluing
>>> together some tools for finding those bottlenecks. Our first step will
>>> be finding them, then squashing them ... however.
>>>
>>> -S
>>>
>>> On 03/22/2012 06:25 PM, Mark Washenberger wrote:
>>>> What problems are caching strategies supposed to solve?
>>>>
>>>> On the nova compute side, it seems like streamlining db access and
>>>> api-view tables would solve any performance problems caching would
>>>> address, while keeping the stale data management problem small.
>>>>
>>>
>>> _______________________________________________
>>> Mailing list: https://launchpad.net/~openstack
>>> Post to     : openstack at lists.launchpad.net
>>> Unsubscribe : https://launchpad.net/~openstack
>>> More help   : https://help.launchpad.net/ListHelp
>>>
>>
>>
>>
>> _______________________________________________
>> Mailing list: https://launchpad.net/~openstack
>> Post to     : openstack at lists.launchpad.net
>> Unsubscribe : https://launchpad.net/~openstack
>> More help   : https://help.launchpad.net/ListHelp
> 
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp
> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: perflog.patch
Type: text/x-patch
Size: 7600 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20120323/b2cd1b44/attachment.bin>


More information about the Openstack mailing list