[Openstack-operators] Neutron timeout issues

Kris G. Lindgren klindgren at godaddy.com
Tue Feb 24 19:44:22 UTC 2015


So an update on this.

We dropped neutron API/RPC workers from 60 each on our 3 neutron api
servers to 10 workers each.  Since that change the neutron timeouts have
dropped to 0.  Under icehouse we were able to run with a 60 workers each
without any issues.
____________________________________________
 
Kris Lindgren
Senior Linux Systems Engineer
GoDaddy, LLC.




On 2/20/15, 9:29 AM, "Kris G. Lindgren" <klindgren at godaddy.com> wrote:

>We have memcache enabled on the metadata servers.  Part of our load is
>because we have a cron job that pulls the metadata and does some stuff on
>the server every ~10 minutes.  We staggered the start times so that the
>requests are spread out over a period of time and in general concurrent
>requests are relatively low, under 10 per second max.
>
>We have looked at FD for the neutron process as well as the database.  We
>are not hitting a db deadlock and the slow query log - the slowest query
>is ~1 second.
>
>We are using the following version of oslo components: messaging 1.4.1, db
>1.3.0, config 1.4.0, serialization 1.1.0, utils 1.1.0
>
>In general we do not have any process using high amount of cpu (asside
>from rabbit).  We see load on multiple neutron processes but its usually
>under 7% per process and only 7-8 processes show up in the process list at
>a time.
>
>We increased the neutron timeout value to 120 seconds and it still get
>read timeouts.  I am about to standup a neutron api server local on the
>metadata server so that it specifically talks to that neutron instance and
>see if that makes things better.
>____________________________________________
> 
>Kris Lindgren
>Senior Linux Systems Engineer
>GoDaddy, LLC.
>
>
>
>On 2/20/15, 12:27 AM, "Robert van Leeuwen"
><Robert.vanLeeuwen at spilgames.com> wrote:
>
>>> After our icehouse -> juno upgrade we are noticing sporadic but
>>>frequent errors from nova-metadata when trying to > serve metadata
>>>requests.  The error is the following:
>>
>>>Is anyone else noticing this or frequent read timeouts when talking to
>>>neutron?  Have you found a solution?
>>> What  have you tried?
>>
>>If it is metadata agent load related:
>>Are you caching the metadata info? (configure memcached_servers in
>>nova.conf)
>>We noticed a huge benefit on the metadata agent performance.
>>Within our cloud the biggest load is created from facter (puppet) which
>>will query the metadata agent.
>>This adds up to quite a lot of requests to the metadata agent quickly.
>>
>>Adding caching significantly decreases the load on all systems.
>>(Without caching we have a VERY high nova-conductor load)
>>
>>Cheers,
>>Rober van Leeuwen
>




More information about the OpenStack-operators mailing list