Open Stack

Fri Feb 20 15:39:47 UTC 2015

We finished upgrading to Juno about the time you guys did.  Just checked 
logs across all environments since the time of the Juno upgrade and I'm 
*not* seeing the same errors.

For comparison here's what we have (mostly out-of-the-box):

    api_workers and rpc_workers = 32
    metadata_workers = 32
    url_timeout = 30
    oslo version = 1.4.1

Any related errors in the Neutron logs?
Couple seemingly dumb questions related to system limits, but:

 1. Could this be a file descriptors limit for the neutron processes?
 2. Recently we ran into the file descriptors limit in MySQL which
    showed up with "sporadic but frequent errors" in Neutron.  Under
    load is your MySQL fd limit being hit?
 3. Similar limit question for RabbitMQ.

Let me know if you want any more comparison info.

Sean Lynn
Time Warner Cable, Inc.

----------------------

Kris G. Lindgren:

"

    After our icehouse -> juno upgrade we are noticing sporadic but frequent errors from nova-metadata when trying to serve metadata requests.  The error is the following:

    [req-594325c6-44ed-465c-a8e4-bd5a8e5dbdcb None] Failed to get metadata for ip: x.x.x.x 2015-02-19 12:16:45.903 25007 TRACE nova.api.metadata.handler Traceback (most recent call last): 2015-02-19 12:16:45.903 25007 TRACE nova.api.metadata.handler File /usr/lib/python2.6/site-packages/nova/api/metadata/handler.py, line 150, in _handle_remote_ip_request 2015-02-19 12:16:45.903 25007 TRACE nova.api.metadata.handler meta_data = self.get_metadata_by_remote_address(remote_address) 2015-02-19 12:16:45.903 25007 TRACE nova.api.metadata.handler File /usr/lib/python2.6/site-packages/nova/api/metadata/handler.py, line 82, in get_metadata_by_remote_address 2015-02-19 12:16:45.903 25007 TRACE nova.api.metadata.handler data = base.get_metadata_by_address(self.conductor_api, address)

    ...

    We have increased the number of neutron workers (40 API and 40 RPC), the Neutron url_timeout interval in nova from 30 to 60 seconds. We are only seeing this issue in production or pre-prod environments are fine.

    Is anyone else noticing this or frequent read timeouts when talking to neutron?  Have you found a solution?  What have you tried?

    I am thinking of updating a bunch of the oslo (db, messaging, ect ect) packages to the latest versions to see if things get better.

"
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20150220/c4c730d5/attachment.html>

Open Stack

[Openstack-operators] Neutron timeout issues

OpenStack

Community

Documentation

Branding & Legal