[openstack-dev] [puppet][qa][ubuntu][neutron] Xenial Neutron Timeouts

Jens Harbott j.harbott at x-ion.de
Tue Nov 14 10:09:37 UTC 2017

2017-11-14 8:24 GMT+00:00 Tobias Urdin <tobias.urdin at crystone.com>:
> Trying to trace this, tempest calls the POST /servers/<instance id>/action
> API endpoint for the nova compute api.
> https://github.com/openstack/tempest/blob/master/tempest/lib/services/compute/floating_ips_client.py#L82
> Nova then takes the requests and tries to do this floating ip association
> using the neutron server api.
> http://logs.openstack.org/47/514347/1/check/puppet-openstack-integration-4-scenario001-tempest-ubuntu-xenial/ed5a657/logs/nova/nova-api.txt.gz
> 2017-10-29 23:12:35.521 17800 ERROR nova.api.openstack.compute.floating_ips
> [req-7f810cc7-a498-4bf4-b27e-8fc80d652785 42526a28b1a14c629b83908b2d75c647
> 2493426e6a3c4253a60c0b7eb35cfe19 - default default] Unable to associate
> floating IP to fixed IP for instance
> d265626a-77c1-4d2f-8260-46abe548293e. Error: Request to
> timed out: ConnectTimeout: Request to
> timed out
> Checking that timestamp in the neutron-server logs:
> http://paste.openstack.org/show/626240/
> We can see that during this timestamp right before at 23:12:30.377 and then
> after 23:12:35.611 everything seems to be doing fine.
> So there is some connectivity issues to the neutron API from where the Nova
> API is running causing a timeout.
> Now some more questions would be:
> * Why is the return code 400? Are we being fooled or is it actually a
> connection timeout.
> * Is the Neutron API stuck causing the failed connection? All talk are done
> over loopback so chance of a problem there is very low.
> * Any firewall catching this? Not likely since the agent processes requests
> right before and after.
> I can't find anything interesting in the overall other system logs that
> could explain that.
> Back to the logs!

I'm pretty certain that this is a deadlock between nova and neutron,
though I cannot put my finger on the exact spot yet. But looking at
the neutron log that you extracted you can see that neutron indeed
tries to give a successful answer to the fip request just after nova
has given up waiting for it (seems the timeout is 30s here):

2017-10-29 23:12:35.932 18958 INFO neutron.wsgi
42526a28b1a14c629b83908b2d75c647 2493426e6a3c4253a60c0b7eb35cfe19 -
default default] "PUT
/v2.0/floatingips/2e3fa334-d6ac-443c-b5ba-eeb521d6324c HTTP/1.1"
status: 200  len: 746 time: 30.4427412

Also, looking at
is seems that nova-api is started with two processes and one thread,
not sure if that means two processes with one thread each or only one
thread total, anyway nova-api might be getting stuck there.

More information about the OpenStack-dev mailing list