[neutron] OVS inactivity probes
James Denton
james.denton at rackspace.com
Tue Mar 24 12:37:03 UTC 2020
Thank you, Slawek. Appreciate the quick assist!
James
On 3/24/20, 4:58 AM, "Slawek Kaplonski" <skaplons at redhat.com> wrote:
CAUTION: This message originated externally, please use caution when clicking on links or opening attachments!
Hi,
I checked it a bit deeper and it seems for me that method add_manager, which is
in [1] is not used at all.
In the past it was used by function "enable_connection_uri" from
neutron.agent.ovsdb.native.helpers module but commit [2] switched it to use
helper function from ovsdbapp.
So I think that this is simply bug in Neutron which we need to fix. I opened bug
for it [3].
[1] https://opendev.org/openstack/neutron/src/branch/master/neutron/agent/common/ovs_lib.py#L122
[2] https://review.opendev.org/#/c/453014/
[3] https://bugs.launchpad.net/neutron/+bug/1868686
On Mon, Mar 23, 2020 at 05:11:42PM +0000, James Denton wrote:
> Hello all,
>
> Like others, we have seen an increase in the amount of messages like those below, followed up by disconnects:
>
> 2020-03-19T07:19:47.414Z|01614|rconn|ERR|br-int<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
> 2020-03-19T07:19:47.414Z|01615|rconn|ERR|br-ex<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
> 2020-03-19T07:19:47.414Z|01616|rconn|ERR|br-tun<->tcp:127.0.0.1:6633: no response to inactivity probe after 10 seconds, disconnecting
>
> We have since increased the value of of_inactivity_probe (https://bugs.launchpad.net/neutron/+bug/1817022) and confirmed the Controller connection reflects this new value:
>
> ---
> # ovs-vsctl list Controller
> ...
> _uuid : b4814677-c6f3-4afc-9c9e-999d5a5ac78f
> connection_mode : out-of-band
> controller_burst_limit: []
> controller_rate_limit: []
> enable_async_messages: []
> external_ids : {}
> inactivity_probe : 60000
> is_connected : true
> local_gateway : []
> local_ip : []
> local_netmask : []
> max_backoff : []
> other_config : {}
> role : other
> status : {last_error="Connection refused", sec_since_connect="1420", sec_since_disconnect="1423", state=ACTIVE}
> target : "tcp:127.0.0.1:6633"
> ---
>
> However, we also see disconnects on the manager side, which the config option does not address:
>
> 2020-03-23T11:01:02.871Z|00443|reconnect|ERR|tcp:127.0.0.1:50098: no response to inactivity probe after 5 seconds, disconnecting
>
> This bug (https://bugs.launchpad.net/neutron/+bug/1627106) and related commit (https://opendev.org/openstack/neutron/commit/1698bee770b84a2663ba940a6ded5d4b9733101a) appear to leverage the ovs_vsctl_timeout value (since renamed to ovsdb_timeout), but the inactivity_probe for the Manager connection does not appear to be implemented. Honestly, I'm not sure if that code path is used.
>
> ---
> # ovs-vsctl list Manager
> _uuid : d61519ba-93fc-4fe5-b05c-b630778a44b0
> connection_mode : []
> external_ids : {}
> inactivity_probe : []
> is_connected : true
> max_backoff : []
> other_config : {}
> status : {bound_port="6640", n_connections="2", sec_since_connect="0", sec_since_disconnect="0"}
> target : "ptcp:6640:127.0.0.1"
> ---
>
> Running this by hand sets the inactivity_probe timeout on manager connection, but we'd prefer to use a built-in method, if possible:
>
> # ovs-vsctl set manager d61519ba-93fc-4fe5-b05c-b630778a44b0 inactivity_probe=30000
>
> Any suggestions?
>
> Thanks,
> James
>
--
Slawek Kaplonski
Senior software engineer
Red Hat
More information about the openstack-discuss
mailing list