[neutron][ovn] networking-ovn-metadata-agent and neutron agent liveness
All, Currently experimenting with networking-ovn (rdo/train packages on centos7) and I've managed to cobble together a functional deployment with two exceptions: metadata agents and agent liveness. Ref: the metadata issues, it appears that the local compute node ovsdb server listens on a unix socket at /var/run/openvswitch/db.sock as openvswitch:hugetlbfs 0750. Since networking-ovn-metadata-agent runs as neutron, it's not able to interact with the local ovs database and gets stuck in a restart loop and complains about the inaccessible database socket. If I edit the systemd unit file and let the agent run as root, it functions as expected. This obviously isn't a real solution, but indicates to me a possible packaging bug? Not sure what the correct mix of permissions is, or if the local database should be listening on tcp:localhost:6640 as well and that's how the metadata agent should connect. The docs are sparse in this area, but I would imagine that something like the metadata-agent should 'just work' out of the box without having to change systemd unit files or mess with unix socket permissions. Thoughts? Secondly, ```openstack network agent list``` shows that all agents (ovn-controller) are all dead, all the time. However, if I display a single agent ```openstack network agent show $foo```, it shows as live. I looked around and saw some discussions about getting networking-ovn to deal with this better, but as of now the agents are reported as dead consistently unless they are explicitly polled, at least on centos 7. I haven't noticed any real impact, but the testing I'm doing is small scale. Other than those two issues, networking-ovn is great, and based on the discussions around possibly deprecating linuxbridge as an in-tree driver, it would make a great 'default' networking configuration option upstream, given the docs get cleaned up. Thanks in advance, r Chris Apsey
Hi Chris – I recall having the same issue when first implementing OVN into OpenStack-Ansible, and currently have the OVN metadata agent running as root[1]. I’m curious to see how others solved the issue as well. Thanks for bringing this up. [1] https://opendev.org/openstack/openstack-ansible-os_neutron/src/branch/master... James Denton Network Engineer Rackspace Private Cloud james.denton@rackspace.com From: Chris Apsey <bitskrieg@bitskrieg.net> Reply-To: Chris Apsey <bitskrieg@bitskrieg.net> Date: Wednesday, November 20, 2019 at 12:00 AM To: "openstack-discuss@lists.openstack.org" <openstack-discuss@lists.openstack.org> Subject: [neutron][ovn] networking-ovn-metadata-agent and neutron agent liveness CAUTION: This message originated externally, please use caution when clicking on links or opening attachments! All, Currently experimenting with networking-ovn (rdo/train packages on centos7) and I've managed to cobble together a functional deployment with two exceptions: metadata agents and agent liveness. Ref: the metadata issues, it appears that the local compute node ovsdb server listens on a unix socket at /var/run/openvswitch/db.sock as openvswitch:hugetlbfs 0750. Since networking-ovn-metadata-agent runs as neutron, it's not able to interact with the local ovs database and gets stuck in a restart loop and complains about the inaccessible database socket. If I edit the systemd unit file and let the agent run as root, it functions as expected. This obviously isn't a real solution, but indicates to me a possible packaging bug? Not sure what the correct mix of permissions is, or if the local database should be listening on tcp:localhost:6640 as well and that's how the metadata agent should connect. The docs are sparse in this area, but I would imagine that something like the metadata-agent should 'just work' out of the box without having to change systemd unit files or mess with unix socket permissions. Thoughts? Secondly, ```openstack network agent list``` shows that all agents (ovn-controller) are all dead, all the time. However, if I display a single agent ```openstack network agent show $foo```, it shows as live. I looked around and saw some discussions about getting networking-ovn to deal with this better, but as of now the agents are reported as dead consistently unless they are explicitly polled, at least on centos 7. I haven't noticed any real impact, but the testing I'm doing is small scale. Other than those two issues, networking-ovn is great, and based on the discussions around possibly deprecating linuxbridge as an in-tree driver, it would make a great 'default' networking configuration option upstream, given the docs get cleaned up. Thanks in advance, r Chris Apsey
James, After playing with this a little more, I think I have a way to handle this that is somewhat better than running as root directly: 1. Allow the ovsdb-server on the compute nodes to listen on 127.0.0.1:6640 (ovs-appctl -t ovsdb-server ovsdb-server/add-remote ptcp:6640:127.0.0.1) 2. Set the helper_command, ovsdb_connection, and the root_helper options in /etc/neutron/plugins/networking-ovn/networking-ovn-metadata-agent.ini as appropriate (example [1]) The process should start successfully as neutron. I'm still having issues with agent liveness reporting, but it appears to be entirely superficial. The agent works as expected. r Chris Apsey [1] https://github.com/GeorgiaCyber/kinetic/blob/networking-ovn/formulas/compute... ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ On Wednesday, November 20, 2019 7:15 AM, James Denton <james.denton@rackspace.com> wrote:
Hi Chris –
I recall having the same issue when first implementing OVN into OpenStack-Ansible, and currently have the OVN metadata agent running as root[1]. I’m curious to see how others solved the issue as well. Thanks for bringing this up.
[1] https://opendev.org/openstack/openstack-ansible-os_neutron/src/branch/master...
James Denton
Network Engineer
Rackspace Private Cloud
james.denton@rackspace.com
From: Chris Apsey <bitskrieg@bitskrieg.net> Reply-To: Chris Apsey <bitskrieg@bitskrieg.net> Date: Wednesday, November 20, 2019 at 12:00 AM To: "openstack-discuss@lists.openstack.org" <openstack-discuss@lists.openstack.org> Subject: [neutron][ovn] networking-ovn-metadata-agent and neutron agent liveness
CAUTION: This message originated externally, please use caution when clicking on links or opening attachments!
All,
Currently experimenting with networking-ovn (rdo/train packages on centos7) and I've managed to cobble together a functional deployment with two exceptions: metadata agents and agent liveness.
Ref: the metadata issues, it appears that the local compute node ovsdb server listens on a unix socket at /var/run/openvswitch/db.sock as openvswitch:hugetlbfs 0750. Since networking-ovn-metadata-agent runs as neutron, it's not able to interact with the local ovs database and gets stuck in a restart loop and complains about the inaccessible database socket. If I edit the systemd unit file and let the agent run as root, it functions as expected. This obviously isn't a real solution, but indicates to me a possible packaging bug? Not sure what the correct mix of permissions is, or if the local database should be listening on tcp:localhost:6640 as well and that's how the metadata agent should connect. The docs are sparse in this area, but I would imagine that something like the metadata-agent should 'just work' out of the box without having to change systemd unit files or mess with unix socket permissions. Thoughts?
Secondly, ```openstack network agent list``` shows that all agents (ovn-controller) are all dead, all the time. However, if I display a single agent ```openstack network agent show $foo```, it shows as live. I looked around and saw some discussions about getting networking-ovn to deal with this better, but as of now the agents are reported as dead consistently unless they are explicitly polled, at least on centos 7. I haven't noticed any real impact, but the testing I'm doing is small scale.
Other than those two issues, networking-ovn is great, and based on the discussions around possibly deprecating linuxbridge as an in-tree driver, it would make a great 'default' networking configuration option upstream, given the docs get cleaned up.
Thanks in advance,
r
Chris Apsey
participants (2)
- 
                
                Chris Apsey
- 
                
                James Denton