[Tripleo][Octavia][OVN] Overcloud deployment Issues with Octavia
Hi All, For some Octavia testing with heat, I tried deploying 3+1 node overcloud with Octavia environment and THT master. As expected, deployment finished successfully. However, when trying to do some Octavia API calls (ex. list laodbalancers etc), it was returning (503) Service Unavailable error. Checking the logs of octavia_api container (though the container was running), it was in a continous loop. I have captured some of these issues (as described below) in https://bugs.launchpad.net/tripleo/+bug/1825146 too. The intent of this mail is to bring it to the notice of the respective teams and understand the best possible way to fix these. 1. Incorrect config setting for provider_drivers. As of https://github.com/openstack/puppet-tripleo/commit/97c46ca76803fb238279d2454..., we added OVN provider driver support for Octavia. However, the config option seems to be set incorrectly which resulted in octavia-api failing to start. 2. ovn_nb_connection not in octavia config It seems ovn provider driver (part of networking-ovn) connects to ovn nb database as part of initialization. However, we don't seem to configure connection string in octavia.conf and it tries to connect to 127.0.0.1 and fails. I've already proposed a few patches[1] to fix the above two issues. 3. Missing Octavia Driver Agent Octavia had added a driver-agent controller process in Stein (for provider drivers to communicate with the octavia for status and stats update). This process is expected to be collocated with octavia-api and creates a pair of unix domain sockets for communication that the drivers use. We don't seem to have this implemented in THT yet. Is this something planned and someone working on it? AFAICT, we can't enable/use the OVN provider driver without adding this service/container? Adding support for this in THT also seems little complex and probably can be done in few different ways. Is there a standard design pattern we use for these kind of requirements in Tripleo? If we already have an agreed plan on how to do it, please ignore below. I can only think of the few options below. Probably there are better ones too:) a. New OctivaDriverAgent Service. As the process is expected to be co-llocated with octavia-api to share the domain sockets (we can share /var/run/octavia from host on both containers), I'm not sure if this would be a good design though. b. As a sidecar of octavia_api Not a new service, but as a sidecar container for octavia-api (started when api starts, do we've something like this already there for other services?). Ideas/comments welcome:) [1] https://review.opendev.org/#/q/status:open+branch:master+topic:bug/1825146 -- Regards, Rabi Mishra
Hi Rabi, Thanks a lot for trying Octavia + OVN provider driver and sharing all these goodies! Both Octavia and networking-ovn teams have been actively working to integrate the Octavia OVN driver in TripleO. Your input will greatly help. On Mon, Apr 22, 2019 at 1:12 PM Rabi Mishra <ramishra@redhat.com> wrote:
Hi All,
For some Octavia testing with heat, I tried deploying 3+1 node overcloud with Octavia environment and THT master. As expected, deployment finished successfully. However, when trying to do some Octavia API calls (ex. list laodbalancers etc), it was returning (503) Service Unavailable error. Checking the logs of octavia_api container (though the container was running), it was in a continous loop.
I have captured some of these issues (as described below) in https://bugs.launchpad.net/tripleo/+bug/1825146 too. The intent of this mail is to bring it to the notice of the respective teams and understand the best possible way to fix these.
1. Incorrect config setting for provider_drivers.
As of https://github.com/openstack/puppet-tripleo/commit/97c46ca76803fb238279d2454..., we added OVN provider driver support for Octavia. However, the config option seems to be set incorrectly which resulted in octavia-api failing to start.
In case the container is still restarting, you might be hitting another issue that got fixed by https://review.opendev.org/#/q/I53610f1c9e3d10bb6b532dd7d43139854c0861b7
2. ovn_nb_connection not in octavia config
It seems ovn provider driver (part of networking-ovn) connects to ovn nb database as part of initialization. However, we don't seem to configure connection string in octavia.conf and it tries to connect to 127.0.0.1 and fails.
I've already proposed a few patches[1] to fix the above two issues.
Thanks!
3. Missing Octavia Driver Agent
Octavia had added a driver-agent controller process in Stein (for provider drivers to communicate with the octavia for status and stats update). This process is expected to be collocated with octavia-api and creates a pair of unix domain sockets for communication that the drivers use.
We don't seem to have this implemented in THT yet. Is this something planned and someone working on it? AFAICT, we can't enable/use the OVN provider driver without adding this service/container?
Yes, it is. Brent plans to work on this at some point in the next weeks. Our goal is to have a reasonable integration state in Stein. You'd be more than welcome to help out should you want. Let us know for better coordination.
Adding support for this in THT also seems little complex and probably can be done in few different ways. Is there a standard design pattern we use for these kind of requirements in Tripleo? If we already have an agreed plan on how to do it, please ignore below.
I can only think of the few options below. Probably there are better ones too:)
a. New OctivaDriverAgent Service.
As the process is expected to be co-llocated with octavia-api to share the domain sockets (we can share /var/run/octavia from host on both containers), I'm not sure if this would be a good design though.
b. As a sidecar of octavia_api
Not a new service, but as a sidecar container for octavia-api (started when api starts, do we've something like this already there for other services?).
Brent and I are of the same idea of having the driver agent as a sidecar container, yes.
Ideas/comments welcome:)
[1] https://review.opendev.org/#/q/status:open+branch:master+topic:bug/1825146
-- Regards, Rabi Mishra
participants (2)
-
Carlos Goncalves
-
Rabi Mishra