Hi All,

For some Octavia testing with heat, I tried deploying 3+1 node overcloud with Octavia environment and THT master. As expected, deployment finished successfully. However, when trying to do some Octavia API calls (ex. list laodbalancers etc), it was returning (503) Service Unavailable error. Checking the logs of octavia_api container (though the container was running), it was in a continous loop.

I have captured some of these issues (as described below) in https://bugs.launchpad.net/tripleo/+bug/1825146 too. The intent of this mail is to bring it to the notice of the respective teams and understand the best possible way to fix these.

1. Incorrect config setting for provider_drivers.

As of https://github.com/openstack/puppet-tripleo/commit/97c46ca76803fb238279d24544e8aba9c5685632, we added OVN provider driver support for Octavia. However, the config option seems to be set incorrectly which resulted in octavia-api failing to start.

2. ovn_nb_connection not in octavia config

It seems ovn provider driver (part of networking-ovn) connects to ovn nb database as part of initialization. However, we don't seem to configure connection string in octavia.conf and it tries to connect to 127.0.0.1 and fails.

I've already proposed a few patches[1] to fix the above two issues.

3. Missing Octavia Driver Agent

Octavia had added a driver-agent controller process in Stein (for provider drivers to communicate with the octavia for status and stats update). This process is expected to be collocated with octavia-api and creates a pair of unix domain sockets for communication that the drivers use.

We don't seem to have this implemented in THT yet. Is this something planned and someone working on it?  AFAICT, we can't enable/use the OVN provider driver without adding this service/container?

Adding support for this in THT also seems little complex and probably can be done in few different ways. Is there a standard design pattern we use for these kind of requirements in Tripleo?  If we already have an agreed plan on how to do it, please ignore below.

I can only think of the few options below.  Probably there are better ones too:)

a. New OctivaDriverAgent Service.

As the process is expected to be co-llocated with octavia-api to share the domain sockets (we can share /var/run/octavia from host on both containers), I'm not sure if this would be a good design though.

b. As a sidecar of octavia_api
  
Not a new service, but as a sidecar container for octavia-api (started when api starts, do we've something like this already there for other services?).

Ideas/comments welcome:)

[1] https://review.opendev.org/#/q/status:open+branch:master+topic:bug/1825146

--
Regards,
Rabi Mishra