[kolla][ptg] Kolla PTG summary

Mark Goddard mark at stackhpc.com
Tue May 4 11:37:37 UTC 2021


Thank you to everyone who attended the Kolla Xena PTG. Thank you also
to those who helped out with chairing and driving discussions, since I
had a cold and my brain was far from 100%.

Here is a short summary of the discussions. See the Etherpad [1] for full notes.

# Wallaby Retrospective

On the positive side, we merged a number of useful features this
cycle, and eventually mitigated the Dockerhub pull limit issues in CI
by switching to quay.io.

On the negative side, we feel that we have lost some review bandwidth
recently. We really need more people in the community helping out with
reviews, and ideally moving to become members of the core team. The
barrier for entry is probably lower than you think, and the workload
does not have to be too heavy - every little helps. PLEASE get in
touch if you are interested in helping out.

# Review Wallaby PTG actions

Many of these are incomplete and still relevant. As usual, it's easier
to think of things to do than find time to do them. I've added them at
the end of the actions section of the etherpad, and we can revisit
them throughout the Xena cycle.

# General topics

## Deprecations

In the continuing drive to reduce the maintenance overhead of the
project, we discussed which components or features might need to be
deprecated next. With Tripleo out of the picture, RHEL support is not
tested, so it will be deprecated in Wallaby for removal in Xena. We
discussed standardising on source image type and dropping support for
binary images, but agreed (once again) that it would be too disruptive
for users, and the repository model may be easier for users to mirror.

## Release process

We agreed to document some more of our release process. We also agreed
to try a new approach in Xena, where we begin the cycle deploying the
previous release of OpenStack (Wallaby) on our master branches, to
avoid firefighting caused by breakage in OpenStack projects. This
should provide us with more stability while we add our own features.
The downside is that we may build up some technical debt to converge
with the new release. Some details are still to be decided.

## Elasticsearch -> OpenSearch

The recent Elasticsearch licence change may be a blocker in some
organisations, and Amazon's OpenSearch appears to provide an
alternative. The final OSS release of Elastic (7.10) EOL is
2022-05-10, so we have time to consider our options. Christian offered
to investigate a possible migration.

## Reflection

The team reflected a little on how things are going. We agreed that
the 4 most active core team members form quite a tight unit that
effectively makes decisions and keeps the project moving forward, with
the help of other core team members and other community members. We
also agreed that we are quite vulnerable to the loss of any of those
4, especially when we consider how many patches are authored and
approved by any 3 of those 4, and their areas of expertise.

## Future leadership

I have decided that this will be my last cycle as Kolla PTL. I still
enjoy the role, but I think it is healthy to rotate the leadership
from time to time. We have at least one person lined up for
nomination, so watch this space!

# Kolla (images) topics

## CentOS Stream 8

We have added support for CentOS Stream 8 (and dropped CentOS Linux 8)
in the Wallaby release. Since CentOS Linux 8 will be EOL at the end of
this year (THANK YOU VERY MUCH FOR THAT BY THE WAY), we need to
consider which stable releases may need to have CentOS Stream 8
support backported. RDO will not support Train on stream. Ussuri will
move to Extended Maintenance before the end of 2021, however we know
that many users are still a few releases behind upstream. In the end,
we agreed to start with backporting support to Victoria, then consider
Ussuri once that is done. We agreed to keep CentOS Linux 8 as the
default base image due to backwards compatibility promises. Stream
images will have a victoria-centos8s tag to differentiate them,
similarly to how we handled CentOS 8.

## Official plugins

We agreed to remove non-official plugins from our images. Often these
projects branch late and break our release processes. We agreed to
provide example config snippets and documentation to help adding these
plugins back into images.

# Kolla Ansible

## Glance metadef APIs (OSSN-0088)

Glance team discussed this issue at the PTG, we have an action to
follow up on their decision.

## Opinionated, hardened configuration

Should we provide an opinionated, hardened configuration? For example
TLS by default, conform to OpenStack security checklist, etc. Some
nods of approval, but no one offering to implement it.

## Running kolla-ansible bootstrap-servers without stress

Currently, running kolla-ansible bootstrap-servers on an existing
system can restart docker on all nodes concurrently. This can take out
clustered services such as MariaDB and RabbitMQ. There are various
options to improve this, including a serial restart, or an intelligent
parallel restart (including waiting for services to be up). We decided
on an action to investigate enabling Docker live restore by default,
and what its shortcomings may be.

## Removal/cleanup of services

We again discussed how to do a more fine grained cleanup of services.
Ideally this would be per-service, and could include things like
dropping DBs and Keystone endpoints. yoctozepto may have time to look
into this.

## Cinder active/active bug

We are still affected by [2]. We have a fairly good plan to resolve
it, but some details around migration of existing clusters need
ironing out. yoctozepto and mnasiadka to continue looking at it.

## More fine grained skipping of tasks, e.g. allow to skip service registration

We agreed to abandon the tag based approach in favour of pushing an
existing effort [3] to make a split genconfig, deploy-containers
command pair work. This achieves one of the main gains of skipping the
bootstrap and registration steps, and is useful for other reasons.

## Letsencrypt

We agreed to split the existing patches [4] into the contentious and
non-contentious parts. Namely, separate HAProxy automatic certificate
reload from certbot support. The HAProxy reload is currently
implemented via supervisord and cron in the container, which does not
fit well with our single process container model. We will look into an
HAProxy upgrade for automatic certificate rotation detection as a
possible solution.

## ProxySQL

kevko outlined the proposed ProxySQL patch chain [5], as well as some
more details on the nature of the problem being solved. In Wallaby we
added initial support for multiple MariaDB clusters. ProxySQL builds
on this, providing sharding at the database schema level between
different OpenStack projects (nova, neutron, etc.).

# Kayobe

## Running host configure during upgrades

Pierre proposed a change [6] to our documented upgrade procedure to
include a 'host configure' step. Sometimes this is necessary to
upgrade or reconfigure Docker, or other parts of the host setup. We
agreed that this should be done, but that we should also improve the
'kayobe <seed|seed-hypervisor|overcloud> host upgrade' commands to
avoid a full host configure run.

## Support provisioning infrastructure VMs

We walked through the proposed feature [7], including the various
milestones proposed. An MVP if simply provisioning infra VMs on the
seed hypervisor was agreed.

## Ubuntu

We discussed what has been achieved [8] in Wallaby so far, what still
needs to be done before the release, and what we have left to look at
in Xena and beyond. We are on track to have most features available in
Wallaby. We discussed whether to backport this work to Victoria. It
will depend on how cleanly the code applies.

## Multiple environments

Pierre implemented the majority of this long-awaited feature [9] in
Wallaby. We still require CI testing, and the ability to share common
Kolla configuration between environments. These will be looked at in

## Multiple host images

Pierre started work [10] on this feature. The aim is to describe
multiple root disk images with different properties, then map these
images to different overcloud hosts.

# Priorities

We usually vote for community priorities, however this cycle it was
felt that they only have a minimal effect on activity. Let's see how
it goes without them.


[1] https://etherpad.opendev.org/p/kolla-xena-ptg
[2] https://bugs.launchpad.net/kolla-ansible/+bug/1904062
[3] https://review.opendev.org/c/openstack/kolla-ansible/+/773246
[4] https://review.opendev.org/q/topic:%22bp%252Fletsencrypt-https%22+(status:open%20OR%20status:merged)
[5] https://review.opendev.org/q/hashtag:%22proxysql%22+(status:open%20OR%20status:merged
[6] https://review.opendev.org/c/openstack/kayobe/+/783053
[7] https://storyboard.openstack.org/#!/story/2008741
[8] https://storyboard.openstack.org/#!/story/2004960
[9] https://storyboard.openstack.org/#!/story/2002009
[10] https://storyboard.openstack.org/#!/story/2002098

More information about the openstack-discuss mailing list