[TripleO] Xena PTG session summaries

Marios Andreou marios at redhat.com
Mon Apr 26 16:37:32 UTC 2021


Hello folks,

I sent out some stats and links on our PTG meetup with
http://lists.openstack.org/pipermail/openstack-discuss/2021-April/021999.html
already, but, as a couple of different people asked me about it, I
took the time to write a summary for each session today. Of course you
can find all etherpad links and recordings via
https://etherpad.opendev.org/p/tripleo-ptg-xena (which seems to be
down right now but I have backups if it isn't resolved by tomorrow I
can try sharing that content somewhere else).

Below is a (very) concise summary of the main points in each session
and I hope that is useful to someone (especially since it took far
longer than I expected ;)). Please reply here (or ping privately if
you prefer) for any glaring omissions or obvious issues that should be
revised (I originally intended to put this in an etherpad for easier
collaboration but as I wrote above, etherpad.opendev.org seems down
right now at least for me).

regards, marios

MON:

* https://etherpad.opendev.org/p/tripleo-ptg-retrospective

Retrospective of the Wallaby cycle - there are some community and team
level 'headlines' on the main items worked on during this cycle on the
etherpad.
Some identified ideas for improvement include targeting another older
branch for end-of-life likely Queens, improving upstream documentation
especially removal of stale content, and creating a tag in Launchpad
for teams so we can more easily identify which squad is currently
assigned.

 * Topic: Plan/Swift removal  update Presentation link:
https://drive.google.com/file/d/1igOW4XuAbU55Tat73DwLqO4UGZu8MiNi/view?usp=sharing

An update of the work completed in the W allay cycle to remove the
Swift service and the deployment plan (which is no longer used as part
of our deployments) from the undercloud. From wallaby onward by
default there is no undercloud Swift. There may be a revision of the
spec https://opendev.org/openstack/tripleo-specs/commit/e83d8aba3a950da83a33c23bcef6ffc38f00002f
as the original plan didn't explicitly consider removal of the
deployment plan.

* https://etherpad.opendev.org/p/tripleo-ephemeral-heat

Update  on the ephemeral heat work (i.e. no permanent heat process on
the undercloud). There has been very strong progress made in this
cycle and there are still some outstanding patches
https://review.opendev.org/q/topic:%22ephemeral-heat%22+(status:open)
to be merged. Goal is to make this the default in Xena deployments and
backport to Wallaby as optional. Besides the main feature, some
related planned work includes consolidation of the
python-tripleoclient  "overcloud deploy" and "tripleo deploy" (eg
standalone) commands. Note that this work depends on the
tripleo-network-v2 work (next session below).


* https://etherpad.opendev.org/p/tripleo-network-v2

Update on the network ports v2 work (moving network port creation out
of the heat stack)
https://opendev.org/openstack/tripleo-specs/src/branch/master/specs/wallaby/triplo-network-data-v2-node-ports.rst
- again good progress on this during Wallaby but there is still some
ongoing work there
https://review.opendev.org/q/topic:%2522network-data-v2%2522+(status:open)
. The goal for Xena is to make this the default (i.e. no
node/networking config in deploy-steps-playbook.yaml). One main area
of work for X in this topic is integration of the baremetal network
config in the overcloud deployment (i.e. allow a single command).

* https://etherpad.opendev.org/p/tripleo-ceph-xena

Update from the ceph team about the main work items completed in
Wallaby including the tripleo-ceph-client and tripleo-ceph in place of
ceph-ansible for RBD (
https://specs.openstack.org/openstack/tripleo-specs/specs/wallaby/tripleo-ceph-client.html
and https://specs.openstack.org/openstack/tripleo-specs/specs/wallaby/tripleo-ceph.html
). The main work planned for Xena is to continue trying to achieve
feature parity with ceph-ansible - including resolving cephadm
blockers, , Ganesha &
https://specs.openstack.org/openstack/tripleo-specs/specs/wallaby/tripleo-ceph-ganesha.html
. One major consideration is how to move ceph creation/config outside
of the heat stack - some parts such as pools, keyrings and haproxy
config will have to remain as part of the tripleo deployment. Note
that this work depends on the network ports v2 (previous session
above).

TUE:

* https://etherpad.opendev.org/p/tripleo-xena-whole-disk-images

A proposal to move to whole disk images instead of the current
overcloud-full.qcow+overcloud-full.initrd+overcloud-full.vmlinuz.
There were many compelling arguments made for the proposal including:
with the overcloud-full.qcow2 partition image, as of centos 8.4 grub2
no longer supports UEFI boot, there will be much less for
ironic-python-agent to do during deployment with a single disk image,
there will be just one file to distribute (vs 3), we will no longer
need to define and build a separate 'hardened' image (and also remove
the related CI jobs). One of the main technical issues that needs to
be addressed first is the grow partition for /var which is where we
are storing containers and config for deployment.

* https://etherpad.opendev.org/p/tripleo-xena-drop-healthchecks

Proposal to drop the container health check since are using deployment
resources but aren't providing value. There was no push back against
this proposal and the details are being discussed in the newly posted
spec @ https://review.opendev.org/c/openstack/tripleo-specs/+/787535.

* https://etherpad.opendev.org/p/ci-tripleo-repos

Proposal to consolidate the various ways and places that tripleo-ci is
using to configure the repos in the CI jobs. There is a spec proposed
@ https://review.opendev.org/c/openstack/tripleo-specs/+/772442 - some
of the work here is split into sub items which are ongoing
(tripleo-get-hash there
https://review.opendev.org/c/openstack/tripleo-ci/+/784392). The main
outstanding blocking item here is to agree on the common data format
for the various personas upstream downstream and product that we need
to support eg https://github.com/mwhahaha/rhos-bootstrap/blob/main/versions/centos.yaml
 vs https://review.opendev.org/c/openstack/tripleo-repos/+/785593/1/tripleo_repos/conf/master.yaml

* openstack tempest skiplist
https://docs.google.com/presentation/d/1aCiV35IYNhPV7SRmi4_A9vkIjZ89pwfC4VvL6frFJNE/edit?usp=sharing

Update on the tempest skiplist effort during Wallaby to consolidate
the skipped Tempest tests in a central location with the ability to
specify particular jobs and or branches for which specific skips will
apply.

* https://etherpad.opendev.org/p/tripleo-next

One of the main items discussed here was the 'first principles'
proposal at https://review.opendev.org/c/openstack/tripleo-specs/+/786980
- these are meant to guide us when discussing changes to our
deployment tooling and architecture. The proposal will merge in  Xena
specs once we've reached consensus on the review. Another topic
discussed in this session was an update on exploratory work to replace
"heat & ansible" in our deployment tooling with 'something else' -
some ongoing work here is at https://github.com/cloudnull/director  &
https://github.com/mwhahaha/task-core. More info and pointers (also
discussed Kube/OCP with an operator to deploy tripleo) on the
etherpad.

WED:

* https://etherpad.opendev.org/p/tripleo-xena-inventory-script

This was a proposal to remove the "tripleo-ansible-inventory script" @
https://github.com/openstack/tripleo-common/blob/ccd990b58b6583dda3a0e0f34135ae343c833f70/tripleo_common/inventory.py#L744
and instead generate it from the deployment data (e.g. metalsmith or
user data from deployed-server deployments). The consensus reached was
that instead of removing it we should instead use it in a better way,
for example make sure static inventories are generated and exported to
known locations (especially for the ephemeral heat case) and re-used.

* https://etherpad.opendev.org/p/vf-ui-output

This was an update from the validations squad about the main items
worked on during Wallaby (integrated the validation framework into the
component CI pipelines, enabled the standalone job in upstream
check/gate and increased adoption especially by the upgrades squad).
Followed by discussions for planned Xena work, including changes in
the UI/CLI (eg jq queries can be handled better and various other UI
improvements more on the etherpad). Some of the other topics raised
here were to make the validations themselves component aware (run all
validations related to a given component) and discussion around the
requirement for a molecule test on all validation additions
(especially the example of mocking out OpenStack services like
keystone); the compromise could be to instead use a standalone job for
such cases.

* https://etherpad.opendev.org/p/Validation-Framework-Next-Generation

In this session the validations squad introduced ideas for the future
direction of the validation framework. Some of the main proposals are
to remove the validations repos - validations-common and
validations-libs out of tripleo governance but still within openstack
and establishing a new validations project (discussion but no clear
consensus on this point), to re-merge the two repos into one
consolidated validations repo and fixup the CLI (see previous session)
- more items and other considerations on the etherpad.

* https://etherpad.opendev.org/p/tripleo-frr-integration

Update on Wallaby progress from the cross-squad team looking at
FRR/BGP integration in the tripleo deployment
(https://opendev.org/openstack/tripleo-specs/src/branch/master/specs/wallaby/triplo-bgp-frrouter.rst).
Some of the main items discussed for Xena work included how we might
approximate some part of this feature in upstream CI (high resource
requirements - downstream CI has 9 nodes) and backport considerations
(no backport to upstream/train).

* https://etherpad.opendev.org/p/update-upgrade-consolidation

In this session the upgrades squad outlined their proposal for
consolidation of the minor update and major upgrade workflows -
without any blockers or objections coming out of the discussion. One
of the main considerations was around how we can decouple the
operating system updates/upgrades from the tripleo container upgrade -
one action item is to de-containerize those containers that are tied
to the kernel version (ABI) such as libvirt and openvswitch.

THU:

* https://etherpad.opendev.org/p/policy-popup-xena-ptg

In this session the security squad gave an update on progress during
Wallaby on the Role Based Access Control (RBAC) - many services have
completed implementation (Keystone, Nova, Ironic  - more on the
etherpad). Then there was a discussion around potential integration
points during the tripleo deployment, for example
https://review.opendev.org/c/openstack/tripleo-heat-templates/+/781571/7/environments/enable-secure-rbac.yaml
. One of the considerations was around how we can test this in CI
(possibly the standalone job is a good fit) as well as the use of
multiple clouds.yaml for project specific operations during the
deployment (with the root clouds yaml having the system-admin
profile).

* https://etherpad.opendev.org/p/centos-stream-9-upstream

In this session the CI squad lead a discussion around centos9 stream
(possibly coming Apr/May) and what we should consider/prepare for with
respect to upstream CI. Some of the main changes and discussion items
included NetworkManager and firewalld replacing iptables, ansible
version (2.11/2.12?/?). Mainly this effort is blocked on the actual
9-stream release and getting the relevant nodepool node. Another main
discussion point here was whether we would support both stream-8 and
stream-9 on particular branches - consensus here is that wallaby has
both 8/9 and for X can have only 9 - but this is all dependent on when
9 becomes available with respect to when Xena is released.

* https://etherpad.opendev.org/p/os-migrate

This session was an update from the upgrades squad around the
os-migrate tool ( https://github.com/os-migrate/os-migrate  ) - which
aims to 'copy' your openstack deployment and in particular the
end-user workloads (i.e. user data, vms etc, but not the controlplane)
onto new hardware, as an alternative to the in-place upgrade. More
information and slides @
https://docs.google.com/presentation/d/1UYGOI89MBLHLpS89mPp0VK1yvTYtb2BamUL_DmfGLGA/edit?usp=sharing




More information about the openstack-discuss mailing list