[openstack-ansible] PTG results
Hi there, Sorry for the previous email - it has been sent accidentally - it was just at the draft stage... I did some formating afterwards:) Here are some of the decisions we come up with during the PTG week: * CentOS 8 topic. ** Install systemd-extras to get systemd-networkd ** Launch without LXC build at first because of the complexity of the solution. To implement LXC we would need the following: *** Replace what `machinectl` does with Ansible tasks, as it's not working properly without btrfs *** See how much it costs to implement lxd - that would not only resolve Centos 8 issue, but also is pretty appreciated among comunity. However, installing in with snapd will auto-update. it's possible to delay up to 60 days the updates but still needs to fix/clean ** Make nspawn unmaintained, because it mostly relies on machinectl. *** Remove nspawn from docs. *** Remove code in cycle afterwards. ** Backport CentOS 8 to Ussuri, drop CentOS 7 for Victoria afterwards. *** On Ussuri CentOS 7 is going to live without distro support *** write reno, that explains that distro installed OSA upgrade path might be tricky/broken for CentOS because of absent Centos7 packages for U. * Logs topic ** finish transition to journald *** check heat for logging *** check for ceph logs - see [1] but this may be a result of the way cephadm containerises all the ceph daemons ** Rewrite log collection script on python with systemd python bindings and get deprecated messages from services' journal into separate file ** Check where we don't use uwsgi role and see if we can use it there now (like designate) ** Check through the logs what roles are we covered with tempest, and what not. We have buch of roles that run tempest, but test like only keystone (but not itself). ** We add libvirtd_exporter [2] to ansible-role-requirements and offer it's deployments on users own prometheus. Offer prometheus deployment as step2. Document usage * Promote ELK stack to 1st class thing ** Create a separate repo and remove from openstack-ansible-ops ** provide out-of-the-box deployment with OSA. Model would be similar to ceph-ansible where deployment can be integrated or standalone * Work on speeding up OSA runtime: ** Fight with skipped tasks (ie by moving them to separate files that would be included) - most valid for systemd service, systemd_networkd and python_venv_build roles ** Try to split up variables by group_vars ** Try to use include instead of imports again * speedup ci ** try to speedup zuul required projects clone process - work with infra team ** Set *_db_setup_host across all roles to utility and adjust [7] * Build mariadb deps for focal (like 10.4.12 release). We can use repo.vexxhost.net for hosting it until mariadb 10.4.14 release. * In case having issues with distro jobs/support we don't hestitate remove it or setting to non-voting state * Remove SUSE support early in Victoria. We already have [4] for this - needs rebasing * Add neutron ovn to integrated tests (with perspective to make it default for new deployments). * Drop resource creation tasks out of os_tempest - OSA and TripleO manage resource creation themselves and pass required vars to os_tempest for config generation * Add support for zookeeper deployment for services coordination (like telemetry, designate, etc) * add tooling to bootstrap-ansible to apply provided gerrit patches for roles - start with this [3] * Try to add aarch64 jobs with separate pipeline once we have some python wheels built up * Migrate group names to remove underscores "The TRANSFORM_INVALID_GROUP_CHARS settings is set to allow bad characters in group names by default, this will change, but still be user configurable on deprecation. This feature will be removed in version 2.10" * add tooling to bootstrap-ansible to apply provided gerrit patches for roles - start with this [5] use it like this [6] * publish common roles (galera, haproxy, memcached, uwsgi, python_venv_build, etc...) to galaxy, rename them to ansible-role-* pattern. As a stage 2 consider publishing os roles. * add some check for repo server, to verify it's ok (lua linters check) instead of failing afterwards because of missing dev libraries for hosts [1] https://github.com/ceph/ceph/blob/be117b555fc1bba1048b87a624d542fd629d1ad1/d... [2] https://github.com/jrosser/rd-ansible-libvirtd-exporter [3] http://paste.openstack.org/show/794258/ [4] https://review.opendev.org/#/c/725541/ [5] http://paste.openstack.org/show/794258/ [6] http://paste.openstack.org/show/794259/ [7] https://review.opendev.org/#/c/671454
PS: Oh, and btw another nice news from PTG week. We agreed with keystone team to add a rolling upgrade testing with OSA for their CI. This had been done until 2017 or so, but we got the way more mature and it's high time we renewed that integration:) 05.06.2020, 20:53, "Dmitriy Rabotyagov" <noonedeadpunk@ya.ru>:
Hi there,
Sorry for the previous email - it has been sent accidentally - it was just at the draft stage... I did some formating afterwards:)
Here are some of the decisions we come up with during the PTG week:
* CentOS 8 topic. ** Install systemd-extras to get systemd-networkd ** Launch without LXC build at first because of the complexity of the solution. To implement LXC we would need the following: *** Replace what `machinectl` does with Ansible tasks, as it's not working properly without btrfs *** See how much it costs to implement lxd - that would not only resolve Centos 8 issue, but also is pretty appreciated among comunity. However, installing in with snapd will auto-update. it's possible to delay up to 60 days the updates but still needs to fix/clean ** Make nspawn unmaintained, because it mostly relies on machinectl. *** Remove nspawn from docs. *** Remove code in cycle afterwards. ** Backport CentOS 8 to Ussuri, drop CentOS 7 for Victoria afterwards. *** On Ussuri CentOS 7 is going to live without distro support *** write reno, that explains that distro installed OSA upgrade path might be tricky/broken for CentOS because of absent Centos7 packages for U.
* Logs topic ** finish transition to journald *** check heat for logging *** check for ceph logs - see [1] but this may be a result of the way cephadm containerises all the ceph daemons ** Rewrite log collection script on python with systemd python bindings and get deprecated messages from services' journal into separate file ** Check where we don't use uwsgi role and see if we can use it there now (like designate) ** Check through the logs what roles are we covered with tempest, and what not. We have buch of roles that run tempest, but test like only keystone (but not itself). ** We add libvirtd_exporter [2] to ansible-role-requirements and offer it's deployments on users own prometheus. Offer prometheus deployment as step2. Document usage
* Promote ELK stack to 1st class thing ** Create a separate repo and remove from openstack-ansible-ops ** provide out-of-the-box deployment with OSA. Model would be similar to ceph-ansible where deployment can be integrated or standalone
* Work on speeding up OSA runtime: ** Fight with skipped tasks (ie by moving them to separate files that would be included) - most valid for systemd service, systemd_networkd and python_venv_build roles ** Try to split up variables by group_vars ** Try to use include instead of imports again
* speedup ci ** try to speedup zuul required projects clone process - work with infra team ** Set *_db_setup_host across all roles to utility and adjust [7]
* Build mariadb deps for focal (like 10.4.12 release). We can use repo.vexxhost.net for hosting it until mariadb 10.4.14 release.
* In case having issues with distro jobs/support we don't hestitate remove it or setting to non-voting state
* Remove SUSE support early in Victoria. We already have [4] for this - needs rebasing
* Add neutron ovn to integrated tests (with perspective to make it default for new deployments).
* Drop resource creation tasks out of os_tempest - OSA and TripleO manage resource creation themselves and pass required vars to os_tempest for config generation
* Add support for zookeeper deployment for services coordination (like telemetry, designate, etc)
* add tooling to bootstrap-ansible to apply provided gerrit patches for roles - start with this [3]
* Try to add aarch64 jobs with separate pipeline once we have some python wheels built up
* Migrate group names to remove underscores "The TRANSFORM_INVALID_GROUP_CHARS settings is set to allow bad characters in group names by default, this will change, but still be user configurable on deprecation. This feature will be removed in version 2.10"
* add tooling to bootstrap-ansible to apply provided gerrit patches for roles - start with this [5] use it like this [6]
* publish common roles (galera, haproxy, memcached, uwsgi, python_venv_build, etc...) to galaxy, rename them to ansible-role-* pattern. As a stage 2 consider publishing os roles.
* add some check for repo server, to verify it's ok (lua linters check) instead of failing afterwards because of missing dev libraries for hosts
[1] https://github.com/ceph/ceph/blob/be117b555fc1bba1048b87a624d542fd629d1ad1/d... [2] https://github.com/jrosser/rd-ansible-libvirtd-exporter [3] http://paste.openstack.org/show/794258/ [4] https://review.opendev.org/#/c/725541/ [5] http://paste.openstack.org/show/794258/ [6] http://paste.openstack.org/show/794259/ [7] https://review.opendev.org/#/c/671454
-- Kind Regards, Dmitriy Rabotyagov
participants (1)
-
Dmitriy Rabotyagov