[tripleo] Victoria TripleO PTG summary
Greetings, Thanks to everyone who attended the OpenStack PTG last week! A special thanks to those who presented their topics and discussed work items with the folks in attendance. As you know the event was virtually hosted in a video conference and seemed quite busy and packed with great topics and conversations. As the current PTL for TripleO I will do my best here to summarize those conversations and items others should be made aware of. To review the topics and discussion please follow the links here [1]. The event was recorded, however the OpenStack foundation has not made any of the videos publicly available yet. Monday June 1st: Retrospective: The TripleO project started with a retrospective of the Ussuri cycle. I attempted to use OpenStack’s storyboard for the process, but had to revert to an etherpad for usability. Keep trying, Storyboard is getting there. The good news is that the good things outweighed the bad things [2], and the ideas for improvement were focused on making things faster \0/ TripleO Operator Ansible Status: by Alex Schultz Alex gave a nice overview of his hard work throughout Ussuri to make TripleO Operator Ansible a reality. TripleO Operator Ansible is the official way now to execute TripleO commands via ansible. The upstream ci and consultants in the field are all consolidating around the tool. The history of reviews to make this happen can be found [4], while Alex completed a lot of the work he also attracted a number of contributors that also completed a lot of work. One note that Alex wanted to emphasize was that while TripleO Operators are meant to be executed by customers, and consultants, TripleO-Ansible is NOT meant to be exposed or called directly. Slides are available here [3]. Thank you Alex! The Future of python-tripleoclient: by Rabi Mishra Rabi led a very interesting conversation about the steps the project would have to take to further simplify the stack of projects used in a TripleO deployment. Currently there are a number of layers in client calls to tripelo-operator-ansible, python-tripleoclient, tripleo-ansible playbooks/modules, tripleo-common library which is complex and not an ideal user experience in terms of logs, resolving bugs. Out of the gate Rabi discussed breaking down python-tripleoclient into something more basic and moving more functions to ansible modules. The proposal was to get rid of CLI or replace it with a very simple one and move all the logic in tripleoclient to ansible playbooks/modules. The top level playbooks would directly map to current cli actions and would be in tripleo-ansible repo. tripleo-operator-ansible can also change to use those playbooks directly and transparently under the hood. Details from the session can be found here [5]. Thanks Rabi! Ansible Strategies & us: by Alex Schultz Alex was up again to let us know what he’s been up to make Ansible more performant. Ansible offers several different kinds of “strategies” with regards to how tasks are executed across multiple hosts. The strategies are pluggable and Alex has built a custom strategy currently called “TripleO Free” that can be used across some but not all of TripleO’s tasks [7]. The performance enhancement is spectacular, reducing a 30 node deployment from almost 2 hours to under 50 minutes. Well done!!! I’ll note the strategy name will be changed to garner more community support and the performance gains are not as pronounced with fewer nodes. A standard CI like deployment ( 4-5 nodes ) can expect to see 20 minutes cut off the deployment Slides of Alex’s presentation can be found here [6]. Very well done! Thanks Alex! Mistral has been removed, so what is left to do? By Kevin Carter Kevin hit us next with what to expect now that mistral has been removed and what steps we need to take next to make the community successful. I’ll note the mistral container is still on the undercloud but inactive, and workflow processing has been converted directly to ansible. There is still some cleanup in tripleo-common and rpm dependencies to prune. A link to the conversation is available [8], and Kevin’s presentation is available [9]. Thank you Kevin! TripleO Operator Pipelines By Emilien Macchi Emilien walked us through what it would take to further consolidate on TripleO Operator Ansible based CI pipelines in TripleO, OSP and at customer sites. Breaking down the full workflow of a deployment and day two operations in CI was reviewed. The goal here is to replace as much CI as possible with TripleO Operator Ansible to have a standardized ansible interface with TripleO in any CI or customer environment. TripleO Operators are shipping in Ussuri and should be backwards compatible with earlier releases. There will be a major push upstream to further integrate TripleO Operator Ansible into every CI job. Notes and comments are published here [10]. Thanks a million! [1] https://etherpad.opendev.org/p/tripleo-ptg-victoria [2] https://etherpad.opendev.org/p/tripleo-ptg-retrospective [3] https://docs.google.com/presentation/d/1Oxs-sflnJd5KIMoY6e_mlfU2zK4QWqWSG7Ah... [4] https://etherpad.opendev.org/p/tripleo-operator-ansible [5] https://etherpad.opendev.org/p/tripleo-future-of-tripleoclient [6] https://docs.google.com/presentation/d/19mr3HyyYUUGcwbRHsy2-k-F4WNCYCoFvSPzd... [7] https://review.opendev.org/#/q/topic:strategy-improvements [8] https://etherpad.opendev.org/p/tripleo-remove-mistral [9] https://docs.google.com/presentation/d/1iir7FA6YwBxRoU_SZJQ2H9Enbak4GAfBw73w... [10] https://etherpad.opendev.org/p/tripleo-operator-pipelines Tuesday June 2nd CI updates: by Wes Hayutin In the CI update I mostly covered the new upstream Component Pipeline. The component pipeline has three major goals, the first is to enable us to release at any time with working components of OpenStack, secondly break down a large problem into smaller problems, and third reduce time to debug and fix. The presentation covers monolithic vs. components builds, the workflow and testing and monitoring of the pipeline. The presentation is available here [11] I also noted that the upstream CI executed 268,805 deployments of TripleO in the Ussuri cycle. Third party CI executed 132,853 deployments. Not to shabby. Details are here [12]. Thanks easter bunny! IPv6 and DCN (routed-networks) in upstream CI by Harald Jensas Harald kicked off the next topic about utilizing more advanced networking with OVB in our third party TripleO CI. The proposal is to update the CI with multiple network segments. Herald has been the primary of OVB ( openstack virtual baremetal ) and looks to be wrapping up this feature [13]. Documentation for the feature can be found here [14] and notes of the discussion are posted [15] Thank you Harald! Enable network isolation by default by Harald Jensas Harald continued the networking discussion by highlighting common mistakes made in the field with network isolation settings and TripleO. When customers or consultants accidentally forget to include network-isolation settings heat can be destructive to the production environment and delete networks during an update or upgrade. The discussion led to merged patches that already solve the issue, but catching it earlier in the process was still a concern and led to discussion around additional validations. The goal was also shifted to make network-isolation more approachable by our customers. A spec will be written to improve the customer experience here. Notes can be found [16] Thanks Harald! Future deprecation of tripleo-validations by Cedric Jeanneret Cedric led us through the current status and future of TripleO Validations. The deprecated version of the validations do not have clear ownership, and testing each validation has proved to be difficult. Enter the solution with the validation teams new framework of validations where the validation service is clearly delineated from the validations themselves. We discussed ownership, packaging, CI with the entire workflow of packaging and testing all have clear ownership and fits in very neatly with the component pipeline. Please read through the details of the discussion as this will impact several projects with a clean, exciting way to validate each service in OpenStack [17]. Thank you Cedric!! Container Image Build v2 by Emilien Macchi and Kevin Carter Kevin and Emilien have been putting in a lot of extra hours revamping the container build system for TripleO and have produced a much improved system in record time. TripleO will benefit from smaller containers, faster builds and the flexibility to handle upstream and downstream builds easily. If you haven’t seen the presentation please do have a look [18]. Notes on the topic can be found here [19]. Get involved if you can keep up, Thanks Emilien, Kevin! Ceph Integration w/ cephadm by Francesco Pantano, Giulio Fidente, John Fulton The storage trio walked us through details of cephadm and the ramifications of replacing ceph-ansible. We discussed a wide range of topics here including what features should be built into TripleO vs. handled directly by cephadmin like scale up/down, updates, upgrades. The team walked us through how they dissected the deployment and injected cephadmin as a proof of concept that everything works well and proved we’re in good hands on the storage front. There is a lot of detail in the notes, so please have a read through here [20]. Thank you Francesco, Giulio ( aka bob dylan ), John! Removal of Heat and Swift from Undercloud by Rabi Mishra Rabi continued from his earlier topic regarding the noble effort to further simplify the OpenStack deployment by removing heat and swift. Rabi articulately described how heat is currently used in the latest release and what would have to be done to remove both heat and swift, walking us through heat resources, extra config, IPAM etc and how each could potentially be replaced. I personally really enjoyed hearing this particular topic and how we can move forward making OpenStack less complex. This is an important topic and you should review Rabi’s clear strategy here [21]. Thanks Rabi! Database migrations - can we make them more friendly, or can we do them a better way? By Jesse Pretorius Jesse led us next and spoke to the hard and complex problem of database migrations across OpenStack projects. This session was more of a brainstorming exercise in discovering creative solutions to complex problems. Unfortunately the group felt the problem came down to governance of OpenStack itself in more uniformly enforcing migration details. It was concluded getting all the projects to agree on a standard for migrations would be quite the uphill climb. Notes on the subject can be found here [22]. Thanks Jesse! [11] https://drive.google.com/file/d/1rAohZ01BDFGBOjI3kS9jy-n-P1weQVIX/view [12] https://etherpad.opendev.org/p/tripleo-ptg-victoria-ci-updates [13] https://review.opendev.org/#/q/topic:ipv6-support+(status:open+OR+status:mer... [14] https://openstack-virtual-baremetal.readthedocs.io/en/latest/deploy/quintupl... [15] https://etherpad.opendev.org/p/tripleo-ipv6-and-routed-networks-in-upstream-... [16] https://etherpad.opendev.org/p/tripleo-enable-net-iso-by-default [17] https://etherpad.opendev.org/p/tripleo-validations-future [18] https://docs.google.com/presentation/d/1l2RzL-hJ-fT9jzi2A7s8ladEumOvy8tDy7DP... [19] https://etherpad.opendev.org/p/tripleo-container-image-tooling-victoria [20] https://etherpad.opendev.org/p/tripleo-ceph [21] https://etherpad.opendev.org/p/tripleo-heat-swift-removal-undercloud [22] https://etherpad.opendev.org/p/tripleo-ptg-victoria-better-db-sync Wednesday June 3rd Speeding up deployments, updates and upgrades by Jesse Pretorius Jesse had quite a few suggestions and proposals on how we may speed up updates and upgrades. Some of the highlights were building on top of Alex’s ansible strategy improvements and avoiding skipped tasks, avoiding unnecessary reboots etc. There was a lot discussed, please refer to the etherpad for details [23]. Thank you Jesse! Running validations from within a container by Cedric Jeanneret Cedric continued to walk us through the very near future of validations. Cedric wanted to discuss the delivery of validations and the implications of using a container to host the validations. Older non-containerized versions of TripleO were discussed, using an ansible collection and a container were discussed. There are multiple use cases for validations leading the group to not consolidate on a single delivery mechanism. More discussion was needed at the end of this topic. Notes are here [24]. Thanks Cedric! Auto --limit scale-up of Compute nodes by Luke Short Luke walked us through his auto scale up spec [25] in the following presentation [26]. Essentially this is customizing the scale up process to better match the ansible configuration for forked processes to make the scale up quicker. Luke was able to perform a 10 node scale up in 20 minutes. Nice presentation Luke! TripleO usability enhancements: by Wes Hayutin Initially there was not a lot of suggestions in this topic prior to PTG, however once we got started with usability improvements they started to roll in. Definitely check the etherpad [27], but I’ll list some here. - Fix “FAILED - RETRYING in stack status” - fixed already in https://review.opendev.org/#/c/725665/ - Network-Isolation user experience - linked back to Harold’s topic - Chem raised a number of update / upgrade improvements on line 25 [27] - Eliminate the need for customers to remove deprecated services from roles_data in upgrades - Improved logging - Block commands that require a previous actions - Prompt user prior to dangerous actions - Keep simplifying, e.g.’s Rabi’s proposals. Check the etherpad for more details [27] Improvements in TLS Everywhere/ CI Presenter by Ronelle Landy, Ade Lee At the last PTG in Shanghai Ade and I proposed a job that would setup IPA and a Standalone deployment of TripleO and configure it to work together upstream to check and GATE changes to TLS. I’m happy to report that Ade and Ronelle got the job done!! The presentation is available here [29]. This should go a very long way to help prevent TLS related bugs across upstream and internal testing. Thanks Ade, Thanks Ronelle! [23] https://etherpad.opendev.org/p/tripleo-ptg-victoria-speedups [24] https://etherpad.opendev.org/p/tripleo-validation-container [25] https://review.opendev.org/#/c/727768/1/specs/victoria/auto_scale_up_compute... [26] https://drive.google.com/file/d/1c2D67QZ5UGBYRONogZJKMief5wJxQAIt/view [27] https://etherpad.opendev.org/p/tripleo-usability-enhancements [28] https://etherpad.opendev.org/p/tripleo-tls-everywhere-ci [29] https://docs.google.com/presentation/d/1gruDzHIjZtPtUUYSRGIrJynTOt9bZP3W7TLf... Thursday June 4th Config-download 2.0 by Luke Short Luke walked us through a proposal to build on config download and what steps could be taken further simplify the deployment using Ansible. Using idempotent tasks, static playbooks etc. The source of truth with regards customer environments was a tough nut to crack here and it was difficult to see a very clear and backwards compatible method to approach this. Notes are available here [30] Ansible logging within tripleoclient by Cedric Jeanneret Cedric proposed improvements to TripleO logging here [31]. Cedric walked the group through the spec. Logging is certainly an area we all want to see improvement on and we all have opinions about so this was a lively conversation. Cedric pointed out there are two main types of steps we need to go after which is any ansible task, and any tripleo cli command called need to be logged together in a human readable way. Very good points made and the conversation will continue in the TripleO spec. [31]. Thanks Cedric! Transitioning the underlying CentOS/RHEL - how can we improve this process? By Jesse Jesse spoke to the challenges facing upgrades with regards to the mix of host RHEL versions in a deployment. Several finer points were made with nuances of the upgrade with regards to HA, pacemaker, libvirt and ironic versions. It was a good troubleshooting session and a thoughtful dialogue with the group. Read the details here [33] Thanks Jesse! VxFlexOS integration within TripleO by Jean Pierre Roquesalane/Rajini Karthik The Tripleo team answered questions and walked the Dell team through the best practices with integrating a 3rd party service with TripleO. Details are here [34] TripleO CI - audit coverage for neutron, ovn, ovs, octavia by Brent Eagles, Slawek Kaplonski, Wes Hayutin This session was about reemphasizing the importance of the network scenarios and workflows that are critical to TripleO’s success in the field. Over the past year or so Brent and others have done a great job in adding additional upstream coverage but it’s now time to make sure this is everyone’s job as well. We discussed the challenges with upstream jobs, the neutron project and TripleO. We also spec’d out some idea on where to build on top of Brent and Slawek successes to reach greater upstream coverage of critical network features and workflows. Notes are available [35]. Really appreciate Brent and Slawek making time to attend, thank you!! tripleo-validations package future by Cedric Jeanneret Last but not least Cedric led a discussion on packaging for validations and how we can align responsibility of validations with rpms, git repos and CI. We spoke to how the packaging is related to the component pipeline. Designing this carefully so that the validation’s team and other projects can work together and independently with clear lines. Good stuff here, details [36] Thanks Cedric! openstack tripleo deploy (standalone) for multinode by James Slagle Please read through the blueprint and specs proposed by James with regards to utilizing the standalone deployment for multinode overcloud deployments. [37] Thanks James [30] https://etherpad.opendev.org/p/tripleo-ptg-victoria-config-download-two [31] https://review.opendev.org/#/c/733652/1/specs/victoria/ansible-logging-tripl... [32] https://etherpad.opendev.org/p/tripleo-ptg-victoria-ansible-logging [33] https://etherpad.opendev.org/p/tripleo-ptg-victoria-distro-transition [34] https://etherpad.opendev.org/p/tripleo-ptg-victororia-VxFlexOS [35 https://etherpad.opendev.org/p/tripleo-network-coverage-audit [36] https://etherpad.opendev.org/p/tripleo-validations-future [37 https://etherpad.opendev.org/p/tripleo-ptg-tripleo-deploy Did you make it? Special thank you sincerely ( I know I never sound sincere ) to both Emilien and Alex throughout the cycle in helping me with my PTL responsibilities!! See you next PTG :)
participants (1)
-
Wesley Hayutin