[openstack-dev] [kolla] PTG Summary
paul.bourke at oracle.com
Thu Mar 8 14:30:53 UTC 2018
Here's my summary of the various topics we discussed during the PTG.
There were one or two I had to step out for but hopefully this serves as
an overall recap. Please refer to the main etherpad for more details
and links to the session specific pads.
build.py script refactor
* I think was little debate that we need this. However, discussion moved
fairly quickly towards if there's changes we can make to our images that
will not require maintaining such a large build script in the first place.
* loci images are making good progress and are already in use by
* By moving the start scripts from the kolla images into
kolla-ansible we can decouple ourselves from these images and open the
possibility of comsuming images from other sources such as loci.
* Do a poc of externalising start scripts (started under
plugin split from main images
* Plugins continue to be a contentious issue in Kolla
* The current approach of installing all available plugins 'out of the
box' is not working for certain users.
* Sam Betts had a good example of why this is not working for them, I
don't feel I can summarise it properly. Will reach out to him to clarify.
* We didn't reach a conclusion on this, it seems there are pros and cons
to each approach. Needs further discussion and possibly some pocs.
ansible "--check" and "--diff" mode
* Operators would like to see some dry run like features in kolla-ansible.
* Would like to see the return of something like genconfig, where
configs can be generated ahead of time and diffed/reviewed before deploy.
* Also some general discussion in this session on management and scaling
difficulties with kolla.
* Inventory management needs to be more flexible.
* Operations are too slow once you hit about 200 nodes, operators are
finding they have to use manual trickery to divide up their inventories.
* A lot of operations take place when very little has changed config wise.
* No specific actions came out of this at this time. I think we'd need
more time on this topic to determine specific work items that can make
Database backup & recovery
* Interesting topic, all in agreement kolla should provide some
functionality in this area.
* Discussion around which areas of responsibility fall on kolla vs. the
operator. E.g. 'kolla should allow for regular database backups, how
those are restored is beyond project scope'
* yankcrime has done some ground work on this as well as a poc.
* Good documentation is important here.
* Review yankcrime's poc and provide feedback
* Form a spec detailing what mechanism we want to use to trigger
* All seem in agreement that the issues and work seen in migrating to
ceph-ansible currently outweigh the benefits.
* Decided to stick with improving kolla ceph for now, with bluestore
support being a priority.
* Write a blueprint to add support for bluestore
* Update docs to better inform operators on why they may or may not want
to use kolla ceph vs the alternatives.
Prometheus support for monitoring
* There have been some previous attempts to add a monitoring stack in
Kolla, though none have come to fruition.
* Oracle are looking at prometheus and what it will take to integrate
that to Kolla to fill this gap.
* Write spec to detail how this will work.
* Do the work.
self health check support
* This had some crossover with the monitoring discussion.
* Kolla has some checks in the form of our 'sanity checks', but these
are underutilised and not implemented for every service. Tempest or
rally would be a better fit here.
* Remove the sanity check code from kolla-ansible - it's not fit for
purpose and our assumption is noone is using it.
* Make contact with the self healing SIG, and see if we can help here.
They may have recommendations for us.
* Make a spec for this.
destroy service & node
* Several aspects to this:
* We would like to be able to remove an individual service as part of
* It is not clear what best practice is to remove a control node in Kolla
* Likewise for compute
* This could be automated but documentation would go a long way here also.
* Clearly document how to remove a control/compute node from a kolla
integrate with docker-compose
* This is something Jeffrey is working on so we didn't have much to
contribute in the way of discussion.
* Review and provide feedback on https://review.openstack.org/538581
Implement rolling upgrade for all core projects
* Started by defining the 'terms of engagement', i.e. what do we mean by
rolling upgrade in kolla, what we currently have vs. what projects
* There are two efforts under way here, 1) supporting online upgrade for
all core projects that support it, 2) supporting FFU(offline) upgrade in
* lujinluo is working on a way to do online FFU in Kolla.
* Testing - we need gates to test upgrade.
* Finish implementation of rolling upgrade for all projects that support
it in Rocky
* Improve documentation around this and upgrades in general for Kolla
* Spec in Rocky for FFU and associated efforts
* Begin looking at what would be required for upgrade gates in Kolla
* mgoddard gave us an overview of the project, what it is and potential
cross over / collaboration areas with kolla.
* In short, Kayobe adds the pieces to kolla-ansible required to build an
end-to-end OpenStack deployment tool, along the lines of TripleO
* There's lots of good info on this on
* None at this time.
HAProxy config customisation ( customize non-openstack service conf)
* Discussion continues on the best way to handle non ini style config
customisation in kolla.
* Similar to the plugins we have lots of ideas but each comes with pros
and cons so its not yet clear which is the right approach.
More information about the OpenStack-dev