Fellow Octavians,
We covered a lot of ground during this PTG, met a number of new folks, and got a lot of valuable feedback. I'll do my best to summarize here what was discussed.
- Metrics
- It would be nice to expose metrics for pools/members, though we would like to get a better understanding of the requirements / use-cases.
- We should publish metrics using some mechanism (plugin/driver).
- The default would be "database" and would handle the existing API-exposed metrics.
- Additional drivers would be loaded in parallel, and might include Monasca/Ceilometer/Prometheus drivers.
- We will switch our metrics internally to use a delta system instead of absolute values from HAProxy. This will allow us to publish in a more sane way in the future. This would not change the way metrics are exposed in the existing API.
- Notifications
- We will need to create a spec and gather community feedback.
- Initial observation indicates the need for two general paths, which will most likely have their own driver systems:
- provisioning_status changes (including all create/update/delete events)
- operating_status changes (member up/down, etc)
- We would provide the entire object in the notification, similar to what other services do.
- Most likely the default driver(s) would use oslo.notify.
- Availability Zone Support (Multi-Zone Fault Tolerance)
- Make at least a story for tracking this, if it doesn't already exist.
- Allow a single LB to have amphorae in multiple zones.
- Existing patch: https://review.opendev.org/#/c/558962/
- Availability Zone Support (Compute AZ Awareness)
- Make at least a story for tracking this, if it doesn't already exist.
- Allow placing LBs in specific zones:
- When zones are geographically separated, LBs should exist in the same zone as the members they support
- When zones are logically separated (like PCI compliance zones, etc), users may need to place them specifically.
- A new parameter `availability_zone` will be added to the LB create API. It will allow the user to select which Octavia AZ to use.
- A new API section will be added for creating/configuring/listing Octavia AZs. This will allow a linkage between Compute AZs and Amphora Management Network, along with other possible options in the future. Admins can create/update, and users can list zones.
- Update clients to support this, including further polluting the `openstack availability zone list` command to include `--loadbalancers` zones.
- Python 2 EOL
- Remove all jobs that test Python 2 (or update them if they're not duplicates).
- Remove six compatibility code, which should simplify string handling significantly.
- More Flavor Capabilities
- Image Tag (to allow different amp images per flavor)
- Availability Zone (to allow compute AZ pinning)
- Management Network (to go with compute AZ)
- Metadata (allow passing arbitrary metadata through to compute)
- TLS Protocol/Cipher API Support
- Allow users to select specific protocols/ciphers as a whitelist.
- Stories:
- Ciphers: https://storyboard.openstack.org/#!/story/2006627
- Protocols: https://storyboard.openstack.org/#!/story/2006733
- Performance Tuning
- HAProxy: There are a number of knobs/dials that can be adjusted to make HAProxy behave more efficiently. Some of these that we could look at more are around TLS options, and around multiprocessing/threading. The latter will probably need to wait for us to switch to HAProxy 2.0.
- Image Metadata: There are flags that could be added to our amphora image's metadata that might improve performance. To be further researched.
- Testing
- Team to evaluate existing non-voting jobs for promotion to voting.
- Agreement was made with the Barbican team to promote both side's co-gating jobs to voting.
- Team to evaluate merging or pruning some jobs to reduce the overall set that run on each change.
- Grenade needs a few changes:
- Switch to python3.
- Upgrade to Zuul v3.
- Test additional operations on existing LBs (old amp image), not just traffic.
- Test more than just the most recent amphora image against the current control-plane code. Use periodic jobs for this.
- Fix the Zuul grafana dashboard for Octavia test history.
- Jobboard
- This continues to be a priority for Ussuri.
- Put together a priority list of patches specifically for jobboard.
- HAProxy 2.0
- A number of features are gated behind the new version, including multi-process and HTTP/2 support.
- Need to reach out to distributions to push for backports (to cloudarchive for Ubuntu, and whatever similar thing for CentOS).
- Possibly add an element to allow building new versions from source.
- Perform version-based validation of options on the API-side of the amphora driver.
- Inspect and cache Glance metadata for the LB's amphora image to get version data.
- Provide the metadata string for the operator from our disk-image-create script.
The full etherpad from the PTG including the notes I've summarized here is available at https://etherpad.openstack.org/p/octavia-shanghai-U-ptg if further review is desired.
Thanks to everyone who participated, and best of luck on this (hopefully) productive new cycle!
--Adam Harwell