Hello, Zorillas and interested stackers.
Thank you for the productive PTG we had last week. Here goes the summary for the past week's discussions. In case you would like to see the expanded versions, please refer to [0] or check out the recordings in the Manila YouTube Channel [11].
Different approaches for the images we use in our jobs
- Recent changes in the Manila CI (related to the service image) made the jobs take longer to run and become more resource intensive, causing jobs to fail more often due to lack of resources in dsvm
- We have been looking for approaches to tackle this issue
- In the short term, if a job keeps failing due to such issues, we will split it into two jobs
- One to run scenario tests (these tests spawn VMs)
- We agreed that this can get worse in the future with dsvm images that could be even more resource demanding, and that we will look for containerized approaches to try and solve this issue
Share backup
- (kpdev) The proposed approach for share backup has changed
- An existing specification [1] was restored
- The idea is to have new APIs for share backup and a new generic Backup driver in a similar way to what Cinder does.
- This would allow backends to have their specific implementations for share backups
- Reviews for this specification are ongoing
Secure RBAC changes
- Goutham has shared with the community all the good progress we had in Manila, and also mentioned what we did in Zed, after all the operator's feedback.
- Now, we are entering a phase where we want to test a lot!
- We have a CI jobs and some test cases covered, but we still want to have more coverage
- During Antelope, we will promote a hackathon in order to accelerate the functional tests coverage
Outreachy Internships
- On this topic, ashrodri, fmount, gouthamr and carloss shared the two proposed Outreachy project that are Manila related
- ashrodri and fmount are proposing a project to have an intern creating a multi-node test environment with Devstack and Ceph [2], as these tests are becoming more resource demanding
- carloss and gouthamr are proposing a project for an intern to work in Manila UI [3]. The idea is to close down the gap of features between Manila UI and the Manila API.
OpenStackSDK Manila Status
- This has been an effort for some time, and during the Zed cycle some code was merged for integrating share groups and share group snapshots.
- This is a good candidate project for an internship, and the idea is that we will be able to get some interns working on this project (potentially a project for a college capstone project)
OSC Updates
- Zed was the release we targeted to reach feature parity with the native client, and we made it!
- We had an idea to add a deprecation warning to the native manila client, but that will need to wait
- There is still a missing bit for adding the deprecation warning: having version autonegotiation working, which is something we are targeting for Antelope
- OSC is now the primary focus of the project when doing implementations for CLIs.
Bandit Testing / VMT
- The Manila team was introduced to the VMT (vulnerability management team) and Bandit, and we had contributors sharing their ideas around having manila under the VMT and running bandit tests.
- The audience agreed with the conditions to be under the overseen repos.
- Goutham has volunteered to be the security liaison for Manila
- There were some errors pointed out in a preliminar test run [4].
- We will add a bandit non-voting job and file bugs against third party drivers that have issues pointed out by the new job
- New bugs will be filed and will be distributed across community members
All Things CephFS
- Ingress aware NFS Ganesha
- Ceph Orchestrator is now capable of creating an ingress service to front-end a cluster of active-active NFS Ganesha instances via a single load-balanced IP address
- Previously, we would only use one single instance of NFS Ganesha. This introduces a SPOF and is not suitable for production environments without HA being handled externally (like TripleO deployments do). With this change, Manila CephFS NFS users would be able to deploy multiple NFS Ganesha servers and leverage inbuilt HA capabilities
- Currently though manila client restrictions (access rules) will not work since NFS-Ganesha sees the ingress service's IP instead of the client's IP address.
- ffilz proposed his design to support the PROXY protocol with NFS-Ganesha. Ceph Ingress can pass the Client's IP address to NFS-Ganesha [5]
- An alternative would be to have cephadm assign stable IP addresses to ganesha instances
- No driver changes are anticipated in either approach
- AI: Investigating deploying ceph-ingress with devstack
- Migration tool for Manila with CephFS NFS
- After the new helper was introduced to the driver, we are able to interact with cephadm-deployed CephNFS service (which comes with a lot of benefits)
- Now, we need to figure out how to upgrade deployments with CephFS NFS using the current helper (which interacts with NFS Ganesha through DBUS)
- There are two main issues:
- Representing a change in exports when migrating across NFS Servers (current use case) or when decommissioning NFS Servers
- Representing exports from multiple servers - a special case enabling migration where an old server will eventually be decommissioned:
- These issues were talked through and possible solutions were proposed and will be worked on in the following cycles. More details on [6]
- DHSS=True for CephFS driver
- Currently, CephFS driver, whether you'd like to use it for Native CephFS, NFS via DBUS, NFS via ceph-mgr only supports DHSS=False. The request of this feature was raised during the OpenInfra Summit in Berlin from several Manila users.
- We discussed some alternatives for making it happen:
- 1) Operator determines Ceph cluster limits, and creates isolated file systems and declares multiple manila backends with each filesystem
- 2) Subvolume groups - pinning a subvolume group to mds can isolate/dedicate mds load
- The lack of resources in the dsvms is affecting the CephFS Native and NFS jobs, causing them to be unstable. The jobs often run into timeouts
- The situation could get worse as Jammy Jellyfish packages are not available yet
- We will ask the ceph-users if jammy release bits can be made available
Oversubscription enhancements
- Storage pools supporting thin provisioning are open to oversubscription. This caused some problems mentioned in [7].
- We have an open specification which we intend to merge in the Antelope cycle, as well as the changes to address such issue
FIPS
- We have shared our testing status with regards to FIPS. We have jobs merged on stable branches up to wallaby for both manila and python-manilaclient repositories.
- The next steps would be a more in-depth code audit to identify non-compliant libraries and making our jobs voting
- We agreed to make our jobs voting when the Ubuntu images supporting FIPS are out
- For drivers using non FIPS compliant libraries, we will notify the maintainers
Metadata API update
- Metadata APIs for share snapshots were added during the Zed cycle
- The goal for this cycle is to have functional testing and the CLI merged. Both patches are under review and expected to be merged not late in the release.
Manila CSI
- The CSI plugin's been pretty stable in the last six months [8]
- There was a good talk presented at KubeCon by Robert Vasek [9]
- The next steps include getting a fix for an issue involving long snapshot names in the CephFS backend and supporting volume expansion in the OpenShift Manila CSI driver operator
Manila Configuration of VLAN Network information
- A issue was found where Contrail Neutron Plugin did not return the VLAN ID during port allocation [10]
- To tackle this issue, we have agreed to add metadata to the share network APIs so the administrators would be able to add VLAN as metadata and drivers would be able to consume it.
Better use of bug statuses
- Our bugs were stuck in "New" instead of "Confirmed" or "Triaged" and this could be misleading.
- We agreed to tag the bugs as confirmed or triaged depending on the outcome of our triaging and not leave bugs as "new"
- We will ask the bug assignees to update their open bugs in case one of them has the new status, so we can have a better visibility
Thanks,
carloss