Hello everyone! Thank you for the great participation at the PTG last week. We've had great discussions and a good turnout. The recordings for the sessions are available on YouTube [0]. If you would like to check on the notes, please take a look at the PTG etherpad [1]. *2024.2 Dalmatian Retrospective* ========================== - New core reviewers in the manila group were impactful in reviews, we should continue actively working on maintaining/growing the core reviewer team. - We had the mid-cycle and managed to combine it with our well-known collaborative review sessions, around feature proposal freeze. This had a good impact on raising awareness on the changes being proposed, as well as prioritizing the reviews. - Great contributions ranging from new third party drivers to successful internships on manila-ui, bandit and the ongoing OpenAPI internships. *Action items:* - Carlos (carloss) will work with the manila team to help people gain context on the bug czar role and work with the team to rotate it. - Vida Haririan (vhari) will jot down the details of the Bug Czar role - Follow the discussions on teams joining the VMT and get Manila included too. - Spread the word on the removal of the manila client and switch to OpenStackClient *Share backup enhancements* ========================= - Out of place restore isn't supported currently. We have agreed that this is a good use case and that a design specification should be proposed to document this. - DataManager / BackupDriver - forcing the backup process to go through the DataManager service is supported through a config option, but Manila is currently not honoring it. We agreed that this is an issue in the code, and we will review the proposed change [2] to make the data manager honor this config. - DataManager to allow for a backup driver to provide reports on API call progress: Currently, the data manager fetches the progress of a backup using a generic get progress call, but it is failing with the generic backup driver. We suggested that this should be fixed in the base driver. - Context for Backup API calls: currently, only objects representing a Share and Backup are passed to the backup driver. The request context should also be forwarded in these calls. The backup driver interface can be changed for this, but we should be mindful of out of tree drivers that could break. *Action items:* - Zach Goggins (zachgoggins) will look into: - Proposing a spec for the share backup out of place restore. - Updating the backup driver interface and adding context to the methods that need it. - Updating the backup driver interface and adding the abstract methods/capabilities that will help with the `get_restore_progress` and `get_backup_progress` methods. - The manila team will provide feedback on [2] *All things CephFS* =============== *Updates from previous cycles* ------------------------------------------ *State of the Standalone NFS Ganesha protocol helper:* - We added a deprecation warning at the end of the previous SLURP release, and we are planning to complete the removal during the 2025.1/Epoxy release. There were no objections to this so far at the PTG. When this is removed, CephFS-via-NFS will only work with cephadm deployed ceph-nfs clusters. *Testing and stabilization:* - devstack-plugin-ceph has been refactored to deploy a standalone NFS-Ganesha service with a ceph orch deployed cluster. We also dropped support for package-based and container-based installation of ceph. cephadm is used to deploy/orchestrate ceph. - Bumped Ceph version to Reef in Antelope, Bobcat, Caracal, Dalmatian, as well as started testing with Squid. - There are some failures on stable branches jobs which are being triaged and fixed. *Manage/unmanage:* - Implementation completed in Dalmatian and the documentation has been updated. We are currently working to enable the tests on CI permanently, as well as doing some small refactors to the CI jobs. *Ensure shares:* - Merged in Dalmatian but testing is still challenging, as running the tests mean that the service would temporarily have a different status and shares within the backend would have their status changed, which is harmful for test concurrency. *Preferred export locations and export location metadata:* - The core feature merged, but we are still working to get the newly implemented tests passing and merged. *Plans for 2025.1/Epoxy* -------------------------------- - NFSv3 + testing: we are looking into enabling NFSv3 support as soon as the patch is merged in Ceph. We agreed that we should enable the tests within manila-tempest-plugin and make any necessary changes to the tests structure, so we can ensure that we are testing some scenarios with both NFSv3 and NFSv4. - We will start to investigate support for SMB/CIFS shares and look at the necessary changes for setting up devstack and testing. *Action items:* - Carlos (carloss) will write an email to the openstack-discuss mailing list announcing the removal of the deprecated ganesha helper - Carlos pursue the manage/unmanage testing patches to have tests enabled in the CephFS jobs during Epoxy. - Carlos will look into approaches to test ensure shares APIs. - Ashley (ashrod98) will continue working on the export location metadata tempest changes and drive them to completion. - The manila team will look into updating manila-tempest-plugin tests and enabling NFSv3 tests in the Ceph NFS jobs - Goutham (gouthamr) will be submitting a prototype of the SMB/CIFS integration *Tech Debt* ======== *Eventlet removal* ---------------------- Our main concerns: - Performance should not be degraded with the default configuration when we switch. - Synchronous calls do not take a big hit and become asynchronous. - Impact to the SSH Pool (used by many drivers) should be minimal. *Action items for 2025.1 Epoxy:* - Tackle the low-hanging-fruit changes. - Participating in the pop-up team discussions. - Removing the affected console scripts in Manila. - Working on performance tests to understand what will be the impact on the SSH pool that is used by some drivers. - Look into enhancing our rally/browbeat test coverage. *CI and testing images* ------------------------------- We started working on the migration of the CI to Ubuntu 24.04 in all of the manila repositories (manila-image-elements, python-manilaclient, manila-ui, manila, manila-specs). Currently, the Ceph job is broken [3]. *Action items:* - We should clean up our CI job variants, as they have a lot of workarounds and we can start moving away from them. *Stable branches* ---------------------- We currently have 5 "unmaintained" branches, so we should be looking at sunsetting them. *Action items:* - Carlos (carloss) will start the conversation for the transition of some of these branches in the openstack-discuss mailing list. *Allowing force delete of a share network subnet* ======================================= We currently can add subnets (which translates to adding new network interfaces) to a share server but we can't remove them. This is a proposal to add this removal feature and being able to detach a network interface of a share server. We agreed that: - This is a good use case and something that can be enhanced. - The enhancement should add a force-delete API. - We should not allow the last subnet to be deleted, otherwise the shares won't have an export path. - A bug should be filed for a tangential issue that the NetApp driver is using "neutron_net_id" (and possibly "neutron_subnet_id" to name resources on the backend: ipspaces, broadcast domains, and possibly concurrency control / locks. *Action Items:* - sylvanld will look into proposing a spec to document this behavior *NetApp Driver: Enforcing lifs limit per HA pair* ===================================== - The NetApp ONTAP storage has a limit of network interfaces per node in a HA pair. In case the sum of allocated network interfaces in the two nodes of the HA pair is bigger than the limit of the single node, then the failover operation is compromised and will fail. - NetApp maintainers would like to fix this issue, and we agreed that: - The fix should be as dynamic as possible, not relying on users/admin input or configuration. - The ONTAP driver must look up all of the interfaces already created and allow/deny the request in case it would compromise the failover. - The NetApp ONTAP driver should keep an updated capability with the max network interfaces support number, and possible the number of allocated network interfaces at the moment. *NetApp Driver: Implement Certificate based authentication* ================================================ - The NetApp ONTAP driver currently handles only user/password authentication, but in an environment that password should change quarterly, this means updating the local.conf at least every three months. This enhancement proposes also adding the possibility of adding certificate based authentication. - We agreed that this is something that is going to be important for operators and will allow them to add their certificates with a longer expiration date, avoiding the disruptions caused by needing to update the user/password. *Manage Share affinity relationships by annotation/label* ============================================= Currently the manila scheduler uses affinity/anti-affinity hints and we base ourselves on share IDs. The idea now would be to have the affinite hints to be based in an affinity policy, as possible with Nova. We considered the proposed approaches, and agreed that: - If we are adding new policies, they should end up becoming a new resource/entity within the manila database - If there is a way to reuse the share groups mechanism, we should prioritize it *Action items:* - Chuan (chuanm) will propose a design spec to document this new behavior. *Share encryption* ============== This feature is currently waiting for more reviews and testing on gerrit. In the Dalmatian release mid-cycle we talked about the importance of testing this feature against a first party driver, to ensure that the APIs and integration with Barbican and Castellan work. We agreed that: - We should do some research on how to do this testing with the generic driver (which uses Cinder and Nova) - The testing will focus on the APIs and behavior of this feature, not the encryption of the shares. *Action items:* - gouthamr will help with some research on how to test this with the generic driver - The manila team will discuss this again in the upcoming manila weekly meetings. [0] https://www.youtube.com/watch?v=8UxrjEr6yik&list=PLnpzT0InFrqDHGfSDPhiGtSeXd36mrI3T [1] https://etherpad.opendev.org/p/epoxy-ptg-manila [2] https://review.opendev.org/c/openstack/manila/+/907983 [3] https://www.spinics.net/lists/ceph-users/msg83201.html