[manila] Bobcat vPTG summary

Carlos Silva ces.eduardo98 at gmail.com
Tue Apr 4 23:35:38 UTC 2023

Hello, Zorillas and interested stackers!

Thank you for attending last week's PTG. We had productive discussions and
came up with good plans for this cycle!

Here is a summary of the topics we spoke about during the PTG. The etherpad
with all the notes is available in [0]. If you would like to see the
recordings for those discussions, they are available on the Manila Youtube
channel [1].

   - *Share Transfer Future development*

   - Share transfers feature was introduced in Antelope but it has a couple
   of possible enhancements, such as: transferring shares that have snapshots,
   replicated shares, share networks alongside its share servers and shares.

   - Snapshots shouldn't be very difficult to cover but we would need to
   keep in mind quota allocations while doing so.

   - Transferring shares with replicas is the bigger challenge, considering
   we need to work with modifying the access rules.

   - The next step would be to work on transferring share networks.
   - AIs: Haixin plans to work on this feature during Bobcat.

   - *S-RBAC changes in Bobcat*

   - For the Bobcat cycle, we will focus on:

   - Ensure we are landing even more tests and enabling the oslo policy

   - Modify the testing we are doing to use the reader role to test
   read-only API, opposed to the member role.

   - Audit all cross service interactions and isolate actions performed by
   the "service" user.

   - Push for the completion of unit tests for Secure RBAC

   - AIs: Manila team will continue promoting events to get tests reviewed
   and implemented

   - *Cross project discussion with Nova for VirtIO FS *

   - Manila and Nova team discussed approaches for avoiding shares to be
   deleted while they are attached to instances.

   - We have agreed that an admin/service level API for locking/unlocking
   shares deletion would be the best solution to avoid shares being deleted
   while mounted. This API would ideally be a 'locked by <consumer>' action
   instead of a simple boolean

   - As part of this cross-project effort, the Manila team has also been
   focusing on getting more APIs available in OpenStack SDK. We have been
   making great progress with it and we plan to have all APIs nova is using
   merged in the OpenStack SDK by the bobcat cycle.

   - AIs:

   - Manila team to write a spec describing the behavior of the
   locking/unlocking API for share.

   - Manila team to focus on having all APIs nova is using in OpenStack SDK
   by the Bobcat cycle.

   - *Scalable NFS Ganesha for CephFS - state of the kitchen*

   - We have shared a status of what was our focus during the previous
   release and some important bugs that have been fixed.

   - Updates on moving from a standalone ganesha to a cephadm deployed:

   - There are three changes being worked on at the moment, after past

   - Ensure that the shares access rules will be applied when moving from
   the old to the new server

   - Ensure that the share exists in both old and new ganesha servers and
   to reapply all access rules in both

   - Change to represent export locations from both old and new export
   locations while both cephfs_nfs_cluster_id and ganesha_server_ips are
   present in the backend.

   - With these three changes we would cover all the scenarios considering
   release upgrades.

   - The changes are being tested and need feedback.

   - A job for automated testing is in progress. We are targeting to have a
   multi-node CI job that will be capable of deploying ganesha using cephadm
   and run tests using it.

   - AI: carloss and gouthamr will pursue closure for the three changes
   related to the upgrades and ashrodri will continue to work on the
   multi-node testing job.

   - *SQLAlchemy 2.0 and other DB challenges/changes*

   - DB testing is failing sporadically in the CI and we were discussing
   possible changes to avoid rechecks caused by sporadic failures.

   - We intend to work on missing indexes for queries, so we can gain
   performance in the database.

   - Implementing __repr__ to manila models

   - Other people are already doing this and this could be of huge help for
   debugging, as one of the operators suggested in the session

   - We will investigate and apply this in Manila

   - AIs: gouthamr will open bugs for tracking the work on the joinedloads
   and a blueprint for the missing indexes.

   - Sqlalchemy 2.0 changes have already started and we have a list of
   changes yet to merge. stephenfin is doing a great work on the changes.

   - AIs: manila team to review the changes proposed by stephenfin

   - *Create share from replicated snapshot / keep source share on replica

   - Users of this feature are looking for a way to test the inactive
   replicas before actually promoting them and if everything is correct, they
   continue with the promotion.

   - One suggestion to get this working would be having a replication pause
   API, which would make the inactive instance able to be mounted

   - We do think this feature could be relevant and some suggestions from
   the Manila team are:

   - Letting users know if they can pause/unpause replication even before
   they create their share replicas

   - Making users able to know about errors in the API (something similar
   to the check mechanism on security-service-update)

   - AIs: carthaca and team will work on a spec and will discuss it further
   with the Manila team.

   - *Tech debt:*

   - We brought up the current issues with our CI, mostly caused by tests
   being very resource intensive.

   - We discussed approaches for scenario testing and thought it is worth
   it trying to create containers to mount share during scenario tests instead
   of spawning VMs.

   - Also, the audience has mentioned that it is difficult to keep track of
   all of these issues and it would be better if we had them documented
   somewhere, and that we could bring these up more often.

   - Migration to Ubuntu 22 is yet to complete for some jobs due to
   failures in package installation and the usage of quagga.

   - AIs:

   - carloss will work on a wiki with tech debt items and will bring this
   up more often in the manila weekly meetings, so if people are available in
   that time frame to work on them, they will have more chances to raise their

   - carloss and gouthamr will work to fix the issues with jobs yet to
   migrate to Ubuntu 22

   - *Scheduler data placement based on vendor specific tool*

   - NetApp has a tool called Active IQ that uses machine learning to
   identify what is the best pool of disks to place a request based on custom
   parameters (latency, iops and so on)

   - They are thinking of trying to have this tool interact with Manila, so
   it would also benefit from their tool when NetApp is the only storage
   vendor in the cloud.

   - The suggestion is to have a new weigher that will make calls to the
   storage based on the request and use Active IQ response to identify where
   the share should be placed.

   - Testing this feature could be a challenge, considering that it's not
   straightforward to foresee what a weigher would return in a given moment

   - AIs: felipe_rodriguess alongside the NetApp team will work on a spec
   so this can be further discussed.

[0] https://etherpad.opendev.org/p/manila-bobcat-ptg

Thank you!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.openstack.org/pipermail/openstack-discuss/attachments/20230404/adcf6191/attachment-0001.htm>

More information about the openstack-discuss mailing list