[manila][ptg] Xena PTG Summary

Goutham Pacha Ravi gouthampravi at gmail.com
Thu May 6 07:30:50 UTC 2021


Hello Zorillas, and other interested Stackers!

Sorry this is getting to you later than usual :)

We concluded a productive Project Team Gathering on 23rd Apr '21. I'd like
to thank everyone that participated as well as the friendly folks at the
OpenInfra Foundation who worked hard to organize this event. The following
is my summary of the proceedings. I've linked associated etherpads and
resources to dig in further. Please feel free to follow up here or on
freenode's #openstack-manila if you have any questions.

== Day 1 - Apr 19, 2021, Mon ==

=== Manila Retrospective ===
Topic Etherpad: https://etherpad.opendev.org/p/manila-wallaby-retrospective
 - interns and their mentors did a great job through Wallaby! Many
accomplishments, thanks to the sponsoring organizations for their investment
 - the two Bugsquash events at M1 and M3 were very effective.
 - we added new core members (python-manilaclient, manila-tempest-plugin)
 - sub-teams (manila-ui, manilaclient and manila-tempest-plugin) increased
focus on the individual projects and sped up reviews
 - good cross project collaboration with OpenStack QA, Horizon, NFS-Ganesha
and Ceph communities
 - we discussed strategies to avoid reviewer burnout and performed health
checks on the team growth initiatives we took up in the Wallaby cycle

Actions:
 - need to get proactive about reviews so we can avoid burn-out at the end
of the cycle
 - continue bug squash days in Xena
 - look for interested contributors to join the core maintainer team
 - bring up reviews in weekly meetings and assign review owners earlier
than M-2
 - contributors need to be reminded to help with reviews


== Day 3 - Apr 21, 2021, Wed ==

=== manila-tempest-plugin plans and update ===
- lkuchlan and vhari highlighted difficulties around "capabilities" testing
that was exposed due to feature changes in the CephFS driver in the Wallaby
cycle
- optional feature testing requires rework, the configuration options and
test strategy have gotten confusing over time
- discussion continued on dropping testing for older API microversions
where some optional features (snapshots, snapshot cloning) weren't really
optional. It was agreed that this was overkill. The problem is specifically
in share type extra-specs. If these are setup with API version >= 2.24, we
could circumvent this:
https://review.opendev.org/c/openstack/manila-tempest-plugin/+/785307. The
only issue is that tests could theoretically be run against a cloud that
doesn't support API version 2.24 (has to be Newton release or older).  The
team agreed that we could document this deficiency in the configuration
file since Newton release has been EOL for a while.

Actions:
 - review and merge the fix to share type extra-specs
 - rework existing "run_XYZZY_tests" and "capability_XYZZY_support" flags
to "share-feature-enabled" section where these capabilities are always set
when a feature is enabled - this is in-line with the way optional feature
testing for other OpenStack services is done

=== Service Recovery ===
Topic Etherpad:
https://etherpad.opendev.org/p/xena-ptg-manila-share-recovery-polling

- gouthamr discussed the use cases and architectures needing to run
manila-share in active-active HA
- service recovery after failure needs to be coordinated to prevent
metadata corruption
- clustering services will provide coordination for cleanup
- vendor driver authors will need to audit critical sections and replace
oslo.concurrency locks with distributed locks.

Actions:
- there will be a specification submitted for review

=== Polling operations in manila-share ===
Topic Etherpad:
https://etherpad.opendev.org/p/xena-ptg-manila-share-recovery-polling
- haixin and gouthamr summarized the periodic tasks conducted by the share
manager service
   - to update share service capabilities and real time capacity
information,
   - to monitor the health of share replicas,
   - to "watch" the progress of long running clone, snapshot and migration
operations.
- these periodic tasks will be inefficiently duplicated if running across
multiple nodes associated with the same backend storage system/driver
- many of these periodic tasks are meant to be read-only, however, the
health-check operations modify database state, currently these assume a
single writer, and unless staggered, updates may be duplicated as well
- proposed mitigation was to allow clustered hosts to ignore polling
operations if one is in progress elsewhere, and for the service instance
executing the tasks to distribute the workload to other hosts via the
message queue
- another issue with periodic tasks was the lack of ability to stop an
unnecessary task from running - we could take advantage of oslo's
threadgroup that exposes a start/stop API

Actions:
- replace loopingcall periodic tasks with threadgroups to selectively run
periodic tasks
- propose a specification to coordinate and distribute polling workloads

=== Interop Working Group ===
- arkady_kanevsky presented the mission of the InteropWG, and laid out
plans for the upcoming Interop guideline
- manila tests were added to the guideline as part of the 2020.11 guideline
and have been available for the community to provide results
- there was an optional capability (shrinking shares) that was flagged, and
the team discussed if it was to remain flagged, or be moved to an advisory
state - including reduction of the cutoff score
- the team wants to stabilize the tests exposed through the next guideline,
and work on a way to remove the necessity of an admin user to bootstrap the
manila's tempest tests where unnecessary

Actions:
- improve manila tempest plugin to not request admin credentials for the
tests exposed, and work with a configured "default" share type

=== Share server migration improvements ===
- share server migration was added as an experimental feature in the
Victoria release
- carloss presented use cases for nondisruptive share server migration
- nondisruptive migration would mean that network changes are unnecessary,
so the share manager needn't seek out any new network ports or relinquish
existing ones
- the share server migration is experimental, however, we'd still make
nondisruptive migration possible only from a new API microversion
- manual triggering of the second phase is still necessary to allow control
of the cutover/switchover. All resources will continue to be tagged busy
until the migration has been completed

Actions:
- no specification is necessary because the changes to the API and core
manila are expected to be minimal

=== OpenStackSDK and Manila ===
- ashrod98, NicoleBU and markypharaoh, seniors at Boston University worked
through the past cycle to expose manila resources to the OpenStackSDK
- an outreachy internship proposal has been submitted to continue their
work to expose more API functionality via the SDK
- we'd prioritize common elements - quotas, limits and availability zones
so we can parallely work on the OpenStackClient and Horizon implementation
for these
- need review attention on the patches

Actions:
- some domain reviewers for manila patches to be added to the OpenStackSDK
core team to help with reviews


== Day 4 - Apr 22, 2021, Thu ==

=== Support soft delete/recycle bin functionality ===
- haixin told us that inspur cloud has a "soft-delete" functionality that
they would like to contribute to upstream manila
- the functionality allows users to recover deleted shares within a
configurable time zone via Manila API
- the proposal is to only allow soft-deletion and recovery of shares, but
in the future this can be extended to other resources such as snapshots,
replicas, share groups
- soft deletions have the same validations as deletions currently do
- quota implications and alternatives (unmanage, an system admin API) were
discussed

Actions:
- a specification will be proposed for this work

=== New Driver Requirements and CI knowledge sharing ===
- there was a informative presentation from the Cinder PTG [3] about using
Software Factory to deploy a Third Party CI system with Gerrit, Nodepool,
Zuulv3 and other components
- some helpful but unmaintained test hook code exists in the repos to aid
with CI systems using devstack-gate
- since devstack-gate has been deprecated for a long time, third party CIs
are strongly recommended to move away from it

Actions:
- complete removal of devstack-gate related manila-tempest-plugin
installation from manila's devstack plugin
- create a Third Party CI and help wiki/doc that vendor driver maintainers
can curate and contribute to

=== New Driver for Dell EMC PowerStore ===
- vb123 discussed Dell EMC powerstore storage, its capabilities and their
work in cinder, and made a proposal for a driver in manila
- this driver's being targeted for the Xena release:
https://blueprints.launchpad.net/manila/+spec/powerstore-manila-driver
- community shared updates and documentation from the Victoria and Wallaby
cycles that may be helpful to the driver authors
- driver submission deadlines have been published:
https://releases.openstack.org/xena/schedule.html

=== Enabling access allow/deny for container driver with LDAP ===
- esantos and carloss enabled the container driver in the Wallaby release
to support addition of and updates to LDAP security services
- however, this work was just intended to expose a reference implementation
to security services
- the container driver does not validate access rules applied with the LDAP
server today
- the proposal is to enable validation in the driver when a security
service is configured

Actions:
- publish container driver helper image to the manila-image-elements
repository and explore container registries that community can use for the
long term for CI and dev/test
- file a bug and fix the missing LDAP validation in the container share
driver

=== Keystone based user/cephx authentication ===
- "cephx" (CEPHFS) access types are not validated beyond their structure
and syntax
- with cephx access, an access key for the ceph client user can be
retrieved via the access-list API call
- access keys are privileged information, while we've always had a stance
that Native CephFS is only suitable in trusted tenant environments, there's
a desire to hide access keys from unrelated users in the project
- gouthamr proposed that we allow deployers to choose if Keystone user
identity validation must be performed
- when validation is enabled, manila ensures that users may only allow
share access to other users in their project
- users may only retrieve access keys belonging to themselves, and no other
project users
- an alternative would be to keystone validation would be to allow
controlling the access_key via a separate RBAC policy
- with the alternative, we wouldn't be able to validate if the access to
user is in the same project as the user requesting the access

Actions:
- a specification will be proposed, however, the work's not expected to be
prioritized for Xena

=== Addressing Technical Debt ===
- we discussed tech debt that was piling up from the last couple of cycles
and assigned owners to drive them to completion in the Xena cycle

Actions:
- python-manilaclient still relies on keystoneclient to perform auth,
there's ongoing work from vkmc+gouthamr to replace this with keystoneauth
[4]
- code in manila relies on the retrying python library that is no longer
maintained; kiwi36 will propose replacing the usage with tenacity with
tbarron's help [5]
- tbarron will be resurrecting a patch that removes old deprecated
configuration options [6]
- carloss will be picking up rootwrap-to-privsep migration

=== Unified Limits ===
Topic Etherpad: https://etherpad.opendev.org/p/unified-limits-ptg
- kiwi36, our Outreachy intern, proposed the design for using keystone's
unified limits in manila with the help of the oslo.limit library
- he walked through the prototype using resource quotas and highlighted the
differences with the current quota system
- we discussed why nested quotas were preferable to user quotas

Actions:
- a release independent specification will be proposed for the community to
review and continue this work
- explore how share type quotas will be handled

=== Secure RBAC Follow up ===
- vhari presented the RBAC changes that were made in the Wallaby release,
including support for the system scope and reader role
- a tempest test strategy was discussed, along with improvements made to
tempest itself to make test credential setup with the new default roles
- there's no plan to enforce the new defaults in Xena, we'll take the
release to stabilize the feature and turn on deprecation warnings
indicating intent to switch to the new defaults in the Y release

Actions:
- vhari and lkuchlan will start working on the tempest tests
- wrap up known issues with the new defaults and backport fixes to the
Wallaby release
- manila's admin model and user personas will be documented


== Day 5 - Apr 23, 2021, Fri ==

=== VirtIOFS plans and update ===
Topic Etherpad:
https://etherpad.opendev.org/p/nova-manila-virtio-fs-support-xptg-april-2021
- tbarron and lyarwood presented an update on the research that was done in
the wallaby release wrt live attaching virtiofs volumes to running compute
instances
- qemu supports live attaches, the feature is coming to libvirt soon. live
migration of instances with virtiofs volumes isn't supported yet - this is
ongoing work in the qemu/kvm/libvirt communities
- we discussed the connection initiation and information exchange between
manila and nova, and what parts will be coordinated via the os-share library
- there are as yet no anticipated changes to the manila API

Actions:
- continue working with the libvirt community to have the live-attach APIs
added
- a specification will be proposed to nova (and manila if changes are
necessary to the manila API)

=== CephFS driver update ===
Topic Etherpad: https://etherpad.opendev.org/p/cephfs-driver-update-xena-ptg
- vkmc presented the changes and new features in the cephfs driver in the
wallaby cycle
- the driver was overhauled to interact with ceph via the ceph-mgr daemon
instead of the deprecated ceph_volume_client library
- there's an upgrade impact going from victoria to wallaby. Deployers must
consult the release notes [7]  and the ceph driver documentation [8] prior
to upgrading. Data path operations are unaffected during an upgrade:
   - ceph clusters must be running the latest version of the release
they're on to allow the wallaby manila driver to communicate with the ceph
cluster
   - the ceph user configured with the driver needs mgr "caps", and mds/osd
privileges can be dropped/reduced [8]
- we discussed dropping the use of dbus in the nfs-ganesha module, and
using the "watch-url" approach that ganesha's ceph-fsal module provides
- we also discussed ceph-adm and ceph-orch based deployment which replaces
ceph-ansible and the impact that would have on ganesha configuration
- ceph quincy will support active/active nfs-ganesha clusters (even with
ceph-adm based deployment), we discussed manila side changes to take
advantage of this feature

=== manilaclient and OpenStackClient plans and update ===
Topic Etherpad: https://etherpad.opendev.org/p/manila-osc-xena-ptg
- maaritamm provided an update on our multi-cycle effort to gain parity
between manila's shell client and the osc plugin
- we made great progress in the wallaby cycle and covered all "user"
related functionality that was requested - we discussed what's left, and
sought community priority of the missing commands
- we discussed deprecation of the manila shell client and only supporting
the osc plugin - albeit addressing the "standalone" use case where "manila"
can be used in place of "openstack share" if users desire
- we could use help to add new commands, and to review code in the osc
plugin

Actions:
- achieve complete feature parity between the two shell client
implementations, deprecate the manila shell client


=== manila-ui plans and update ===
Topic Etherpad: https://etherpad.opendev.org/p/manila-ui-update-xena-ptg
- disap, our Outreachy intern and vkmc, highlighted the changes made in the
wallaby cycle
- we lauded the cross-project collaboration that was renewed in this cycle
to triage issues, and proactively work on incoming changes in horizon and
elsewhere
- manila-ui's microversion catchup is progressing slowly, but surely
- we have another outreachy project proposal that overlaps with the xena
cycle

Actions:
- continue to catch up to API feature functionality in the UI


Thanks for staying with me so far! :) There were a number of items that we
couldn't cover, that you can see on the PTG Planning etherpad [1], if you
own those topics, please add them to the weekly IRC meeting agenda [2] and
we can go over them. The meeting minutes etherpad continues to be available
[9].

As usual, the whole PTG was recorded and posted on the OpenStack Manila
Youtube channel [10]

We had a great turn out and we heard from a diverse set of contributors,
operators, interns and users from several affiliations and time zones. On
behalf of the OpenStack Manila team, I deeply appreciate your time, and
help in keeping the momentum on the project!

Best,
gouthamr

[1] https://etherpad.opendev.org/p/xena-ptg-manila-planning
[2] https://wiki.openstack.org/wiki/Manila/Meetings
[3] https://www.youtube.com/watch?v=hVLpPBldn7g
[4] https://review.opendev.org/647538
[5] https://review.opendev.org/380552
[6] https://review.opendev.org/745206/
[7] https://docs.openstack.org/releasenotes/manila/wallaby.html
<https://docs.openstack.org/releasenotes/manila/wallaby.html>
[8] https://docs.openstack.org/manila/wallaby/admin/cephfs_driver.html
[9] https://etherpad.opendev.org/p/xena-ptg-manila
[10]
https://www.youtube.com/playlist?list=PLnpzT0InFrqDmsKKsF0MQKtv9fP4Oik17
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210506/09b6ffef/attachment-0001.html>


More information about the openstack-discuss mailing list