[openstack-dev] [manila][ptg] Rocky PTG summary

Tom Barron tpb at dyncloud.net
Wed Mar 14 11:16:48 UTC 2018


We had a good showing [1] at the Rocky PTG in Dublin.  Most of us see
each other face-to-face rarely and we had some (even long time)
contributors come to the PTG for the first time or join manila from
other projects!  We had a good time together [2], took on some tough
subjects, and planned out our approach to Rocky.

The following summarizes our main discussions.  For the raw
discussion topic/log etherpad see [3] or for video of the team in
action see [4].  This summary has also been rendered in this
etherpad:

https://etherpad.openstack.org/p/manila-rocky-ptg-summary

Please follow up in the etherpad with corrections or additions,
especially where we've missed a perspective or interpretation.

== Queens Retrospective ==

Summary [5] shows focus on maintaining quality and integrity of the
project while at the same time seeking ways to encourage developer
participation, new driver engagement, and adoption of manila in real
deployments.

== Rocky Schedule ==

- We'll keep the same project specific deadlines as Queens:
* Spec freeze at Rocky-1 milestone
* New Driver Submission Freeze at Rocky-2 milestone
* Feature Proposal Freeze at release-7 week
  (two weeks before Rocky-3 milestone)

== Cross Project Goals ==

- Manila met the queens goals (policy in code [6] and split of tempest
  into its own repos [7]).
- For Rocky mox removal goal [8] we have no direct usage of mox
  anymore but need to track the transitive dependency of the manila-ui
  plugin on mox via horizon [9]
- We have already met the minimum Rocky mutable configuration goal [10]
  in that we have general support for toggle of debug logging without
  restart.  We agreed that additional mutable configuration options
  should be proposed on a case-by-case basis, with use-cases and
  supporting arguments to the effect that they are indeed safe to be
  treated as mutable.

== Documentation Gaps ==

- amito's experience introducing the new Infinidat driver in Queens
  shows significant gaps in our doc for new drivers
- jungleboyj proposed that cinder will clean up its onboarding doc
  including its wiki for how to contribute a driver [11]
- amito will work with the manila community to port over this information
  and identify any remaining gaps
- patrickeast will be adding a Pure back end in Rocky and can help
  identify gaps
- we agreed to work with cinder to drive consistency in 'Contributor
  Guide' format and subject matter.

== Python 3 ==

- Distros are dropping support for python 2, completely, between now
  and 2020 so OpenStack projects we need to start getting ready now 
  [12]
- Our main exposure is in manila-ui where we still run unit tests with
  python 2 only
- Also need to add a good set of python 3 tempest tests for manila proper
- CentOS jobs will need to be replaced with stable Fedora jobs
- vkmc will drive this; overall goal may take more than one release

== NFSExportsHelper ==

- bswartz has a better implementation
- in discussion he developed a preliminary plan for migrating users
  from the old to the new implementation
- impacts generic and lvm drivers, arguably reference only
- bswartz will communicate any impact to openstack-dev, openstack-operators,
  and openstack-users mailing lists

== Quota Resource Usage Tracking ==

- we inherited our reservation/commit/rollback system from Cinder who
  in turn took theirs from Nova
- it is buggy, making reservations in one service and doing commit/rollback
  in scattered places in another service.  Customer bugs with quotas 
  are painful and confidence that they are actually fixed is low.
- melwitt and dansmith explained how Nova has now abandoned this
  system in favor of actual resource counting in the api service
- we intend to explore the possibility of implementing a similar system
  as the new Nova approach; cinder is exploring this as well
- can be implemented as bug fixes if it's clean and easy to understand

== Replacing rootwrap with privsep ==

- What's in it for manila?
- Nova says it improves performance; Cinder says it harms performance :)
- It serializes operations so the performance impact depends on how long
  the elevated privilege operations run.
- We need to study our codebase more to understand impact; not a Rocky
  goal for us to implement this.

== Huawei proposal to support more access rule attributes ==

* access levels like all_squash / no all_squash
  Most but not all vendors can support these.
  We agreed that although opaque metadata on access rules _could_ be
  used to allow manila forks to implement such support opaquely to
  manila proper, this is a generally useful characteristic, not
  something only useful for Huawei private cloud.  So it should be
  implemented using new public extra specs and back end capability
  checking in the scheduler in order to avoid error cases with back
  ends that cannot support the capabilities in question.
* ordering semantics for access rules to dis-ambiguate rule sets
  where incompatible access modes (like r/w and r/o) are applied
  to the same range of addresses
- We recognized that there may be cases where a cloud or distribution
  may need to extend the upstream manila with features that are not
  supported upstream and observed that in general wsgi extensions 
  would not be sufficient to meet these needs.  Metatdata fields that
  "mean something" to the forked distribution but which are opaque
  to manila proper could be used to address these needs.
- Huawei will submit specs for these access rule attributes, as well
  as for metadata for access rules (though the latter will not be used
  for *these* features) and we will prioritize their review.

== Huawei proposal to support optional share size ==

- Huawei public cloud users would like to be able to create a share
  without specifying a share size, so proposal is to have a default
  share size for such cases.
- Manila community does not want this general capability b/c we think
  many users would assume the share can grow without limit if no size
  is specified and that is not what this proposal would do.

== Feasability of other potential features ==

- 'auto' value for overprovisioning ratio.

   * Cinder has this in flight (erlon).
   * patrickeast might pick this up since he's doing a new driver for
     manila that does its own autoprovisioning and that's in the path
     of the cinder work.

- manage/unmanage for DHSS=True

Requires two-steps (1) manage share servers, (2) manage shares
and care must be taken w.r.t. unmanaged resources in managed
share servers.  No objection in principle to supporting this
but the design is non-trivial.  Probably not a Rocky target.

- create-from-snapshot to different pools than the original

No objection in principle but the spec may be tricky: need to
handle driver/scheduler interaction so that pools incompatible
with any given create-from-snapshot request are not chosen.
Probably not a Rocky target, watch similar cinder work.

== HA for software defined Share Servers ==

- current software defined Shared Servers (as with the generic
  and Windows back ends) introduce a single point of failure in the
  data path that arguably make them unacceptable for production use
- in Queens tbarron proposed a spec [13] wherein small self-managing
  pacemaker-corosync clusters of service VMs to address the SPOF   
  issue
- His target was a scale-out, DHSS=True version of the Ceph NFS driver
 * currently we deploy with TripleO as processes/containers running
     the controller node pacemaker/corosync cluster
 * avoids the SPOF issue at the cost of:
     + no scale out
     + DHSS-only deployment
- New approach for the ceph-nfs back end
  * de-couple the manila driver and the open source back end software
  * ceph mimic will enable running ceph-nfs (ganesha) active-active
  * implement the back end to run ceph daemons (including ceph-nfs)
    under kubernetes, with kubernetes HA
  * manila driver will interact with ceph mgr over rest interface to 
    create back end share servers per tenant as well as share CRUD
    on the back end
  * manila will pass neutron details for share networks to the ceph 
    manager, ceph manager will annotate pod creation requests for 
    nfs-gw with details that kuryr will use to connect ceph-nfs 
    gateways directly to tenant private networks
- This will take a couple releases to develop but has potential for a fully
  open source, production quality, software defined DHSS=True back end
- Discussion suggested that when ready this back end could serve as
  reference DHSS=True driver but we'd need to figure a way to run 
  (scaled down version?) in gate
- bswartz started a good discussion about kubernetes HA, write 
  caching, and container restart that has been picked up here [14]
- tbarron's queens spec could still be developed for alternate 
  software defined back ends that implement share servers via service 
  VMs such as generic

 == Scenario Test Improvements ==

- Several contributors are proposing enhancements.
- Port Valeriy's spec over to Rocky and keep it up to date as
  part of the reviews for the enhancements.
- Spec deadline does not apply to this test-only work.

== Openstack Client Integration ==

- We are behind other projects in having no OSC support.
- May be able to get an outreachy intern to work on this.
- Technical issues:
  * need some basic compatability with manila microversions
  * need to be able to run manila standalone as well
- Can pursue this opportunistically, outside of release waterfall
  spec approval cadence.

== Shade Support ==

- tbarron, patrickeast, others find shade and ansible roles built
  on shade super useful but manila has not shade support
- We agreed to support efforts to provide shade / ansible support
  for manila share service opportunistically, outside normal release
  cadence

== Races ==

- We are seeing random CI failures in the dummy driver
- These may due to races in the manager since no back end drivers are
  exercised.
- Let's file bugs on these and investigate.

== Manila UI and django requirements bump [15] ==

- Last time we bumped minimum django requirements it didn't go
  smoothly in manila ui
- Mostly just a headsup

== Testing multisegment binding in the gate [16] ==

- Challenging
- Community thinks gate testing is probably not a great use of community
  resources but third parties with customers using this for their 
  drivers should be motivated to do so.

== Support for non-nova consumers of file shares ==

- Hot area, increases the value of vendors and back ends investing
  in manila drivers if they can be used outside OpenStack.
- Options:
* OpenStack / K8s side-by-side: manila as part of full OpenStack
  provides shares to k8s, perhaps using kuryr to extend neutron
  networks into k8s
* standalone manila / cinder as software defined storage appliance
  running with mysql and rabbitmq but without keystone (NOAUTH) and
  the rest of OpenStack
* manilalib (like cinderlib) inside a persistence service (for model
  updates, driver private data) that can be used by a stateless CSI
  driver
* other ...
* use of any of the above with OpenSDS [17].  Manila / Cinder core
  xing-yang is now working full time on SDS.

== Mount Automation ==

- Pursued nova support in mitaka, were blocked, never pursued further.
- We may get this "for free" when providing mounts to container workloads
  (as in kubernetes where mounts are done by the hosts and containers 
  get automated bind mounts to the shares).
- For traditional nova workloads heat / ansible may be our best available option.

== IPv6 Fulfillment ==

- In Queens EMC and NetApp and lvm back ends added IPv6 support.
- In Rocky Huawei, Pure, Ceph Native, and Ceph NFS expect to add IPv6 support.

== MOD_WSGI ==

- We discussed when to use it in CI jobs and agreed that we need to
  cover both MOD_WSGI and non MOD_WSGI cases, and we are currently 
  doing that.

== Migration to StoryBoard [18] ==

- launchpad is up to ~1.75 m bugs now; storyboard starts numbering at
  2 m; import of our launchpad bugs into storyboard will be easier if
  we act sooner rather than later
- We will look at the storyboard sandbox [19] and see if it can meet
  our needs.
- diablorojo will do a test migration and when she reports back we
  can consider next steps

== Zuul v3 migration ==

- we need to add changes related to jobs config in  manila-tempest-plugin
- Can maked progress incrementally - no need to e.g. break 3rd party
  jobs by forcing everyone to change at once.
- Start with infra based jobs

== Priorities for the Rocky Release Cycle ==

- Get manila-ui ready for python 3 and start converting tempest jobs
- Get parity with cinder on documentation and forge agreement with
  them on remaining gaps and plan of action.
- Explore nova style quota usage system; can implement opportunistically
  after spec deadlines if we have agreement on a solution.
- Review NFSExportsHelper and migration proposal, merge if possible.
- Review Huawei extra access rule attribute specs, merge if possible.
- Improve testing, especially scenario tests
- OSC client, shade / ansible (pursue opportunistically)
- new IPv6 driver support
- investigate StoryBoard migration, move ahead if feasible
- pursue path to production quality open source DHSS=True back end

== Action Items ==

- tbarron will develop etherpad for priority reviews and summary
  dashboard incorporating the past review etherpads / gerrit 
  dashboards that bswartz supplied
- ganso will create an ehterpad for collaboration on development of
  reviewer/contributor checklists
- tbarron will check how cinder fixed log filtering issue
- tbarron will update releases.openstack.org with manila-specific schedule
- amito, patrickeast will help us identify gaps in new driver doc
- vkmc will track removal of mox in Horizon
- bswartz will communicate impact of new NFSExportsHelper to email lists
- anyone: explore new-style quota usage system; implement it opportunistically
- vkmc will see if we can get an outreachy intern to work on OSC for manila
- diablorojo will do a test migration of manila projects to storyboard and
  report back

== Footnotes ==

[1] 10+ people in the room at almost all times, sometimes almost double.
[2] For pictures see: http://lists.openstack.org/pipermail/openstack-dev/2018-March/128099.html
[3] https://etherpad.openstack.org/p/manila-rocky-ptg
[4] https://youtu.be/HEX9znj4-wM
[5] http://lists.openstack.org/pipermail/openstack-dev/2018-March/128232.html
[6] https://governance.openstack.org/tc/goals/queens/policy-in-code.html
[7] https://governance.openstack.org/tc/goals/queens/split-tempest-plugins.html
[8] https://governance.openstack.org/tc/goals/rocky/mox_removal.html
[9] https://etherpad.openstack.org/p/horizon-unittest-mock-migration
[10] https://governance.openstack.org/tc/goals/rocky/enable-mutable-configuration.html
[11] https://wiki.openstack.org/wiki/Cinder/how-to-contribute-a-driver
[12] https://wiki.openstack.org/wiki/Python3#Python_3_Status_of_OpenStack_projects
[13] https://review.openstack.org/#/c/504987/
[14] http://lists.openstack.org/pipermail/openstack-dev/2018-March/128064.html
[15] http://lists.openstack.org/pipermail/openstack-dev/2018-February/127421.html
[16] https://bugs.launchpad.net/manila/+bug/1747695
[17] https://docs.google.com/presentation/d/1zix__I4bUyZQpGe31Wlmv0pyvBOXacULmVbQlHaNQvo/edit?usp=sharing
[18] https://docs.openstack.org/infra/storyboard/migration.html
[19] https://storyboard-dev.openstack.org/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20180314/1d0ad58c/attachment.sig>


More information about the OpenStack-dev mailing list