[TripleO] criteria for deprecating services

Mark Goddard mark at stackhpc.com
Tue Mar 5 08:01:29 UTC 2019


On Mon, 4 Mar 2019 at 17:52, Ben Nemec <openstack at nemebean.com> wrote:

>
>
> On 3/4/19 9:16 AM, Alex Schultz wrote:
> > On Mon, Mar 4, 2019 at 6:11 AM Dan Prince <dprince at redhat.com> wrote:
> >>
> >> On Fri, 2019-03-01 at 15:43 -0700, Alex Schultz wrote:
> >>> On Fri, Mar 1, 2019 at 3:24 PM Dan Prince <dprince at redhat.com> wrote:
> >>>> Recently we've been cleaning house in some of of the TripleO
> >>>> supported
> >>>> services.
> >>>>
> >>>> We removed MongoDB as RDO was also dropping it. I guess we needed
> >>>> to
> >>>> follow suite as our CI is also based on the packages there.
> >>>>
> >>>> For other services (Designate for example) if the RDO packages
> >>>> exist
> >>>> and we already have support do we really need to deprecate them?
> >>>> Having
> >>>> the ability to deploy some of the lesser used but still active
> >>>> OpenStack projects with our deployment framework is nice for
> >>>> developers
> >>>> and users alike. Especially when you want to try out a new
> >>>> services.
> >>>>
> >>>
> >>> It's the long term maintenance of them to ensure they continue to
> >>> work
> >>> (packaging/promotions/requirement syncing). If no one is watching
> >>> them
> >>> and making sure they still work, I'm not sure it's worth saying they
> >>> are "supported". Much like the baremetal support that we had, when we
> >>> drop any testing we might as well mark them deprecated since there is
> >>> no way to know if they still "work" the next day.  Adding and
> >>> maintaining services is non-trivial so unless it's actively used, I
> >>> don't think it's necessarily a bad thing to trim our "supported" list
> >>> to a set of known good services.
> >>>
> >>> Just in the last two or three weeks I've had to go address packaging
> >>> problems with Vitrage[0] and Tacker[1] because requirements changed
> >>> in
> >>> the project and the packages weren't kept up to date so the puppet
> >>> module CI was broken.  No one noticed this was broken until we went
> >>> to
> >>> go update some unrelated things and found out that they were broken.
> >>> The same thing happens in TripleO too where a breakage in a less than
> >>> supported service takes away time for more important work.  The cost
> >>> to keep these things working is > 0.
> >>
> >> Agree the cost isn't zero. But it also isn't high. And there is value
> >> to a project having a deep bench of services from which to choose and
> >> try out. The existance of at least some "niche" services in TripleO
> >> provides some value to our users and perhaps even an argument to use
> >> TripleO as it would be considered a feature to be able to try out these
> >> services. Perhaps even partially implemented ones in some cases still
> >> have value (no HA support for example).
> >>
> >
> > So I gave it some thought and rather than just deprecating for
> > removal, could we instead mark them as experimental and treat them as
> > such?  Yes you're right that folks might want to try these services,
> > however there is no clear definition of a service that should always
> > work vs a service that might work.  From an end user perspective if
> > they see that something like Congress is defined and they try and
> > consume it only to find out it doesn't work or isn't configured
> > correctly then that is a poor experience.   I also don't think someone
> > who is new to TripleO who wants to try out a service will likely be
> > able to figure out why it's not working and just think "TripleO
> > doesn't work".  Can we move services which we have no guarentee to be
> > working (no testing/no owners) to a /experimental/ folder to indicate
> > the service may or may not work?
>
> As someone who wrote the templates for a now-deprecated service I like
> the idea of them living on in some format. On the other hand, in the
> course of writing the Designate templates they were broken multiple
> times by TripleO changes to the service interfaces. If a service isn't
> being tested regularly I suspect there's little chance of it continuing
> to work long-term without _someone_ looking after it.
>
> Heck, Designate _is_ in the gate right now and it still broke recently
> in real deployments with separate control and compute nodes. Without
> someone paying attention to it I don't know how that would ever have
> been found or fixed.
>
> I think my recommendation would be to keep James's maintainer
> requirement for even experimental services, but maybe instead of gating
> on them just have a periodic job that runs with them enabled once a
> night and emails the maintainer of record if it fails. That way they
> can't block other work and aren't consuming much in the way of ci
> resources, but they can be maintained with minimal effort. It might
> encourage more people to sign up as maintainers if they know breakages
> in the service aren't going to force them to drop everything to unblock
> the gate.
>

In the kolla project we run some of the service-specific jobs only when
relevant files have changed, using Zuuls files/irrelevant-files
configuration syntax. This can be combined with a periodic job to catch
code rot.
Mark


> Or maybe that will just result in all the periodic jobs failing
> indefinitely, but if that happens then you know the maintainer isn't
> maintaining anymore and you can deprecate the service.
>
> I'm also not sure how much burden that would put on the ci squad to set
> up such jobs. That's another discussion we'd need to have.
>
> >
> >
> >> I just spent the time to "flatten" many of these services thinking they
> >> would stay for awhile. Many of us are willing to chip in to keep some
> >> of these I think.
> >>
> >>>
> >>> [0] https://review.rdoproject.org/r/#/c/19006/
> >>> [1] https://review.rdoproject.org/r/#/c/18830/
> >>>
> >>>> Rather than debate these things ad-hoc on some of the various
> >>>> reviews I
> >>>> figured it work asking here. Do we have a criteria for when it is
> >>>> appropriate to deprecate a service that is implemented and fully
> >>>> working? Is it costing us that much in terms of CI and resources to
> >>>> keep a few of these services around?
> >>>>
> >>>
> >>> Do you have a definition of "fully implemented"?  Some of the
> >>> services
> >>> that have been added were added but never actually tested. Designate
> >>> only recently was covered with testing.  Things like Congress have
> >>> never been tested (like via tempest) and we've only done an install
> >>> but no actual service verification.  I would say Designate might be
> >>> closer to fully implemented but Tacker/Congress would not be
> >>> considered implemented.
> >>>
> >>> Given that we've previously been asked to reduce our CI footprint, I
> >>> think it's hard to say is it really costing that much because the
> >>> answer would be yes if it has even the slightest impact.  The fewer
> >>> services we support, the less scenarios we have to have, the less
> >>> complex deployments we have and the less resource it consumes.
> >>
> >> For the services we agree to keep we could always run them in a lower
> >> bandwidth CI framework. Something like periodic jobs. Understood these
> >> would occasionally get broken but the upstream feedback loop would at
> >> least exist and the services could stay. And we'd still be able to
> >> reduce our CI resources as well.
> >>
> >>>
> >>> Thanks,
> >>> -Alex
> >>>
> >>>> Dan
> >>>>
> >>>>
> >>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190305/a9973705/attachment-0001.html>


More information about the openstack-discuss mailing list