[openstack-dev] [magnum] Discussion of supporting single/multiple OS distro

Steve Gordon sgordon at redhat.com
Thu Mar 17 13:22:10 UTC 2016


----- Original Message -----
> From: "Kai Qiang Wu" <wkqwu at cn.ibm.com>
> To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
> Sent: Tuesday, March 15, 2016 3:20:46 PM
> Subject: Re: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro
> 
> Hi  Stdake,
> 
> There is a patch about Atomic 23 support in Magnum.  And atomic 23 uses
> kubernetes 1.0.6, and docker 1.9.1.
> From Steve Gordon, I learnt they did have a two-weekly release. To me it
> seems each atomic 23 release not much difference, (minor change)
> The major rebases/updates may still have to wait for e.g. Fedora Atomic 24.

Well, the emphasis here is on *may*. As was pointed out in that same thread [1] rebases certainly can occur although those builds need to get karma in the fedora build system to be pushed into updates and subsequently included in the next rebuild (e.g. see [2] for a newer K8S build). The main point is that if a rebase involves introducing some element of backwards incompatibility then that would have to wait to the next major (F24) - outside of that there is some flexibility.

> So maybe we not need to test every Atomic 23 two-weekly.
> Pick one or update old, when we find it is integrated with new kubernetes
> or docker, etcd etc. If other small changes(not include security), seems
> not need to update so frequently, it can save some efforts.

A question I have posed before and that I think will need to be answered if Magnum is indeed to move towards the model for handling drivers proposed in this thread is what are the expectations Magnum has for each image/coe combination in terms of versions of key components for a given Magnum release, and what are the expectations Magnum has for same when looking forwards to say Newton.

Based on our discussion it seemed like there were some issues that mean kubernetes-1.1.0 would be preferable for example (although that it wasn't there was in fact itself a bug it would seem, but regardless it's a valid example), but is that expectation documented somewhere? It seems like based on feature roadmap it should be possible to at least put forward minimum required versions for key components (e.g. docker, k8s, flanel, etcd for the K8S COE)? This would make it easier to guide the relevant upstreams to ensure their images support the Magnum team's needs and at least minimize the need to do custom builds if not eliminate it.

-Steve

[1] https://lists.fedoraproject.org/archives/list/cloud@lists.fedoraproject.org/thread/ZJARDKSB3KGMKLACCZSQALZHV54PAJUB/
[2] https://bodhi.fedoraproject.org/updates/FEDORA-2016-a89f5ce5f4

> From:	"Steven Dake (stdake)" <stdake at cisco.com>
> To:	"OpenStack Development Mailing List (not for usage questions)"
>             <openstack-dev at lists.openstack.org>
> Date:	16/03/2016 03:23 am
> Subject:	Re: [openstack-dev] [magnum] Discussion of supporting
>             single/multiple OS distro
> 
> 
> 
> WFM as long as we stick to the spirit of the proposal and don't end up in a
> situation where there is only one distribution.  Others in the thread had
> indicated there would be only one distribution in tree, which I'd find
> disturbing for reasons already described on this thread.
> 
> While we are about it, we should move to the latest version of atomic and
> chase atomic every two weeks on their release.  Thoughts?
> 
> Regards
> -steve
> 
> 
> From: Hongbin Lu <hongbin.lu at huawei.com>
> Reply-To: "OpenStack Development Mailing List (not for usage questions)" <
> openstack-dev at lists.openstack.org>
> Date: Monday, March 14, 2016 at 8:10 PM
> To: "OpenStack Development Mailing List (not for usage questions)" <
> openstack-dev at lists.openstack.org>
> Subject: Re: [openstack-dev] [magnum] Discussion of supporting
> single/multiple OS distro
> 
> 
> 
>             From: Adrian Otto [mailto:adrian.otto at rackspace.com]
>             Sent: March-14-16 4:49 PM
>             To: OpenStack Development Mailing List (not for usage
>             questions)
>             Subject: Re: [openstack-dev] [magnum] Discussion of supporting
>             single/multiple OS distro
> 
>             Steve,
> 
>             I think you may have misunderstood our intent here. We are not
>             seeking to lock in to a single OS vendor. Each COE driver can
>             have a different OS. We can have multiple drivers per COE. The
>             point is that drivers should be simple, and therefore should
>             support one Bay node OS each. That would mean taking what we
>             have today in our Kubernetes Bay type implementation and
>             breaking it down into two drivers: one for CoreOS and another
>             for Fedora/Atomic. New drivers would start out in a contrib
>             directory where complete functional testing would not be
>             required. In order to graduate one out of contrib and into the
>             realm of support of the Magnum dev team, it would need to have
>             a full set of tests, and someone actively maintaining it.
>       OK. It sounds like the proposal allows more than one OS to be
>       in-tree, as long as the second OS goes through an incubation process.
>       If that is what you mean, it sounds reasonable to me.
> 
>             Multi-personality driers would be relatively complex. That
>             approach would slow down COE specific feature development, and
>             complicate maintenance that is needed as new versions of the
>             dependency chain are bundled in (docker, k8s, etcd, etc.). We
>             have all agreed that having integration points that allow for
>             alternate OS selection is still our direction. This follows the
>             pattern that we set previously when deciding what networking
>             options to support. We will have one that’s included as a
>             default, and a way to plug in alternates.
> 
>             Here is what I expect to see when COE drivers are implemented:
> 
>             Docker Swarm:
>             Default driver Fedora/Atomic
>             Alternate driver: TBD
> 
>             Kubernetes:
>             Default driver Fedora/Atomic
>             Alternate driver: CoreOS
> 
>             Apache Mesos/Marathon:
>             Default driver: Ubuntu
>             Alternate driver: TBD
> 
>             We can allow an arbitrary number of alternates. Those TBD items
>             can be initially added to the contrib directory, and with the
>             right level of community support can be advanced to defaults if
>             shown to work better, be more straightforward to maintain, be
>             more secure, or whatever criteria is important to us when
>             presented with the choice. Such criteria will be subject to
>             community consensus. This should allow for free experimentation
>             with alternates to allow for innovation. See how this is not
>             locking in a single OS vendor?
> 
>             Adrian
> 
>                   On Mar 14, 2016, at 12:41 PM, Steven Dake (stdake) <
>                   stdake at cisco.com> wrote:
> 
>                   Hongbin,
> 
>                   When we are at a disagreement in the Kolla core team, we
>                   have the Kolla core reviewers vote on the matter. This is
>                   typical standard OpenStack best practice.
> 
>                   I think the vote would be something like
>                   "Implement one OS/COE/network/storage prototype, or
>                   implement many."
> 
>                   I don't have a horse in this race, but I think it would
>                   be seriously damaging to Magnum to lock in to a single
>                   vendor.
> 
>                   Regards
>                   -steve
> 
> 
>                   From: Hongbin Lu <hongbin.lu at huawei.com>
>                   Reply-To: "OpenStack Development Mailing List (not for
>                   usage questions)" <openstack-dev at lists.openstack.org>
>                   Date: Monday, March 7, 2016 at 10:06 AM
>                   To: "OpenStack Development Mailing List (not for usage
>                   questions)" <openstack-dev at lists.openstack.org>
>                   Subject: Re: [openstack-dev] [magnum] Discussion of
>                   supporting single/multiple OS distro
> 
> 
> 
>                          From: Corey O'Brien [mailto:coreypobrien at gmail.com
>                          ]
>                          Sent: March-07-16 8:11 AM
>                          To: OpenStack Development Mailing List (not for
>                          usage questions)
>                          Subject: Re: [openstack-dev] [magnum] Discussion
>                          of supporting single/multiple OS distro
> 
>                          Hongbin, I think the offer to support different OS
>                          options is a perfect example both of what we want
>                          and what we don't want. We definitely want to
>                          allow for someone like yourself to maintain
>                          templates for whatever OS they want and to have
>                          that option be easily integrated in to a Magnum
>                          deployment. However, when developing features or
>                          bug fixes, we can't wait for you to have time to
>                          add it for whatever OS you are promising to
>                          maintain.
>                    It might be true that supporting additional OS could
>                    slow down the development speed, but the key question is
>                    how much the impact will be. Does it outweigh the
>                    benefits? IMO, the impact doesn’t seem to be
>                    significant, given the fact that most features and bug
>                    fixes are OS agnostic. Also, keep in mind that every
>                    features we introduced (variety of COEs, variety of Nova
>                    virt-driver, variety of network driver, variety of
>                    volume driver, variety of …) incurs a maintenance
>                    overhead. If you want an optimal development speed, we
>                    will be limited to support a single COE/virt
>                    driver/network driver/volume driver. I guess that is not
>                    the direction we like to be?
> 
>                          Instead, we would all be forced to develop the
>                          feature for that OS as well. If every member of
>                          the team had a special OS like that we'd all have
>                          to maintain all of them.
>                    To be clear, I don’t have a special OS, I guess neither
>                    do others who disagreed in this thread.
> 
>                          Alternatively, what was agreed on by most at the
>                          midcycle was that if someone like yourself wanted
>                          to support a specific OS option, we would have an
>                          easy place for those contributions to go without
>                          impacting the rest of the team. The team as a
>                          whole would agree to develop all features for at
>                          least the reference OS.
>                    Could we re-confirm that this is a team agreement? There
>                    is no harm to re-confirm it in the design summit/ML/team
>                    meeting. Frankly, it doesn’t seem to be.
> 
>                          Then individuals or companies who are passionate
>                          about an alternative OS can develop the features
>                          for that OS.
> 
>                          Corey
> 
>                          On Sat, Mar 5, 2016 at 12:30 AM Hongbin Lu <
>                          hongbin.lu at huawei.com> wrote:
> 
> 
>                                From: Adrian Otto [mailto:
>                                adrian.otto at rackspace.com]
>                                Sent: March-04-16 6:31 PM
> 
>                                To: OpenStack Development Mailing List (not
>                                for usage questions)
>                                Subject: Re: [openstack-dev] [magnum]
>                                Discussion of supporting single/multiple OS
>                                distro
> 
>                                Steve,
> 
>                                      On Mar 4, 2016, at 2:41 PM, Steven
>                                      Dake (stdake) <stdake at cisco.com>
>                                      wrote:
> 
>                                      From: Adrian Otto <
>                                      adrian.otto at rackspace.com>
>                                      Reply-To: "OpenStack Development
>                                      Mailing List (not for usage
>                                      questions)" <
>                                      openstack-dev at lists.openstack.org>
>                                      Date: Friday, March 4, 2016 at 12:48
>                                      PM
>                                      To: "OpenStack Development Mailing
>                                      List (not for usage questions)" <
>                                      openstack-dev at lists.openstack.org>
>                                      Subject: Re: [openstack-dev] [magnum]
>                                      Discussion of supporting
>                                      single/multiple OS distro
> 
>                                       Hongbin,
> 
>                                       To be clear, this pursuit is not
>                                       about what OS options cloud operators
>                                       can select. We will be offering a
>                                       method of choice. It has to do with
>                                       what we plan to build comprehensive
>                                       testing for,
>                                 This is easy. Once we build comprehensive
>                                 tests for the first OS, just re-run it for
>                                 other OS(s).
> 
>                                       and the implications that has on our
>                                       pace of feature development. My
>                                       guidance here is that we resist the
>                                       temptation to create a system with
>                                       more permutations than we can
>                                       possibly support. The relation
>                                       between bay node OS, Heat Template,
>                                       Heat Template parameters, COE, and
>                                       COE dependencies (could-init, docker,
>                                       flannel, etcd, etc.) are
>                                       multiplicative in nature. From the
>                                       mid cycle, it was clear to me that:
> 
>                                       1) We want to test at least one OS
>                                       per COE from end-to-end with
>                                       comprehensive functional tests.
>                                       2) We want to offer clear and precise
>                                       integration points to allow cloud
>                                       operators to substitute their own OS
>                                       in place of whatever one is the
>                                       default for the given COE.
> 
>                                      A COE shouldn’t have a default
>                                      necessarily that locks out other
>                                      defaults.  Magnum devs are the experts
>                                      in how these systems operate, and as
>                                      such need to take on the
>                                      responsibility of the implementation
>                                      for multi-os support.
> 
>                                       3) We want to control the total
>                                       number of configuration permutations
>                                       to simplify our efforts as a project.
>                                       We agreed that gate testing all
>                                       possible permutations is intractable.
> 
>                                      I disagree with this point, but I
>                                      don't have the bandwidth available to
>                                      prove it ;)
> 
>                                That’s exactly my point. It takes a chunk of
>                                human bandwidth to carry that
>                                responsibility. If we had a system engineer
>                                assigned from each of the various upstream
>                                OS distros working with Magnum, this would
>                                not be a big deal. Expecting our current
>                                contributors to support a variety of OS
>                                variants is not realistic.
>                          You have my promise to support an additional OS
>                          for 1 or 2 popular COEs.
> 
>                                Change velocity among all the components we
>                                rely on has been very high. We see some of
>                                our best contributors frequently sidetracked
>                                in the details of the distros releasing
>                                versions of code that won’t work with ours.
>                                We want to upgrade a component to add a new
>                                feature, but struggle to because the new
>                                release of the distro that offers that
>                                component is otherwise incompatible.
>                                Multiply this by more distros, and we expect
>                                a real problem.
>                          At Magnum upstream, the overhead doesn’t seem to
>                          come from the OS. Perhaps, that is specific to
>                          your downstream?
> 
>                                There is no harm if you have 30 gates
>                                running the various combinations.
>                                Infrastructure can handle the load.  Whether
>                                devs have the cycles to make a fully
>                                bulletproof gate is the question I think you
>                                answered with the word intractable.
> 
>                                Actually, our existing gate tests are really
>                                stressing out our CI infra. At least one of
>                                the new infrastructure providers that
>                                replaced HP have equipment that runs
>                                considerably slower. For example, our swam
>                                functional gate now frequently fails because
>                                it can’t finish before the allowed time
>                                limit of 2 hours where it could finish
>                                substantially faster before. If we expanded
>                                the workload considerably, we might quickly
>                                work to the detriment of other projects by
>                                perpetually clogging the CI pipelines. We
>                                want to be a good citizen of the openstack
>                                CI community. Testing configuration of third
>                                party software should be done with third
>                                party CI setups. That’s one of the reasons
>                                those exist. Ideally, each would be
>                                maintained by those who have a strategic
>                                (commercial?) interest in support for that
>                                particular OS.
> 
>                                      I can tell you in Kolla we spend a lot
>                                      of cycles just getting basic gating
>                                      going of building containers and then
>                                      deploying them.  We have even made
>                                      inroads into testing the deployment.
>                                      We do CentOS, Ubuntu, and soon Oracle
>                                      Linux, for both source and binary and
>                                      build and deploy.  Lots of gates and
>                                      if they aren't green we know the patch
>                                      is wrong.
> 
>                                Remember that COE’s are tested on nova
>                                instances within heat stacks. Starting lots
>                                of nova instances within devstack in the
>                                gates is problematic. We are looking into
>                                using a libvirt-lxc instance type from nova
>                                instead of a libvirt-kvm instance to help
>                                alleviate this. Until then, limiting the
>                                scope of our gate tests is appropriate. We
>                                will continue our efforts to make them
>                                reasonably efficient.
> 
>                                Thanks,
> 
>                                Adrian
> 
> 
>                                      Regards
>                                      -steve
> 
> 
>                                       Note that it will take a thoughtful
>                                       approach (subject to discussion) to
>                                       balance these interests. Please take
>                                       a moment to review the interest
>                                       above. Do you or others disagree with
>                                       these? If so, why?
> 
>                                       Adrian
> 
>                                             On Mar 4, 2016, at 9:09 AM,
>                                             Hongbin Lu <
>                                             hongbin.lu at huawei.com> wrote:
> 
>                                             I don’t think there is any
>                                             consensus on supporting single
>                                             distro. There are multiple
>                                             disagreements on this thread,
>                                             including several senior team
>                                             members and a project
>                                             co-founder. This topic should
>                                             be re-discussed (possibly at
>                                             the design summit).
> 
>                                             Best regards,
>                                             Hongbin
> 
>                                             From: Corey O'Brien [
>                                             mailto:coreypobrien at gmail.com]
>                                             Sent: March-04-16 11:37 AM
>                                             To: OpenStack Development
>                                             Mailing List (not for usage
>                                             questions)
>                                             Subject: Re: [openstack-dev]
>                                             [magnum] Discussion of
>                                             supporting single/multiple OS
>                                             distro
> 
>                                             I don't think anyone is saying
>                                             that code should somehow block
>                                             support for multiple distros.
>                                             The discussion at midcycle was
>                                             about what the we should gate
>                                             on and ensure feature parity
>                                             for as a team. Ideally, we'd
>                                             like to get support for every
>                                             distro, I think, but no one
>                                             wants to have that many gates.
>                                             Instead, the consensus at the
>                                             midcycle was to have 1
>                                             reference distro for each COE,
>                                             gate on those and develop
>                                             features there, and then have
>                                             any other distros be maintained
>                                             by those in the community that
>                                             are passionate about them.
> 
>                                             The issue also isn't about how
>                                             difficult or not it is. The
>                                             problem we want to avoid is
>                                             spending precious time
>                                             guaranteeing that new features
>                                             and bug fixes make it through
>                                             multiple distros.
> 
>                                             Corey
> 
>                                             On Fri, Mar 4, 2016 at 11:18 AM
>                                             Steven Dake (stdake) <
>                                             stdake at cisco.com> wrote:
>                                              My position on this is simple.
> 
>                                              Operators are used to using
>                                              specific distros because that
>                                              is what they used in the
>                                              90s,and the 00s, and the 10s.
>                                              Yes, 25 years of using a
>                                              distro, and you learn it
>                                              inside and out.  This means
>                                              you don't want to relearn a
>                                              new distro, especially if your
>                                              an RPM user going to DEB or a
>                                              DEB user going to RPM.  These
>                                              are non-starter options for
>                                              operators, and as a result,
>                                              mean that distro choice is a
>                                              must.  Since CoreOS is a new
>                                              OS in the marketplace, it may
>                                              make sense to consider placing
>                                              it in "third" position in
>                                              terms of support.
> 
>                                              Besides that problem, various
>                                              distribution companies will
>                                              only support distros running
>                                              in Vms if it matches the host
>                                              kernel, which makes total
>                                              sense to me.  This means on an
>                                              Ubuntu host if I want support
>                                              I need to run Ubuntu vms, on a
>                                              RHEL host I want to run RHEL
>                                              vms, because, hey, I want my
>                                              issues supported.
> 
>                                              For these reasons and these
>                                              reasons alone, there is no
>                                              good rationale to remove
>                                              multi-distro support  from
>                                              Magnum.  All I've heard in
>                                              this thread so far is "its too
>                                              hard".  Its not too hard,
>                                              especially with Heat
>                                              conditionals making their way
>                                              into Mitaka.
> 
>                                              Regards
>                                              -steve
> 
>                                              From: Hongbin Lu <
>                                              hongbin.lu at huawei.com>
>                                              Reply-To: "
>                                              openstack-dev at lists.openstack.org
>                                              " <
>                                              openstack-dev at lists.openstack.org
>                                              >
>                                              Date: Monday, February 29,
>                                              2016 at 9:40 AM
>                                              To: "
>                                              openstack-dev at lists.openstack.org
>                                              " <
>                                              openstack-dev at lists.openstack.org
>                                              >
>                                              Subject: [openstack-dev]
>                                              [magnum] Discussion of
>                                              supporting single/multiple OS
>                                              distro
> 
>                                              Hi team,
> 
>                                              This is a continued discussion
>                                              from a review [1]. Corey
>                                              O'Brien suggested to have
>                                              Magnum support a single OS
>                                              distro (Atomic). I disagreed.
>                                              I think we should bring the
>                                              discussion to here to get
>                                              broader set of inputs.
> 
>                                              Corey O'Brien
>                                              From the midcycle, we decided
>                                              we weren't going to continue
>                                              to support 2 different
>                                              versions of the k8s template.
>                                              Instead, we were going to
>                                              maintain the Fedora Atomic
>                                              version of k8s and remove the
>                                              coreos templates from the
>                                              tree. I don't think we should
>                                              continue to develop features
>                                              for coreos k8s if that is
>                                              true.
>                                              In addition, I don't think we
>                                              should break the coreos
>                                              template by adding the trust
>                                              token as a heat parameter.
> 
>                                              Hongbin Lu
>                                              I was on the midcycle and I
>                                              don't remember any decision to
>                                              remove CoreOS support. Why you
>                                              want to remove CoreOS
>                                              templates from the tree.
>                                              Please note that this is a
>                                              very big decision and please
>                                              discuss it with the team
>                                              thoughtfully and make sure
>                                              everyone agree.
> 
>                                              Corey O'Brien
>                                              Removing the coreos templates
>                                              was a part of the COE drivers
>                                              decision. Since each COE
>                                              driver will only support 1
>                                              distro+version+coe we
>                                              discussed which ones to
>                                              support in tree. The decision
>                                              was that instead of trying to
>                                              support every distro and every
>                                              version for every coe, the
>                                              magnum tree would only have
>                                              support for 1 version of 1
>                                              distro for each of the 3 COEs
>                                              (swarm/docker/mesos). Since we
>                                              already are going to support
>                                              Atomic for swarm, removing
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 

-- 
Steve Gordon,
Principal Product Manager,
Red Hat OpenStack Platform



More information about the OpenStack-dev mailing list