[openstack-dev] [magnum] Discussion of supporting single/multiple OS distro

Steven Dake (stdake) stdake at cisco.com
Tue Mar 15 19:21:07 UTC 2016


WFM as long as we stick to the spirit of the proposal and don't end up in a situation where there is only one distribution.  Others in the thread had indicated there would be only one distribution in tree, which I'd find disturbing for reasons already described on this thread.

While we are about it, we should move to the latest version of atomic and chase atomic every two weeks on their release.  Thoughts?

Regards
-steve


From: Hongbin Lu <hongbin.lu at huawei.com<mailto:hongbin.lu at huawei.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Monday, March 14, 2016 at 8:10 PM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro



From: Adrian Otto [mailto:adrian.otto at rackspace.com]
Sent: March-14-16 4:49 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro

Steve,

I think you may have misunderstood our intent here. We are not seeking to lock in to a single OS vendor. Each COE driver can have a different OS. We can have multiple drivers per COE. The point is that drivers should be simple, and therefore should support one Bay node OS each. That would mean taking what we have today in our Kubernetes Bay type implementation and breaking it down into two drivers: one for CoreOS and another for Fedora/Atomic. New drivers would start out in a contrib directory where complete functional testing would not be required. In order to graduate one out of contrib and into the realm of support of the Magnum dev team, it would need to have a full set of tests, and someone actively maintaining it.
OK. It sounds like the proposal allows more than one OS to be in-tree, as long as the second OS goes through an incubation process. If that is what you mean, it sounds reasonable to me.

Multi-personality driers would be relatively complex. That approach would slow down COE specific feature development, and complicate maintenance that is needed as new versions of the dependency chain are bundled in (docker, k8s, etcd, etc.). We have all agreed that having integration points that allow for alternate OS selection is still our direction. This follows the pattern that we set previously when deciding what networking options to support. We will have one that’s included as a default, and a way to plug in alternates.

Here is what I expect to see when COE drivers are implemented:

Docker Swarm:
Default driver Fedora/Atomic
Alternate driver: TBD

Kubernetes:
Default driver Fedora/Atomic
Alternate driver: CoreOS

Apache Mesos/Marathon:
Default driver: Ubuntu
Alternate driver: TBD

We can allow an arbitrary number of alternates. Those TBD items can be initially added to the contrib directory, and with the right level of community support can be advanced to defaults if shown to work better, be more straightforward to maintain, be more secure, or whatever criteria is important to us when presented with the choice. Such criteria will be subject to community consensus. This should allow for free experimentation with alternates to allow for innovation. See how this is not locking in a single OS vendor?

Adrian

On Mar 14, 2016, at 12:41 PM, Steven Dake (stdake) <stdake at cisco.com<mailto:stdake at cisco.com>> wrote:

Hongbin,

When we are at a disagreement in the Kolla core team, we have the Kolla core reviewers vote on the matter. This is typical standard OpenStack best practice.

I think the vote would be something like
"Implement one OS/COE/network/storage prototype, or implement many."

I don't have a horse in this race, but I think it would be seriously damaging to Magnum to lock in to a single vendor.

Regards
-steve


From: Hongbin Lu <hongbin.lu at huawei.com<mailto:hongbin.lu at huawei.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Monday, March 7, 2016 at 10:06 AM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro



From: Corey O'Brien [mailto:coreypobrien at gmail.com]
Sent: March-07-16 8:11 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro

Hongbin, I think the offer to support different OS options is a perfect example both of what we want and what we don't want. We definitely want to allow for someone like yourself to maintain templates for whatever OS they want and to have that option be easily integrated in to a Magnum deployment. However, when developing features or bug fixes, we can't wait for you to have time to add it for whatever OS you are promising to maintain.
It might be true that supporting additional OS could slow down the development speed, but the key question is how much the impact will be. Does it outweigh the benefits? IMO, the impact doesn’t seem to be significant, given the fact that most features and bug fixes are OS agnostic. Also, keep in mind that every features we introduced (variety of COEs, variety of Nova virt-driver, variety of network driver, variety of volume driver, variety of …) incurs a maintenance overhead. If you want an optimal development speed, we will be limited to support a single COE/virt driver/network driver/volume driver. I guess that is not the direction we like to be?

Instead, we would all be forced to develop the feature for that OS as well. If every member of the team had a special OS like that we'd all have to maintain all of them.
To be clear, I don’t have a special OS, I guess neither do others who disagreed in this thread.

Alternatively, what was agreed on by most at the midcycle was that if someone like yourself wanted to support a specific OS option, we would have an easy place for those contributions to go without impacting the rest of the team. The team as a whole would agree to develop all features for at least the reference OS.
Could we re-confirm that this is a team agreement? There is no harm to re-confirm it in the design summit/ML/team meeting. Frankly, it doesn’t seem to be.

Then individuals or companies who are passionate about an alternative OS can develop the features for that OS.

Corey

On Sat, Mar 5, 2016 at 12:30 AM Hongbin Lu <hongbin.lu at huawei.com<mailto:hongbin.lu at huawei.com>> wrote:


From: Adrian Otto [mailto:adrian.otto at rackspace.com<mailto:adrian.otto at rackspace.com>]
Sent: March-04-16 6:31 PM

To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro

Steve,

On Mar 4, 2016, at 2:41 PM, Steven Dake (stdake) <stdake at cisco.com<mailto:stdake at cisco.com>> wrote:

From: Adrian Otto <adrian.otto at rackspace.com<mailto:adrian.otto at rackspace.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Friday, March 4, 2016 at 12:48 PM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro

Hongbin,

To be clear, this pursuit is not about what OS options cloud operators can select. We will be offering a method of choice. It has to do with what we plan to build comprehensive testing for,
This is easy. Once we build comprehensive tests for the first OS, just re-run it for other OS(s).

and the implications that has on our pace of feature development. My guidance here is that we resist the temptation to create a system with more permutations than we can possibly support. The relation between bay node OS, Heat Template, Heat Template parameters, COE, and COE dependencies (could-init, docker, flannel, etcd, etc.) are multiplicative in nature. From the mid cycle, it was clear to me that:

1) We want to test at least one OS per COE from end-to-end with comprehensive functional tests.
2) We want to offer clear and precise integration points to allow cloud operators to substitute their own OS in place of whatever one is the default for the given COE.

A COE shouldn’t have a default necessarily that locks out other defaults.  Magnum devs are the experts in how these systems operate, and as such need to take on the responsibility of the implementation for multi-os support.

3) We want to control the total number of configuration permutations to simplify our efforts as a project. We agreed that gate testing all possible permutations is intractable.

I disagree with this point, but I don't have the bandwidth available to prove it ;)

That’s exactly my point. It takes a chunk of human bandwidth to carry that responsibility. If we had a system engineer assigned from each of the various upstream OS distros working with Magnum, this would not be a big deal. Expecting our current contributors to support a variety of OS variants is not realistic.
You have my promise to support an additional OS for 1 or 2 popular COEs.

Change velocity among all the components we rely on has been very high. We see some of our best contributors frequently sidetracked in the details of the distros releasing versions of code that won’t work with ours. We want to upgrade a component to add a new feature, but struggle to because the new release of the distro that offers that component is otherwise incompatible. Multiply this by more distros, and we expect a real problem.
At Magnum upstream, the overhead doesn’t seem to come from the OS. Perhaps, that is specific to your downstream?

There is no harm if you have 30 gates running the various combinations.  Infrastructure can handle the load.  Whether devs have the cycles to make a fully bulletproof gate is the question I think you answered with the word intractable.

Actually, our existing gate tests are really stressing out our CI infra. At least one of the new infrastructure providers that replaced HP have equipment that runs considerably slower. For example, our swam functional gate now frequently fails because it can’t finish before the allowed time limit of 2 hours where it could finish substantially faster before. If we expanded the workload considerably, we might quickly work to the detriment of other projects by perpetually clogging the CI pipelines. We want to be a good citizen of the openstack CI community. Testing configuration of third party software should be done with third party CI setups. That’s one of the reasons those exist. Ideally, each would be maintained by those who have a strategic (commercial?) interest in support for that particular OS.

I can tell you in Kolla we spend a lot of cycles just getting basic gating  going of building containers and then deploying them.  We have even made inroads into testing the deployment.  We do CentOS, Ubuntu, and soon Oracle Linux, for both source and binary and build and deploy.  Lots of gates and if they aren't green we know the patch is wrong.

Remember that COE’s are tested on nova instances within heat stacks. Starting lots of nova instances within devstack in the gates is problematic. We are looking into using a libvirt-lxc instance type from nova instead of a libvirt-kvm instance to help alleviate this. Until then, limiting the scope of our gate tests is appropriate. We will continue our efforts to make them reasonably efficient.

Thanks,

Adrian


Regards
-steve


Note that it will take a thoughtful approach (subject to discussion) to balance these interests. Please take a moment to review the interest above. Do you or others disagree with these? If so, why?

Adrian

On Mar 4, 2016, at 9:09 AM, Hongbin Lu <hongbin.lu at huawei.com<mailto:hongbin.lu at huawei.com>> wrote:

I don’t think there is any consensus on supporting single distro. There are multiple disagreements on this thread, including several senior team members and a project co-founder. This topic should be re-discussed (possibly at the design summit).

Best regards,
Hongbin

From: Corey O'Brien [mailto:coreypobrien at gmail.com]
Sent: March-04-16 11:37 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro

I don't think anyone is saying that code should somehow block support for multiple distros. The discussion at midcycle was about what the we should gate on and ensure feature parity for as a team. Ideally, we'd like to get support for every distro, I think, but no one wants to have that many gates. Instead, the consensus at the midcycle was to have 1 reference distro for each COE, gate on those and develop features there, and then have any other distros be maintained by those in the community that are passionate about them.

The issue also isn't about how difficult or not it is. The problem we want to avoid is spending precious time guaranteeing that new features and bug fixes make it through multiple distros.

Corey

On Fri, Mar 4, 2016 at 11:18 AM Steven Dake (stdake) <stdake at cisco.com<mailto:stdake at cisco.com>> wrote:
My position on this is simple.

Operators are used to using specific distros because that is what they used in the 90s,and the 00s, and the 10s.  Yes, 25 years of using a distro, and you learn it inside and out.  This means you don't want to relearn a new distro, especially if your an RPM user going to DEB or a DEB user going to RPM.  These are non-starter options for operators, and as a result, mean that distro choice is a must.  Since CoreOS is a new OS in the marketplace, it may make sense to consider placing it in "third" position in terms of support.

Besides that problem, various distribution companies will only support distros running in Vms if it matches the host kernel, which makes total sense to me.  This means on an Ubuntu host if I want support I need to run Ubuntu vms, on a RHEL host I want to run RHEL vms, because, hey, I want my issues supported.

For these reasons and these reasons alone, there is no good rationale to remove multi-distro support  from Magnum.  All I've heard in this thread so far is "its too hard".  Its not too hard, especially with Heat conditionals making their way into Mitaka.

Regards
-steve

From: Hongbin Lu <hongbin.lu at huawei.com<mailto:hongbin.lu at huawei.com>>
Reply-To: "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Monday, February 29, 2016 at 9:40 AM
To: "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro

Hi team,

This is a continued discussion from a review [1]. Corey O'Brien suggested to have Magnum support a single OS distro (Atomic). I disagreed. I think we should bring the discussion to here to get broader set of inputs.

Corey O'Brien
>From the midcycle, we decided we weren't going to continue to support 2 different versions of the k8s template. Instead, we were going to maintain the Fedora Atomic version of k8s and remove the coreos templates from the tree. I don't think we should continue to develop features for coreos k8s if that is true.
In addition, I don't think we should break the coreos template by adding the trust token as a heat parameter.

Hongbin Lu
I was on the midcycle and I don't remember any decision to remove CoreOS support. Why you want to remove CoreOS templates from the tree. Please note that this is a very big decision and please discuss it with the team thoughtfully and make sure everyone agree.

Corey O'Brien
Removing the coreos templates was a part of the COE drivers decision. Since each COE driver will only support 1 distro+version+coe we discussed which ones to support in tree. The decision was that instead of trying to support every distro and every version for every coe, the magnum tree would only have support for 1 version of 1 distro for each of the 3 COEs (swarm/docker/mesos). Since we already are going to support Atomic for swarm, removing coreos and keeping Atomic for kubernetes was the favored choice.

Hongbin Lu
Strongly disagree. It is a huge risk to support a single distro. The selected distro could die in the future. Who knows. Why make Magnum take this huge risk? Again, the decision of supporting single distro is a very big decision. Please bring it up to the team and have it discuss thoughtfully before making any decision. Also, Magnum doesn't have to support every distro and every version for every coe, but should support *more than one* popular distro for some COEs (especially for the popular COEs).

Corey O'Brien
The discussion at the midcycle started from the idea of adding support for RHEL and CentOS. We all discussed and decided that we wouldn't try to support everything in tree. Magnum would provide support in-tree for 1 per COE and the COE driver interface would allow others to add support for their preferred distro out of tree.

Hongbin Lu
I agreed the part that "we wouldn't try to support everything in tree". That doesn't imply the decision to support single distro. Again, support single distro is a huge risk. Why make Magnum take this huge risk?

[1] https://review.openstack.org/#/c/277284/

Best regards,
Hongbin
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org/?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org<mailto:OpenStack-dev-request at lists.openstack.org>?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org<mailto:OpenStack-dev-request at lists.openstack.org>?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org/?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org<mailto:OpenStack-dev-request at lists.openstack.org>?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160315/8646adcf/attachment.html>


More information about the OpenStack-dev mailing list