[openstack-dev] [magnum] Discussion of supporting single/multiple OS distro
Adrian Otto
adrian.otto at rackspace.com
Fri Mar 4 23:31:17 UTC 2016
Steve,
On Mar 4, 2016, at 2:41 PM, Steven Dake (stdake) <stdake at cisco.com<mailto:stdake at cisco.com>> wrote:
From: Adrian Otto <adrian.otto at rackspace.com<mailto:adrian.otto at rackspace.com>>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Friday, March 4, 2016 at 12:48 PM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: Re: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro
Hongbin,
To be clear, this pursuit is not about what OS options cloud operators can select. We will be offering a method of choice. It has to do with what we plan to build comprehensive testing for, and the implications that has on our pace of feature development. My guidance here is that we resist the temptation to create a system with more permutations than we can possibly support. The relation between bay node OS, Heat Template, Heat Template parameters, COE, and COE dependencies (could-init, docker, flannel, etcd, etc.) are multiplicative in nature. From the mid cycle, it was clear to me that:
1) We want to test at least one OS per COE from end-to-end with comprehensive functional tests.
2) We want to offer clear and precise integration points to allow cloud operators to substitute their own OS in place of whatever one is the default for the given COE.
A COE shouldn’t have a default necessarily that locks out other defaults. Magnum devs are the experts in how these systems operate, and as such need to take on the responsibility of the implementation for multi-os support.
3) We want to control the total number of configuration permutations to simplify our efforts as a project. We agreed that gate testing all possible permutations is intractable.
I disagree with this point, but I don't have the bandwidth available to prove it ;)
That’s exactly my point. It takes a chunk of human bandwidth to carry that responsibility. If we had a system engineer assigned from each of the various upstream OS distros working with Magnum, this would not be a big deal. Expecting our current contributors to support a variety of OS variants is not realistic. Change velocity among all the components we rely on has been very high. We see some of our best contributors frequently sidetracked in the details of the distros releasing versions of code that won’t work with ours. We want to upgrade a component to add a new feature, but struggle to because the new release of the distro that offers that component is otherwise incompatible. Multiply this by more distros, and we expect a real problem.
There is no harm if you have 30 gates running the various combinations. Infrastructure can handle the load. Whether devs have the cycles to make a fully bulletproof gate is the question I think you answered with the word intractable.
Actually, our existing gate tests are really stressing out our CI infra. At least one of the new infrastructure providers that replaced HP have equipment that runs considerably slower. For example, our swam functional gate now frequently fails because it can’t finish before the allowed time limit of 2 hours where it could finish substantially faster before. If we expanded the workload considerably, we might quickly work to the detriment of other projects by perpetually clogging the CI pipelines. We want to be a good citizen of the openstack CI community. Testing configuration of third party software should be done with third party CI setups. That’s one of the reasons those exist. Ideally, each would be maintained by those who have a strategic (commercial?) interest in support for that particular OS.
I can tell you in Kolla we spend a lot of cycles just getting basic gating going of building containers and then deploying them. We have even made inroads into testing the deployment. We do CentOS, Ubuntu, and soon Oracle Linux, for both source and binary and build and deploy. Lots of gates and if they aren't green we know the patch is wrong.
Remember that COE’s are tested on nova instances within heat stacks. Starting lots of nova instances within devstack in the gates is problematic. We are looking into using a libvirt-lxc instance type from nova instead of a libvirt-kvm instance to help alleviate this. Until then, limiting the scope of our gate tests is appropriate. We will continue our efforts to make them reasonably efficient.
Thanks,
Adrian
Regards
-steve
Note that it will take a thoughtful approach (subject to discussion) to balance these interests. Please take a moment to review the interest above. Do you or others disagree with these? If so, why?
Adrian
On Mar 4, 2016, at 9:09 AM, Hongbin Lu <hongbin.lu at huawei.com<mailto:hongbin.lu at huawei.com>> wrote:
I don’t think there is any consensus on supporting single distro. There are multiple disagreements on this thread, including several senior team members and a project co-founder. This topic should be re-discussed (possibly at the design summit).
Best regards,
Hongbin
From: Corey O'Brien [mailto:coreypobrien at gmail.com]
Sent: March-04-16 11:37 AM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro
I don't think anyone is saying that code should somehow block support for multiple distros. The discussion at midcycle was about what the we should gate on and ensure feature parity for as a team. Ideally, we'd like to get support for every distro, I think, but no one wants to have that many gates. Instead, the consensus at the midcycle was to have 1 reference distro for each COE, gate on those and develop features there, and then have any other distros be maintained by those in the community that are passionate about them.
The issue also isn't about how difficult or not it is. The problem we want to avoid is spending precious time guaranteeing that new features and bug fixes make it through multiple distros.
Corey
On Fri, Mar 4, 2016 at 11:18 AM Steven Dake (stdake) <stdake at cisco.com<mailto:stdake at cisco.com>> wrote:
My position on this is simple.
Operators are used to using specific distros because that is what they used in the 90s,and the 00s, and the 10s. Yes, 25 years of using a distro, and you learn it inside and out. This means you don't want to relearn a new distro, especially if your an RPM user going to DEB or a DEB user going to RPM. These are non-starter options for operators, and as a result, mean that distro choice is a must. Since CoreOS is a new OS in the marketplace, it may make sense to consider placing it in "third" position in terms of support.
Besides that problem, various distribution companies will only support distros running in Vms if it matches the host kernel, which makes total sense to me. This means on an Ubuntu host if I want support I need to run Ubuntu vms, on a RHEL host I want to run RHEL vms, because, hey, I want my issues supported.
For these reasons and these reasons alone, there is no good rationale to remove multi-distro support from Magnum. All I've heard in this thread so far is "its too hard". Its not too hard, especially with Heat conditionals making their way into Mitaka.
Regards
-steve
From: Hongbin Lu <hongbin.lu at huawei.com<mailto:hongbin.lu at huawei.com>>
Reply-To: "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: Monday, February 29, 2016 at 9:40 AM
To: "openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Subject: [openstack-dev] [magnum] Discussion of supporting single/multiple OS distro
Hi team,
This is a continued discussion from a review [1]. Corey O'Brien suggested to have Magnum support a single OS distro (Atomic). I disagreed. I think we should bring the discussion to here to get broader set of inputs.
Corey O'Brien
>From the midcycle, we decided we weren't going to continue to support 2 different versions of the k8s template. Instead, we were going to maintain the Fedora Atomic version of k8s and remove the coreos templates from the tree. I don't think we should continue to develop features for coreos k8s if that is true.
In addition, I don't think we should break the coreos template by adding the trust token as a heat parameter.
Hongbin Lu
I was on the midcycle and I don't remember any decision to remove CoreOS support. Why you want to remove CoreOS templates from the tree. Please note that this is a very big decision and please discuss it with the team thoughtfully and make sure everyone agree.
Corey O'Brien
Removing the coreos templates was a part of the COE drivers decision. Since each COE driver will only support 1 distro+version+coe we discussed which ones to support in tree. The decision was that instead of trying to support every distro and every version for every coe, the magnum tree would only have support for 1 version of 1 distro for each of the 3 COEs (swarm/docker/mesos). Since we already are going to support Atomic for swarm, removing coreos and keeping Atomic for kubernetes was the favored choice.
Hongbin Lu
Strongly disagree. It is a huge risk to support a single distro. The selected distro could die in the future. Who knows. Why make Magnum take this huge risk? Again, the decision of supporting single distro is a very big decision. Please bring it up to the team and have it discuss thoughtfully before making any decision. Also, Magnum doesn't have to support every distro and every version for every coe, but should support *more than one* popular distro for some COEs (especially for the popular COEs).
Corey O'Brien
The discussion at the midcycle started from the idea of adding support for RHEL and CentOS. We all discussed and decided that we wouldn't try to support everything in tree. Magnum would provide support in-tree for 1 per COE and the COE driver interface would allow others to add support for their preferred distro out of tree.
Hongbin Lu
I agreed the part that "we wouldn't try to support everything in tree". That doesn't imply the decision to support single distro. Again, support single distro is a huge risk. Why make Magnum take this huge risk?
[1] https://review.openstack.org/#/c/277284/
Best regards,
Hongbin
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org/?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org<mailto:OpenStack-dev-request at lists.openstack.org>?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org<mailto:OpenStack-dev-request at lists.openstack.org>?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160304/e2b21cf6/attachment.html>
More information about the OpenStack-dev
mailing list