[openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

Hongbin Lu hongbin.lu at huawei.com
Sun May 29 22:40:17 UTC 2016



> -----Original Message-----
> From: Steven Dake (stdake) [mailto:stdake at cisco.com]
> Sent: May-29-16 3:29 PM
> To: OpenStack Development Mailing List (not for usage questions)
> Subject: Re: [openstack-dev]
> [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a
> k8s orchestrator
> 
> Quick question below.
> 
> On 5/28/16, 1:16 PM, "Hongbin Lu" <hongbin.lu at huawei.com> wrote:
> 
> >
> >
> >> -----Original Message-----
> >> From: Zane Bitter [mailto:zbitter at redhat.com]
> >> Sent: May-27-16 6:31 PM
> >> To: OpenStack Development Mailing List
> >> Subject: [openstack-dev]
> >> [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
> >> Gap analysis: Heat as a k8s orchestrator
> >>
> >> I spent a bit of time exploring the idea of using Heat as an
> external
> >> orchestration layer on top of Kubernetes - specifically in the case
> >> of TripleO controller nodes but I think it could be more generally
> >> useful too - but eventually came to the conclusion it doesn't work
> >> yet, and probably won't for a while. Nevertheless, I think it's
> >> helpful to document a bit to help other people avoid going down the
> >> same path, and also to help us focus on working toward the point
> >> where it _is_ possible, since I think there are other contexts where
> >> it would be useful too.
> >>
> >> We tend to refer to Kubernetes as a "Container Orchestration Engine"
> >> but it does not actually do any orchestration, unless you count just
> >> starting everything at roughly the same time as 'orchestration'.
> >> Which I wouldn't. You generally handle any orchestration
> requirements
> >> between services within the containers themselves, possibly using
> >> external services like etcd to co-ordinate. (The Kubernetes project
> >> refer to this as "choreography", and explicitly disclaim any attempt
> >> at
> >> orchestration.)
> >>
> >> What Kubernetes *does* do is more like an actively-managed version
> of
> >> Heat's SoftwareDeploymentGroup (emphasis on the _Group_). Brief
> recap:
> >> SoftwareDeploymentGroup is a type of ResourceGroup; you give it a
> map
> >> of resource names to server UUIDs and it creates a
> SoftwareDeployment
> >> for each server. You have to generate the list of servers somehow to
> >> give it (the easiest way is to obtain it from the output of another
> >> ResourceGroup containing the servers). If e.g. a server goes down
> you
> >> have to detect that externally, and trigger a Heat update that
> >> removes it from the templates, redeploys a replacement server, and
> >> regenerates the server list before a replacement SoftwareDeployment
> >> is created. In constrast, Kubernetes is running on a cluster of
> >> servers, can use rules to determine where to run containers, and can
> >> very quickly redeploy without external intervention in response to a
> >> server or container falling over. (It also does rolling updates,
> >> which Heat can also do albeit in a somewhat hacky way when it comes
> >> to SoftwareDeployments - which we're planning to fix.)
> >>
> >> So this seems like an opportunity: if the dependencies between
> >> services could be encoded in Heat templates rather than baked into
> >> the containers then we could use Heat as the orchestration layer
> >> following the dependency-based style I outlined in [1]. (TripleO is
> >> already moving in this direction with the way that composable-roles
> >> uses
> >> SoftwareDeploymentGroups.) One caveat is that fully using this style
> >> likely rules out for all practical purposes the current
> >> Pacemaker-based HA solution. We'd need to move to a lighter-weight
> HA
> >> solution, but I know that TripleO is considering that anyway.
> >>
> >> What's more though, assuming this could be made to work for a
> >> Kubernetes cluster, a couple of remappings in the Heat environment
> >> file should get you an otherwise-equivalent single-node non-HA
> >> deployment basically for free. That's particularly exciting to me
> >> because there are definitely deployments of TripleO that need HA
> >> clustering and deployments that don't and which wouldn't want to pay
> >> the complexity cost of running Kubernetes when they don't make any
> real use of it.
> >>
> >> So you'd have a Heat resource type for the controller cluster that
> >> maps to either an OS::Nova::Server or (the equivalent of) an
> >> OS::Magnum::Bay, and a bunch of software deployments that map to
> >> either a OS::Heat::SoftwareDeployment that calls (I assume)
> >> docker-compose directly or a Kubernetes Pod resource to be named
> later.
> >>
> >> The first obstacle is that we'd need that Kubernetes Pod resource in
> >> Heat. Currently there is no such resource type, and the OpenStack
> API
> >> that would be expected to provide that API (Magnum's /container
> >> endpoint) is being deprecated, so that's not a long-term solution.[2]
> >> Some folks from the Magnum community may or may not be working on a
> >> separate project (which may or may not be called Higgins) to do that.
> >> It'd be some time away though.
> >>
> >> An alternative, though not a good one, would be to create a
> >> Kubernetes resource type in Heat that has the credentials passed in
> >> somehow. I'm very against that though. Heat is just not good at
> >> handling credentials other than Keystone ones. We haven't ever
> >> created a resource type like this before, except for the Docker one
> >> in /contrib that serves as a prime example of what *not* to do. And
> >> if it doesn't make sense to wrap an OpenStack API around this then
> >> IMO it isn't going to make any more sense to wrap a Heat resource
> around it.
> >
> >There are ways to alleviate the credential handling issue. First,
> >Kubernetes supports Keystone authentication [1]. Magnum has a BP [2]
> to
> >turn on this feature. In addition, there is a Kubernetes python-
> binding
> >[3] under development. By combining all these efforts, it is possible
> >to create a Kubernetes resource in Heat without handing credentials
> >other than the Keystone ones.
> >
> >[1] http://kubernetes.io/docs/admin/authentication/
> >[2] https://blueprints.launchpad.net/magnum/+spec/keystone-for-k8s-bay
> >[3] https://github.com/openstack/python-k8sclient
> >
> >>
> >> A third option might be a SoftwareDeployment, possibly on one of the
> >> controller nodes themselves, that calls the k8s client. (We could
> >> create a software deployment hook to make this easy.) That would
> >> suffer from all of the same issues that TripleO currently has about
> >> having to choose a server on which to deploy though.
> >
> >From my point of view, the Kubernetes Heat resources approach is
> >possibly more user-friendly than the SoftwareDeployment approach. That
> >is because SoftwareDeployment and SoftwareDeploymentGroup resources
> are
> >very advanced and complex. It might take a while for users to figure
> >out how to use them. The requirement of building a custom image is
> >another barrier of entry. In Magnum, we explored the possibility to
> >leverage SD/SDG in Atomic-based COEs, but stopped on that direction
> >until the
> >os-*-* tools have been fully containerized [4] so that those resources
> >could work on any OS.
> >
> >[4] https://bugs.launchpad.net/magnum/+bug/1424969
> >
> >>
> >> The secondary obstacle is networking. TripleO has some pretty
> >> complicated networking requirements (specifically network isolation
> >> for the various services) that for now can't be supported when
> >> deploying a cluster with Magnum. The Kuryr project is working on
> >> improved networking for Magnum, but I don't know whether this is a
> >> use-case that would be covered.
> >
> >Sorry, I don't get this. Mind elaborating the details of your network
> >requirements?
> >
> >>
> >> There's also the issue that IIUC Magnum operates its Neutron L3
> >> agents in such a way that connectivity to the user nodes is
> >> guaranteed only if Magnum itself is running in an HA cloud. This is
> a
> >> problematic assumption in general, but it's particularly problematic
> >> in the case of the TripleO *undercloud*, which is not HA and which
> we
> >> very much do not want to be in the networking path for the overcloud
> controller nodes.
> >> Again, I don't know if this will be resolved by Kuryr or when.
> >>
> >> Magnum does offer the option to pass a custom template, and I assume
> >> that would allow us to set up the networking the way we want it.
> >> However, TripleO uses all kinds of tricks with the environment and
> >> parameters, so there'd quite likely need to be some enhancements to
> >> both Heat (in order to access the current environment from within a
> >> template) and Magnum (to pass an environment along with the template)
> >> to support that.
> >
> >Magnum prefers to leverage the Heat conditionals feature instead of
> >leveraging environments, because we expected Heat conditionals would
> >make our Heat templates simpler and easier to maintain. If we can pass
> >a parameter to Heat template and use conditionals to interpret the
> >parameter, I am not sure if we also need to support passing
> >environments as well (it seems conditionals can do whatever
> environments can do).
> >
> >>
> >> At that point it's a legitimate question to ask what exactly Magnum
> >> is buying us if TripleO has to maintain its own Kubernetes
> deployment
> >> templates anyway. I can think of only two things: an easier
> >> transition later if we do believe that the networking stuff will be
> >> resolved, and the /containers API. And the /containers API is being
> deprecated.
> >>
> >> In that sense, the Magnum/Higgins split could be a good thing for
> the
> >> Heat+Kubernetes use case in the long term - if we had a
> >> Keystone-authenticated API that can allow Heat to make use of any
> k8s
> >> cluster, not just those deployed via Magnum, then Magnum could be
> cut
> >> out of the loop in those cases where networking issues preclude its
> use.
> >
> >Wearing my Magnum PTL hat, I am sorry to hear Magnum couldn't resolve
> >your problem immediately. Wearing my Higgins core hat, I am thrilled
> >that Higgins is under your consideration in long term.
> 
> Who is the PTL and core team of Higgins?  I didn't see an announcement
> on the mailing list, although granted I probably missed it given the
> volume we have :)

Here is the core team: https://review.openstack.org/#/admin/groups/1382,members . There is no official Higgins PTL, since there is no PTL election yet. Right now, I am coordinating the contribution and running the weekly team meeting. You can consider me as the temporary PTL until the official PTL is elected. We will find the right time to hold a PTL election.

> 
> Regards
> -steve
> >
> >>
> >> In the short term, though, there seems to be a number of obstacles.
> >> Perhaps some of the folks involved in the relevant projects could
> >> comment on when/if those are likely to be resolved.
> >>
> >> cheers,
> >> Zane.
> >>
> >> [1]
> >> http://lists.openstack.org/pipermail/openstack-dev/2016-
> >> March/090055.html
> >> [2]https://etherpad.openstack.org/p/newton-magnum-unified-
> abstraction
> >>
> >>
> _____________________________________________________________________
> >> __
> >> ___
> >> OpenStack Development Mailing List (not for usage questions)
> >> Unsubscribe: OpenStack-dev-
> >> request at lists.openstack.org?subject:unsubscribe
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> >
> >______________________________________________________________________
> _
> >___ OpenStack Development Mailing List (not for usage questions)
> >Unsubscribe:
> >OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> 
> _______________________________________________________________________
> ___
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list