[openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr] Gap analysis: Heat as a k8s orchestrator

Steven Dake (stdake) stdake at cisco.com
Sun May 29 19:29:21 UTC 2016


Quick question below.

On 5/28/16, 1:16 PM, "Hongbin Lu" <hongbin.lu at huawei.com> wrote:

>
>
>> -----Original Message-----
>> From: Zane Bitter [mailto:zbitter at redhat.com]
>> Sent: May-27-16 6:31 PM
>> To: OpenStack Development Mailing List
>> Subject: [openstack-dev] [TripleO][Kolla][Heat][Higgins][Magnum][Kuryr]
>> Gap analysis: Heat as a k8s orchestrator
>> 
>> I spent a bit of time exploring the idea of using Heat as an external
>> orchestration layer on top of Kubernetes - specifically in the case of
>> TripleO controller nodes but I think it could be more generally useful
>> too - but eventually came to the conclusion it doesn't work yet, and
>> probably won't for a while. Nevertheless, I think it's helpful to
>> document a bit to help other people avoid going down the same path, and
>> also to help us focus on working toward the point where it _is_
>> possible, since I think there are other contexts where it would be
>> useful too.
>> 
>> We tend to refer to Kubernetes as a "Container Orchestration Engine"
>> but it does not actually do any orchestration, unless you count just
>> starting everything at roughly the same time as 'orchestration'. Which
>> I wouldn't. You generally handle any orchestration requirements between
>> services within the containers themselves, possibly using external
>> services like etcd to co-ordinate. (The Kubernetes project refer to
>> this as "choreography", and explicitly disclaim any attempt at
>> orchestration.)
>> 
>> What Kubernetes *does* do is more like an actively-managed version of
>> Heat's SoftwareDeploymentGroup (emphasis on the _Group_). Brief recap:
>> SoftwareDeploymentGroup is a type of ResourceGroup; you give it a map
>> of resource names to server UUIDs and it creates a SoftwareDeployment
>> for each server. You have to generate the list of servers somehow to
>> give it (the easiest way is to obtain it from the output of another
>> ResourceGroup containing the servers). If e.g. a server goes down you
>> have to detect that externally, and trigger a Heat update that removes
>> it from the templates, redeploys a replacement server, and regenerates
>> the server list before a replacement SoftwareDeployment is created. In
>> constrast, Kubernetes is running on a cluster of servers, can use rules
>> to determine where to run containers, and can very quickly redeploy
>> without external intervention in response to a server or container
>> falling over. (It also does rolling updates, which Heat can also do
>> albeit in a somewhat hacky way when it comes to SoftwareDeployments -
>> which we're planning to fix.)
>> 
>> So this seems like an opportunity: if the dependencies between services
>> could be encoded in Heat templates rather than baked into the
>> containers then we could use Heat as the orchestration layer following
>> the dependency-based style I outlined in [1]. (TripleO is already
>> moving in this direction with the way that composable-roles uses
>> SoftwareDeploymentGroups.) One caveat is that fully using this style
>> likely rules out for all practical purposes the current Pacemaker-based
>> HA solution. We'd need to move to a lighter-weight HA solution, but I
>> know that TripleO is considering that anyway.
>> 
>> What's more though, assuming this could be made to work for a
>> Kubernetes cluster, a couple of remappings in the Heat environment file
>> should get you an otherwise-equivalent single-node non-HA deployment
>> basically for free. That's particularly exciting to me because there
>> are definitely deployments of TripleO that need HA clustering and
>> deployments that don't and which wouldn't want to pay the complexity
>> cost of running Kubernetes when they don't make any real use of it.
>> 
>> So you'd have a Heat resource type for the controller cluster that maps
>> to either an OS::Nova::Server or (the equivalent of) an OS::Magnum::Bay,
>> and a bunch of software deployments that map to either a
>> OS::Heat::SoftwareDeployment that calls (I assume) docker-compose
>> directly or a Kubernetes Pod resource to be named later.
>> 
>> The first obstacle is that we'd need that Kubernetes Pod resource in
>> Heat. Currently there is no such resource type, and the OpenStack API
>> that would be expected to provide that API (Magnum's /container
>> endpoint) is being deprecated, so that's not a long-term solution.[2]
>> Some folks from the Magnum community may or may not be working on a
>> separate project (which may or may not be called Higgins) to do that.
>> It'd be some time away though.
>> 
>> An alternative, though not a good one, would be to create a Kubernetes
>> resource type in Heat that has the credentials passed in somehow. I'm
>> very against that though. Heat is just not good at handling credentials
>> other than Keystone ones. We haven't ever created a resource type like
>> this before, except for the Docker one in /contrib that serves as a
>> prime example of what *not* to do. And if it doesn't make sense to wrap
>> an OpenStack API around this then IMO it isn't going to make any more
>> sense to wrap a Heat resource around it.
>
>There are ways to alleviate the credential handling issue. First,
>Kubernetes supports Keystone authentication [1]. Magnum has a BP [2] to
>turn on this feature. In addition, there is a Kubernetes python-binding
>[3] under development. By combining all these efforts, it is possible to
>create a Kubernetes resource in Heat without handing credentials other
>than the Keystone ones.
>
>[1] http://kubernetes.io/docs/admin/authentication/
>[2] https://blueprints.launchpad.net/magnum/+spec/keystone-for-k8s-bay
>[3] https://github.com/openstack/python-k8sclient
>
>> 
>> A third option might be a SoftwareDeployment, possibly on one of the
>> controller nodes themselves, that calls the k8s client. (We could
>> create a software deployment hook to make this easy.) That would suffer
>> from all of the same issues that TripleO currently has about having to
>> choose a server on which to deploy though.
>
>From my point of view, the Kubernetes Heat resources approach is possibly
>more user-friendly than the SoftwareDeployment approach. That is because
>SoftwareDeployment and SoftwareDeploymentGroup resources are very
>advanced and complex. It might take a while for users to figure out how
>to use them. The requirement of building a custom image is another
>barrier of entry. In Magnum, we explored the possibility to leverage
>SD/SDG in Atomic-based COEs, but stopped on that direction until the
>os-*-* tools have been fully containerized [4] so that those resources
>could work on any OS.
>
>[4] https://bugs.launchpad.net/magnum/+bug/1424969
>
>> 
>> The secondary obstacle is networking. TripleO has some pretty
>> complicated networking requirements (specifically network isolation for
>> the various services) that for now can't be supported when deploying a
>> cluster with Magnum. The Kuryr project is working on improved
>> networking for Magnum, but I don't know whether this is a use-case that
>> would be covered.
>
>Sorry, I don't get this. Mind elaborating the details of your network
>requirements?
>
>> 
>> There's also the issue that IIUC Magnum operates its Neutron L3 agents
>> in such a way that connectivity to the user nodes is guaranteed only if
>> Magnum itself is running in an HA cloud. This is a problematic
>> assumption in general, but it's particularly problematic in the case of
>> the TripleO *undercloud*, which is not HA and which we very much do not
>> want to be in the networking path for the overcloud controller nodes.
>> Again, I don't know if this will be resolved by Kuryr or when.
>> 
>> Magnum does offer the option to pass a custom template, and I assume
>> that would allow us to set up the networking the way we want it.
>> However, TripleO uses all kinds of tricks with the environment and
>> parameters, so there'd quite likely need to be some enhancements to
>> both Heat (in order to access the current environment from within a
>> template) and Magnum (to pass an environment along with the template)
>> to support that.
>
>Magnum prefers to leverage the Heat conditionals feature instead of
>leveraging environments, because we expected Heat conditionals would make
>our Heat templates simpler and easier to maintain. If we can pass a
>parameter to Heat template and use conditionals to interpret the
>parameter, I am not sure if we also need to support passing environments
>as well (it seems conditionals can do whatever environments can do).
>
>> 
>> At that point it's a legitimate question to ask what exactly Magnum is
>> buying us if TripleO has to maintain its own Kubernetes deployment
>> templates anyway. I can think of only two things: an easier transition
>> later if we do believe that the networking stuff will be resolved, and
>> the /containers API. And the /containers API is being deprecated.
>> 
>> In that sense, the Magnum/Higgins split could be a good thing for the
>> Heat+Kubernetes use case in the long term - if we had a
>> Keystone-authenticated API that can allow Heat to make use of any k8s
>> cluster, not just those deployed via Magnum, then Magnum could be cut
>> out of the loop in those cases where networking issues preclude its use.
>
>Wearing my Magnum PTL hat, I am sorry to hear Magnum couldn't resolve
>your problem immediately. Wearing my Higgins core hat, I am thrilled that
>Higgins is under your consideration in long term.

Who is the PTL and core team of Higgins?  I didn't see an announcement on
the mailing list, although granted I probably missed it given the volume
we have :)

Regards
-steve
>
>> 
>> In the short term, though, there seems to be a number of obstacles.
>> Perhaps some of the folks involved in the relevant projects could
>> comment on when/if those are likely to be resolved.
>> 
>> cheers,
>> Zane.
>> 
>> [1]
>> http://lists.openstack.org/pipermail/openstack-dev/2016-
>> March/090055.html
>> [2]https://etherpad.openstack.org/p/newton-magnum-unified-abstraction
>> 
>> _______________________________________________________________________
>> ___
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-
>> request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>__________________________________________________________________________
>OpenStack Development Mailing List (not for usage questions)
>Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev




More information about the OpenStack-dev mailing list