[openstack-dev] [magnum] Discuss the idea of manually managing the bay nodes

Fox, Kevin M Kevin.Fox at pnnl.gov
Thu Jun 2 23:54:50 UTC 2016


As an operator that has clouds that are partitioned into different host aggregates with different flavors targeting them, I totally believe we will have users that want to have a single k8s cluster span multiple different flavor types. I'm sure once I deploy magnum, I will want it too. You could have some special hardware on some nodes, not on others. but you can still have cattle, if you have enough of them and the labels are set appropriately. Labels allow you to continue to partition things when you need to, and ignore it when you dont, making administration significantly easier.

Say I have a tenant with 5 gpu nodes, and 10 regular nodes allocated into a k8s cluster. I may want 30 instances of container x that doesn't care where they land, and prefer 5 instances that need cuda. The former can be deployed with a k8s deployment. The latter can be deployed with a daemonset. All should work well and very non pet'ish. The whole tenant could be viewed with a single pane of glass, making it easy to manage.

Thanks,
Kevin
________________________________________
From: Adrian Otto [adrian.otto at rackspace.com]
Sent: Thursday, June 02, 2016 4:24 PM
To: OpenStack Development Mailing List (not for usage questions)
Subject: Re: [openstack-dev] [magnum] Discuss the idea of manually managing the bay nodes

I am really struggling to accept the idea of heterogeneous clusters. My experience causes me to question whether a heterogeneus cluster makes sense for Magnum. I will try to explain why I have this hesitation:

1) If you have a heterogeneous cluster, it suggests that you are using external intelligence to manage the cluster, rather than relying on it to be self-managing. This is an anti-pattern that I refer to as “pets" rather than “cattle”. The anti-pattern results in brittle deployments that rely on external intelligence to manage (upgrade, diagnose, and repair) the cluster. The automation of the management is much harder when a cluster is heterogeneous.

2) If you have a heterogeneous cluster, it can fall out of balance. This means that if one of your “important” or “large” members fail, there may not be adequate remaining members in the cluster to continue operating properly in the degraded state. The logic of how to track and deal with this needs to be handled. It’s much simpler in the heterogeneous case.

3) Heterogeneous clusters are complex compared to homogeneous clusters. They are harder to work with, and that usually means that unplanned outages are more frequent, and last longer than they with a homogeneous cluster.

Summary:

Heterogeneous:
  - Complex
  - Prone to imbalance upon node failure
  - Less reliable

Heterogeneous:
  - Simple
  - Don’t get imbalanced when a min_members concept is supported by the cluster controller
  - More reliable

My bias is to assert that applications that want a heterogeneous mix of system capacities at a node level should be deployed on multiple homogeneous bays, not a single heterogeneous one. That way you end up with a composition of simple systems rather than a larger complex one.

Adrian


> On Jun 1, 2016, at 3:02 PM, Hongbin Lu <hongbin.lu at huawei.com> wrote:
>
> Personally, I think this is a good idea, since it can address a set of similar use cases like below:
> * I want to deploy a k8s cluster to 2 availability zone (in future 2 regions/clouds).
> * I want to spin up N nodes in AZ1, M nodes in AZ2.
> * I want to scale the number of nodes in specific AZ/region/cloud. For example, add/remove K nodes from AZ1 (with AZ2 untouched).
>
> The use case above should be very common and universal everywhere. To address the use case, Magnum needs to support provisioning heterogeneous set of nodes at deploy time and managing them at runtime. It looks the proposed idea (manually managing individual nodes or individual group of nodes) can address this requirement very well. Besides the proposed idea, I cannot think of an alternative solution.
>
> Therefore, I vote to support the proposed idea.
>
> Best regards,
> Hongbin
>
>> -----Original Message-----
>> From: Hongbin Lu
>> Sent: June-01-16 11:44 AM
>> To: OpenStack Development Mailing List (not for usage questions)
>> Subject: RE: [openstack-dev] [magnum] Discuss the idea of manually
>> managing the bay nodes
>>
>> Hi team,
>>
>> A blueprint was created for tracking this idea:
>> https://blueprints.launchpad.net/magnum/+spec/manually-manage-bay-
>> nodes . I won't approve the BP until there is a team decision on
>> accepting/rejecting the idea.
>>
>> From the discussion in design summit, it looks everyone is OK with the
>> idea in general (with some disagreements in the API style). However,
>> from the last team meeting, it looks some people disagree with the idea
>> fundamentally. so I re-raised this ML to re-discuss.
>>
>> If you agree or disagree with the idea of manually managing the Heat
>> stacks (that contains individual bay nodes), please write down your
>> arguments here. Then, we can start debating on that.
>>
>> Best regards,
>> Hongbin
>>
>>> -----Original Message-----
>>> From: Cammann, Tom [mailto:tom.cammann at hpe.com]
>>> Sent: May-16-16 5:28 AM
>>> To: OpenStack Development Mailing List (not for usage questions)
>>> Subject: Re: [openstack-dev] [magnum] Discuss the idea of manually
>>> managing the bay nodes
>>>
>>> The discussion at the summit was very positive around this
>> requirement
>>> but as this change will make a large impact to Magnum it will need a
>>> spec.
>>>
>>> On the API of things, I was thinking a slightly more generic approach
>>> to incorporate other lifecycle operations into the same API.
>>> Eg:
>>> magnum bay-manage <bay> <life-cycle-op>
>>>
>>> magnum bay-manage <bay> reset –hard
>>> magnum bay-manage <bay> rebuild
>>> magnum bay-manage <bay> node-delete <name/uuid> magnum bay-manage
>>> <bay> node-add –flavor <flavor> magnum bay-manage <bay> node-reset
>>> <name> magnum bay-manage <bay> node-list
>>>
>>> Tom
>>>
>>> From: Yuanying OTSUKA <yuanying at oeilvert.org>
>>> Reply-To: "OpenStack Development Mailing List (not for usage
>>> questions)" <openstack-dev at lists.openstack.org>
>>> Date: Monday, 16 May 2016 at 01:07
>>> To: "OpenStack Development Mailing List (not for usage questions)"
>>> <openstack-dev at lists.openstack.org>
>>> Subject: Re: [openstack-dev] [magnum] Discuss the idea of manually
>>> managing the bay nodes
>>>
>>> Hi,
>>>
>>> I think, user also want to specify the deleting node.
>>> So we should manage “node” individually.
>>>
>>> For example:
>>> $ magnum node-create —bay …
>>> $ magnum node-list —bay
>>> $ magnum node-delete $NODE_UUID
>>>
>>> Anyway, if magnum want to manage a lifecycle of container
>>> infrastructure.
>>> This feature is necessary.
>>>
>>> Thanks
>>> -yuanying
>>>
>>>
>>> 2016年5月16日(月) 7:50 Hongbin Lu
>>> <hongbin.lu at huawei.com<mailto:hongbin.lu at huawei.com>>:
>>> Hi all,
>>>
>>> This is a continued discussion from the design summit. For recap,
>>> Magnum manages bay nodes by using ResourceGroup from Heat. This
>>> approach works but it is infeasible to manage the heterogeneity
>> across
>>> bay nodes, which is a frequently demanded feature. As an example,
>>> there is a request to provision bay nodes across availability zones
>> [1].
>>> There is another request to provision bay nodes with different set of
>>> flavors [2]. For the request features above, ResourceGroup won’t work
>>> very well.
>>>
>>> The proposal is to remove the usage of ResourceGroup and manually
>>> create Heat stack for each bay nodes. For example, for creating a
>>> cluster with 2 masters and 3 minions, Magnum is going to manage 6
>> Heat
>>> stacks (instead of 1 big Heat stack as right now):
>>> * A kube cluster stack that manages the global resources
>>> * Two kube master stacks that manage the two master nodes
>>> * Three kube minion stacks that manage the three minion nodes
>>>
>>> The proposal might require an additional API endpoint to manage nodes
>>> or a group of nodes. For example:
>>> $ magnum nodegroup-create --bay XXX --flavor m1.small --count 2 --
>>> availability-zone us-east-1 ….
>>> $ magnum nodegroup-create --bay XXX --flavor m1.medium --count 3 --
>>> availability-zone us-east-2 …
>>>
>>> Thoughts?
>>>
>>> [1] https://blueprints.launchpad.net/magnum/+spec/magnum-
>> availability-
>>> zones
>>> [2] https://blueprints.launchpad.net/magnum/+spec/support-multiple-
>>> flavor
>>>
>>> Best regards,
>>> Hongbin
>>>
>> ______________________________________________________________________
>>> _
>>> ___
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-
>>> request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-
>>> request at lists.openstack.org?subject:unsubscribe>
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>> ______________________________________________________________________
>>> _
>>> ___
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-
>>> request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev


More information about the OpenStack-dev mailing list