[openstack-dev] [Magnum] Next auto-scaling feature design?
ton at us.ibm.com
Thu Aug 18 21:19:01 UTC 2016
We have had numerous discussion on this topic, including a presentation and
a design session
in Tokyo, but we have not really arrived at a consensus yet. Part of the
problem is that auto-scaling
at the container level is still being developed, so it is still a moving
However, a few points did emerge from the discussion (not necessarily
It's preferable to have a single point of decision on auto-scaling for
both the container and infrastructure level.
One approach is to make this decision at the container orchestration
level, so the infrastructure level would just
provide the service to handle request to scale the infrastructure. This
would require coordinating support with
upstream like Kubernetes. This approach also means that we don't want a
major component in Magnum to
It's good to have a policy-driven mechanism for auto-scaling to handle
complex scenarios. For this, Senlin
is a candidate; upstream is another potential choice.
We may want to revisit this topic as a design session in the next summit.
From: Hongbin Lu <hongbin.lu at huawei.com>
To: "OpenStack Development Mailing List (not for usage questions)"
<openstack-dev at lists.openstack.org>
Date: 08/18/2016 12:26 PM
Subject: Re: [openstack-dev] [Magnum] Next auto-scaling feature design?
> -----Original Message-----
> From: hieulq at vn.fujitsu.com [mailto:hieulq at vn.fujitsu.com]
> Sent: August-18-16 3:57 AM
> To: openstack-dev at lists.openstack.org
> Subject: [openstack-dev] [Magnum] Next auto-scaling feature design?
> Hi Magnum folks,
> I have some interests in our auto scaling features and currently
> testing with some container monitoring solutions such as heapster,
> telegraf and prometheus. I have seen the PoC session corporate with
> Senlin in Austin and have some questions regarding of this design:
> - We have decided to move all container management from Magnum to Zun,
> so is there only one level of scaling (node) instead of both node and
> - The PoC design show that Magnum (Magnum Scaler) need to depend on
> Heat/Ceilometer for gathering metrics and do the scaling work based on
> auto scaling policies, but is Heat/Ceilometer is the best choice for
> Magnum auto scaling?
> Currently, I saw that Magnum only send CPU and Memory metric to
> Ceilometer, and Heat can grab these to decide the right scaling method.
> IMO, this approach have some problems, please take a look and give
> - The AutoScaling Policy and AutoScaling Resource of Heat cannot handle
> complex scaling policies. For example:
> If CPU > 80% then scale out
> If Mem < 40% then scale in
> -> What if CPU = 90% and Mem = 30%, the conflict policy will appear.
> There are some WIP patch-set of Heat conditional logic in . But IMO,
> the conditional logic of Heat also cannot resolve the conflict of
> scaling policies. For example:
> If CPU > 80% and Mem >70% then scale out If CPU < 30% or Mem < 50% then
> scale in
> -> What if CPU = 90% and Mem = 30%.
> Thus, I think that we need to implement magnum scaler for validating
> the policy conflicts.
> - Ceilometer may have troubles if we deploy thousands of COE.
> I think we need a new design for auto scaling feature, not for Magnum
> only but also Zun (because the scaling level of container maybe forked
> to Zun too). Here are some ideas:
> 1. Add new field enable_monitor to cluster template (ex baymodel) and
> show the monitoring URL when creating cluster (bay) complete. For
> example, we can use Prometheus as monitoring container for each cluster.
> (Heapster is the best choice for k8s, but not good enough for swarm or
[Hongbin Lu] Personally, I think this is a good idea.
> 2. Create Magnum scaler manager (maybe a new service):
> - Monitoring enabled monitor cluster and send metric to ceilometer if
> - Manage user-defined scaling policy: not only cpu and memory but also
> other metrics like network bw, CCU.
> - Validate user-defined scaling policy and trigger heat for scaling
> actions. (can trigger nova-scheduler for more scaling options)
> - Need highly scalable architecture, first step we can implement simple
> validator method but in the future, there are some other approach such
> as using fuzzy logic or AI to make an appropriate decision.
[Hongbin Lu] I think this is a valid requirement but I wonder why you want
it in Magnum. However, if you have a valid reason to do that, you can
create a custom bay driver. You can add logic to the custom driver to
retrieve metrics from the monitoring URL and send them to ceilometers.
Users can pass scaling policy via "labels" when they create the bay. The
custom driver is responsible to validate the policy and trigger the action
based on that. Does it satisfy your requirement?
> Some use case for operators:
> - I want to create a k8s cluster, and if CCU or network bandwidth is
> high please scale-out X nodes in other regions.
> - I want to create swarm cluster, and if CPU or memory is too high,
> please scale-out X nodes to make sure total CPU and memory is about 50%.
> What do you think about these above ideas/problems?
> . https://blueprints.launchpad.net/heat/+spec/support-conditions-
> Hieu LE.
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> request at lists.openstack.org?subject:unsubscribe
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
-------------- next part --------------
An HTML attachment was scrubbed...
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 105 bytes
Desc: not available
More information about the OpenStack-dev