[openstack-dev] [Magnum] Next auto-scaling feature design?

hieulq at vn.fujitsu.com hieulq at vn.fujitsu.com
Fri Aug 19 07:05:49 UTC 2016


Thanks for all the information.

Yeah, hope that we can help a session about auto-scaling this summit.

From: Ton Ngo [mailto:ton at us.ibm.com]
Sent: Friday, August 19, 2016 4:19 AM
To: OpenStack Development Mailing List (not for usage questions) <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Magnum] Next auto-scaling feature design?


We have had numerous discussion on this topic, including a presentation and a design session
in Tokyo, but we have not really arrived at a consensus yet. Part of the problem is that auto-scaling
at the container level is still being developed, so it is still a moving target.
However, a few points did emerge from the discussion (not necessarily consensus):

  *   It's preferable to have a single point of decision on auto-scaling for both the container and infrastructure level.
One approach is to make this decision at the container orchestration level, so the infrastructure level would just
provide the service to handle request to scale the infrastructure. This would require coordinating support with
upstream like Kubernetes. This approach also means that we don't want a major component in Magnum to
drive auto-scaling.
  *   It's good to have a policy-driven mechanism for auto-scaling to handle complex scenarios. For this, Senlin
is a candidate; upstream is another potential choice.
We may want to revisit this topic as a design session in the next summit.
Ton Ngo,

[Inactive hide details for Hongbin Lu ---08/18/2016 12:26:07 PM---> -----Original Message----- > From: hieulq at vn.fujitsu.com [ma]Hongbin Lu ---08/18/2016 12:26:07 PM---> -----Original Message----- > From: hieulq at vn.fujitsu.com<mailto:hieulq at vn.fujitsu.com> [mailto:hieulq at vn.fujitsu.com]

From: Hongbin Lu <hongbin.lu at huawei.com<mailto:hongbin.lu at huawei.com>>
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>>
Date: 08/18/2016 12:26 PM
Subject: Re: [openstack-dev] [Magnum] Next auto-scaling feature design?
________________________________





> -----Original Message-----
> From: hieulq at vn.fujitsu.com<mailto:hieulq at vn.fujitsu.com> [mailto:hieulq at vn.fujitsu.com]
> Sent: August-18-16 3:57 AM
> To: openstack-dev at lists.openstack.org<mailto:openstack-dev at lists.openstack.org>
> Subject: [openstack-dev] [Magnum] Next auto-scaling feature design?
>
> Hi Magnum folks,
>
> I have some interests in our auto scaling features and currently
> testing with some container monitoring solutions such as heapster,
> telegraf and prometheus. I have seen the PoC session corporate with
> Senlin in Austin and have some questions regarding of this design:
> - We have decided to move all container management from Magnum to Zun,
> so is there only one level of scaling (node) instead of both node and
> container?
> - The PoC design show that Magnum (Magnum Scaler) need to depend on
> Heat/Ceilometer for gathering metrics and do the scaling work based on
> auto scaling policies, but is Heat/Ceilometer is the best choice for
> Magnum auto scaling?
>
> Currently, I saw that Magnum only send CPU and Memory metric to
> Ceilometer, and Heat can grab these to decide the right scaling method.
> IMO, this approach have some problems, please take a look and give
> feedbacks:
> - The AutoScaling Policy and AutoScaling Resource of Heat cannot handle
> complex scaling policies. For example:
> If CPU > 80% then scale out
> If Mem < 40% then scale in
> -> What if CPU = 90% and Mem = 30%, the conflict policy will appear.
> There are some WIP patch-set of Heat conditional logic in [1]. But IMO,
> the conditional logic of Heat also cannot resolve the conflict of
> scaling policies. For example:
> If CPU > 80% and Mem >70% then scale out If CPU < 30% or Mem < 50% then
> scale in
> -> What if CPU = 90% and Mem = 30%.
> Thus, I think that we need to implement magnum scaler for validating
> the policy conflicts.
> - Ceilometer may have troubles if we deploy thousands of COE.
>
> I think we need a new design for auto scaling feature, not for Magnum
> only but also Zun (because the scaling level of container maybe forked
> to Zun too). Here are some ideas:
> 1. Add new field enable_monitor to cluster template (ex baymodel) and
> show the monitoring URL when creating cluster (bay) complete. For
> example, we can use Prometheus as monitoring container for each cluster.
> (Heapster is the best choice for k8s, but not good enough for swarm or
> mesos).

[Hongbin Lu] Personally, I think this is a good idea.

> 2. Create Magnum scaler manager (maybe a new service):
> - Monitoring enabled monitor cluster and send metric to ceilometer if
> need.
> - Manage user-defined scaling policy: not only cpu and memory but also
> other metrics like network bw, CCU.
> - Validate user-defined scaling policy and trigger heat for scaling
> actions. (can trigger nova-scheduler for more scaling options)
> - Need highly scalable architecture, first step we can implement simple
> validator method but in the future, there are some other approach such
> as using fuzzy logic or AI to make an appropriate decision.

[Hongbin Lu] I think this is a valid requirement but I wonder why you want it in Magnum. However, if you have a valid reason to do that, you can create a custom bay driver. You can add logic to the custom driver to retrieve metrics from the monitoring URL and send them to ceilometers. Users can pass scaling policy via "labels" when they create the bay. The custom driver is responsible to  validate the policy and trigger the action based on that. Does it satisfy your requirement?

>
> Some use case for operators:
> - I want to create a k8s cluster, and if CCU or network bandwidth is
> high please scale-out X nodes in other regions.
> - I want to create swarm cluster, and if CPU or memory is too high,
> please scale-out X nodes to make sure total CPU and memory is about 50%.
>
> What do you think about these above ideas/problems?
>
> [1]. https://blueprints.launchpad.net/heat/+spec/support-conditions-
> function
>
> Thanks,
> Hieu LE.
>
>
> _______________________________________________________________________
> ___
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-
> request at lists.openstack.org?subject:unsubscribe<mailto:request at lists.openstack.org?subject:unsubscribe>
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<mailto:OpenStack-dev-request at lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160819/085e5707/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 105 bytes
Desc: image001.gif
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160819/085e5707/attachment.gif>


More information about the OpenStack-dev mailing list