[openstack-dev] [heat] Kubernetes AutoScaling with Heat AutoScalingGroup and Ceilometer

Qiming Teng tengqim at linux.vnet.ibm.com
Tue Apr 28 14:29:34 UTC 2015


On Mon, Apr 27, 2015 at 12:28:01PM -0400, Rabi Mishra wrote:
> Hi All,
> 
> Deploying Kubernetes(k8s) cluster on any OpenStack based cloud for container based workload is a standard deployment pattern. However, auto-scaling this cluster based on load would require some integration between k8s OpenStack components. While looking at the option of leveraging Heat ASG to achieve autoscaling, I came across few requirements that the list can discuss and arrive at the best possible solution.
> 
> A typical k8s deployment scenario on OpenStack would be as below.
> 
> - Master (single VM)
> - Minions/Nodes (AutoScalingGroup)
> 
> AutoScaling of the cluster would involve both scaling of minions/nodes and scaling Pods(ReplicationControllers). 
> 
> 1. Scaling Nodes/Minions:
> 
> We already have utilization stats collected at the hypervisor level, as ceilometer compute agent polls the local libvirt daemon to acquire performance data for the local instances/nodes.

I really doubts if those metrics are so useful to trigger a scaling
operation. My suspicion is based on two assumptions: 1) autoscaling
requests should come from the user application or service, not from the
controller plane, the application knows best whether scaling is needed;
2) hypervisor level metrics may be misleading in some cases. For
example, it cannot give an accurate CPU utilization number in the case
of CPU overcommit which is a common practice.

> Also, Kubelet (running on the node) collects the cAdvisor stats. However, cAdvisor stats are not fed back to the scheduler at present and scheduler uses a simple round-robin method for scheduling.

It looks like a multi-layer resource management problem which needs a
wholistic design. I'm not quite sure if scheduling at the container
layer alone can help improve resource utilization or not.

> Req 1: We would need a way to push stats from the kubelet/cAdvisor to ceilometer directly or via the master(using heapster). Alarms based on these stats can then be used to scale up/down the ASG. 

To send a sample to ceilometer for triggering autoscaling, we will need
some user credentials to authenticate with keystone (even with trusts).
We need to pass the project-id in and out so that ceilometer will know
the correct scope for evaluation. We also need a standard way to tag
samples with the stack ID and maybe also the ASG ID. I'd love to see
this done transparently, i.e. no matching_metadata or query confusions.

> There is an existing blueprint[1] for an inspector implementation for docker hypervisor(nova-docker). However, we would probably require an agent running on the nodes or master and send the cAdvisor or heapster stats to ceilometer. I've seen some discussions on possibility of leveraging keystone trusts with ceilometer client. 

An agent is needed, definitely.

> Req 2: Autoscaling Group is expected to notify the master that a new node has been added/removed. Before removing a node the master/scheduler has to mark node as 
> unschedulable. 

A little bit confused here ... are we scaling the containers or the
nodes or both?

> Req 3: Notify containers/pods that the node would be removed for them to stop accepting any traffic, persist data. It would also require a cooldown period before the node removal. 

There have been some discussions on sending messages, but so far I don't
think there is a conclusion on the generic solution.

Just my $0.02.

BTW, we have been looking into similar problems in the Senlin project.

Regards,
  Qiming

> Both requirement 2 and 3 would probably require generating scaling event notifications/signals for master and containers to consume and probably some ASG lifecycle hooks.  
> 
> 
> Req 4: In case of too many 'pending' pods to be scheduled, scheduler would signal ASG to scale up. This is similar to Req 1. 
> 
> 
> 2. Scaling Pods
> 
> Currently manual scaling of pods is possible by resizing ReplicationControllers. k8s community is working on an abstraction, AutoScaler[2] on top of ReplicationController(RC) that provides intention/rule based autoscaling. There would be a requirement to collect cAdvisor/Heapster stats to signal the AutoScaler too. Probably this is beyond the scope of OpenStack.
> 
> Any thoughts and ideas on how to realize this use-case would be appreciated.
> 
> 
> [1] https://review.openstack.org/gitweb?p=openstack%2Fceilometer-specs.git;a=commitdiff;h=6ea7026b754563e18014a32e16ad954c86bd8d6b
> [2] https://github.com/GoogleCloudPlatform/kubernetes/blob/master/docs/proposals/autoscaling.md
> 
> Regards,
> Rabi Mishra
> 
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 




More information about the OpenStack-dev mailing list