[openstack-dev] Fwd: [Senlin]Support more complicated scaling scenario

jun.xu.sz at 139.com jun.xu.sz at 139.com
Mon Nov 23 07:11:01 UTC 2015


Thanks HU yanyan, I still has some question about your answers, and I respond inline.


2015-11-20 16:06 GMT+08:00 xujun at cmss.chinamobile.com <xujun at cmss.chinamobile.com>:
Thanks yanyan!

Xu Jun is a contributor from CMCC. He asked a very interesting question about cluster scaling support in Senlin. To make the discussion more thorough, I just post the question and my answer here.

The question from Jun is as following:

For an action, senlin will check all according polices, like if a cluster attach two scaling-in polices, 
the two scaling-in polices will be checked when doing a scaling-action on this cluster. This is not same as  OS::Heat::ScalingPolicy in heat?
How should I use senlin for following cases?
1.  15% < cpu_util  < 30%,  scaling_in 1 instance
2.   cpu_util < 15%, scaling_in 2 instances

This is a very interesting question and you're right about the difference between Senlin ScalingPolicy and OS::Heat::ScalingPolicy. In Heat, OS::Heat::ScalingPolicy is actually not just a policy. It is a combination of a webhook and a rule about how ASG respond to the webhook triggering(resource signal). So you can define two different OS::Heat::ScalingPolicy instances to make them deal with two cases you described respectively.

But in Senlin, ScalingPolicy is a REAL policy, it only describes how a Senlin cluster react to an action triggered by Senlin webhook which is defined separately. The problem is when an cluster action e.g. CLUSTER_SCALE_IN is triggered, all policies attached to it will be checked in sequence based on policies priority definition. So if you create two Senlin ScalingPolicy and attach them to the same cluster, only one of them will take effect actually.

# 1.  But in policy_check function, all the policies will be checked in priority-based order for a CLUSTER_SCALING_IN action if the cluster attached with SCALING multiple policies. 
       is this a bug?  or  what  is the significance  of prority).  
            Sorry, I didn't describe it clearly. I mean although both scaling policies will be checked before CLUSTER_SCALING_IN action is executed, count result from one ScalingPolicy will actually be overridden by the result from another ScalingPolicy which has higher priority. 
          
After debug it , I found thart former result is not overridden by another policy.
http://git.openstack.org/cgit/openstack/senlin/tree/senlin/engine/actions/base.py#n441
          
  
    2. if  a cluster attached a scaling policy with event = CLUSTER_SCALE_IN,  when a  CLUSTER_SCALING_OUT action is triggered,  the policy also will be checked,  is this reasonable?
         When a ScalingPolicy is defined, you can use 'event' property to specify the action type you want the policy to take effect on, like:
         http://git.openstack.org/cgit/openstack/senlin/tree/examples/policies/scaling_policy.yaml#n5

         Although a ScalingPolicy will be checked for both CLUSTER_SCALE_IN and CLUSTER_SCALE_OUT actions, the check routine will return immediately if the action type is not what it is expecting. 
         http://git.openstack.org/cgit/openstack/senlin/tree/senlin/policies/scaling_policy.py#n133

Yes in pre_op, it‘s not checked, but all ScalingPolicies still will check whether in cooldown.
http://git.openstack.org/cgit/openstack/senlin/tree/senlin/engine/actions/base.py#n431 


Currently, you can use the following way to support your use case in Senlin:
1. Define two Senlin webhooks which target on the CLUSTER_SCALE_OUT action of the cluster and specify the 'param' as {'count': 1} for webhook1 and {'count': 2 } for webhook2;
1. Define two ceilometer/aodh alarms with the first one matching case1 and second one matching case2. Then define webhook1 url as alarm1's alarm-action and webhook2 url as alarm2's alarm-action.

# 
Your suggestion has a problem when I want different cooldown for each ceilometer/aodh alarms, for following cases, how should I do?
1.  15% < cpu_util  < 30%,  scaling_in 1 instance with 300s cooldown time
2.   cpu_util < 15%, scaling_in 2 instances with 600s  cooldown time
     You can define the cooldown by specifying it when creating the policy or attaching it to a cluster. The cooldown check logic will prevent a policy taking effect if cooldown is still in progress.
     http://git.openstack.org/cgit/openstack/senlin/tree/senlin/engine/actions/base.py#n431

Yes we can define cooldown for each policy, but I want each cluster_scaling_in action is only checked by one  scaling_policy like OS::Heat::ScalingPolicy?
In heat,  we could define two scaling_in actions(via define two  OS::Heat::ScalingPolicy polices ), each scaling_in action is checked by one OS::Heat::ScalingPolicy, so each scaling_in action's cooldown is only checked in one OS::Heat::ScalingPolicy.
but in senlin, each scaling_in action will be checked by all attached scaling_policies, so all scaling_polices' cooldown will be checked.



For a senlin webhook, could we assign a policy which will be checked ?
     User is not allowed to specify the policy when defining a webhook. The webhook target is decided by target object(cluster or node) and target action type.

Then each time alarm1 is triggered, cluster will be scaled out with count 1 which means one new node will be created and added to cluster. When alarm2 is triggered, cluster will be scaled out with count 2 that two new nodes will be created and added to cluster.

The question you asked is really interesting and we did consider to support this kind of requirement using a 'complex' ScalingPolicy which defined both trigger(alarm), webhook and some rules for scaling. But after some discussion, we felt that maybe we should let some high level service/enduser to define this kind of 'POLICY' since it's more like a workflow definition rather than a description of the rule cluster scaling. So currently, we only provide atomic operation(e.g. webhook, 'simple' ScalingPolicy) in Senlin while leaving the work of combining these operations to support a use case to enduser/high-level service.

Thanks a lot for throwing this interesting question and I do agree that we should make more discussion about it to think whether we need to adjust our policy design to support this kind of scenario more smoothly.

--
Best regards,

Yanyan



-- 
Best regards,

Yanyan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20151123/21276a09/attachment.html>


More information about the OpenStack-dev mailing list