<div dir="ltr">Hi,<div><br></div><div>I use Aodh + Gnocchi for autoscaling. I also use Mistral + Zaqar for auto-healing. See the example below, hope it helps.</div><div><br></div><div><br></div><div>Main template:</div><div><br></div><div>(...)</div><div><div>mongocluster:</div><div>    type: OS::Heat::AutoScalingGroup</div><div>    properties:</div><div>      cooldown: 60</div><div>      desired_capacity: 2</div><div>      max_size: 3</div><div>      min_size: 1</div><div>      resource:</div><div>        type: ./mongocluster.yaml</div><div>        properties:</div><div>          network: { get_attr: [ voicis_network, be_om_net ] }</div><div>          flavor: { get_param: flavor }<br></div><div>          image: { get_param: image }</div><div>          key_name: { get_param: key_name }</div><div>          base_mgmt_security_group: { get_attr: [ security_groups, base_mgmt ] }</div><div>          mongodb_security_group: { get_attr: [ security_groups, mongodb ] }</div><div>          root_stack_id: {get_param: "OS::stack_id"}<br></div><div>          metadata: {"metering.server_group": {get_param: "OS::stack_id"}}</div></div><div><br></div><div><br></div><div><div>mongodb_scaleup_policy:</div><div>    type: OS::Heat::ScalingPolicy</div><div>    properties:</div><div>      adjustment_type: change_in_capacity</div><div>      auto_scaling_group_id: {get_resource: mongocluster}</div><div>      cooldown: 60</div><div>      scaling_adjustment: 1</div><div><br></div><div>  mongodb_scaledown_policy:</div><div>    type: OS::Heat::ScalingPolicy</div><div>    properties:</div><div>      adjustment_type: change_in_capacity</div><div>      auto_scaling_group_id: {get_resource: mongocluster}</div><div>      cooldown: 60</div><div>      scaling_adjustment: -1</div><div><br></div></div><div><div>cpu_alarm_high:</div><div>    type: OS::Aodh::GnocchiAggregationByResourcesAlarm</div><div>    properties:</div><div>      description: Scale-up if the average CPU > 95% for 1 minute</div><div>      metric: cpu_util</div><div>      aggregation_method: mean</div><div>      granularity: 300</div><div>      evaluation_periods: 1</div><div>      threshold: 80</div><div>      resource_type: instance</div><div>      comparison_operator: gt</div><div>      alarm_actions:</div><div>        - str_replace:</div><div>            template: trust+url</div><div>            params:</div><div>              url: {get_attr: [mongodb_scaleup_policy, signal_url]}</div><div>      query:</div><div>        str_replace:</div><div>          template: '{"=": {"server_group": "stack_id"}}'</div><div>          params:</div><div>            stack_id: {get_param: "OS::stack_id"}</div><div><br></div><div>  cpu_alarm_low:</div><div>    type: OS::Aodh::GnocchiAggregationByResourcesAlarm</div><div>    properties:</div><div>      metric: cpu_util</div><div>      aggregation_method: mean</div><div>      granularity: 300</div><div>      evaluation_periods: 1</div><div>      threshold: 5</div><div>      resource_type: instance</div><div>      comparison_operator: lt</div><div>      alarm_actions:</div><div>        - str_replace:</div><div>            template: trust+url</div><div>            params:</div><div>              url: {get_attr: [mongodb_scaledown_policy, signal_url]}</div><div>      query:</div><div>        str_replace:</div><div>          template: '{"=": {"server_group": "stack_id"}}'</div><div>          params:</div><div>            stack_id: {get_param: "OS::stack_id"}</div></div><div><br></div><div><div>outputs:</div><div>  mongo_stack_id:<br></div><div>    description: UUID of the cluster nested stack</div><div>    value: {get_resource: mongocluster}</div><div>  scale_up_url:</div><div>    description: ></div><div>      This URL is the webhook to scale up the autoscaling group.  You</div><div>      can invoke the scale-up operation by doing an HTTP POST to this</div><div>      URL; no body nor extra headers are needed.</div><div>    value: {get_attr: [mongodb_scaleup_policy, alarm_url]}</div><div>  scale_dn_url:</div><div>    description: ></div><div>      This URL is the webhook to scale down the autoscaling group.</div><div>      You can invoke the scale-down operation by doing an HTTP POST to</div><div>      this URL; no body nor extra headers are needed.</div><div>    value: {get_attr: [mongodb_scaledown_policy, alarm_url]}</div><div>  ceilometer_query:</div><div>    value:</div><div>      str_replace:</div><div>        template: ></div><div>          ceilometer statistics -m cpu_util</div><div>          -q metadata.user_metadata.stack=stackval -p 60 -a avg</div><div>        params:</div><div>          stackval: { get_param: "OS::stack_id" }</div><div>    description: ></div><div>      This is a Ceilometer query for statistics on the cpu_util meter</div><div>      Samples about OS::Nova::Server instances in this stack.  The -q</div><div>      parameter selects Samples according to the subject's metadata.</div><div>      When a VM's metadata includes an item of the form metering.X=Y,</div><div>      the corresponding Ceilometer resource has a metadata item of the</div><div>      form user_metadata.X=Y and samples about resources so tagged can</div><div>      be queried with a Ceilometer query term of the form</div><div>      metadata.user_metadata.X=Y.  In this case the nested stacks give</div><div>      their VMs metadata that is passed as a nested stack parameter,</div><div>      and this stack passes a metadata of the form metering.stack=Y,</div><div>      where Y is this stack's ID.</div></div><div><br></div><div><br></div><div><br></div><div><br></div><div>mongocluster.yaml</div><div><br></div><div><div>heat_template_version: ocata</div><div><br></div><div>description: ></div><div>  MongoDB cluster node</div></div><div><br></div><div><br></div><div><div>metadata:</div><div>    type: json</div><div><br></div><div>  root_stack_id:</div><div>    type: string</div><div>    default: ""</div><div><br></div><div>conditions:</div><div>    is_standalone: {equals: [{get_param: root_stack_id}, ""]}</div><div><br></div></div><div><br></div><div>resources:<br></div><div><br></div><div><div>mongodbserver:</div><div>    type: OS::Nova::Server</div><div>    properties:</div><div>      name: { str_replace: { params: { random_string: { get_resource: random_str }, __zone__: { get_param: zone } }, template: mongodb-random_string.__zone__ } }</div><div>      image: { get_param: image }</div><div>      flavor: { get_param: flavor }</div><div>      metadata: {get_param: metadata}</div><div>      key_name: { get_param: key_name }</div><div>      networks:</div><div>        - port: { get_resource: om_port }</div><div>      user_data_format: SOFTWARE_CONFIG<br></div><div>      user_data: { get_resource: server_clu_init }</div><div><br></div><div>  alarm_queue:</div><div>    type: OS::Zaqar::Queue</div><div><br></div><div>  error_event_alarm:</div><div>    type: OS::Aodh::EventAlarm</div><div>    properties:</div><div>      event_type: compute.instance.update</div><div>      query:</div><div>        - field: traits.instance_id</div><div>          value: {get_resource: mongodbserver}</div><div>          op: eq</div><div>        - field: traits.state</div><div>          value: error</div><div>          op: eq</div><div>      alarm_queues:</div><div>       - {get_resource: alarm_queue}</div><div><br></div><div>  deleted_event_alarm:</div><div>    type: OS::Aodh::EventAlarm</div><div>    properties:</div><div>      event_type: compute.instance.delete.start</div><div>      query:</div><div>        - field: traits.instance_id</div><div>          value: {get_resource: mongodbserver}</div><div>          op: eq</div><div>      alarm_queues:</div><div>       - {get_resource: alarm_queue}</div><div><br></div><div>  alarm_cache_wait:</div><div>    type: OS::Heat::TestResource</div><div>    properties:</div><div>      action_wait_secs:</div><div>        create: 60</div><div>        update: 60</div><div>      value:</div><div>        list_join:</div><div>          - ''</div><div>          - - {get_attr: [error_event_alarm, show]}</div><div>            - {get_attr: [deleted_event_alarm, show]}</div><div><br></div><div>  alarm_subscription:</div><div>    type: OS::Zaqar::MistralTrigger</div><div>    properties:</div><div>      queue_name: {get_resource: alarm_queue}</div><div>      workflow_id: {get_resource: autoheal}</div><div>      input:</div><div>        stack_id: {get_param: "OS::stack_id"}</div><div>        root_stack_id:</div><div>          if:</div><div>            - is_standalone</div><div>            - {get_param: "OS::stack_id"}</div><div>            - {get_param: "root_stack_id"}</div><div><br></div><div>  autoheal:</div><div>    type: OS::Mistral::Workflow</div><div>    properties:</div><div>      description: ></div><div>        Mark a server as unhealthy and commence a stack update to replace it.</div><div>      input:</div><div>        stack_id:</div><div>        root_stack_id:</div><div>      type: direct</div><div>      tasks:</div><div>        - name: resources_mark_unhealthy</div><div>          action:</div><div>            list_join:</div><div>              - ' '</div><div>              - - heat.resources_mark_unhealthy</div><div>                - stack_id=<% $.stack_id %></div><div>                - resource_name=<% env().notification.body.reason_data.event.traits.where($[0] = 'instance_id').select($[2]).first() %></div><div>                - mark_unhealthy=true</div><div>                - resource_status_reason='Marked by alarm'</div><div>          on_success:</div><div>            - stacks_update</div><div>        - name: stacks_update</div><div>          action: heat.stacks_update stack_id=<% $.root_stack_id %> existing=true</div><div><br></div><div>outputs:</div><div>  private_mgmt_ip:</div><div>    description: IP address in private management network</div><div>    value: { get_attr: [ be_om_port, fixed_ips, 0, ip_address ] }</div><div>  internal_ip:</div><div>    description: IP address in private rt network</div><div>    value: { get_attr: [ be_rt_port, fixed_ips, 0, ip_address ] }</div><div>  OS::stack_id:</div><div>    description: The server UUID</div><div>    value: {get_resource: mongodbserver}</div><div>    condition: {not: is_standalone}</div></div><div><br></div><div><br></div><div>Best Regards</div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Wed, Aug 16, 2017 at 1:44 PM, Krzysztof Świątek <span dir="ltr"><<a href="mailto:krzysztof.swiatek@corp.ovh.com" target="_blank">krzysztof.swiatek@corp.ovh.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi,<br>

<br>

i have a question about alarms in openstack.<br>

<br>

I want autoscaling with heat, and I'm looking for metric/alarm project<br>

which I can use with heat.<br>

I found that I can use Monasca or Ceilometer (with Aodh).<br>

My question is:<br>

Is any of you using heat (autoscaling) in production?<br>

If yes what are you using (Monasca, Ceilometer, other) for metric and<br>

alarms, and why?<br>

<span class="HOEnZb"><font color="#888888"><br>

--<br>

Pozdrawiam,<br>

Krzysztof Świątek<br>

<br>

______________________________<wbr>_________________<br>

OpenStack-operators mailing list<br>

<a href="mailto:OpenStack-operators@lists.openstack.org">OpenStack-operators@lists.<wbr>openstack.org</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" rel="noreferrer" target="_blank">http://lists.openstack.org/<wbr>cgi-bin/mailman/listinfo/<wbr>openstack-operators</a><br>

</font></span></blockquote></div><br></div>