[Openstack-operators] [openstack-operators][Ceilometer vs Monasca] Alarms: Ceilometer vs Monasca

Pedro Sousa pgsousa at gmail.com
Wed Aug 16 16:04:41 UTC 2017


Hi,

I use Aodh + Gnocchi for autoscaling. I also use Mistral + Zaqar for
auto-healing. See the example below, hope it helps.


Main template:

(...)
mongocluster:
    type: OS::Heat::AutoScalingGroup
    properties:
      cooldown: 60
      desired_capacity: 2
      max_size: 3
      min_size: 1
      resource:
        type: ./mongocluster.yaml
        properties:
          network: { get_attr: [ voicis_network, be_om_net ] }
          flavor: { get_param: flavor }
          image: { get_param: image }
          key_name: { get_param: key_name }
          base_mgmt_security_group: { get_attr: [ security_groups,
base_mgmt ] }
          mongodb_security_group: { get_attr: [ security_groups, mongodb ] }
          root_stack_id: {get_param: "OS::stack_id"}
          metadata: {"metering.server_group": {get_param: "OS::stack_id"}}


mongodb_scaleup_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: {get_resource: mongocluster}
      cooldown: 60
      scaling_adjustment: 1

  mongodb_scaledown_policy:
    type: OS::Heat::ScalingPolicy
    properties:
      adjustment_type: change_in_capacity
      auto_scaling_group_id: {get_resource: mongocluster}
      cooldown: 60
      scaling_adjustment: -1

cpu_alarm_high:
    type: OS::Aodh::GnocchiAggregationByResourcesAlarm
    properties:
      description: Scale-up if the average CPU > 95% for 1 minute
      metric: cpu_util
      aggregation_method: mean
      granularity: 300
      evaluation_periods: 1
      threshold: 80
      resource_type: instance
      comparison_operator: gt
      alarm_actions:
        - str_replace:
            template: trust+url
            params:
              url: {get_attr: [mongodb_scaleup_policy, signal_url]}
      query:
        str_replace:
          template: '{"=": {"server_group": "stack_id"}}'
          params:
            stack_id: {get_param: "OS::stack_id"}

  cpu_alarm_low:
    type: OS::Aodh::GnocchiAggregationByResourcesAlarm
    properties:
      metric: cpu_util
      aggregation_method: mean
      granularity: 300
      evaluation_periods: 1
      threshold: 5
      resource_type: instance
      comparison_operator: lt
      alarm_actions:
        - str_replace:
            template: trust+url
            params:
              url: {get_attr: [mongodb_scaledown_policy, signal_url]}
      query:
        str_replace:
          template: '{"=": {"server_group": "stack_id"}}'
          params:
            stack_id: {get_param: "OS::stack_id"}

outputs:
  mongo_stack_id:
    description: UUID of the cluster nested stack
    value: {get_resource: mongocluster}
  scale_up_url:
    description: >
      This URL is the webhook to scale up the autoscaling group.  You
      can invoke the scale-up operation by doing an HTTP POST to this
      URL; no body nor extra headers are needed.
    value: {get_attr: [mongodb_scaleup_policy, alarm_url]}
  scale_dn_url:
    description: >
      This URL is the webhook to scale down the autoscaling group.
      You can invoke the scale-down operation by doing an HTTP POST to
      this URL; no body nor extra headers are needed.
    value: {get_attr: [mongodb_scaledown_policy, alarm_url]}
  ceilometer_query:
    value:
      str_replace:
        template: >
          ceilometer statistics -m cpu_util
          -q metadata.user_metadata.stack=stackval -p 60 -a avg
        params:
          stackval: { get_param: "OS::stack_id" }
    description: >
      This is a Ceilometer query for statistics on the cpu_util meter
      Samples about OS::Nova::Server instances in this stack.  The -q
      parameter selects Samples according to the subject's metadata.
      When a VM's metadata includes an item of the form metering.X=Y,
      the corresponding Ceilometer resource has a metadata item of the
      form user_metadata.X=Y and samples about resources so tagged can
      be queried with a Ceilometer query term of the form
      metadata.user_metadata.X=Y.  In this case the nested stacks give
      their VMs metadata that is passed as a nested stack parameter,
      and this stack passes a metadata of the form metering.stack=Y,
      where Y is this stack's ID.




mongocluster.yaml

heat_template_version: ocata

description: >
  MongoDB cluster node


metadata:
    type: json

  root_stack_id:
    type: string
    default: ""

conditions:
    is_standalone: {equals: [{get_param: root_stack_id}, ""]}


resources:

mongodbserver:
    type: OS::Nova::Server
    properties:
      name: { str_replace: { params: { random_string: { get_resource:
random_str }, __zone__: { get_param: zone } }, template:
mongodb-random_string.__zone__ } }
      image: { get_param: image }
      flavor: { get_param: flavor }
      metadata: {get_param: metadata}
      key_name: { get_param: key_name }
      networks:
        - port: { get_resource: om_port }
      user_data_format: SOFTWARE_CONFIG
      user_data: { get_resource: server_clu_init }

  alarm_queue:
    type: OS::Zaqar::Queue

  error_event_alarm:
    type: OS::Aodh::EventAlarm
    properties:
      event_type: compute.instance.update
      query:
        - field: traits.instance_id
          value: {get_resource: mongodbserver}
          op: eq
        - field: traits.state
          value: error
          op: eq
      alarm_queues:
       - {get_resource: alarm_queue}

  deleted_event_alarm:
    type: OS::Aodh::EventAlarm
    properties:
      event_type: compute.instance.delete.start
      query:
        - field: traits.instance_id
          value: {get_resource: mongodbserver}
          op: eq
      alarm_queues:
       - {get_resource: alarm_queue}

  alarm_cache_wait:
    type: OS::Heat::TestResource
    properties:
      action_wait_secs:
        create: 60
        update: 60
      value:
        list_join:
          - ''
          - - {get_attr: [error_event_alarm, show]}
            - {get_attr: [deleted_event_alarm, show]}

  alarm_subscription:
    type: OS::Zaqar::MistralTrigger
    properties:
      queue_name: {get_resource: alarm_queue}
      workflow_id: {get_resource: autoheal}
      input:
        stack_id: {get_param: "OS::stack_id"}
        root_stack_id:
          if:
            - is_standalone
            - {get_param: "OS::stack_id"}
            - {get_param: "root_stack_id"}

  autoheal:
    type: OS::Mistral::Workflow
    properties:
      description: >
        Mark a server as unhealthy and commence a stack update to replace
it.
      input:
        stack_id:
        root_stack_id:
      type: direct
      tasks:
        - name: resources_mark_unhealthy
          action:
            list_join:
              - ' '
              - - heat.resources_mark_unhealthy
                - stack_id=<% $.stack_id %>
                - resource_name=<%
env().notification.body.reason_data.event.traits.where($[0] =
'instance_id').select($[2]).first() %>
                - mark_unhealthy=true
                - resource_status_reason='Marked by alarm'
          on_success:
            - stacks_update
        - name: stacks_update
          action: heat.stacks_update stack_id=<% $.root_stack_id %>
existing=true

outputs:
  private_mgmt_ip:
    description: IP address in private management network
    value: { get_attr: [ be_om_port, fixed_ips, 0, ip_address ] }
  internal_ip:
    description: IP address in private rt network
    value: { get_attr: [ be_rt_port, fixed_ips, 0, ip_address ] }
  OS::stack_id:
    description: The server UUID
    value: {get_resource: mongodbserver}
    condition: {not: is_standalone}


Best Regards


On Wed, Aug 16, 2017 at 1:44 PM, Krzysztof Świątek <
krzysztof.swiatek at corp.ovh.com> wrote:

> Hi,
>
> i have a question about alarms in openstack.
>
> I want autoscaling with heat, and I'm looking for metric/alarm project
> which I can use with heat.
> I found that I can use Monasca or Ceilometer (with Aodh).
> My question is:
> Is any of you using heat (autoscaling) in production?
> If yes what are you using (Monasca, Ceilometer, other) for metric and
> alarms, and why?
>
> --
> Pozdrawiam,
> Krzysztof Świątek
>
> _______________________________________________
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20170816/8831e90a/attachment.html>


More information about the OpenStack-operators mailing list