<font size=2 face="sans-serif">In AWS, an autoscaling group includes health
maintenance functionality --- both an ability to detect basic forms of
failures and an ability to react properly to failures detected by itself
or by a load balancer. What is the thinking about how to get this
functionality in OpenStack? Since OpenStack's OS::Heat::AutoScalingGroup
has a more general member type, what is the thinking about what failure
detection means (and how it would be accomplished, communicated)?</font>
<br>
<br><font size=2 face="sans-serif">I have not found design discussion of
this; have I missed something?</font>
<br>
<br><font size=2 face="sans-serif">I suppose the natural answer for OpenStack
would be centered around webhooks. An OpenStack scaling group (OS
SG = OS::Heat::AutoScalingGroup or AWS::AutoScaling::AutoScalingGroup or
OS::Heat::ResourceGroup or OS::Heat::InstanceGroup) could generate a webhook
per member, with the meaning of the webhook being that the member has been
detected as dead and should be deleted and removed from the group --- and
a replacement member created if needed to respect the group's minimum size.
When the member is a Compute instance and Ceilometer exists, the
OS SG could define a Ceilometer alarm for each member (by including these
alarms in the template generated for the nested stack that is the SG),
programmed to hit the member's deletion webhook when death is detected
(I imagine there are a few ways to write a Ceilometer condition that detects
instance death). When the member is a nested stack and Ceilometer
exists, it could be the member stack's responsibility to include a Ceilometer
alarm that detects the member stack's death and hit the member stack's
deletion webhook. There is a small matter of how the author of the
template used to create the member stack writes some template snippet that
creates a Ceilometer alarm that is specific to a member stack that does
not exist yet. I suppose we could stipulate that if the member template
includes a parameter with name "member_name" and type "string"
then the OS OG takes care of supplying the correct value of that parameter;
as illustrated in the asg_of_stacks.yaml of </font><a href=https://review.openstack.org/#/c/97366/><font size=2 face="sans-serif">https://review.openstack.org/#/c/97366/</font></a><font size=2 face="sans-serif">
, a member template can use a template parameter to tag Ceilometer data
for querying. The URL of the member stack's deletion webhook could
be passed to the member template via the same sort of convention. When
Ceilometer does not exist, it is less obvious to me what could usefully
be done. Are there any useful SG member types besides Compute instances
and nested stacks? Note that a nested stack could also pass its member
deletion webhook to a load balancer (that is willing to accept such a thing,
of course), so we get a lot of unity of mechanism between the case of detection
by infrastructure vs. application level detection.</font>
<br>
<br><font size=2 face="sans-serif">I am not entirely happy with the idea
of a webhook per member. If I understand correctly, generating webhooks
is a somewhat expensive and problematic process. What would be the
alternative?</font>
<br>
<br><font size=2 face="sans-serif">Thanks,<br>
Mike</font>