[openstack-dev] [Heat] [TripleO] Rolling updates spec re-written. RFC
Zane Bitter
zbitter at redhat.com
Wed Feb 5 18:24:33 UTC 2014
On 05/02/14 11:39, Clint Byrum wrote:
> Excerpts from Zane Bitter's message of 2014-02-04 16:14:09 -0800:
>> On 03/02/14 17:09, Clint Byrum wrote:
>>> UpdatePolicy in cfn is a single string, and causes very generic rolling
>>
>> Huh?
>>
>> http://docs.aws.amazon.com/AWSCloudFormation/latest/UserGuide/aws-attribute-updatepolicy.html
>>
>> Not only is it not just a single string (in fact, it looks a lot like
>> the properties you have defined), it's even got another layer of
>> indirection so you can define different types of update policy (rolling
>> vs. canary, anybody?). It's an extremely flexible syntax.
>>
>
> Oops, I relied a little too much on my memory and not enough on docs for
> that one. O-k, I will re-evaluate given actual knowledge of how it
> actually works. :-P
cheers :D
>> BTW, given that we already implemented this in autoscaling, it might be
>> helpful to talk more specifically about what we need to do in addition
>> in order to support the use cases you have in mind.
>>
>
> As Robert mentioned in his mail, autoscaling groups won't allow us to
> inject individual credentials. With the ResourceGroup, we can make a
> nested stack with a random string generator so that is solved. Now the
\o/ for the random string generator solving the problem!
:-( for ResourceGroup being the only way to do it.
This is exactly why I hate ResourceGroup and think it was a mistake.
Powerful software comes from being able to combine simple concepts in
complex ways. Right now you have to choose between an autoscaling group,
which has rolling updates, and a ResourceGroup which allows you to scale
stacks. That sucks. What you need is to have both at the same time, and
the way to do that is to allow autoscaling groups to scale stacks, as
has long been planned.
At this point it would be a mistake to add a _complicated_ feature
solely for the purpose of working around the fact the we can't yet
combine two other, existing, features. It would be better to fix
autoscaling groups to allow you to inject individual credentials and
then add a simpler feature that does not need to create ad-hoc groups.
> other piece we need is to be able to directly choose machines to take
> out of commission, which I think we may have a simple solution to but I
> don't want to derail on that.
>
> The one used in AutoScalingGroups is also limited to just one group,
> thus it can be done all inside the resource.
>
>>> update behavior. I want this resource to be able to control multiple
>>> groups as if they are one in some cases (Such as a case where a user
>>> has migrated part of an app to a new type of server, but not all.. so
>>> they will want to treat the entire aggregate as one rolling update).
>>>
>>> I'm o-k with overloading it to allow resource references, but I'd like
>>> to hear more people take issue with depends_on before I select that
>>> course.
>>
>> Resource references in general, and depends_on in particular, feel like
>> very much the wrong abstraction to me. This is a policy, not a resource.
>>
>>> To answer your question, using it with a server instance allows
>>> rolling updates across non-grouped resources. In the example the
>>> rolling_update_dbs does this.
>>
>> That's not a great example, because one DB server depends on the other,
>> forcing them into updating serially anyway.
>>
>
> You're right, a better example is a set of (n) resource groups which
> serve the same service and thus we want to make sure we maintain the
> minimum service levels as a whole.
That's interesting, and I'd like to hear more about that use case and
why it couldn't be solved using autoscaling groups assuming the obstacle
to using them at all were eliminated. If there's a real use case here
beyond "work around lack of stack-scaling functionality" then I'm
definitely open to being persuaded. I'd just like to make sure that it
exists and justifies the extra complexity.
> If it were an order of magnitude harder to do it this way, I'd say
> sure let's just expand on the single-resource rolling update. But
> I think it won't be that much harder to achieve this and then the use
> case is solved.
I guess what I'm thinking is that your proposal is really two features:
1) Notifications/callbacks on update that allow the user to hook in to
the workflow.
2) Rolling updates over ad-hoc groups (not autoscaling groups).
I think we all agree that (1) is needed; by my count ~6 really good use
cases have been mentioned in this thread.
What I'm suggesting is that we probably don't need to do (2) at all if
we fix autoscaling groups to be something you could use.
Having reviewed the code for rolling updates in scaling groups, I can
report that it is painfully complicated and that you'd be doing yourself
a big favour by not attempting to reimplement it with ad-hoc groups ;).
(To be fair, I don't think this would be quite as bad, though clearly it
wouldn't be as good as not having to do it at all.) More concerning than
that, though, is the way this looks set to make the template format even
more arcane than it already is. We might eventually be able to deprecate
resource types like ResourceGroup but we will be stuck with stuff like
this approximately forever, so we better make sure it contains only what
we need for the long term and isn't substantially shaped by tactical
workarounds for temporary problems.
>> I have to say that even in general, this whole idea about applying
>> update policies to non-grouped resources doesn't make a whole lot of
>> sense to me. For non-grouped resources you control the resource
>> definitions individually - if you don't want them to update at a
>> particular time, you have the option of just not updating them.
(Clarification: at the time I wrote this I wasn't aware that TripleO was
unable to use autoscaling groups in their current form, and the example
on the wiki contained only two servers, not 10+.)
> If I have to calculate all the deltas and feed Heat 10 templates, each
> with one small delta, I'm writing the same code as I'm proposing for
> this rolling update feature, but I'm writing it outside of Heat. That
> seems counter-productive for all of the other Heat users who would find
> this useful.
That's true. But as I mentioned in my reply to Robert, you already
started reimplementing autoscaling functionality when you had to
generate your own templates with multiple nearly-identical servers. If
the choice is between pushing more functionality (i.e. stack-scaling)
into autoscaling so that it actually works for you, or pushing
autoscaling functionality (i.e. rolling-update) out to ad-hoc groups,
then I'd submit that the former is better for Heat, for TripleO, and for
all of the other Heat users as well, because then nobody has to
implement _any_ part of autoscaling outside of Heat.
To be clear, that's a big 'if' and there may be another use case that I
am missing, but I think it's worthwhile to have the discussion.
cheers,
Zane.
More information about the OpenStack-dev
mailing list