[openstack-dev] [Heat] Convergence proof-of-concept showdown

Gurjar, Unmesh unmesh.gurjar at hp.com
Thu Dec 18 15:55:53 UTC 2014


> -----Original Message-----
> From: Zane Bitter [mailto:zbitter at redhat.com]
> Sent: Thursday, December 18, 2014 7:42 AM
> To: openstack-dev at lists.openstack.org
> Subject: Re: [openstack-dev] [Heat] Convergence proof-of-concept
> showdown
> 
> On 17/12/14 13:05, Gurjar, Unmesh wrote:
> >> I'm storing a tuple of its name and database ID. The data structure
> >> is resource.GraphKey. I was originally using the name for something,
> >> but I suspect I could probably drop it now and just store the
> >> database ID, but I haven't tried it yet. (Having the name in there
> >> definitely makes debugging more pleasant though ;)
> >>
> >
> > I agree, having name might come in handy while debugging!
> >
> >> When I build the traversal graph each node is a tuple of the GraphKey
> >> and a boolean to indicate whether it corresponds to an update or a
> >> cleanup operation (both can appear for a single resource in the same
> graph).
> >
> > Just to confirm my understanding, cleanup operation takes care of both:
> > 1. resources which are deleted as a part of update and 2. previous
> > versioned resource which was updated by replacing with a new resource
> > (UpdateReplace scenario)
> 
> Yes, correct. Also:
> 
> 3. resource versions which failed to delete for whatever reason on a previous
> traversal
> 
> > Also, the cleanup operation is performed after the update completes
> successfully.
> 
> NO! They are not separate things!
> 
> https://github.com/openstack/heat/blob/stable/juno/heat/engine/update.
> py#L177-L198
> 
> >>> If I am correct, you are updating all resources on update regardless
> >>> of their change which will be inefficient if stack contains a million
> resource.
> >>
> >> I'm calling update() on all resources regardless of change, but
> >> update() will only call handle_update() if something has changed
> >> (unless the plugin has overridden Resource._needs_update()).
> >>
> >> There's no way to know whether a resource needs to be updated before
> >> you're ready to update it, so I don't think of this as 'inefficient', just
> 'correct'.
> >>
> >>> We have similar questions regarding other areas in your
> >>> implementation, which we believe if we understand the outline of
> >>> your implementation. It is difficult to get a hold on your approach
> >>> just by looking
> >> at code. Docs strings / Etherpad will help.
> >>>
> >>>
> >>> About streams, Yes in a million resource stack, the data will be
> >>> huge, but
> >> less than template.
> >>
> >> No way, it's O(n^3) (cubed!) in the worst case to store streams for
> >> each resource.
> >>
> >>> Also this stream is stored
> >>> only In IN_PROGRESS resources.
> >>
> >> Now I'm really confused. Where does it come from if the resource
> >> doesn't get it until it's already in progress? And how will that information
> help it?
> >>
> >
> > When an operation on stack is initiated, the stream will be identified.
> 
> OK, this may be one of the things I was getting confused about - I though a
> 'stream' belonged to one particular resource and just contained all of the
> paths to reaching that resource. But here it seems like you're saying that a
> 'stream' is a representation of the entire graph?
> So it's essentially just a gratuitously bloated NIH serialisation of the
> Dependencies graph?
> 
> > To begin
> > the operation, the action is initiated on the leaf (or root)
> > resource(s) and the stream is stored (only) in this/these IN_PROGRESS
> resource(s).
> 
> How does that work? Does it get deleted again when the resource moves to
> COMPLETE?
> 

Yes, IMO, upon resource completion, the stream can be deleted. I do not
foresee any situation where-in the storing the stream is required.
Since, when another operation is initiated on the stack, that template should
be parsed and the new stream should be identified and used.

> > The stream should then keep getting passed to the next/previous level
> > of resource(s) as and when the dependencies for the next/previous level
> of resource(s) are met.
> 
> That sounds... identical to the way it's implemented in my prototype (passing
> a serialisation of the graph down through the notification triggers), except for
> the part about storing it in the Resource table.
> Why would we persist to the database data that we only need for the
> duration that we already have it in memory anyway?
> 

Earlier we thought of passing it along while initiating the next level of resource(s).
However, for the million resource stack, it will be quite large and passing it
around will be inefficient. So, we intended to have it stored in database.

Also, it can be used for resuming a stack operation when the processing engine goes
down and another engine has to resume that stack operation.

> If we're going to persist it we should do so once, in the Stack table, at the
> time that we're preparing to start the traversal.
> 
> >>> The reason to have entire dependency list to reduce DB queries while
> >>> a
> >> stack update.
> >>
> >> But we never need to know that. We only need to know what just
> >> happened and what to do next.
> >>
> >
> > As mentioned earlier, each level of resources in a graph pass on the
> > dependency list/stream to their next/previous level of resources. This
> > is information should further be used to determine what is to be done next
> and drive the operation to completion.
> 
> Why would we store *and* forward?
> 

Sorry for causing the confusion. By pass on, I meant setting/storing the stream
in database for the next level of resource.

> >>> When you have a singular dependency on each resources similar to
> >>> your implantation, then we will end up loading Dependencies one at a
> >>> time and
> >> altering almost all resource's dependency regardless of their change.
> >>>
> >>> Regarding a 2 template approach for delete, it is not actually 2
> >>> different templates. Its just that we have a delete stream To be
> >>> taken up
> >> post update.
> >>
> >> That would be a regression from Heat's current behaviour, where we
> >> start cleaning up resources as soon as they have nothing depending on
> them.
> >> There's not even a reason to make it worse than what we already have,
> >> because it's actually a lot _easier_ to treat update and clean up as
> >> the same kind of operation and throw both into the same big graph.
> >> The dual implementations and all of the edge cases go away and you
> >> can just trust in the graph traversal to do the Right Thing in the most
> parallel way possible.
> >>
> >>> (Any post operation will be handled as an update) This approach is
> >>> True when Rollback==True We can always fall back to regular stream
> >>> (non-delete stream) if Rollback=False
> >>
> >> I don't understand what you're saying here.
> >
> > Just to elaborate, in case of update with rollback, there will be 2
> > streams of
> > operations:
> 
> There really should not be.
> 
> > 1. first is the create and update resource stream 2. second is the
> > stream for deleting resources (which will be taken if the first stream
> > completes successfully).
> 
> We don't want to break it up into discrete steps. We want to treat it as one
> single graph traversal - provided we set up the dependencies correctly then
> the most optimal behaviour just falls out of our graph traversal algorithm for
> free.
> 

Yes, I agree.

> In the existing Heat code I linked above, we use actual
> heat.engine.resource.Resource objects as nodes in the dependency graph
> and rely on figuring out whether they came from the old or new stack to
> differentiate them. However, it's not possible (or desirable) to serialise a
> graph containing those objects and send it to another process, so in my
> convergence prototype I use (key, direction) tuples as the nodes so that the
> same key may appear twice in the graph with different 'directions'
> (forward=True for update, =False for cleanup - note that the direction is with
> respect to the template... as far as the graph is concerned it's a single
> traversal going in one direction).
> 
> Artificially dividing things into separate update and cleanup phases is both
> more complicated code to maintain and a major step backwards for our
> users.
> 
> I want to be very clear about this: treating the updates and cleanups as
> separate, serial tasks is a -2 show stopper for any convergence design.
> We *MUST* NOT do that to our users.
> 
> > However, in case of an update without rollback, there will be a single
> > stream of operation (no delete/cleanup stream required).
> 
> By 'update without rollback' I assume you mean when the user issues an
> update with disable_rollback=True?
> 
> Actually it doesn't matter what you mean, because there is no way of
> interpreting this that could make it correct. We *always* need to check all of
> the pre-existing resources for clean up. The only exception is on create, and
> then only because the set of pre-existing resources is empty.
> 
> If your plan for handling UpdateReplace when rollback is disabled is just to
> delete the old resource at the same time as creating the new one, then your
> plan won't work because the dependencies are backwards.
> And leaving the replaced resources around without even trying to clean
> them up would be outright unethical, given how much money it would cost
> our users. So -2 on 'no cleanup when rollback disabled' as well.
> 
> cheers,
> Zane.
> 
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list