[openstack-dev] [Heat] Stack breakpoint

Ton Ngo ton at us.ibm.com
Mon Mar 17 19:10:33 UTC 2014


I would like to revisit with more details an idea that was mentioned in the
last design summit and hopefully get some feedback.

The scenario is troubleshooting a failed template.
Currently we can stop on the point of failure by disabling rollback:  this
works well for stack-create; stack-update requires some more work but
that's different thread.  In many cases however, the point of failure may
be too late or too hard to debug because the condition causing the failure
may not be obvious or may have been changed.  If we can pause the stack at
a point before the failure, then we can check whether the state of the
environment and the stack is what we expect.
The analogy with program debugging is breakpoint/step, so it may be useful
to introduce this same concept in a stack.

The usage would be something like:
-Run stack-create (or stack-update once it can handle failure) with one or
more resource name specified as breakpoint
-As the engine traverses down the dependency graph, it would stop at the
breakpoint resource and all dependent resources.  Other resources with no
dependency will proceed to completion.
-After debugging, continue the stack by:
    -Stepping: remove current breakpoint, set breakpoint for next resource
(s) in dependency graph, resume stack-create (or stack-update)
    -Running to completion: remove current breakpoint, resume stack-create
(or stack-update)

Some other possible uses for this breakpoint:
- While developing new template or resource type, bring up a stack to a
point before the new code is to be executed
- Introduce human process: pause the partial stack so the user can get the
stack info and perform some tasks before continuing

Some issues to consider (with initial feedback from shardy):
- Granularity of stepping:  resource level or internal steps within a
resource
- How to specify breakpoints:  CLI argument or coded in template or both
- How to handle resources with timer, e.g. wait condition:  pause/resume
timer value
- New state for a resource:  PAUSED

Thanks.

Ton Ngo,




More information about the OpenStack-dev mailing list