Open Stack

Thu Jan 23 00:44:01 UTC 2014

So, a conversation came again up today around whether or not Ironic will,
in the future, support operations on groups of nodes. Some folks have
expressed a desire for Ironic to expose operations on groups of nodes;
others want Ironic to host the hardware-grouping data so that eg. Heat and
Tuskar can make more intelligent group-aware decisions or represent the
groups in a UI. Neither of these have an implementation in Ironic today...
and we still need to implement a host of other things before we start on
this. FWIW, this discussion is meant to stimulate thinking ahead to things
we might address in Juno, and aligning development along the way.

There's also some refactoring / code cleanup which is going on and worth
mentioning because it touches the part of the code which this discussion
impacts. For our developers, here is additional context:
* our TaskManager class supports locking >1 node atomically, but both the
driver API and our REST API only support operating on one node at a time.
AFAIK, no where in the code do we actually pass a group of nodes.
* for historical reasons, our driver API requires both a TaskManager and a
Node object be passed to all methods. However, the TaskManager object
contains a reference to the Node(s) which it has acquired, so the node
parameter is redundant.
* we've discussed cleaning this up, but I'd like to avoid refactoring the
same interfaces again when we go to add group-awareness.

I'll try to summarize the different axis-of-concern around which the
discussion of node groups seem to converge...

1: physical vs. logical grouping
- Some hardware is logically, but not strictly physically, grouped. Eg, 1U
servers in the same rack. There is some grouping, such as failure domain,
but operations on discrete nodes are discreet. This grouping should be
modeled somewhere, and some times a user may wish to perform an operation
on that group. Is a higher layer (tuskar, heat, etc) sufficient? I think so.
- Some hardware _is_ physically grouped. Eg, high-density cartridges which
share firmware state or a single management end-point, but are otherwise
discrete computing devices. This grouping must be modeled somewhere, and
certain operations can not be performed on one member without affecting all
members. Things will break if each node is treated independently.

2: performance optimization
- Some operations may be optimized if there is an awareness of concurrent
identical operations. Eg, deploy the same image to lots of nodes using
multicast or bittorrent. If Heat were to inform Ironic that this deploy is
part of a group, the optimization would be deterministic. If Heat does not
inform Ironic of this grouping, but Ironic infers it (eg, from timing of
requests for similar actions) then optimization is possible but
non-deterministic, and may be much harder to reason about or debug.

3: APIs
- Higher layers of OpenStack (eg, Heat) are expected to orchestrate
discrete resource units into a larger group operation. This is where the
grouping happens today, but already results in inefficiencies when
performing identical operations at scale. Ironic may be able to get around
this by coalescing adjacent requests for the same operation, but this would
be non-deterministic.
- Moving group-awareness or group-operations into the lower layers (eg,
Ironic) looks like it will require non-trivial changes to Heat and Nova,
and, in my opinion, violates a layer-constraint that I would like to
maintain. On the other hand, we could avoid the challenges around
coalescing. This might be necessary to support physically-grouped hardware
anyway, too.

If Ironic coalesces requests, it could be done in either the
ConductorManager layer or in the drivers themselves. The difference would
be whether our internal driver API accepts one node or a set of nodes for
each operation. It'll also impact our locking model. Both of these are
implementation details that wouldn't affect other projects, but would
affect our driver developers.

Also, until Ironic models physically-grouped hardware relationships in some
internal way, we're going to have difficulty supporting that class of
hardware. Is that OK? What is the impact of not supporting such hardware?
It seems, at least today, to be pretty minimal.

Discussion is welcome.

-Devananda
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140122/efc0a0e4/attachment.html>

Open Stack

[openstack-dev] [Ironic] Node groups and multi-node operations

OpenStack

Community

Documentation

Branding & Legal