[openstack-dev] [Ironic] Node groups and multi-node operations

Clint Byrum clint at fewbar.com
Sat Jan 25 06:42:42 UTC 2014

Excerpts from Robert Collins's message of 2014-01-24 18:48:41 -0800:
> On 25 Jan 2014 15:11, "Clint Byrum" <clint at fewbar.com> wrote:
> >
> > Excerpts from Devananda van der Veen's message of 2014-01-22 16:44:01
> -0800:
> >
> > What Tuskar wants to do is layer workloads on top of logical and physical
> > groupings. So it would pass to Nova "Boot 4 machines with (flavor)
> > and distinct(failure_domain_id)"
> Maybe. Maybe it would ask for a reservation and then ask for machines
> within that reservation.... Until it is unopened we are speculating :-)

Reservation is a better way to put it, I do agree with that.

> > However, in looking at how Ironic works and interacts with Nova, it
> > doesn't seem like there is any distinction of data per-compute-node
> > inside Ironic.  So for this to work, I'd have to run a whole bunch of
> > ironic instances, one per compute node. That seems like something we
> > don't want to do.
> Huh?

I can't find anything in Ironic that lets you group nodes by anything
except chassis. It was not a serious discussion of how the problem would
be solved, just a point that without some way to tie ironic nodes to
compute-nodes I'd have to run multiple ironics.

> > So perhaps if ironic can just model _a single_ logical grouping per node,
> > it can defer any further distinctions up to Nova where it will benefit
> > all workloads, not just Ironic.
> Agreed with this.
> be deterministic. If Heat does not
> > > inform Ironic of this grouping, but Ironic infers it (eg, from timing of
> > > requests for similar actions) then optimization is possible but
> > > non-deterministic, and may be much harder to reason about or debug.
> > >
> >
> > I'm wary of trying to get too deep on optimization this early. There
> > are some blanket optimizations that you allude to here that I think will
> > likely work o-k with even the most minimal of clues.
> +1 premature optimisation and the root of all evil...
> > > 3: APIs
> r the same operation, but this would
> > > be non-deterministic.
> >
> > Agreed, I think Ironic needs _some_ level of grouping to be efficient.
> What makes you think this? Ironic runs in the same data centre as Nova...
> It it takes 20000 Api calls to boot 10000 physical machines is that really
> a performance problem? When other that first power on would you do that
> anyway?

The API calls are meh. The image distribution and power fluctuations
may not be.

> > > - Moving group-awareness or group-operations into the lower layers (eg,
> > > Ironic) looks like it will require non-trivial changes to Heat and Nova,
> > > and, in my opinion, violates a layer-constraint that I would like to
> > > maintain. On the other hand, we could avoid the challenges around
> > > coalescing. This might be necessary to support physically-grouped
> hardware
> > > anyway, too.
> > >
> >
> > I actually think that the changes to Heat and Nova are trivial. Nova
> > needs to have groups for compute nodes and the API needs to accept those
> > groups. Heat needs to take advantage of them via the API.
> The changes to Nova would be massive and invasive as they would be
> redefining the driver api....and all the logic around it.

I'm not sure I follow you at all. I'm suggesting that the scheduler have
a new thing to filter on, and that compute nodes push their unique ID
down into the Ironic driver so that while setting up nodes in Ironic one
can assign them to a compute node. That doesn't sound massive and

> > There is a non-trivial follow-on which is a "wholistic" scheduler which
> > would further extend these groups into other physical resources like
> > networks and block devices. These all feel like logical evolutions of the
> > idea of making somewhat arbitrary and overlapping groups of compute nodes.
> The holistic scheduler can also be a holistic reserver plus reservation
> aware scheduler -this avoids a lot of pain imo

I think what I said still applies with that model, but it definitely
becomes a lot more robust.

More information about the OpenStack-dev mailing list