[openstack-dev] [Ironic] Node groups and multi-node operations
clint at fewbar.com
Sat Jan 25 15:11:22 UTC 2014
Excerpts from Robert Collins's message of 2014-01-25 02:47:42 -0800:
> On 25 January 2014 19:42, Clint Byrum <clint at fewbar.com> wrote:
> > Excerpts from Robert Collins's message of 2014-01-24 18:48:41 -0800:
> >> > However, in looking at how Ironic works and interacts with Nova, it
> >> > doesn't seem like there is any distinction of data per-compute-node
> >> > inside Ironic. So for this to work, I'd have to run a whole bunch of
> >> > ironic instances, one per compute node. That seems like something we
> >> > don't want to do.
> >> Huh?
> > I can't find anything in Ironic that lets you group nodes by anything
> > except chassis. It was not a serious discussion of how the problem would
> > be solved, just a point that without some way to tie ironic nodes to
> > compute-nodes I'd have to run multiple ironics.
> I don't understand the point. There is no tie between ironic nodes and
> compute nodes. Why do you want one?
Because sans Ironic, compute-nodes still have physical characteristics
that make grouping on them attractive for things like anti-affinity. I
don't really want my HA instances "not on the same compute node", I want
them "not in the same failure domain". It becomes a way for all
OpenStack workloads to have more granularity than "availability zone".
So if we have all of that modeled in compute-nodes, then when adding
physical hardware to Ironic one just needs to have something to model
the same relationship for each physical hardware node. We don't have to
do it by linking hardware nodes to compute-nodes, but that would be
doable for a first cut without much change to Ironic.
> >> What makes you think this? Ironic runs in the same data centre as Nova...
> >> It it takes 20000 Api calls to boot 10000 physical machines is that really
> >> a performance problem? When other that first power on would you do that
> >> anyway?
> > The API calls are meh. The image distribution and power fluctuations
> > may not be.
> But there isn't a strong connection between API call and image
> distribution - e.g. (and this is my current favorite for 'when we get
> to optimising') a glance multicast service - Ironic would just add
> nodes to the relevant group as they are requested, and remove when
> they complete, and glance can take care of stopping the service when
> there are no members in the group.
I think we agree here. Entirely. :)
> >> > I actually think that the changes to Heat and Nova are trivial. Nova
> >> > needs to have groups for compute nodes and the API needs to accept those
> >> > groups. Heat needs to take advantage of them via the API.
> >> The changes to Nova would be massive and invasive as they would be
> >> redefining the driver api....and all the logic around it.
> > I'm not sure I follow you at all. I'm suggesting that the scheduler have
> > a new thing to filter on, and that compute nodes push their unique ID
> > down into the Ironic driver so that while setting up nodes in Ironic one
> > can assign them to a compute node. That doesn't sound massive and
> > invasive.
> I think we're perhaps talking about different things - in the section
> you were answering, I thought he was talking about whether the API
> should offer operations on arbitrary sets of nodes at once, or whether
> each operation should be a separate API call vs what I now think you
> were talking about which was whether operations should be able to
> describe logical relations to other instances/nodes. Perhaps if we use
> the term 'batch' rather than 'group' to talk about the
> multiple-things-at-once aspect, and grouping to talk about the
> primarily scheduler related problems of affinity / anti affinity etc,
> we can avoid future confusion.
Yes, thats a good point. I was talking about modeling failure domains
only. Batching API requests seems like an entirely different thing.
More information about the OpenStack-dev