[openstack-dev] [Ironic] Manual scheduling nodes in maintenance mode

Devananda van der Veen devananda.vdv at gmail.com
Wed Mar 19 04:34:27 UTC 2014


On Tue, Mar 18, 2014 at 12:24 PM, Robert Collins
<robertc at robertcollins.net>wrote:

> On 15 March 2014 13:07, Devananda van der Veen <devananda.vdv at gmail.com>
> wrote:
> > +1 to the idea.
> >
> > However, I think we should discuss whether the rescue interface is the
> > appropriate path. It's initial intention was to tie into Nova's rescue
> > interface, allowing a user whose instance is non-responsive to boot into
> a
> > recovery mode and access the data stored within their instance. I think
> > there are two different use-cases here:
> >
> > Case A: a user of Nova who somehow breaks their instance, and wants to
> boot
> > into a "rescue" or "recovery" mode, preserving instance data. This is
> useful
> > if, eg, they lost network access or broke their grub config.
> >
> > Case B: an operator of the baremetal cloud whose hardware may be
> > malfunctioning, who wishes to hide that hardware from users of Case A
> while
> > they diagnose and fix the underlying problem.
> >
> > As I see it, Nova's rescue API (and by extension, the same API in
> Ironic) is
> > intended for A, but not for B.  TripleO's use case includes both of
> these,
> > and may be conflating them.
>
> I agree.
>
> > I believe Case A is addressed by the planned driver.rescue interface. As
> for
> > Case B, I think the solution is to use different tenants and move the
> node
> > between them. This is a more complex problem -- Ironic does not model
> > tenants, and AFAIK Nova doesn't reserve unallocated compute resources on
> a
> > per-tenant basis.
> >
> > That said, I think we will need a way to indicate "this bare metal node
> > belongs to that tenant", regardless of the rescue use case.
>
> I'm not sure Ironic should be involved in scheduling (and giving a
> node to a tenant is a scheduling problem).
>
>
Ironic does not need to make decisions about scheduling for nodes to be
associated to specific tenants. It merely needs to store the tenant_id and
expose it to a (potentially new) filter scheduler that matches on it in a
way that prevents users of Nova from explicitly choosing machines that
"belong" to other tenants. I think the only work needed for this is a new
scheduler filter, a few lines in the Nova driver to expose info to it, and
for the operator to stash a tenant ID in Ironic using the existing API to
update the node.properties field. I don't envision that Nova should ever
change the node->tenant mapping.


> If I may sketch an alternative - when a node is put into maintenance
> mode, keep publishing it to the scheduler, but add an extra spec to it
> that won't match any request automatically.
>
> Then 'deploy X to a maintenance node machine' is simple nove boot with
> a scheduler hint to explicitly choose that machine, and all the
> regular machinery will take place.
>

That should also work :)

I don't see any reason why we can't do both.

-Deva
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140318/2a16e867/attachment.html>


More information about the OpenStack-dev mailing list