[openstack-dev] [Ironic] Manual scheduling nodes in maintenance mode

Devananda van der Veen devananda.vdv at gmail.com
Sat Mar 15 00:07:12 UTC 2014


+1 to the idea.

However, I think we should discuss whether the rescue interface is the
appropriate path. It's initial intention was to tie into Nova's rescue
interface, allowing a user whose instance is non-responsive to boot into a
recovery mode and access the data stored within their instance. I think
there are two different use-cases here:

Case A: a user of Nova who somehow breaks their instance, and wants to boot
into a "rescue" or "recovery" mode, preserving instance data. This is
useful if, eg, they lost network access or broke their grub config.

Case B: an operator of the baremetal cloud whose hardware may be
malfunctioning, who wishes to hide that hardware from users of Case A while
they diagnose and fix the underlying problem.

As I see it, Nova's rescue API (and by extension, the same API in Ironic)
is intended for A, but not for B.  TripleO's use case includes both of
these, and may be conflating them.

I believe Case A is addressed by the planned driver.rescue interface. As
for Case B, I think the solution is to use different tenants and move the
node between them. This is a more complex problem -- Ironic does not model
tenants, and AFAIK Nova doesn't reserve unallocated compute resources on a
per-tenant basis.

That said, I think we will need a way to indicate "this bare metal node
belongs to that tenant", regardless of the rescue use case.

-Deva



On Fri, Mar 14, 2014 at 5:01 AM, Lucas Alvares Gomes
<lucasagomes at gmail.com>wrote:

> On Wed, Mar 12, 2014 at 8:07 PM, Chris Jones <cmsj at tenshu.net> wrote:
>
>>
>> Hey
>>
>> I wanted to throw out an idea that came to me while I was working on
>> diagnosing some hardware issues in the Tripleo CD rack at the sprint last
>> week.
>>
>> Specifically, if a particular node has been dropped from automatic
>> scheduling by the operator, I think it would be super useful to be able to
>> still manually schedule the node. Examples might be that someone is
>> diagnosing a hardware issue and wants to boot an image that has all their
>> favourite diagnostic tools in it, or they might be booting an image they
>> use for updating firmwares, etc (frankly, just being able to boot a
>> generic, unmodified host OS on a node can be super useful if you're trying
>> to crash cart the machine for something hardware related).
>>
>> Any thoughts? :)
>>
>
> +1 I like the idea and think it's quite useful.
>
> Drivers in Ironic already expose a rescue interface[1] (which I don't
> think we had put much thoughts into it yet) perhaps the PXE driver might
> implement something similar to what you want to do here?
>
> [1]
> https://github.com/openstack/ironic/blob/master/ironic/drivers/base.py#L60
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20140314/3731c2a0/attachment.html>


More information about the OpenStack-dev mailing list