[openstack-dev] [ironic] Tooling for recovering nodes

Dmitry Tantsur dtantsur at redhat.com
Tue May 31 08:35:12 UTC 2016


On 05/31/2016 10:25 AM, Tan, Lin wrote:
> Hi,
>
> Recently, I am working on a spec[1] in order to recover nodes which get stuck in deploying state, so I really expect some feedback from you guys.
>
> Ironic nodes can be stuck in deploying/deploywait/cleaning/cleanwait/inspecting/deleting if the node is reserved by a dead conductor (the exclusive lock was not released).
> Any further requests will be denied by ironic because it thinks the node resource is under control of another conductor.
>
> To be more clear, let's narrow the scope and focus on the deploying state first. Currently, people do have several choices to clear the reserved lock:
> 1. restart the dead conductor
> 2. wait up to 2 or 3 minutes and _check_deploying_states() will clear the lock.
> 3. The operator touches the DB to manually recover these nodes.
>
> Option two looks very promising but there are some weakness:
> 2.1 It won't work if the dead conductor was renamed or deleted.
> 2.2 It won't work if the node's specific driver was not enabled on live conductors.
> 2.3 It won't work if the node is in maintenance. (only a corner case).

We can and should fix all three cases.

>
> Definitely we should improve the option 2, but there are could be more issues I didn't know in a more complicated environment.
> So my question is do we still need a new command to recover these node easier without accessing DB, like this PoC [2]:
>   ironic-noderecover --node_uuids=UUID1,UUID2  --config-file=/etc/ironic/ironic.conf

I'm -1 to anything silently removing the lock until I see a clear use 
case which is impossible to improve within Ironic itself. Such utility 
may and will be abused.

I'm fine with anything that does not forcibly remove the lock by default.

>
> Best Regards,
>
> Tan
>
>
> [1] https://review.openstack.org/#/c/319812
> [2] https://review.openstack.org/#/c/311273/
>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>




More information about the OpenStack-dev mailing list