[E] [ironic] How to move nodes from a 'clean failed' state into 'Available'
Igal Katzir
ikatzir at infinidat.com
Wed Mar 31 08:28:29 UTC 2021
Hello Forum,
Just for the record, the problem was resolved by restarting all the ironic
containers, I believe that restarting the UC node entirely would have also
fixed that.
So after the ironic containers started fresh, the PXE worked well, and
after running 'openstack overcloud node introspect --all-manageable
--provide' it shows:
+--------------------------------------+------------+---------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power
State | Provisioning State | Maintenance |
+--------------------------------------+------------+---------------+-------------+--------------------+-------------+
| 588bc3f6-dc14-4a07-8e38-202540d046f8 | interop025 | None | power
off | available | False |
| dceab84b-1d99-49b5-8f79-c589c0884269 | interop026 | None | power
off | available | False |
+--------------------------------------+------------+---------------+-------------+--------------------+-------------+
I now ready for deployment of overcloud.
thanks,
Igal
On Thu, Mar 25, 2021 at 12:48 AM Igal Katzir <ikatzir at infinidat.com> wrote:
> Thanks Jay,
> It gets into 'clean failed' state because it fails to boot into PXE mode.
> I don't understand why the DHCP does not respond to the clients request,
> it's like it remembers that the same client already received an IP in the
> past.
> Is there a way to clear the dnsmasq database of reservations?
> Igal
>
> On Wed, Mar 24, 2021 at 5:26 PM Jay Faulkner <
> jay.faulkner at verizonmedia.com> wrote:
>
>> A node in CLEAN FAILED must be moved to MANAGEABLE state before it can be
>> told to "provide" (which eventually puts it back in AVAILABLE).
>>
>> Try this:
>> `openstack baremetal node manage UUID`, then run the command with
>> "provide" as you did before.
>>
>> The available states and their transitions are documented here:
>> https://docs.openstack.org/ironic/latest/contributor/states.html
>>
>> I'll note that if cleaning failed, it's possible the node is
>> misconfigured in such a way that will cause all deployments and cleanings
>> to fail (e.g.; if you're using Ironic with Nova, and you attempt to
>> provision a machine and it errors during deploy; Nova will by default
>> attempt to clean that node, which may be why you see it end up in clean
>> failed). So I strongly suggest you look at the last_error field on the node
>> and attempt to determine why the failure happened before retrying.
>>
>> Good luck!
>>
>> -Jay Faulkner
>>
>> On Wed, Mar 24, 2021 at 8:20 AM Igal Katzir <ikatzir at infinidat.com>
>> wrote:
>>
>>> Hello Team,
>>>
>>> I had a situation where my *undercloud-node *had a problem with it’s
>>> disk and has disconnected from overcloud.
>>> I couldn’t restore the undercloud controller and ended up re-installing
>>> it (running 'openstack undercloud install’).
>>> The installation ended successfully but now I’m in a situation where
>>> Cleanup of the overcloud deployed nodes fails:
>>>
>>> (undercloud) [stack at interop010 ~]$ openstack baremetal node list
>>>
>>> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
>>> | UUID | Name | Instance
>>> UUID | Power State | Provisioning State | Maintenance |
>>>
>>> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
>>> | 97b9a603-f64f-47c1-9fb4-6c68a5b38ff6 | interop025 | None |
>>> power on | clean failed | True |
>>> | 4b02703a-f765-4ebb-85ed-75e88b4cbea5 | interop026 | None |
>>> power on | clean failed | True |
>>>
>>> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
>>>
>>> I’ve tried to move node to available state but cannot:
>>> (undercloud) [stack at interop010 ~]$ openstack baremetal node provide
>>> 97b9a603-f64f-47c1-9fb4-6c68a5b38ff6
>>> The requested action "provide" can not be performed on node
>>> "97b9a603-f64f-47c1-9fb4-6c68a5b38ff6" while it is in state "clean failed".
>>> (HTTP 400)
>>>
>>> My question is:
>>> *How do I make the nodes available again?*
>>> as the deployment of overcloud fails with:
>>> ERROR due to "Message: No valid host was found. , Code: 500”
>>>
>>> Thanks,
>>> Igal
>>>
>>
>
> --
> Regards,
>
> *Igal Katzir*
> Cell +972-54-5597086
> Interoperability Team
> *INFINIDAT*
>
>
>
>
>
--
Regards,
*Igal Katzir*
Cell +972-54-5597086
Interoperability Team
*INFINIDAT*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210331/f925a6ba/attachment.html>
More information about the openstack-discuss
mailing list