[ironic] Cannot move nodes from state 'clean failed' into provisioning state 'Available'

Julia Kreger juliaashleykreger at gmail.com
Wed Mar 24 17:31:01 UTC 2021


So versions and overall configuration might help, *but* often these
issues are just a typo with a MAC address or the wrong port. Can you
verify that the MAC address your seeing DHCP requests for matchs what
is recorded for the node in the `openstack baremetal port list`
output?

On Wed, Mar 24, 2021 at 8:18 AM Igal Katzir <ikatzir at infinidat.com> wrote:
>
> Hello all,
>
> While troubleshooting this, another observation I see is that when I run put the node in state provide:
> 'openstack baremetal node provide 97b9a603-f64f-47c1-9fb4-6c68a5b38ff6’
> It starts the cleaning process, then the node boots into PXE but the undercloud ignores it.
> When I tap the port I see that requests reach its interface:
>
> (undercloud) [stack at interop010 ~]$ sudo tcpdump -i br-ctlplane
> 10:43:10.600421 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from a0:36:9f:95:dd:e2 (oui Unknown), length 548
>
> But on the same time the dnsmasq ignores it:
> (undercloud) [stack at interop010 ~]$ sudo tail -f /var/log/containers/ironic-inspector/dnsmasq.log
> Mar 24 10:39:43 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) 6c:ae:8b:69:ee:80 ignored
> Mar 24 10:40:36 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) a0:36:9f:95:dd:e2 ignored
> Mar 24 10:40:39 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) a0:36:9f:95:dd:e2 ignored
> Mar 24 10:40:48 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) 6c:ae:8b:69:ee:80 ignored
> Mar 24 10:41:52 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) 6c:ae:8b:69:ee:80 ignored
> Mar 24 10:42:57 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) 6c:ae:8b:69:ee:80 ignored
> Mar 24 10:43:06 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) a0:36:9f:95:dd:e2 ignored
> Mar 24 10:43:10 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) a0:36:9f:95:dd:e2 ignored
> Mar 24 10:43:14 dnsmasq-dhcp[7]: DHCPDISCOVER(br-ctlplane) a0:36:9f:95:dd:e2 ignored
>
> Why is that?
> What is needed for the cleanup to start?
>
> Thanks,
> Igal
>
> On 24 Mar 2021, at 0:09, Igal Katzir <ikatzir at infinidat.com> wrote:
>
> Hello Team,
>
> I had a situation where my undercloud-node had a problem with it’s disk and has disconnected from overcloud.
> I couldn’t restore the undercloud controller and ended up re-installing it (running 'openstack undercloud install’).
> The installation ended successfully but now I’m in a situation where Cleanup of the overcloud deployed nodes fails:
>
> (undercloud) [stack at interop010 ~]$ openstack baremetal node list
> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
> | UUID                                       | Name       | Instance    UUID        | Power State | Provisioning State | Maintenance |
> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
> | 97b9a603-f64f-47c1-9fb4-6c68a5b38ff6 | interop025 | None          | power on    | clean failed       | True        |
> | 4b02703a-f765-4ebb-85ed-75e88b4cbea5 | interop026 | None          | power on    | clean failed       | True        |
> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
>
> I’ve tried to move node to available state but cannot:
> (undercloud) [stack at interop010 ~]$ openstack baremetal node provide 97b9a603-f64f-47c1-9fb4-6c68a5b38ff6
> The requested action "provide" can not be performed on node "97b9a603-f64f-47c1-9fb4-6c68a5b38ff6" while it is in state "clean failed". (HTTP 400)
>
> My question is:
> How do I make the nodes available again?
> as the deployment of overcloud fails with:
> ERROR due to "Message: No valid host was found. , Code: 500”
>
> Thanks,
> Igal
>
>



More information about the openstack-discuss mailing list