[E] [ironic] How to move nodes from a 'clean failed' state into 'Available'
Igal Katzir
ikatzir at infinidat.com
Wed Mar 31 19:24:12 UTC 2021
Hi Julia,
How can I easily tell the ironic version?
This is an rhosp 16.1 installation so its pretty much new.
Igal
בתאריך יום ד׳, 31 במרץ 2021, 21:25, מאת Julia Kreger <
juliaashleykreger at gmail.com>:
> Out of curiosity, is this a very new version of dnsmasq? or an older
> version? I ask because there have been some fixes and regressions
> related to dnsmasq updating its configuration and responding to
> machines appropriately. A version might be helpful, just to enable
> those of us who are curious to go double check things at a minimum.
>
> On Wed, Mar 31, 2021 at 1:28 AM Igal Katzir <ikatzir at infinidat.com> wrote:
> >
> > Hello Forum,
> > Just for the record, the problem was resolved by restarting all the
> ironic containers, I believe that restarting the UC node entirely would
> have also fixed that.
> > So after the ironic containers started fresh, the PXE worked well, and
> after running 'openstack overcloud node introspect --all-manageable
> --provide' it shows:
> >
> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
> > | UUID | Name | Instance UUID |
> Power State | Provisioning State | Maintenance |
> >
> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
> > | 588bc3f6-dc14-4a07-8e38-202540d046f8 | interop025 | None |
> power off | available | False |
> > | dceab84b-1d99-49b5-8f79-c589c0884269 | interop026 | None |
> power off | available | False |
> >
> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
> >
> > I now ready for deployment of overcloud.
> > thanks,
> > Igal
> >
> > On Thu, Mar 25, 2021 at 12:48 AM Igal Katzir <ikatzir at infinidat.com>
> wrote:
> >>
> >> Thanks Jay,
> >> It gets into 'clean failed' state because it fails to boot into PXE
> mode.
> >> I don't understand why the DHCP does not respond to the clients
> request, it's like it remembers that the same client already received an IP
> in the past.
> >> Is there a way to clear the dnsmasq database of reservations?
> >> Igal
> >>
> >> On Wed, Mar 24, 2021 at 5:26 PM Jay Faulkner <
> jay.faulkner at verizonmedia.com> wrote:
> >>>
> >>> A node in CLEAN FAILED must be moved to MANAGEABLE state before it can
> be told to "provide" (which eventually puts it back in AVAILABLE).
> >>>
> >>> Try this:
> >>> `openstack baremetal node manage UUID`, then run the command with
> "provide" as you did before.
> >>>
> >>> The available states and their transitions are documented here:
> https://docs.openstack.org/ironic/latest/contributor/states.html
> >>>
> >>> I'll note that if cleaning failed, it's possible the node is
> misconfigured in such a way that will cause all deployments and cleanings
> to fail (e.g.; if you're using Ironic with Nova, and you attempt to
> provision a machine and it errors during deploy; Nova will by default
> attempt to clean that node, which may be why you see it end up in clean
> failed). So I strongly suggest you look at the last_error field on the node
> and attempt to determine why the failure happened before retrying.
> >>>
> >>> Good luck!
> >>>
> >>> -Jay Faulkner
> >>>
> >>> On Wed, Mar 24, 2021 at 8:20 AM Igal Katzir <ikatzir at infinidat.com>
> wrote:
> >>>>
> >>>> Hello Team,
> >>>>
> >>>> I had a situation where my undercloud-node had a problem with it’s
> disk and has disconnected from overcloud.
> >>>> I couldn’t restore the undercloud controller and ended up
> re-installing it (running 'openstack undercloud install’).
> >>>> The installation ended successfully but now I’m in a situation where
> Cleanup of the overcloud deployed nodes fails:
> >>>>
> >>>> (undercloud) [stack at interop010 ~]$ openstack baremetal node list
> >>>>
> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
> >>>> | UUID | Name | Instance
> UUID | Power State | Provisioning State | Maintenance |
> >>>>
> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
> >>>> | 97b9a603-f64f-47c1-9fb4-6c68a5b38ff6 | interop025 | None |
> power on | clean failed | True |
> >>>> | 4b02703a-f765-4ebb-85ed-75e88b4cbea5 | interop026 | None |
> power on | clean failed | True |
> >>>>
> +--------------------------------------+------------+---------------+-------------+--------------------+-------------+
> >>>>
> >>>> I’ve tried to move node to available state but cannot:
> >>>> (undercloud) [stack at interop010 ~]$ openstack baremetal node provide
> 97b9a603-f64f-47c1-9fb4-6c68a5b38ff6
> >>>> The requested action "provide" can not be performed on node
> "97b9a603-f64f-47c1-9fb4-6c68a5b38ff6" while it is in state "clean failed".
> (HTTP 400)
> >>>>
> >>>> My question is:
> >>>> How do I make the nodes available again?
> >>>> as the deployment of overcloud fails with:
> >>>> ERROR due to "Message: No valid host was found. , Code: 500”
> >>>>
> >>>> Thanks,
> >>>> Igal
> >>
> >>
> >>
> >> --
> >> Regards,
> >> Igal Katzir
> >> Cell +972-54-5597086
> >> Interoperability Team
> >> INFINIDAT
> >>
> >>
> >>
> >>
> >
> >
> > --
> > Regards,
> > Igal Katzir
> > Cell +972-54-5597086
> > Interoperability Team
> > INFINIDAT
> >
> >
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210331/3a507e39/attachment-0001.html>
More information about the openstack-discuss
mailing list