[Kolla-Ansible][Ironic] Baremetal node Cleaning fails in UEFI mode, but succeeds in Legacy Bios Mode
Julia Kreger
juliaashleykreger at gmail.com
Mon Aug 23 13:41:23 UTC 2021
Greetings Anirudh,
If you could post your ``openstack baremetal node show <uuid>`` output
for a node which is in this state, where it is configured to boot from
local storage, and is booting to network. Along with that, it would be
helpful to understand if the machine is configured for UEFI or not.
Realistically this is where using IPMI on modern hardware becomes a
problem, because there is no actual standard for the signaling
behavior as it relates to UEFI boot with IPMI. We encourage operators
to use Redfish instead as it is clearly delineated as part of the
standard.
One last thing. You may want to check and update BMC and system
firmware on your hardware.
On Mon, Aug 23, 2021 at 12:41 AM Anirudh Gupta <anyrude10 at gmail.com> wrote:
>
> Hi Julia,
>
> Thanks for your reply.
>
> There is also an update that with Centos 8.4 Partition Disk Image, I am able to successfully provision the baremetal node. With Centos 8.4 ISO and Wholedisk Image the behaviour is the same that it doesn't boot from Hard disk.
>
> Please find below my setup details:
>
> I am using HP server DL380 Gen9 with BIOS P89 v2.76 (10/21/2019) with IPMI utility
>
> Hard disk is the first priority followed by 1GB NIC which I have set to PXE
>
> I don't find any logs in /var/log/ironic/deploy_logs. However there is a folder /var/log/kolla/ironic/, but there are no deploy_logs in that folder
>
> I have downloaded the kolla source image from docker hub
>
> docker pull kolla/centos-source-ironic-conductor:wallaby
>
> Similar images have been downloaded by kolla ansible for other ironic components
>
> Regards
> Anirudh Gupta
>
> On Fri, Aug 20, 2021 at 9:56 PM Julia Kreger <juliaashleykreger at gmail.com> wrote:
>>
>>
>>
>> On Fri, Aug 20, 2021 at 7:07 AM Anirudh Gupta <anyrude10 at gmail.com> wrote:
>>>
>>> Hi Mark,
>>>
>>> There was some issue with the cleaning image due to which the issue reported in previous conversation was observed.
>>>
>>> This was successfully resolved.
>>> By setting the parameter in ironic.conf file
>>> [pxe]
>>> uefi_ipxe_bootfile_name = ipxe-x86_64.efi
>>>
>>> The "node provide" command successfully executed and the node came in "available" state.
>>>
>>> In Legacy:
>>> When I am trying to create the server using "server create " command and a userimage is passed in the command, the procedure is that the node will install userimage over the network and then will be rebooted
>>> After the reboot, it will boot up with the Hard disk and with the OS specified in userimage.
>>>
>>> In UEFI:
>>> When I am trying to create the server using "server create " command and a userimage is passed in the command, the procedure of installing user image and rebooting remains the same.
>>> But After the reboot, despite setting the hard disk as the first priority, it again starts booting over the network and eventually fails.
>>>
>> This is very very likely an issue with the vendor's firmware. We've seen some instances where the bmc refuses to honor the request to change *or* where it is honored for a single boot operation only. In part some of this may be due to improved support in handling UEFI boot signaling where the wrong thing could occur, at least with IPMI.
>>
>> In order to create a fix or workaround, we need the following information:
>>
>> Are you using IPMI or Redfish? If your using IPMI, you should consider using Redfish.
>>
>> What is the hardware vendor?
>>
>> What is the BMC firmware version?
>>
>> Is the BMC set to always network boot by default completely?
>>
>> In UEFI, what does the machine report for the efibootmgr output. Your deployment agent logs actually have this output already in the journal. Typically /var/log/ironic/deploy_logs. We've seen some hardware act completely disjointed from the EFI NVRAM, or where it resets the EFI NVRAM when we request a one time override.
>>
>> Most importantly, what is the version of ironic and ironic-python-agent?
>>
>>>
>>> I have also tried passing the capabilities='boot_option:local' both in baremetal node and flavor, but the behaviour is the same.
>>>
>>> Regards
>>> Anirudh Gupta
>>>
>>>
>>>
>>>
>>> [trim]
More information about the openstack-discuss
mailing list