[Kolla-Ansible][Ironic] Baremetal node Cleaning fails in UEFI mode, but succeeds in Legacy Bios Mode
Anirudh Gupta
anyrude10 at gmail.com
Wed Aug 25 05:27:30 UTC 2021
Hi Julia,
I have also upgraded my firmware to *P89 v2.90 (10/16/2020) *but still the
result is the same.
For your reference, the output of is openstack baremetal node show for
whole disk image is as follows:
[ansible at localhost ~]$ openstack baremetal node show baremetal-node -f json
{
"allocation_uuid": null,
"automated_clean": null,
"bios_interface": "no-bios",
"boot_interface": "ipxe",
"chassis_uuid": null,
"clean_step": {},
"conductor": "controller",
"conductor_group": "",
"console_enabled": false,
"console_interface": "no-console",
"created_at": "2021-08-25T04:51:32+00:00",
"deploy_interface": "direct",
"deploy_step": {},
"description": null,
"driver": "ipmi",
"driver_info": {
"ipmi_port": 623,
"ipmi_username": "hsc",
"ipmi_password": "******",
"ipmi_address": "10.0.1.207",
"deploy_kernel": "a34b7e57-f324-40fe-8fe4-04eb7ea49c3a",
"deploy_ramdisk": "8db38567-4923-4322-b1bf-e12cce5cafc4"
},
"driver_internal_info": {
"clean_steps": null,
"agent_erase_devices_iterations": 1,
"agent_erase_devices_zeroize": true,
"agent_continue_if_secure_erase_failed": false,
"agent_continue_if_ata_erase_failed": false,
"agent_enable_nvme_secure_erase": true,
"agent_enable_ata_secure_erase": true,
"disk_erasure_concurrency": 1,
"agent_erase_skip_read_only": false,
"last_power_state_change": "2021-08-25T05:10:16.671639",
"agent_version": "7.0.2.dev10",
"agent_last_heartbeat": "2021-08-25T05:09:34.904605",
"hardware_manager_version": {
"MellanoxDeviceHardwareManager": "1",
"generic_hardware_manager": "1.1"
},
"agent_cached_clean_steps_refreshed": "2021-08-25 04:59:28.312524",
"is_whole_disk_image": true,
"deploy_steps": null,
"agent_cached_deploy_steps_refreshed": "2021-08-25 05:08:58.530633",
"root_uuid_or_disk_id": "0x3f3df0d8"
},
"extra": {},
"fault": null,
"inspect_interface": "no-inspect",
"inspection_finished_at": null,
"inspection_started_at": null,
"instance_info": {
"image_source": "da92cd5d-e1d6-458d-a2b2-86e897a982c6",
"root_gb": "470",
"swap_mb": "0",
"display_name": "server1",
"vcpus": "24",
"nova_host_id": "controller-ironic",
"memory_mb": "62700",
"local_gb": "470",
"configdrive": "******",
"image_disk_format": "raw",
"image_checksum": null,
"image_os_hash_algo": "sha512",
"image_os_hash_value":
"3b16d3a6734c23fb43fbd6deee16c907ea8e398bfd5163cd08f16ccd07a74399bb35f16a4713c3847058b445bf4150448f22eb11e75debcc548b8eaacf777e70",
"image_url": "******",
"image_container_format": "bare",
"image_tags": [],
"image_properties": {
"stores": "file",
"os_hidden": false,
"virtual_size": 3511681024,
"owner_specified.openstack.object": "images/centos-d",
"owner_specified.openstack.sha256": "",
"owner_specified.openstack.md5": ""
},
"image_type": "whole-disk-image"
},
"instance_uuid": "e29c267f-8ddb-4dce-a07c-18c4f7210010",
"last_error": null,
"lessee": null,
"maintenance": false,
"maintenance_reason": null,
"management_interface": "ipmitool",
"name": "baremetal-node",
"network_data": {},
"network_interface": "flat",
"owner": null,
"power_interface": "ipmitool",
"power_state": "power on",
"properties": {
"cpus": 30,
"memory_mb": 62700,
"local_gb": 470,
"cpu_arch": "x86_64",
"capabilities": "boot_mode:uefi,boot_option:local",
"vendor": "hewlett-packard"
},
"protected": false,
"protected_reason": null,
"provision_state": "active",
"provision_updated_at": "2021-08-25T05:10:37+00:00",
"raid_config": {},
"raid_interface": "no-raid",
"rescue_interface": "no-rescue",
"reservation": null,
"resource_class": "baremetal-resource-class",
"retired": false,
"retired_reason": null,
"storage_interface": "noop",
"target_power_state": null,
"target_provision_state": null,
"target_raid_config": {},
"traits": [],
"updated_at": "2021-08-25T05:10:37+00:00",
"uuid": "3caaffe3-a6be-4b8c-b3dd-d302c4367670",
"vendor_interface": "ipmitool"
}
I am not getting why this issue is not being reproduced with the partition
disk image.
Regards
Anirudh Gupta
On Mon, Aug 23, 2021 at 7:11 PM Julia Kreger <juliaashleykreger at gmail.com>
wrote:
> Greetings Anirudh,
>
> If you could post your ``openstack baremetal node show <uuid>`` output
> for a node which is in this state, where it is configured to boot from
> local storage, and is booting to network. Along with that, it would be
> helpful to understand if the machine is configured for UEFI or not.
> Realistically this is where using IPMI on modern hardware becomes a
> problem, because there is no actual standard for the signaling
> behavior as it relates to UEFI boot with IPMI. We encourage operators
> to use Redfish instead as it is clearly delineated as part of the
> standard.
>
> One last thing. You may want to check and update BMC and system
> firmware on your hardware.
>
> On Mon, Aug 23, 2021 at 12:41 AM Anirudh Gupta <anyrude10 at gmail.com>
> wrote:
> >
> > Hi Julia,
> >
> > Thanks for your reply.
> >
> > There is also an update that with Centos 8.4 Partition Disk Image, I am
> able to successfully provision the baremetal node. With Centos 8.4 ISO and
> Wholedisk Image the behaviour is the same that it doesn't boot from Hard
> disk.
> >
> > Please find below my setup details:
> >
> > I am using HP server DL380 Gen9 with BIOS P89 v2.76 (10/21/2019) with
> IPMI utility
> >
> > Hard disk is the first priority followed by 1GB NIC which I have set to
> PXE
> >
> > I don't find any logs in /var/log/ironic/deploy_logs. However there is a
> folder /var/log/kolla/ironic/, but there are no deploy_logs in that folder
> >
> > I have downloaded the kolla source image from docker hub
> >
> > docker pull kolla/centos-source-ironic-conductor:wallaby
> >
> > Similar images have been downloaded by kolla ansible for other ironic
> components
> >
> > Regards
> > Anirudh Gupta
> >
> > On Fri, Aug 20, 2021 at 9:56 PM Julia Kreger <
> juliaashleykreger at gmail.com> wrote:
> >>
> >>
> >>
> >> On Fri, Aug 20, 2021 at 7:07 AM Anirudh Gupta <anyrude10 at gmail.com>
> wrote:
> >>>
> >>> Hi Mark,
> >>>
> >>> There was some issue with the cleaning image due to which the issue
> reported in previous conversation was observed.
> >>>
> >>> This was successfully resolved.
> >>> By setting the parameter in ironic.conf file
> >>> [pxe]
> >>> uefi_ipxe_bootfile_name = ipxe-x86_64.efi
> >>>
> >>> The "node provide" command successfully executed and the node came in
> "available" state.
> >>>
> >>> In Legacy:
> >>> When I am trying to create the server using "server create " command
> and a userimage is passed in the command, the procedure is that the node
> will install userimage over the network and then will be rebooted
> >>> After the reboot, it will boot up with the Hard disk and with the OS
> specified in userimage.
> >>>
> >>> In UEFI:
> >>> When I am trying to create the server using "server create " command
> and a userimage is passed in the command, the procedure of installing user
> image and rebooting remains the same.
> >>> But After the reboot, despite setting the hard disk as the first
> priority, it again starts booting over the network and eventually fails.
> >>>
> >> This is very very likely an issue with the vendor's firmware. We've
> seen some instances where the bmc refuses to honor the request to change
> *or* where it is honored for a single boot operation only. In part some of
> this may be due to improved support in handling UEFI boot signaling where
> the wrong thing could occur, at least with IPMI.
> >>
> >> In order to create a fix or workaround, we need the following
> information:
> >>
> >> Are you using IPMI or Redfish? If your using IPMI, you should consider
> using Redfish.
> >>
> >> What is the hardware vendor?
> >>
> >> What is the BMC firmware version?
> >>
> >> Is the BMC set to always network boot by default completely?
> >>
> >> In UEFI, what does the machine report for the efibootmgr output. Your
> deployment agent logs actually have this output already in the journal.
> Typically /var/log/ironic/deploy_logs. We've seen some hardware act
> completely disjointed from the EFI NVRAM, or where it resets the EFI NVRAM
> when we request a one time override.
> >>
> >> Most importantly, what is the version of ironic and ironic-python-agent?
> >>
> >>>
> >>> I have also tried passing the capabilities='boot_option:local' both in
> baremetal node and flavor, but the behaviour is the same.
> >>>
> >>> Regards
> >>> Anirudh Gupta
> >>>
> >>>
> >>>
> >>>
> >>> [trim]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210825/06f3cfae/attachment-0001.html>
More information about the openstack-discuss
mailing list