[Kolla-Ansible][Ironic] Baremetal node Cleaning fails in UEFI mode, but succeeds in Legacy Bios Mode

Anirudh Gupta anyrude10 at gmail.com
Wed Aug 25 05:27:30 UTC 2021


Hi Julia,

I have also upgraded my firmware to *P89 v2.90 (10/16/2020) *but still the
result is the same.
For your reference, the output of is openstack baremetal node show for
whole disk image is as follows:

[ansible at localhost ~]$ openstack baremetal node show baremetal-node -f json
{
  "allocation_uuid": null,
  "automated_clean": null,
  "bios_interface": "no-bios",
  "boot_interface": "ipxe",
  "chassis_uuid": null,
  "clean_step": {},
  "conductor": "controller",
  "conductor_group": "",
  "console_enabled": false,
  "console_interface": "no-console",
  "created_at": "2021-08-25T04:51:32+00:00",
  "deploy_interface": "direct",
  "deploy_step": {},
  "description": null,
  "driver": "ipmi",
  "driver_info": {
    "ipmi_port": 623,
    "ipmi_username": "hsc",
    "ipmi_password": "******",
    "ipmi_address": "10.0.1.207",
    "deploy_kernel": "a34b7e57-f324-40fe-8fe4-04eb7ea49c3a",
    "deploy_ramdisk": "8db38567-4923-4322-b1bf-e12cce5cafc4"
  },
  "driver_internal_info": {
    "clean_steps": null,
    "agent_erase_devices_iterations": 1,
    "agent_erase_devices_zeroize": true,
    "agent_continue_if_secure_erase_failed": false,
    "agent_continue_if_ata_erase_failed": false,
    "agent_enable_nvme_secure_erase": true,
    "agent_enable_ata_secure_erase": true,
    "disk_erasure_concurrency": 1,
    "agent_erase_skip_read_only": false,
    "last_power_state_change": "2021-08-25T05:10:16.671639",
    "agent_version": "7.0.2.dev10",
    "agent_last_heartbeat": "2021-08-25T05:09:34.904605",
    "hardware_manager_version": {
      "MellanoxDeviceHardwareManager": "1",
      "generic_hardware_manager": "1.1"
    },
    "agent_cached_clean_steps_refreshed": "2021-08-25 04:59:28.312524",
    "is_whole_disk_image": true,
    "deploy_steps": null,
    "agent_cached_deploy_steps_refreshed": "2021-08-25 05:08:58.530633",
    "root_uuid_or_disk_id": "0x3f3df0d8"
  },
  "extra": {},
  "fault": null,
  "inspect_interface": "no-inspect",
  "inspection_finished_at": null,
  "inspection_started_at": null,
  "instance_info": {
    "image_source": "da92cd5d-e1d6-458d-a2b2-86e897a982c6",
    "root_gb": "470",
    "swap_mb": "0",
    "display_name": "server1",
    "vcpus": "24",
    "nova_host_id": "controller-ironic",
    "memory_mb": "62700",
    "local_gb": "470",
    "configdrive": "******",
    "image_disk_format": "raw",
    "image_checksum": null,
    "image_os_hash_algo": "sha512",
    "image_os_hash_value":
"3b16d3a6734c23fb43fbd6deee16c907ea8e398bfd5163cd08f16ccd07a74399bb35f16a4713c3847058b445bf4150448f22eb11e75debcc548b8eaacf777e70",
    "image_url": "******",
    "image_container_format": "bare",
    "image_tags": [],
    "image_properties": {
      "stores": "file",
      "os_hidden": false,
      "virtual_size": 3511681024,
      "owner_specified.openstack.object": "images/centos-d",
      "owner_specified.openstack.sha256": "",
      "owner_specified.openstack.md5": ""
    },
    "image_type": "whole-disk-image"
  },
  "instance_uuid": "e29c267f-8ddb-4dce-a07c-18c4f7210010",
  "last_error": null,
  "lessee": null,
  "maintenance": false,
  "maintenance_reason": null,
  "management_interface": "ipmitool",
  "name": "baremetal-node",
  "network_data": {},
  "network_interface": "flat",
  "owner": null,
  "power_interface": "ipmitool",
  "power_state": "power on",
  "properties": {
    "cpus": 30,
    "memory_mb": 62700,
    "local_gb": 470,
    "cpu_arch": "x86_64",
    "capabilities": "boot_mode:uefi,boot_option:local",
    "vendor": "hewlett-packard"
  },
  "protected": false,
  "protected_reason": null,
  "provision_state": "active",
  "provision_updated_at": "2021-08-25T05:10:37+00:00",
  "raid_config": {},
  "raid_interface": "no-raid",
  "rescue_interface": "no-rescue",
  "reservation": null,
  "resource_class": "baremetal-resource-class",
  "retired": false,
  "retired_reason": null,
  "storage_interface": "noop",
  "target_power_state": null,
  "target_provision_state": null,
  "target_raid_config": {},
  "traits": [],
  "updated_at": "2021-08-25T05:10:37+00:00",
  "uuid": "3caaffe3-a6be-4b8c-b3dd-d302c4367670",
  "vendor_interface": "ipmitool"
}


I am not getting why this issue is not being reproduced with the partition
disk image.

Regards
Anirudh Gupta



On Mon, Aug 23, 2021 at 7:11 PM Julia Kreger <juliaashleykreger at gmail.com>
wrote:

> Greetings Anirudh,
>
> If you could post your ``openstack baremetal node show <uuid>`` output
> for a node which is in this state, where it is configured to boot from
> local storage, and is booting to network. Along with that, it would be
> helpful to understand if the machine is configured for UEFI or not.
> Realistically this is where using IPMI on modern hardware becomes a
> problem, because there is no actual standard for the signaling
> behavior as it relates to UEFI boot with IPMI. We encourage operators
> to use Redfish instead as it is clearly delineated as part of the
> standard.
>
> One last thing. You may want to check and update BMC and system
> firmware on your hardware.
>
> On Mon, Aug 23, 2021 at 12:41 AM Anirudh Gupta <anyrude10 at gmail.com>
> wrote:
> >
> > Hi Julia,
> >
> > Thanks for your reply.
> >
> > There is also an update that with Centos 8.4 Partition Disk Image, I am
> able to successfully provision the baremetal node. With Centos 8.4 ISO and
> Wholedisk Image the behaviour is the same that it doesn't boot from Hard
> disk.
> >
> > Please find below my setup details:
> >
> > I am using HP server DL380 Gen9 with BIOS P89 v2.76 (10/21/2019) with
> IPMI utility
> >
> > Hard disk is the first priority followed by 1GB NIC which I have set to
> PXE
> >
> > I don't find any logs in /var/log/ironic/deploy_logs. However there is a
> folder /var/log/kolla/ironic/, but there are no deploy_logs in that folder
> >
> > I have downloaded the kolla source image from docker hub
> >
> > docker pull kolla/centos-source-ironic-conductor:wallaby
> >
> > Similar images have been downloaded by kolla ansible for other ironic
> components
> >
> > Regards
> > Anirudh Gupta
> >
> > On Fri, Aug 20, 2021 at 9:56 PM Julia Kreger <
> juliaashleykreger at gmail.com> wrote:
> >>
> >>
> >>
> >> On Fri, Aug 20, 2021 at 7:07 AM Anirudh Gupta <anyrude10 at gmail.com>
> wrote:
> >>>
> >>> Hi Mark,
> >>>
> >>> There was some issue with the cleaning image due to which the issue
> reported in previous conversation was observed.
> >>>
> >>> This was successfully resolved.
> >>> By setting the parameter in ironic.conf file
> >>> [pxe]
> >>> uefi_ipxe_bootfile_name =  ipxe-x86_64.efi
> >>>
> >>> The "node provide" command successfully executed and the node came in
> "available" state.
> >>>
> >>> In Legacy:
> >>> When I am trying to create the server using "server create " command
> and a userimage is passed in the command, the procedure is that the node
> will install userimage over the network and then will be rebooted
> >>> After the reboot, it will boot up with the Hard disk and with the OS
> specified in userimage.
> >>>
> >>> In UEFI:
> >>> When I am trying to create the server using "server create " command
> and a userimage is passed in the command, the procedure of installing user
> image and rebooting remains the same.
> >>> But After the reboot, despite setting the hard disk as the first
> priority,  it again starts booting over the network and eventually fails.
> >>>
> >> This is very very likely an issue with the vendor's firmware. We've
> seen some instances where the bmc refuses to honor the request to change
> *or*  where it is honored for a single boot operation only. In part some of
> this may be due to improved support in handling UEFI boot signaling where
> the wrong thing could occur, at least with IPMI.
> >>
> >> In order to create a fix or workaround, we need the following
> information:
> >>
> >> Are you using IPMI or Redfish? If your using IPMI, you should consider
> using Redfish.
> >>
> >> What is the hardware vendor?
> >>
> >> What is the BMC firmware version?
> >>
> >> Is the BMC set to always network boot by default completely?
> >>
> >> In UEFI, what does the machine report for the efibootmgr output. Your
> deployment agent logs actually have this output already in the journal.
> Typically /var/log/ironic/deploy_logs. We've seen some hardware act
> completely disjointed from the EFI NVRAM, or where it resets the EFI NVRAM
> when we request a one time override.
> >>
> >> Most importantly, what is the version of ironic and ironic-python-agent?
> >>
> >>>
> >>> I have also tried passing the capabilities='boot_option:local' both in
> baremetal node and flavor, but the behaviour is the same.
> >>>
> >>> Regards
> >>> Anirudh Gupta
> >>>
> >>>
> >>>
> >>>
> >>> [trim]
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210825/06f3cfae/attachment-0001.html>


More information about the openstack-discuss mailing list