<div dir="ltr"><div>Hello Mike,</div><div><br></div><div>We had this issue in the past and there was a bug tracker upstream [1], the upstream fix was merged during xena cycle [2].<br></div><div>I would attempt to check the firmware present in the machine and try to upgrade to see if it helps.<br></div><div>I've also noticed we just had another change in IPA with a fix for efibootmgr [3], not sure if the image you downloaded from tarballs already contains.<br></div><div><br></div><div>[1] <a href="https://storyboard.openstack.org/#!/story/2008962">https://storyboard.openstack.org/#!/story/2008962</a> <br>[2] <a href="https://review.opendev.org/c/openstack/ironic-python-agent/+/795862">https://review.opendev.org/c/openstack/ironic-python-agent/+/795862</a></div><div>[3] <a href="https://review.opendev.org/c/openstack/ironic-python-agent/+/881762">https://review.opendev.org/c/openstack/ironic-python-agent/+/881762</a> </div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Em qui., 25 de mai. de 2023 às 04:19, Mike Currin <<a href="mailto:mike@idia.ac.za">mike@idia.ac.za</a>> escreveu:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi All,<br>
<br>
We have a Xena based Openstack deployment, recently we deployed 60+<br>
nodes in our research cluster with Ironic which worked well. All of<br>
these were deployed using a standard process I'll describe below.<br>
<br>
We recently took delivery of a new Dell R6625 server with NVMe devices<br>
onlym which only support UEFI boot, so we are trying to get that<br>
working.<br>
<br>
The server PXEs and downloads the RAM disk and then the Deploy image,<br>
once running that it immediately crashes (I assume when running<br>
linuxefi). We tested UEFI deploy on an existing Dell R640 server,<br>
that server works with BIOS but we swapped it over to UEFI and it does<br>
the same, so it wasn't due to the much bigger/different architecture<br>
(AMD vs Intel) server. We have a few older servers in a test setup<br>
(which are Dell R630's) which are working fine and don't do this<br>
behaviour. We haven't tried them on our production setup as if even<br>
if they worked it wouldn't help us move forward.<br>
<br>
I made a video showing this:<br>
<a href="https://www.dropbox.com/s/5jbn1qpylxaevqb/uefiboot2.mov?dl=0" rel="noreferrer" target="_blank">https://www.dropbox.com/s/5jbn1qpylxaevqb/uefiboot2.mov?dl=0</a><br>
In the iDRAC we just get that the "System BIOS has halted" and<br>
somewhere I said to change hardware that you recently added, which<br>
feels unlikely as 2 different servers both working elsewhere with<br>
totally different hardware,.<br>
<br>
I've done a iDRAC Serial console debug but it isn't showing me much<br>
that is of any use.<br>
<br>
This is our entire process to deploy a node (some is once off of<br>
course, I've not included the network setup):<br>
<br>
openstack flavor create --ram 256000 --disk 20 --vcpus 32 --public our-baremetal<br>
openstack flavor set our-baremetal --property capabilities:boot_mode="uefi"<br>
<br>
We downloaded the latest (Xena) images from:<br>
<a href="https://tarballs.opendev.org/openstack/ironic-python-agent/dib/files/" rel="noreferrer" target="_blank">https://tarballs.opendev.org/openstack/ironic-python-agent/dib/files/</a><br>
I also tried the latest Centos9 ones just to try them, made no difference.<br>
<br>
We then extract the image and make a small mod to the service start<br>
(we found it didn't bring the NIC immediately up so put a ping delay<br>
in the ExecStart), but that's not part of this problem.<br>
<br>
openstack image create --disk-format aki --container-format aki<br>
--public --file ironic-agent.kernel deploy-vmlinuz<br>
openstack image create --disk-format ari --container-format ari<br>
--public --file ironic-agent.initramfs-ping-patched deploy-initrd<br>
<br>
Then to do the deploy:<br>
export HOSTNAME=<hostname><br>
export MGMTIP=<idracip><br>
<br>
openstack baremetal node create --driver ipmi --name $HOSTNAME<br>
--driver-info ipmi_port=623 --driver-info ipmi_username=root<br>
--driver-info 'ipmi_password=<ourpassword>' --driver-info<br>
ipmi_address=$MGMTIP --resource-class baremetal-resource-class<br>
--property cpus=32 --property memory_mb=256000 --property local_gb=20<br>
--property cpu_arch=x86_64 --driver-info deploy_ramdisk=$(openstack<br>
image show deploy-initrd -f value -c id) --driver-info<br>
deploy_kernel=$(openstack image show deploy-vmlinuz -f value -c id)<br>
NODE=$(openstack baremetal node show -f value -c uuid $HOSTNAME)<br>
openstack baremetal node set $NODE --property capabilities='boot_mode:uefi'<br>
<br>
openstack baremetal port create <MACADDRESS> --node $NODE<br>
--physical-network physnet3<br>
openstack baremetal node manage $NODE --wait && openstack baremetal<br>
node list && openstack baremetal node provide $NODE && openstack<br>
baremetal node list<br>
<br>
openstack server create --use-config-drive --image <ourimage> --flavor<br>
our-baremetal --security-group worker --network ironic-network<br>
--key-name <ourkeyname> servername<br>
<br>
Does any one have any more info to help or any suggestions as to<br>
something more I could try, I'm out of ideas. I know that UEFI itself<br>
works on both the servers, we have a setup with Ubuntu MAAS and it can<br>
deploy perfectly fine using its process with the UEFI setup so, it's<br>
something on the Ironic deploy image that's causing us this problem.<br>
<br>
Regards,<br>
Mike<br>
<br>
</blockquote></div><br clear="all"><br><span class="gmail_signature_prefix">-- </span><br><div dir="ltr" class="gmail_signature"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><div dir="ltr"><span style="background-color:rgb(255,255,255)"><font style="background-color:transparent"><div><div dir="ltr"><div><div style="color:rgb(136,136,136);font-family:arial,sans-serif;font-size:12.8px"><i style="font-size:13px"><font style="color:rgb(0,0,0)">Att[]'s</font><font color="#500050"><span style="color:rgb(0,0,0)"></span></font></i><font style="background-color:transparent"><div style="color:rgb(136,136,136);font-family:arial,sans-serif;font-size:12.8px"><i style="font-size:13px"><font color="#500050"><span style="color:rgb(0,0,0)">Iury Gregory Melo Ferreira</span> </font><br></i><i><font color="#000000">MSc in Computer Science at UFCG<br></font></i></div><div style="color:rgb(136,136,136);font-family:arial,sans-serif;font-size:12.8px"><i><font color="#000000">Ironic PTL </font></i><br><i><font color="#000000"><span style="background-color:rgb(255,255,255)"><font style="background-color:transparent"><i><font color="#000000">Senior Software Engineer at Red Hat Brazil</font></i></font></span></font></i></div></font></div><div><font style="font-family:arial,sans-serif;font-size:12.8px" color="#000000"><i>Social</i>:</font><font style="font-family:arial,sans-serif;font-size:12.8px"><font color="#888888"> </font><a href="https://www.linkedin.com/in/iurygregory" target="_blank"><font color="#0b5394">https://www.linkedin.com/in/iurygregory</font></a></font></div><div><i style="color:rgb(136,136,136);background-color:transparent;font-size:13px"><font color="#500050"><span style="color:rgb(0,0,0)">E-mail: </span> </font><a href="mailto:iurygregory@gmail.com" style="color:rgb(0,84,136)" target="_blank">iurygregory@gmail.com</a></i></div></div></div></div></font></span></div></div></div></div></div></div></div></div></div></div>