Hi Jay,

Thanks for your advice, and sorry for the late reply. To clarify things, I have a setup of a controller node and a compute node that supports IPMI. I am also using kolla-ansible as my deployment tool. The controller node is booted with Ubuntu 20.04, and the IPMI node is booted with Ubuntu 22.04. I was able to enroll the node via Horizon as a bare metal node. However, when I attempt to directly boot up an ISO image onto the bare metal node, it is giving me the timeout error. Do I have to set up an instance before booting? Below is a more precise scenario of how my installation works.

First, I uploaded three different images (kernel, ramdisk, ISO) in Glance using the following command.
1. Creating kernel and ramdisk image
$ curl https://tarballs.opendev.org/openstack/ironic-python-agent/dib/files/ipa-centos8-stable-yoga.kernel   -o /etc/kolla/config/ironic/ironic-agent.kernel
$ curl https://tarballs.opendev.org/openstack/ironic-python-agent/dib/files/ipa-centos8-stable-yoga.initramfs   -o /etc/kolla/config/ironic/ironic-agent.initramfs
$ openstack image create --disk-format aki --container-format aki --public   --file /etc/kolla/config/ironic/ironic-agent.kernel deploy-vmlinuz
$ openstack image create --disk-format ari --container-format ari --public   --file /etc/kolla/config/ironic/ironic-agent.initramfs deploy-initrd
2. Creating ISO image
$ disk-image-create -o output.qcow vm block-device-gpt ubuntu-minimal
$ openstack image create --disk-format=qcow2 --container-format=bare --file=output.qcow.qcow2 --property os_distro='ubuntu' output_test

Here, I will talk about how I enrolled in my node. I have given the following details
1. Node Info: node name, node driver (ipmi), properties (vendor: supermicro, cpu_arch: x86_64), instance info (image_resource: <ID of the ISO image [output_test]>, root_gb: 10)
2. Driver Details: deploy kernel (<ID of the kernel image [deploy-vmlinuz]>), deploy ramdisk (<ID of the ramdisk image [deploy-initrd]>), ipmi_address (<IP of the ethernet plugin into supermicro IPMI port>), ipmi_bridging (no), agent_verify_ca (True), deploy_foces_oob_reboot (False), ('Default'), (False), ipmi_password (<password to login to IPMI>), ipmi_priv_level (ADMINISTRATOR), ipmi_protocol_version (2.0), ipmi_terminal_port (9091), ipmi_username (<username to login to IPMI>)
3. Driver Interfaces: boot (pxe), console (ipmitool-shellinabox), deploy (direct), inspect (inspector), management (ipmitool), network (flat), power (ipmitool), raid (agent), storage (noop), vendor (ipmitool)

After the node is enrolled, I noticed that the driver validation shows that all interfaces, such as boot, deploy,  management, network, power, console, inspect, raid, and storage, are validated. However, both bios and rescue are not validated. I am unsure if that will affect either the inspection or deployment stages.

Here, I will talk about how my network is being created and attached. I first created a port on my public network by giving the following details
1. Info: Name, Enable Admin State (True), Device ID (<blank>), Device Owner (baremetal:none), Binding Host (<ID of the enrolled baremetal node>), MAC address (<blank>), Binding: VNIC Type (Bare Metal), Port Security (True)
2. Security Group: Port Security Groups (default)
After Creating it, I obtained the MAC address of that port and run the following command to attach that port to it
1. $baremetal port create --node <ID of the enrolled baremetal node>  --physical-network physnet1 <MAC address of the port obtained>
After creating the port, I will then manually attach the port to the baremetal node as well since it requires a virtual interface to deploy.
1. $baremetal node vif attach <ID of the enrolled baremetal node> <ID of the port created>

Finally, I have moved the provision state of the node from manageable to available, and then moved from available to active. This will then start the deployment process, and after 30 minutes, the ironic service will hit a timeout error. Let me know if you need any further information about the deployment and the concerns you have.

Best,
James


On Tue, May 28, 2024 at 4:59 AM Jay Faulkner <jay@gr-oss.io> wrote:
Hi James,

Can you give me a little more detail about your deployment? It sounds like you may be trying to deploy via an ISO with Ironic; which isn't really something we directly support.

You do have some choices:
- You can boot an ISO ephemerally using Ironic's ramdisk driver. This is unlikely to be what you want unless you're certain.
- You can use a tool like diskimage-builder ( https://docs.openstack.org/diskimage-builder/latest/ ) to build a bare metal image suitable for deployment.
- You can attempt to use a prebuilt cloud image from Ubuntu; but this may not work as such images often are missing required drivers for physical bare metal.

Can you provide more detail about your installation if these aren't on the right track -- thanks!

-
Jay Faulkner

On Mon, May 27, 2024 at 12:41 PM James Leong <jamesleong123098@gmail.com> wrote:
Hi all, 

I am trying to play around with ironic, and I am having an issue with deploying an Ubuntu 20.04 ISO image onto my bare metal node. Since I am quite new to bare metal, I am not entirely sure what went wrong as I couldn't find any error messages shown on my log file (ironic-conductor, ironic-api, ironic_neutron_agent, and nova-compute-ironic). This might be my guess; the ironic inspection is hitting timeout, which is causing the deployment to timeout. However, I have no idea why the inspection is failing at timeout. Any help would be appreciated.

Best,
James