Hi Team/Julia, Thank you for your constant help @Julia Kreger <juliaashleykreger@gmail.com> . We decided to install the wallaby release using online sources. We followed the link: https://docs.openstack.org/project-deploy-guide/tripleo-docs/wallaby/deploym... When the installation of the undercloud was successful, We found out that all the containers except ironic_pxe_http were in healthy state as opposed to the mentioned container which was in an unhealthy state. We collected the pcap files during the node introspection at this point, and following is our result: [image: image.png] As you can see, we are getting a read request from our baremetal node, but our tftp server is not replying with the acknowledgement message. We have seen in normal cases that at this point data transfer should begin which is not happening here. Apart from this the container ironic_pxe_http is an unhealthy state as mentioned previously. On inspecting the pod, we are getting the following error: ``` "Log": [ { "Start": "2023-08-22T13:03:17.113762224+05:30", "End": "2023-08-22T13:03:17.400862661+05:30", "ExitCode": 1, "Output": "/usr/sbin/httpd -DFOREGROUND\ncurl: (22) The requested URL returned error: 404 Not Found\n\n404 ca:ca:ca:9900::43:8088 0.000512 seconds" }, ``` We think that ca:ca:ca:9900::43:8088 is not a valid syntax. For IPV6 ips, it should be [ca:ca:ca:9900::43]:8088. Kindly note that this is our assumption. Could you please help us out with it? Thanks and Regards, Kushagra Gupta On Fri, Aug 11, 2023 at 2:44 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:
Greetings,
I would recommend verifying you can ping addresses, and then inspect firewall rules, since it sounds like the issue is rooted in the state of the undercloud node. I'm unaware of any specific configuration which would cause this, meaning you would realistically need to identify why the packets are not making it through to the service.
-Julia
On Thu, Aug 10, 2023 at 4:21 AM Lokendra Rathour < lokendrarathour@gmail.com> wrote:
Hi Juliya/ Team,
We are yet failing to get the ipv6 provisioning. Steps/report shared by Kushagra needs your help.
Thanks once again for your help.
-Lokendra
On Tue, Aug 8, 2023 at 6:12 PM Kushagr Gupta < kushagrguptasps.mun@gmail.com> wrote:
Hi Julia,Team,
Thank you for the response @Julia Kreger <juliaashleykreger@gmail.com>
On Thu, Jul 27, 2023 at 6:59 PM Julia Kreger < juliaashleykreger@gmail.com> wrote:
I guess what is weird in this entire thing is it sounds like you're shifting over to what appears to be OPROM boot code in a network interface card, which might not support v6. Then again a port mirrored packet capture would be the needful item to troubleshoot further.
We have setup a local dnsmasq-dhcp server and TFTP server on a VM and tried PXE booting the same set of hardwares. The hardware are booting on IPV6 so I think the hardware supports IPV6 PXE booting.
Are you able to extract the exact command line which is being passed to the dnsmasq process for that container launch?
I guess I'm worried if somehow dnsmasq changed or if an old version is somehow in the container image you're using.
The command line which is getting executed is as follows: " "command": [ "/bin/bash", "-c", "BIND_HOST=ca:ca:ca:9900::171; /usr/sbin/dnsmasq --keep-in-foreground --log-facility=/var/log/ironic/dnsmasq.log --user=root --conf-file=/dev/null --listen-address=$BIND_HOST --port=0 --enable-tftp --tftp-root=/var/lib/ironic/tftpboot" ], " We found this command in the following:
/var/lib/tripleo-config/container-startup-config/step_4/ironic_pxe_tftp.json
Apart from this we also tried to install the openstack version zed. In this version, the container ironic_pxe_tftp is up and running but we were still getting the same error:
[image: image.png]
We tried to curl the file which the TFTP container provides from a remote machine(not the undercloud), but we are unable to curl it.
[image: image.png]
But when, we do the same thing from the undercloud, it is working fine:
[image: image.png]
We also set up an undercloud machine on ipv4 for comparison. When we tried to curl the image from a remote machine(not the undercloud) for this server, we were able to curl it.
[image: image.png]
On further digging, we found that in the zed release, the "ironic_pxe_tftp" is in *healthy *state while three containers namely: "ironic_api","ironic_conductor","ironic_pxe_http" are in *unhealthy *state but are up and running. We re-installed the undercloud on the fresh machine and re-tried node introspection after performing basic tasks like image upload, node registration. To our surprise, the Introspection was successful. and the nodes came in available state:
[image: nodes_available_4.PNG]
At this point we were also able to curl the file from a random machine:
[image: image.png]
But it all stopped once we restarted the undercloud node even though all the containers were up and running. We are further investigating this issue.
Thanks and Regards Kushagra Gupta
-- ~ Lokendra skype: lokendrarathour