[Tripleo] Issue in Baremetal Provisioning from Overcloud
Hi Team I am trying to Provision Bare Metal Node from my tripleo Overcloud. For this, while deploying the overcloud, I have followed the *"default flat" *network approach specified in the below link https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... Just to highlight the changes, I have defined the *ironic-config.yaml* parameter_defaults: ... ... IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: *172.23.3.100,172.23.3.150* IronicInspectorInterface: 'br-baremetal' Also modified the file *~/templates/network-environment.yaml* parameter_defaults: NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal With this I have Followed all the steps of creating br-baremetal bridge on controller, given in the link below: https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... - type: ovs_bridge name: br-baremetal use_dhcp: false members: - type: interface name: nic3 Post Deployment, I have also create a flat network on "datacentre" physical network and subnet having the range *172.23.3.200,172.23.3.240 *(as suggested subnet is same as of inspector and range is different) and the router Also created a baremetal node and ran *"openstack baremetal node manage bm1", *the state of which was a success. Observation: On executing "openstack baremetal node *provide* bm1", the machine gets power on and ideally it should take an IP from ironic inspector range and PXE Boot. But nothing of this sort happens and we see an IP from neutron range " *172.23.3.239*" (attached the screenshot) [image: image.png] I have checked overcloud ironic inspector podman logs alongwith the tcpdump. In tcpdump, I can only see dhcp discover request on br-baremetal and nothing happens after that. I have tried to explain my issue in detail, but I would be happy to share more details in case still required. Can someone please help in resolving my issue. Regards Anirudh Gupta
My guess: You're running OVN. You need neutron-dhcp-agent running as well. OVN disables it by default and OVN's integrated DHCP service does not support options for network booting. -Julia On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Team
I am trying to Provision Bare Metal Node from my tripleo Overcloud. For this, while deploying the overcloud, I have followed the *"default flat" *network approach specified in the below link
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
Just to highlight the changes, I have defined the
*ironic-config.yaml*
parameter_defaults: ... ... IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: *172.23.3.100,172.23.3.150* IronicInspectorInterface: 'br-baremetal'
Also modified the file *~/templates/network-environment.yaml*
parameter_defaults: NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal
With this I have Followed all the steps of creating br-baremetal bridge on controller, given in the link below:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
- type: ovs_bridge name: br-baremetal use_dhcp: false members: - type: interface name: nic3
Post Deployment, I have also create a flat network on "datacentre" physical network and subnet having the range *172.23.3.200,172.23.3.240 *(as suggested subnet is same as of inspector and range is different) and the router
Also created a baremetal node and ran *"openstack baremetal node manage bm1", *the state of which was a success.
Observation:
On executing "openstack baremetal node *provide* bm1", the machine gets power on and ideally it should take an IP from ironic inspector range and PXE Boot. But nothing of this sort happens and we see an IP from neutron range " *172.23.3.239*" (attached the screenshot)
[image: image.png]
I have checked overcloud ironic inspector podman logs alongwith the tcpdump. In tcpdump, I can only see dhcp discover request on br-baremetal and nothing happens after that.
I have tried to explain my issue in detail, but I would be happy to share more details in case still required. Can someone please help in resolving my issue.
Regards Anirudh Gupta
Hi Julia, Thanks for your response. For the overcloud deployment, I am executing the following command: openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml \ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml I can see some OVN related stuff in my roles_data and environments/network-isolation.yaml [stack@undercloud ~]$ grep -inr "ovn" roles_data.yaml:34: *OVNCMSOptions: "enable-chassis-as-gw"* roles_data.yaml:168: - *OS::TripleO::Services::OVNDBs* roles_data.yaml:169: - *OS::TripleO::Services::OVNController* roles_data.yaml:279: - *OS::TripleO::Services::OVNController* roles_data.yaml:280: - *OS::TripleO::Services::OVNMetadataAgent* environments/network-isolation.yaml:16: *OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml* What is your recommendation and how to disable OVN....should I remove it from roles_data.yaml and then render so that it doesn't get generated in environments/network-isolation.yaml Please suggest some pointers. Regards Anirudh Gupta It seems OVN is getting installed in ironic On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:
My guess: You're running OVN. You need neutron-dhcp-agent running as well. OVN disables it by default and OVN's integrated DHCP service does not support options for network booting.
-Julia
On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Team
I am trying to Provision Bare Metal Node from my tripleo Overcloud. For this, while deploying the overcloud, I have followed the *"default flat" *network approach specified in the below link
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
Just to highlight the changes, I have defined the
*ironic-config.yaml*
parameter_defaults: ... ... IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: *172.23.3.100,172.23.3.150* IronicInspectorInterface: 'br-baremetal'
Also modified the file *~/templates/network-environment.yaml*
parameter_defaults: NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal
With this I have Followed all the steps of creating br-baremetal bridge on controller, given in the link below:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
- type: ovs_bridge name: br-baremetal use_dhcp: false members: - type: interface name: nic3
Post Deployment, I have also create a flat network on "datacentre" physical network and subnet having the range *172.23.3.200,172.23.3.240 *(as suggested subnet is same as of inspector and range is different) and the router
Also created a baremetal node and ran *"openstack baremetal node manage bm1", *the state of which was a success.
Observation:
On executing "openstack baremetal node *provide* bm1", the machine gets power on and ideally it should take an IP from ironic inspector range and PXE Boot. But nothing of this sort happens and we see an IP from neutron range " *172.23.3.239*" (attached the screenshot)
[image: image.png]
I have checked overcloud ironic inspector podman logs alongwith the tcpdump. In tcpdump, I can only see dhcp discover request on br-baremetal and nothing happens after that.
I have tried to explain my issue in detail, but I would be happy to share more details in case still required. Can someone please help in resolving my issue.
Regards Anirudh Gupta
It is not a matter of disabling OVN, but a matter of enabling the dnsmasq service and notifications. https://github.com/openstack/tripleo-heat-templates/blob/master/environments... may provide some insight. I suspect if you're using stable/wallaby based branches and it is not working, there may need to be a patch backported by the TripleO maintainers. On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia,
Thanks for your response. For the overcloud deployment, I am executing the following command:
openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml \ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml
I can see some OVN related stuff in my roles_data and environments/network-isolation.yaml
[stack@undercloud ~]$ grep -inr "ovn" roles_data.yaml:34: *OVNCMSOptions: "enable-chassis-as-gw"* roles_data.yaml:168: - *OS::TripleO::Services::OVNDBs* roles_data.yaml:169: - *OS::TripleO::Services::OVNController* roles_data.yaml:279: - *OS::TripleO::Services::OVNController* roles_data.yaml:280: - *OS::TripleO::Services::OVNMetadataAgent* environments/network-isolation.yaml:16: *OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml*
What is your recommendation and how to disable OVN....should I remove it from roles_data.yaml and then render so that it doesn't get generated in environments/network-isolation.yaml Please suggest some pointers.
Regards Anirudh Gupta
It seems OVN is getting installed in ironic
On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:
My guess: You're running OVN. You need neutron-dhcp-agent running as well. OVN disables it by default and OVN's integrated DHCP service does not support options for network booting.
-Julia
On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Team
I am trying to Provision Bare Metal Node from my tripleo Overcloud. For this, while deploying the overcloud, I have followed the *"default flat" *network approach specified in the below link
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
Just to highlight the changes, I have defined the
*ironic-config.yaml*
parameter_defaults: ... ... IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: *172.23.3.100,172.23.3.150* IronicInspectorInterface: 'br-baremetal'
Also modified the file *~/templates/network-environment.yaml*
parameter_defaults: NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal
With this I have Followed all the steps of creating br-baremetal bridge on controller, given in the link below:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
- type: ovs_bridge name: br-baremetal use_dhcp: false members: - type: interface name: nic3
Post Deployment, I have also create a flat network on "datacentre" physical network and subnet having the range *172.23.3.200,172.23.3.240 *(as suggested subnet is same as of inspector and range is different) and the router
Also created a baremetal node and ran *"openstack baremetal node manage bm1", *the state of which was a success.
Observation:
On executing "openstack baremetal node *provide* bm1", the machine gets power on and ideally it should take an IP from ironic inspector range and PXE Boot. But nothing of this sort happens and we see an IP from neutron range " *172.23.3.239*" (attached the screenshot)
[image: image.png]
I have checked overcloud ironic inspector podman logs alongwith the tcpdump. In tcpdump, I can only see dhcp discover request on br-baremetal and nothing happens after that.
I have tried to explain my issue in detail, but I would be happy to share more details in case still required. Can someone please help in resolving my issue.
Regards Anirudh Gupta
Hi Julia Thanks for your response. Earlier I was passing both ironic.yaml and ironic-overcloud.yaml located at path /usr/share/openstack-tripleo-heat-templates/environments/services/ My current understanding now says that since I am using OVN, not OVS so I should pass only ironic-overcloud.yaml in my deployment. I am currently on Train Release and my default ironic-overcloud.yaml file has no such entry DhcpAgentNotification: true I would add this there and re deploy the setup. Would that be enough to make my deployment successful? Regards Anirudh Gupta On Fri, 4 Feb, 2022, 18:40 Julia Kreger, <juliaashleykreger@gmail.com> wrote:
It is not a matter of disabling OVN, but a matter of enabling the dnsmasq service and notifications.
https://github.com/openstack/tripleo-heat-templates/blob/master/environments... may provide some insight.
I suspect if you're using stable/wallaby based branches and it is not working, there may need to be a patch backported by the TripleO maintainers.
On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia,
Thanks for your response. For the overcloud deployment, I am executing the following command:
openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml \ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml
I can see some OVN related stuff in my roles_data and environments/network-isolation.yaml
[stack@undercloud ~]$ grep -inr "ovn" roles_data.yaml:34: *OVNCMSOptions: "enable-chassis-as-gw"* roles_data.yaml:168: - *OS::TripleO::Services::OVNDBs* roles_data.yaml:169: - *OS::TripleO::Services::OVNController* roles_data.yaml:279: - *OS::TripleO::Services::OVNController* roles_data.yaml:280: - *OS::TripleO::Services::OVNMetadataAgent* environments/network-isolation.yaml:16: *OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml*
What is your recommendation and how to disable OVN....should I remove it from roles_data.yaml and then render so that it doesn't get generated in environments/network-isolation.yaml Please suggest some pointers.
Regards Anirudh Gupta
It seems OVN is getting installed in ironic
On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:
My guess: You're running OVN. You need neutron-dhcp-agent running as well. OVN disables it by default and OVN's integrated DHCP service does not support options for network booting.
-Julia
On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Team
I am trying to Provision Bare Metal Node from my tripleo Overcloud. For this, while deploying the overcloud, I have followed the *"default flat" *network approach specified in the below link
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
Just to highlight the changes, I have defined the
*ironic-config.yaml*
parameter_defaults: ... ... IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: *172.23.3.100,172.23.3.150* IronicInspectorInterface: 'br-baremetal'
Also modified the file *~/templates/network-environment.yaml*
parameter_defaults: NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal
With this I have Followed all the steps of creating br-baremetal bridge on controller, given in the link below:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
- type: ovs_bridge name: br-baremetal use_dhcp: false members: - type: interface name: nic3
Post Deployment, I have also create a flat network on "datacentre" physical network and subnet having the range *172.23.3.200,172.23.3.240 *(as suggested subnet is same as of inspector and range is different) and the router
Also created a baremetal node and ran *"openstack baremetal node manage bm1", *the state of which was a success.
Observation:
On executing "openstack baremetal node *provide* bm1", the machine gets power on and ideally it should take an IP from ironic inspector range and PXE Boot. But nothing of this sort happens and we see an IP from neutron range " *172.23.3.239*" (attached the screenshot)
[image: image.png]
I have checked overcloud ironic inspector podman logs alongwith the tcpdump. In tcpdump, I can only see dhcp discover request on br-baremetal and nothing happens after that.
I have tried to explain my issue in detail, but I would be happy to share more details in case still required. Can someone please help in resolving my issue.
Regards Anirudh Gupta
On Fri, Feb 4, 2022 at 5:50 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia
Thanks for your response.
Earlier I was passing both ironic.yaml and ironic-overcloud.yaml located at path /usr/share/openstack-tripleo-heat-templates/environments/services/
My current understanding now says that since I am using OVN, not OVS so I should pass only ironic-overcloud.yaml in my deployment.
I am currently on Train Release and my default ironic-overcloud.yaml file has no such entry DhcpAgentNotification: true
I suspect that should work. Let us know if it does.
I would add this there and re deploy the setup.
Would that be enough to make my deployment successful?
Regards Anirudh Gupta
On Fri, 4 Feb, 2022, 18:40 Julia Kreger, <juliaashleykreger@gmail.com> wrote:
It is not a matter of disabling OVN, but a matter of enabling the dnsmasq service and notifications.
https://github.com/openstack/tripleo-heat-templates/blob/master/environments... may provide some insight.
I suspect if you're using stable/wallaby based branches and it is not working, there may need to be a patch backported by the TripleO maintainers.
On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia,
Thanks for your response. For the overcloud deployment, I am executing the following command:
openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml \ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml
I can see some OVN related stuff in my roles_data and environments/network-isolation.yaml
[stack@undercloud ~]$ grep -inr "ovn" roles_data.yaml:34: *OVNCMSOptions: "enable-chassis-as-gw"* roles_data.yaml:168: - *OS::TripleO::Services::OVNDBs* roles_data.yaml:169: - *OS::TripleO::Services::OVNController* roles_data.yaml:279: - *OS::TripleO::Services::OVNController* roles_data.yaml:280: - *OS::TripleO::Services::OVNMetadataAgent* environments/network-isolation.yaml:16: *OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml*
What is your recommendation and how to disable OVN....should I remove it from roles_data.yaml and then render so that it doesn't get generated in environments/network-isolation.yaml Please suggest some pointers.
Regards Anirudh Gupta
It seems OVN is getting installed in ironic
On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:
My guess: You're running OVN. You need neutron-dhcp-agent running as well. OVN disables it by default and OVN's integrated DHCP service does not support options for network booting.
-Julia
On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Team
I am trying to Provision Bare Metal Node from my tripleo Overcloud. For this, while deploying the overcloud, I have followed the *"default flat" *network approach specified in the below link
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
Just to highlight the changes, I have defined the
*ironic-config.yaml*
parameter_defaults: ... ... IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: *172.23.3.100,172.23.3.150* IronicInspectorInterface: 'br-baremetal'
Also modified the file *~/templates/network-environment.yaml*
parameter_defaults: NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal
With this I have Followed all the steps of creating br-baremetal bridge on controller, given in the link below:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
- type: ovs_bridge name: br-baremetal use_dhcp: false members: - type: interface name: nic3
Post Deployment, I have also create a flat network on "datacentre" physical network and subnet having the range *172.23.3.200,172.23.3.240 *(as suggested subnet is same as of inspector and range is different) and the router
Also created a baremetal node and ran *"openstack baremetal node manage bm1", *the state of which was a success.
Observation:
On executing "openstack baremetal node *provide* bm1", the machine gets power on and ideally it should take an IP from ironic inspector range and PXE Boot. But nothing of this sort happens and we see an IP from neutron range " *172.23.3.239*" (attached the screenshot)
[image: image.png]
I have checked overcloud ironic inspector podman logs alongwith the tcpdump. In tcpdump, I can only see dhcp discover request on br-baremetal and nothing happens after that.
I have tried to explain my issue in detail, but I would be happy to share more details in case still required. Can someone please help in resolving my issue.
Regards Anirudh Gupta
Hi, Surely I'll revert the status once it gets deployed. Bdw the suspicion is because of Train Release or it is something else? Regards Anirudh Gupta On Fri, 4 Feb, 2022, 20:29 Julia Kreger, <juliaashleykreger@gmail.com> wrote:
On Fri, Feb 4, 2022 at 5:50 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia
Thanks for your response.
Earlier I was passing both ironic.yaml and ironic-overcloud.yaml located at path /usr/share/openstack-tripleo-heat-templates/environments/services/
My current understanding now says that since I am using OVN, not OVS so I should pass only ironic-overcloud.yaml in my deployment.
I am currently on Train Release and my default ironic-overcloud.yaml file has no such entry DhcpAgentNotification: true
I suspect that should work. Let us know if it does.
I would add this there and re deploy the setup.
Would that be enough to make my deployment successful?
Regards Anirudh Gupta
On Fri, 4 Feb, 2022, 18:40 Julia Kreger, <juliaashleykreger@gmail.com> wrote:
It is not a matter of disabling OVN, but a matter of enabling the dnsmasq service and notifications.
https://github.com/openstack/tripleo-heat-templates/blob/master/environments... may provide some insight.
I suspect if you're using stable/wallaby based branches and it is not working, there may need to be a patch backported by the TripleO maintainers.
On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia,
Thanks for your response. For the overcloud deployment, I am executing the following command:
openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml \ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml
I can see some OVN related stuff in my roles_data and environments/network-isolation.yaml
[stack@undercloud ~]$ grep -inr "ovn" roles_data.yaml:34: *OVNCMSOptions: "enable-chassis-as-gw"* roles_data.yaml:168: - *OS::TripleO::Services::OVNDBs* roles_data.yaml:169: - *OS::TripleO::Services::OVNController* roles_data.yaml:279: - *OS::TripleO::Services::OVNController* roles_data.yaml:280: - *OS::TripleO::Services::OVNMetadataAgent* environments/network-isolation.yaml:16: *OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml*
What is your recommendation and how to disable OVN....should I remove it from roles_data.yaml and then render so that it doesn't get generated in environments/network-isolation.yaml Please suggest some pointers.
Regards Anirudh Gupta
It seems OVN is getting installed in ironic
On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger < juliaashleykreger@gmail.com> wrote:
My guess: You're running OVN. You need neutron-dhcp-agent running as well. OVN disables it by default and OVN's integrated DHCP service does not support options for network booting.
-Julia
On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Team
I am trying to Provision Bare Metal Node from my tripleo Overcloud. For this, while deploying the overcloud, I have followed the *"default flat" *network approach specified in the below link
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
Just to highlight the changes, I have defined the
*ironic-config.yaml*
parameter_defaults: ... ... IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: *172.23.3.100,172.23.3.150* IronicInspectorInterface: 'br-baremetal'
Also modified the file *~/templates/network-environment.yaml*
parameter_defaults: NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal
With this I have Followed all the steps of creating br-baremetal bridge on controller, given in the link below:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
- type: ovs_bridge name: br-baremetal use_dhcp: false members: - type: interface name: nic3
Post Deployment, I have also create a flat network on "datacentre" physical network and subnet having the range *172.23.3.200,172.23.3.240 *(as suggested subnet is same as of inspector and range is different) and the router
Also created a baremetal node and ran *"openstack baremetal node manage bm1", *the state of which was a success.
Observation:
On executing "openstack baremetal node *provide* bm1", the machine gets power on and ideally it should take an IP from ironic inspector range and PXE Boot. But nothing of this sort happens and we see an IP from neutron range " *172.23.3.239*" (attached the screenshot)
[image: image.png]
I have checked overcloud ironic inspector podman logs alongwith the tcpdump. In tcpdump, I can only see dhcp discover request on br-baremetal and nothing happens after that.
I have tried to explain my issue in detail, but I would be happy to share more details in case still required. Can someone please help in resolving my issue.
Regards Anirudh Gupta
Hi Julia, Thanks a lot for your responses and support. To Update on the ongoing issue, I tried deploying the overcloud with your valuable suggestions i.e by passing "*DhcpAgentNotification: true*" in ironic-overcloud.yaml The setup came up successfully, but with this configuration the IP allocated on the system is one which is being configured while creating the subnet in openstack. [image: image.png] The system is still getting the IP (172.23.3.212) from neutron. The subnet range was configured as *172.23.3.210-172.23.3.240 *while creating the provisioning subnet. The system gets stuck here and no action is performed after this. Is there any way to resolve this and make successful provisioning the baremetal node in *TripleO Train Release* (Since RHOSP 16 was on Train, so I thought to go with that version for better stability) https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.... I have some queries: 1. Is passing "*DhcpAgentNotification: true" *enough or do we have to make some other changes as well? 2. Although there are some security concerns specified in the document, but Currently I am focusing on the default flat bare metal approach which has dedicated interface for bare metal Provisioning. There is one composable method approach as well. Keeping aside the security concerns, which approach is better and functional? 1. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.... 3. Will moving to upper openstack release version make this deployment possible? 1. If Yes, which release should I go with as till wallaby the ironic-overcloud.yml file has no option of including "*DhcpAgentNotification: true*" by default 1. https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi... Looking forward for your valuable feedback/response. Regards Anirudh Gupta On Fri, Feb 4, 2022 at 8:54 PM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi,
Surely I'll revert the status once it gets deployed. Bdw the suspicion is because of Train Release or it is something else?
Regards Anirudh Gupta
On Fri, 4 Feb, 2022, 20:29 Julia Kreger, <juliaashleykreger@gmail.com> wrote:
On Fri, Feb 4, 2022 at 5:50 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia
Thanks for your response.
Earlier I was passing both ironic.yaml and ironic-overcloud.yaml located at path /usr/share/openstack-tripleo-heat-templates/environments/services/
My current understanding now says that since I am using OVN, not OVS so I should pass only ironic-overcloud.yaml in my deployment.
I am currently on Train Release and my default ironic-overcloud.yaml file has no such entry DhcpAgentNotification: true
I suspect that should work. Let us know if it does.
I would add this there and re deploy the setup.
Would that be enough to make my deployment successful?
Regards Anirudh Gupta
On Fri, 4 Feb, 2022, 18:40 Julia Kreger, <juliaashleykreger@gmail.com> wrote:
It is not a matter of disabling OVN, but a matter of enabling the dnsmasq service and notifications.
https://github.com/openstack/tripleo-heat-templates/blob/master/environments... may provide some insight.
I suspect if you're using stable/wallaby based branches and it is not working, there may need to be a patch backported by the TripleO maintainers.
On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia,
Thanks for your response. For the overcloud deployment, I am executing the following command:
openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml \ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml
I can see some OVN related stuff in my roles_data and environments/network-isolation.yaml
[stack@undercloud ~]$ grep -inr "ovn" roles_data.yaml:34: *OVNCMSOptions: "enable-chassis-as-gw"* roles_data.yaml:168: - *OS::TripleO::Services::OVNDBs* roles_data.yaml:169: - *OS::TripleO::Services::OVNController* roles_data.yaml:279: - *OS::TripleO::Services::OVNController* roles_data.yaml:280: - *OS::TripleO::Services::OVNMetadataAgent* environments/network-isolation.yaml:16: *OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml*
What is your recommendation and how to disable OVN....should I remove it from roles_data.yaml and then render so that it doesn't get generated in environments/network-isolation.yaml Please suggest some pointers.
Regards Anirudh Gupta
It seems OVN is getting installed in ironic
On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger < juliaashleykreger@gmail.com> wrote:
My guess: You're running OVN. You need neutron-dhcp-agent running as well. OVN disables it by default and OVN's integrated DHCP service does not support options for network booting.
-Julia
On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
> Hi Team > > I am trying to Provision Bare Metal Node from my tripleo Overcloud. > For this, while deploying the overcloud, I have followed the *"default > flat" *network approach specified in the below link > > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... > > Just to highlight the changes, I have defined the > > *ironic-config.yaml* > > parameter_defaults: > ... > ... > IronicIPXEEnabled: true > IronicInspectorSubnets: > - ip_range: *172.23.3.100,172.23.3.150* > IronicInspectorInterface: 'br-baremetal' > > Also modified the file *~/templates/network-environment.yaml* > > parameter_defaults: > NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal > NeutronFlatNetworks: datacentre,baremetal > > With this I have Followed all the steps of creating br-baremetal > bridge on controller, given in the link below: > > > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... > > - type: ovs_bridge > name: br-baremetal > use_dhcp: false > members: > - type: interface > name: nic3 > > Post Deployment, I have also create a flat network on "datacentre" > physical network and subnet having the range *172.23.3.200,172.23.3.240 > *(as suggested subnet is same as of inspector and range is > different) and the router > > Also created a baremetal node and ran *"openstack baremetal node > manage bm1", *the state of which was a success. > > Observation: > > On executing "openstack baremetal node *provide* bm1", the machine > gets power on and ideally it should take an IP from ironic inspector range > and PXE Boot. > But nothing of this sort happens and we see an IP from neutron range > "*172.23.3.239*" (attached the screenshot) > > [image: image.png] > > I have checked overcloud ironic inspector podman logs alongwith the > tcpdump. > In tcpdump, I can only see dhcp discover request on br-baremetal and > nothing happens after that. > > I have tried to explain my issue in detail, but I would be happy to > share more details in case still required. > Can someone please help in resolving my issue. > > Regards > Anirudh Gupta >
On Mon, Feb 7, 2022 at 4:47 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia,
Thanks a lot for your responses and support. To Update on the ongoing issue, I tried deploying the overcloud with your valuable suggestions i.e by passing "*DhcpAgentNotification: true*" in ironic-overcloud.yaml The setup came up successfully, but with this configuration the IP allocated on the system is one which is being configured while creating the subnet in openstack.
[image: image.png]
The system is still getting the IP (172.23.3.212) from neutron. The subnet range was configured as *172.23.3.210-172.23.3.240 *while creating the provisioning subnet. The system gets stuck here and no action is performed after this.
Is there any way to resolve this and make successful provisioning the baremetal node in *TripleO Train Release* (Since RHOSP 16 was on Train, so I thought to go with that version for better stability)
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
I have some queries:
1. Is passing "*DhcpAgentNotification: true" *enough or do we have to make some other changes as well?
I have no way to really know. My understanding is based on the contents in the templates you have chosen and/or modified, then the neutron-dhcp-agent can be disabled. It would be easy to see if it is in place or not by running `openstack network agent list` and looking for a neutron dhcp agent. If not present, something is disabling the agent which is required for bare metal to function as the integrated DHCP server in OVN does not support PXE options such as those used to facilitate network booting.
1. Although there are some security concerns specified in the document, but Currently I am focusing on the default flat bare metal approach which has dedicated interface for bare metal Provisioning. There is one composable method approach as well. Keeping aside the security concerns, which approach is better and functional? 1. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
You're referencing RH docs, which makes me wonder if you're a RH customer or if you're trying to use RH docs with upstream TripleO which may not exactly be ideal. If you are a RH customer, it wouldn't be a bad idea to reach out to RH support. Anyway, starting out you likely want to focus on the basics and not use a composible network. Once you have that working it would make sense to evolve towards a composed network. Trying to do it now introduces more variables which will make it harder to configure it for your environment.
1. Will moving to upper openstack release version make this deployment possible? 1. If Yes, which release should I go with as till wallaby the ironic-overcloud.yml file has no option of including "*DhcpAgentNotification: true*" by default 1. https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi...
Possibly, I honestly don't know the entire change history and interaction of the templates and overrides which exist with all the various options you can choose with TripleO.
Looking forward for your valuable feedback/response.
Regards Anirudh Gupta
On Fri, Feb 4, 2022 at 8:54 PM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi,
Surely I'll revert the status once it gets deployed. Bdw the suspicion is because of Train Release or it is something else?
Regards Anirudh Gupta
On Fri, 4 Feb, 2022, 20:29 Julia Kreger, <juliaashleykreger@gmail.com> wrote:
On Fri, Feb 4, 2022 at 5:50 AM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia
Thanks for your response.
Earlier I was passing both ironic.yaml and ironic-overcloud.yaml located at path /usr/share/openstack-tripleo-heat-templates/environments/services/
My current understanding now says that since I am using OVN, not OVS so I should pass only ironic-overcloud.yaml in my deployment.
I am currently on Train Release and my default ironic-overcloud.yaml file has no such entry DhcpAgentNotification: true
I suspect that should work. Let us know if it does.
I would add this there and re deploy the setup.
Would that be enough to make my deployment successful?
Regards Anirudh Gupta
On Fri, 4 Feb, 2022, 18:40 Julia Kreger, <juliaashleykreger@gmail.com> wrote:
It is not a matter of disabling OVN, but a matter of enabling the dnsmasq service and notifications.
https://github.com/openstack/tripleo-heat-templates/blob/master/environments... may provide some insight.
I suspect if you're using stable/wallaby based branches and it is not working, there may need to be a patch backported by the TripleO maintainers.
On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta <anyrude10@gmail.com> wrote:
Hi Julia,
Thanks for your response. For the overcloud deployment, I am executing the following command:
openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml \ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml
I can see some OVN related stuff in my roles_data and environments/network-isolation.yaml
[stack@undercloud ~]$ grep -inr "ovn" roles_data.yaml:34: *OVNCMSOptions: "enable-chassis-as-gw"* roles_data.yaml:168: - *OS::TripleO::Services::OVNDBs* roles_data.yaml:169: - *OS::TripleO::Services::OVNController* roles_data.yaml:279: - *OS::TripleO::Services::OVNController* roles_data.yaml:280: - *OS::TripleO::Services::OVNMetadataAgent* environments/network-isolation.yaml:16: *OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml*
What is your recommendation and how to disable OVN....should I remove it from roles_data.yaml and then render so that it doesn't get generated in environments/network-isolation.yaml Please suggest some pointers.
Regards Anirudh Gupta
It seems OVN is getting installed in ironic
On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger < juliaashleykreger@gmail.com> wrote:
> My guess: You're running OVN. You need neutron-dhcp-agent running as > well. OVN disables it by default and OVN's integrated DHCP service does not > support options for network booting. > > -Julia > > On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta <anyrude10@gmail.com> > wrote: > >> Hi Team >> >> I am trying to Provision Bare Metal Node from my tripleo Overcloud. >> For this, while deploying the overcloud, I have followed the *"default >> flat" *network approach specified in the below link >> >> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... >> >> Just to highlight the changes, I have defined the >> >> *ironic-config.yaml* >> >> parameter_defaults: >> ... >> ... >> IronicIPXEEnabled: true >> IronicInspectorSubnets: >> - ip_range: *172.23.3.100,172.23.3.150* >> IronicInspectorInterface: 'br-baremetal' >> >> Also modified the file *~/templates/network-environment.yaml* >> >> parameter_defaults: >> NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal >> NeutronFlatNetworks: datacentre,baremetal >> >> With this I have Followed all the steps of creating br-baremetal >> bridge on controller, given in the link below: >> >> >> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... >> >> - type: ovs_bridge >> name: br-baremetal >> use_dhcp: false >> members: >> - type: interface >> name: nic3 >> >> Post Deployment, I have also create a flat network on "datacentre" >> physical network and subnet having the range *172.23.3.200,172.23.3.240 >> *(as suggested subnet is same as of inspector and range is >> different) and the router >> >> Also created a baremetal node and ran *"openstack baremetal node >> manage bm1", *the state of which was a success. >> >> Observation: >> >> On executing "openstack baremetal node *provide* bm1", the >> machine gets power on and ideally it should take an IP from ironic >> inspector range and PXE Boot. >> But nothing of this sort happens and we see an IP from neutron >> range "*172.23.3.239*" (attached the screenshot) >> >> [image: image.png] >> >> I have checked overcloud ironic inspector podman logs alongwith the >> tcpdump. >> In tcpdump, I can only see dhcp discover request on br-baremetal >> and nothing happens after that. >> >> I have tried to explain my issue in detail, but I would be happy to >> share more details in case still required. >> Can someone please help in resolving my issue. >> >> Regards >> Anirudh Gupta >> >
On 2/7/22 13:47, Anirudh Gupta wrote:
Hi Julia,
Thanks a lot for your responses and support. To Update on the ongoing issue, I tried deploying the overcloud with your valuable suggestions i.e by passing "*DhcpAgentNotification: true*" in ironic-overcloud.yaml The setup came up successfully, but with this configuration the IP allocated on the system is one which is being configured while creating the subnet in openstack.
image.png
The system is still getting the IP (172.23.3.212) from neutron. The subnet range was configured as *172.23.3.210-172.23.3.240 *while creating the provisioning subnet.
The node is supposed to get an IP address from the neutron subnet on the provisioning network when: a) provisioning node b) cleaning node. When you do "baremetal node provide" cleaning is most likely automatically initiated. (Since cleaning is enabled by default for Ironic in overcloud AFIK.) The only time you will get an address from the IronicInspectorSubnets (ip_range: 172.23.3.100,172.23.3.150 in your case) is when you start ironic node introspection.
The system gets stuck here and no action is performed after this.
It seems the system is getting an address from the expected DHCP server, but it does not boot. I would start looking into the pxe properties in the DHCP Reply. What is the status of the node in ironic at this stage? `openstack baremetal node list` `openstack baremetal node show <node-uuid>` Check the `extra_dhcp_opts` on the neutron port, it should set the nextserver and bootfile parameters. Does the bootfile exist in /var/lib/ironic/tftpboot? Inspect the `/var/lib/ironic/tftpboot/pxelinux.cfg/` directory, you should see a file matching the MAC address of your system. Does the content make sense? Can you capture DHCP and TFTP traffic on the provisioning network?
Is there any way to resolve this and make successful provisioning the baremetal node in *TripleO Train Release* (Since RHOSP 16 was on Train, so I thought to go with that version for better stability) https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/release_notes/index>
I have some queries:
1. Is passing "*DhcpAgentNotification: true" *enough or do we have to make some other changes as well?
I belive in train "DhcpAgentNotification" defaults to True. The change to default to false was added more recently, and it was not backported. (https://review.opendev.org/c/openstack/tripleo-heat-templates/+/801761) NOTE, the environment for enabling ironi for the overcloud 'environments/services/ironic-overcloud.yaml' overrides this to 'true' in later releases.
2. Although there are some security concerns specified in the document, but Currently I am focusing on the default flat bare metal approach which has dedicated interface for bare metal Provisioning. There is one composable method approach as well. Keeping aside the security concerns, which approach is better and functional? 1. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/bare_metal_provisioning/prerequisites-for-bare-metal-provisioning>
Both should work, using the composable network is more secure since baremetal nodes does not have access to the control plane network.
3. Will moving to upper openstack release version make this deployment possible? 1. If Yes, which release should I go with as till wallaby the ironic-overcloud.yml file has no option of including "*DhcpAgentNotification: true*" by default 1. https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi... <https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/environments/services/ironic-overcloud.yaml>
Looking forward for your valuable feedback/response.
Regards Anirudh Gupta
On Fri, Feb 4, 2022 at 8:54 PM Anirudh Gupta <anyrude10@gmail.com <mailto:anyrude10@gmail.com>> wrote:
Hi,
Surely I'll revert the status once it gets deployed. Bdw the suspicion is because of Train Release or it is something else?
Regards Anirudh Gupta
On Fri, 4 Feb, 2022, 20:29 Julia Kreger, <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>> wrote:
On Fri, Feb 4, 2022 at 5:50 AM Anirudh Gupta <anyrude10@gmail.com <mailto:anyrude10@gmail.com>> wrote:
Hi Julia
Thanks for your response.
Earlier I was passing both ironic.yaml and ironic-overcloud.yaml located at path /usr/share/openstack-tripleo-heat-templates/environments/services/
My current understanding now says that since I am using OVN, not OVS so I should pass only ironic-overcloud.yaml in my deployment.
I am currently on Train Release and my default ironic-overcloud.yaml file has no such entry DhcpAgentNotification: true
I suspect that should work. Let us know if it does.
I would add this there and re deploy the setup.
Would that be enough to make my deployment successful?
Regards Anirudh Gupta
On Fri, 4 Feb, 2022, 18:40 Julia Kreger, <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>> wrote:
It is not a matter of disabling OVN, but a matter of enabling the dnsmasq service and notifications.
https://github.com/openstack/tripleo-heat-templates/blob/master/environments... <https://github.com/openstack/tripleo-heat-templates/blob/master/environments/services/ironic-overcloud.yaml> may provide some insight.
I suspect if you're using stable/wallaby based branches and it is not working, there may need to be a patch backported by the TripleO maintainers.
On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta <anyrude10@gmail.com <mailto:anyrude10@gmail.com>> wrote:
Hi Julia,
Thanks for your response. For the overcloud deployment, I am executing the following command:
openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml \ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml
I can see some OVN related stuff in my roles_data and environments/network-isolation.yaml
[stack@undercloud ~]$ grep -inr "ovn" roles_data.yaml:34: *OVNCMSOptions: "enable-chassis-as-gw"* roles_data.yaml:168: - *OS::TripleO::Services::OVNDBs* roles_data.yaml:169: - *OS::TripleO::Services::OVNController* roles_data.yaml:279: - *OS::TripleO::Services::OVNController* roles_data.yaml:280: - *OS::TripleO::Services::OVNMetadataAgent* environments/network-isolation.yaml:16: *OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml* * * What is your recommendation and how to disable OVN....should I remove it from roles_data.yaml and then render so that it doesn't get generated in environments/network-isolation.yaml Please suggest some pointers.
Regards Anirudh Gupta * * * *
It seems OVN is getting installed in ironic
On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>> wrote:
My guess: You're running OVN. You need neutron-dhcp-agent running as well. OVN disables it by default and OVN's integrated DHCP service does not support options for network booting.
-Julia
On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta <anyrude10@gmail.com <mailto:anyrude10@gmail.com>> wrote:
Hi Team
I am trying to Provision Bare Metal Node from my tripleo Overcloud. For this, while deploying the overcloud, I have followed the *"default flat" *network approach specified in the below link https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-planning>
Just to highlight the changes, I have defined the
*ironic-config.yaml*
parameter_defaults: ... ... IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: *172.23.3.100,172.23.3.150* IronicInspectorInterface: 'br-baremetal'
Also modified the file *~/templates/network-environment.yaml*
parameter_defaults: NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal
With this I have Followed all the steps of creating br-baremetal bridge on controller, given in the link below:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-deploy>
- type: ovs_bridge name: br-baremetal use_dhcp: false members: - type: interface name: nic3
Post Deployment, I have also create a flat network on "datacentre" physical network and subnet having the range *172.23.3.200,172.23.3.240 *(as suggested subnet is same as of inspector and range is different) and the router
Also created a baremetal node and ran *"openstack baremetal node manage bm1", *the state of which was a success.
Observation:
On executing "openstack baremetal node *provide* bm1", the machine gets power on and ideally it should take an IP from ironic inspector range and PXE Boot. But nothing of this sort happens and we see an IP from neutron range "*172.23.3.239*" (attached the screenshot)
image.png
I have checked overcloud ironic inspector podman logs alongwith the tcpdump. In tcpdump, I can only see dhcp discover request on br-baremetal and nothing happens after that.
I have tried to explain my issue in detail, but I would be happy to share more details in case still required. Can someone please help in resolving my issue.
Regards Anirudh Gupta
Hi Harald, Responding on behalf of Anirudh's email: Thanks for the response and we now do understand that we are getting IP from the expected DHCP server. We tried the scenario and here are our findings, Our admin and internal endpoints are on subnet: 30.30.30.x public : 10.0.1.x (overcloud) [stack@undercloud ~]$ *OpenStack endpoint list | grep ironic* | 04c163251e5546769446a4fa4fa20484 | regionOne | ironic | baremetal | True | admin | http://30.30.30.213:6385 | | 5c8557ae639a4898bdc6121f6e873724 | regionOne | ironic | baremetal | True | internal | http://30.30.30.213:6385 | | 62e07a3b2f3f4158bb27d8603a8f5138 | regionOne | ironic-inspector | baremetal-introspection | True | public | http://10.0.1.88:5050 | | af29bd64513546409f44cc5d56ea1082 | regionOne | ironic-inspector | baremetal-introspection | True | internal | http://30.30.30.213:5050 | | b76cdb5e77c54fc6b10cbfeada0e8bf5 | regionOne | ironic-inspector | baremetal-introspection | True | admin | http://30.30.30.213:5050 | | bd2954f41e49419f85669990eb59f51a | regionOne | ironic | baremetal | True | public | http://10.0.1.88:6385 | (overcloud) [stack@undercloud ~]$ we are following the flat default n/w approach for ironic provisioning, for which we are creating a flat network on baremetal physnet. we are still getting IP from neutron range (172.23.3.220 - 172.23.3.240) - 172.23.3.240. Further, we found that once IP (172.23.3.240) is allocated to baremetal node, it looks for 30.30.30.220( IP of one of the three controllers) for pxe booting. Checking the same controllers logs we found that *`/var/lib/ironic/tftpboot/pxelinux.cfg/` directory exists,* but then there is *no file matching the mac *address of the baremetal node. Also checking the *extra_dhcp_opts* we found this: (overcloud) [stack@undercloud ~]$ *openstack port show d7e573bf-1028-437a-8118-a2074c7573b2 | grep "extra_dhcp_opts"* | extra_dhcp_opts | ip_version='4', opt_name='tag:ipxe,67', opt_value='http://30.30.30.220:8088/boot.ipxe' [image: image.png] *Few points as observations:* 1. Although the baremetal network (172.23.3.x) is routable to the admin network (30.30.30.x), but it gets timeout at this window. 2. in TCPDump we are only getting read requests. 3. `openstack baremetal node list 1. (overcloud) [stack@undercloud ~]$ openstack baremetal node list +--------------------------------------+------+---------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+------+---------------+-------------+--------------------+-------------+ | 7066fbe1-9c29-4702-9cd4-2b55daf19630 | bm1 | None | power on | clean wait | False | +--------------------------------------+------+---------------+-------------+--------------------+-------------+ 4. `openstack baremetal node show <node-uuid>` 1. (overcloud) [stack@undercloud ~]$ openstack baremetal node show bm1 +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value | +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | allocation_uuid | None | | automated_clean | None | | bios_interface | no-bios | | boot_interface | ipxe | | chassis_uuid | None | | clean_step | {} | | conductor | overcloud-controller-0.localdomain | | conductor_group | | | console_enabled | False | | console_interface | ipmitool-socat | | created_at | 2022-02-09T14:21:24+00:00 | | deploy_interface | iscsi | | deploy_step | {} | | description | None | | driver | ipmi | | driver_info | {'ipmi_address': '10.0.1.183', 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_terminal_port': 623, 'deploy_kernel': '9e1365b6-261a-42a2-abfe-40158945de57', 'deploy_ramdisk': 'fe608dd2-ce86-4faf-b4b8-cc5cb143eb56'} | | driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, 'agent_continue_if_ata_erase_failed': False, 'agent_enable_ata_secure_erase': True, 'disk_erasure_concurrency': 1, 'last_power_state_change': '2022-02-09T14:23:39.525629'} | | extra | {} | | fault | None | | inspect_interface | inspector | | inspection_finished_at | None | | inspection_started_at | None | | instance_info | {} | | instance_uuid | None | | last_error | None | | maintenance | False | | maintenance_reason | None | | management_interface | ipmitool | | name | bm1 | | network_interface | flat | | owner | None | | power_interface | ipmitool | | power_state | power on | | properties | {'cpus': 20, 'cpu_arch': 'x86_64', 'capabilities': 'boot_option:local,boot_mode:uefi', 'memory_mb': 63700, 'local_gb': 470, 'vendor': 'hewlett-packard'} | | protected | False | | protected_reason | None | | provision_state | clean wait | | provision_updated_at | 2022-02-09T14:24:05+00:00 | | raid_config | {} | | raid_interface | no-raid | | rescue_interface | agent | | reservation | None | | resource_class | bm1 | | storage_interface | noop | | target_power_state | None | | target_provision_state | available | | target_raid_config | {} | | traits | [] | | updated_at | 2022-02-09T14:24:05+00:00 | | uuid | 7066fbe1-9c29-4702-9cd4-2b55daf19630 | | vendor_interface | ipmitool | +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud ~]$ *Queries:* - What are the settings we can do for successfully pxe-boot of the baremetal node and provisioning our node successfully ? On Tue, Feb 8, 2022 at 6:27 PM Harald Jensas <hjensas@redhat.com> wrote:
On 2/7/22 13:47, Anirudh Gupta wrote:
Hi Julia,
Thanks a lot for your responses and support. To Update on the ongoing issue, I tried deploying the overcloud with your valuable suggestions i.e by passing "*DhcpAgentNotification: true*" in ironic-overcloud.yaml The setup came up successfully, but with this configuration the IP allocated on the system is one which is being configured while creating the subnet in openstack.
image.png
The system is still getting the IP (172.23.3.212) from neutron. The subnet range was configured as *172.23.3.210-172.23.3.240 *while creating the provisioning subnet.
The node is supposed to get an IP address from the neutron subnet on the provisioning network when: a) provisioning node b) cleaning node.
When you do "baremetal node provide" cleaning is most likely automatically initiated. (Since cleaning is enabled by default for Ironic in overcloud AFIK.)
The only time you will get an address from the IronicInspectorSubnets (ip_range: 172.23.3.100,172.23.3.150 in your case) is when you start ironic node introspection.
The system gets stuck here and no action is performed after this.
It seems the system is getting an address from the expected DHCP server, but it does not boot. I would start looking into the pxe properties in the DHCP Reply.
What is the status of the node in ironic at this stage? `openstack baremetal node list` `openstack baremetal node show <node-uuid>`
Check the `extra_dhcp_opts` on the neutron port, it should set the nextserver and bootfile parameters. Does the bootfile exist in /var/lib/ironic/tftpboot? Inspect the `/var/lib/ironic/tftpboot/pxelinux.cfg/` directory, you should see a file matching the MAC address of your system. Does the content make sense?
Can you capture DHCP and TFTP traffic on the provisioning network?
Is there any way to resolve this and make successful provisioning the baremetal node in *TripleO Train Release* (Since RHOSP 16 was on Train, so I thought to go with that version for better stability)
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
< https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
I have some queries:
1. Is passing "*DhcpAgentNotification: true" *enough or do we have to make some other changes as well?
I belive in train "DhcpAgentNotification" defaults to True. The change to default to false was added more recently, and it was not backported. (https://review.opendev.org/c/openstack/tripleo-heat-templates/+/801761)
NOTE, the environment for enabling ironi for the overcloud 'environments/services/ironic-overcloud.yaml' overrides this to 'true' in later releases.
2. Although there are some security concerns specified in the document, but Currently I am focusing on the default flat bare metal approach which has dedicated interface for bare metal Provisioning. There is one composable method approach as well. Keeping aside the security concerns, which approach is better and functional? 1. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.... < https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
Both should work, using the composable network is more secure since baremetal nodes does not have access to the control plane network.
3. Will moving to upper openstack release version make this deployment possible? 1. If Yes, which release should I go with as till wallaby the ironic-overcloud.yml file has no option of including "*DhcpAgentNotification: true*" by default 1. https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi... < https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi...
Looking forward for your valuable feedback/response.
Regards Anirudh Gupta
On Fri, Feb 4, 2022 at 8:54 PM Anirudh Gupta <anyrude10@gmail.com <mailto:anyrude10@gmail.com>> wrote:
Hi,
Surely I'll revert the status once it gets deployed. Bdw the suspicion is because of Train Release or it is something else?
Regards Anirudh Gupta
On Fri, 4 Feb, 2022, 20:29 Julia Kreger, <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>> wrote:
On Fri, Feb 4, 2022 at 5:50 AM Anirudh Gupta <anyrude10@gmail.com <mailto:anyrude10@gmail.com>> wrote:
Hi Julia
Thanks for your response.
Earlier I was passing both ironic.yaml and ironic-overcloud.yaml located at path
/usr/share/openstack-tripleo-heat-templates/environments/services/
My current understanding now says that since I am using OVN, not OVS so I should pass only ironic-overcloud.yaml in my deployment.
I am currently on Train Release and my default ironic-overcloud.yaml file has no such entry DhcpAgentNotification: true
I suspect that should work. Let us know if it does.
I would add this there and re deploy the setup.
Would that be enough to make my deployment successful?
Regards Anirudh Gupta
On Fri, 4 Feb, 2022, 18:40 Julia Kreger, <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>> wrote:
It is not a matter of disabling OVN, but a matter of enabling the dnsmasq service and notifications.
https://github.com/openstack/tripleo-heat-templates/blob/master/environments...
<
https://github.com/openstack/tripleo-heat-templates/blob/master/environments...
may provide some insight.
I suspect if you're using stable/wallaby based branches and it is not working, there may need to be a patch backported by the TripleO maintainers.
On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta <anyrude10@gmail.com <mailto:anyrude10@gmail.com>>
wrote:
Hi Julia,
Thanks for your response. For the overcloud deployment, I am executing the following command:
openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e
/home/stack/templates/environments/network-isolation.yaml
\ -e
/home/stack/templates/environments/network-environment.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml
\ -e /home/stack/templates/ironic-config.yaml \ -e
/usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/podman.yaml
\ -e /home/stack/containers-prepare-parameter.yaml
I can see some OVN related stuff in my roles_data and environments/network-isolation.yaml
[stack@undercloud ~]$ grep -inr "ovn" roles_data.yaml:34: *OVNCMSOptions: "enable-chassis-as-gw"* roles_data.yaml:168: - *OS::TripleO::Services::OVNDBs* roles_data.yaml:169: - *OS::TripleO::Services::OVNController* roles_data.yaml:279: - *OS::TripleO::Services::OVNController* roles_data.yaml:280: - *OS::TripleO::Services::OVNMetadataAgent* environments/network-isolation.yaml:16: *OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml* * * What is your recommendation and how to disable OVN....should I remove it from roles_data.yaml and then render so that it doesn't get generated in environments/network-isolation.yaml Please suggest some pointers.
Regards Anirudh Gupta * * * *
It seems OVN is getting installed in ironic
On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>> wrote:
My guess: You're running OVN. You need neutron-dhcp-agent running as well. OVN disables it by default and OVN's integrated DHCP service does not support options for network booting.
-Julia
On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta <anyrude10@gmail.com <mailto:anyrude10@gmail.com>> wrote:
Hi Team
I am trying to Provision Bare Metal Node from my tripleo Overcloud. For this, while deploying the overcloud, I have followed the *"default flat" *network approach specified in the below link
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
<
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
Just to highlight the changes, I have defined the
*ironic-config.yaml*
parameter_defaults: ... ... IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: *172.23.3.100,172.23.3.150* IronicInspectorInterface: 'br-baremetal'
Also modified the file *~/templates/network-environment.yaml*
parameter_defaults: NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal
With this I have Followed all the steps of creating br-baremetal bridge on controller, given in the link below:
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
<
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
- type: ovs_bridge name: br-baremetal use_dhcp: false members: - type: interface name: nic3
Post Deployment, I have also create a flat network on "datacentre" physical network and subnet having the range *172.23.3.200,172.23.3.240 *(as suggested subnet is same as of inspector and range is different) and the router
Also created a baremetal node and ran *"openstack baremetal node manage bm1", *the state of which was a success.
Observation:
On executing "openstack baremetal node *provide* bm1", the machine gets power on and ideally it should take an IP from ironic inspector range and PXE Boot. But nothing of this sort happens and we see an IP from neutron range "*172.23.3.239*" (attached the screenshot)
image.png
I have checked overcloud ironic inspector podman logs alongwith the tcpdump. In tcpdump, I can only see dhcp discover request on br-baremetal and nothing happens after that.
I have tried to explain my issue in detail, but I would be happy to share more details in case still required. Can someone please help in resolving my
issue.
Regards Anirudh Gupta
On 2/9/22 15:58, Lokendra Rathour wrote:
Hi Harald, Responding on behalf of Anirudh's email: Thanks for the response and we now do understand that we are getting IP from the expected DHCP server.
We tried the scenario and here are our findings, Our admin and internal endpoints are on subnet: 30.30.30.x public : 10.0.1.x
(overcloud) [stack@undercloud ~]$ *OpenStack endpoint list | grep ironic* | 04c163251e5546769446a4fa4fa20484 | regionOne | ironic | baremetal | True | admin | http://30.30.30.213:6385 <http://30.30.30.213:6385> | | 5c8557ae639a4898bdc6121f6e873724 | regionOne | ironic | baremetal | True | internal | http://30.30.30.213:6385 <http://30.30.30.213:6385> | | 62e07a3b2f3f4158bb27d8603a8f5138 | regionOne | ironic-inspector | baremetal-introspection | True | public | http://10.0.1.88:5050 <http://10.0.1.88:5050> | | af29bd64513546409f44cc5d56ea1082 | regionOne | ironic-inspector | baremetal-introspection | True | internal | http://30.30.30.213:5050 <http://30.30.30.213:5050> | | b76cdb5e77c54fc6b10cbfeada0e8bf5 | regionOne | ironic-inspector | baremetal-introspection | True | admin | http://30.30.30.213:5050 <http://30.30.30.213:5050> | | bd2954f41e49419f85669990eb59f51a | regionOne | ironic | baremetal | True | public | http://10.0.1.88:6385 <http://10.0.1.88:6385> | (overcloud) [stack@undercloud ~]$
we are following the flat default n/w approach for ironic provisioning, for which we are creating a flat network on baremetal physnet. we are still getting IP from neutron range (172.23.3.220 - 172.23.3.240) - 172.23.3.240.
Further, we found that once IP (172.23.3.240) is allocated to baremetal node, it looks for 30.30.30.220( IP of one of the three controllers) for pxe booting. Checking the same controllers logs we found that
*`/var/lib/ironic/tftpboot/pxelinux.cfg/` directory exists,* but then there is *no file matching the mac *address of the baremetal node.
Also checking the *extra_dhcp_opts* we found this: (overcloud) [stack@undercloud ~]$ *openstack port show d7e573bf-1028-437a-8118-a2074c7573b2 | grep "extra_dhcp_opts"*
| extra_dhcp_opts | ip_version='4', opt_name='tag:ipxe,67', opt_value='http://30.30.30.220:8088/boot.ipxe <http://30.30.30.220:8088/boot.ipxe>'
image.png *Few points as observations:*
1. Although the baremetal network (172.23.3.x) is routable to the admin network (30.30.30.x), but it gets timeout at this window.
It should be able to download the file over a routed network.
2. in TCPDump we are only getting read requests.
If you have access check the switches and routers if you can see the traffic being dropped/blocked somewhere on the path? I'm not 100% sure what parameters you used when deploying, but did you try to change the ServiceNetMap for IronicApiNetwork, IronicNetwork? If you set that to the name of the baremetal network (172.23.3.x)? ServiceNetMap: IronicApiNetwork: baremetal_network IronicNetwork: baremetal_network The result will be that the http server will listen on 172.23.3.x, and the extra_dhcp_opts should point to 172.23.3.x as well.
3. `openstack baremetal node list 1. (overcloud) [stack@undercloud ~]$ openstack baremetal node list +--------------------------------------+------+---------------+-------------+--------------------+-------------+ | UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance | +--------------------------------------+------+---------------+-------------+--------------------+-------------+ | 7066fbe1-9c29-4702-9cd4-2b55daf19630 | bm1 | None | power on | clean wait | False | +--------------------------------------+------+---------------+-------------+--------------------+-------------+ 4. `openstack baremetal node show <node-uuid>` 1.
(overcloud) [stack@undercloud ~]$ openstack baremetal node show bm1 +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | Field | Value
| +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ | allocation_uuid | None
| | automated_clean | None
| | bios_interface | no-bios
| | boot_interface | ipxe
| | chassis_uuid | None
| | clean_step | {}
| | conductor | overcloud-controller-0.localdomain
| | conductor_group |
| | console_enabled | False
| | console_interface | ipmitool-socat
| | created_at | 2022-02-09T14:21:24+00:00
| | deploy_interface | iscsi
| | deploy_step | {}
| | description | None
| | driver | ipmi
| | driver_info | {'ipmi_address': '10.0.1.183', 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_terminal_port': 623, 'deploy_kernel': '9e1365b6-261a-42a2-abfe-40158945de57', 'deploy_ramdisk': 'fe608dd2-ce86-4faf-b4b8-cc5cb143eb56'} | | driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, 'agent_continue_if_ata_erase_failed': False, 'agent_enable_ata_secure_erase': True, 'disk_erasure_concurrency': 1, 'last_power_state_change': '2022-02-09T14:23:39.525629'} | | extra | {}
| | fault | None
| | inspect_interface | inspector
| | inspection_finished_at | None
| | inspection_started_at | None
| | instance_info | {}
| | instance_uuid | None
| | last_error | None
| | maintenance | False
| | maintenance_reason | None
| | management_interface | ipmitool
| | name | bm1
| | network_interface | flat
| | owner | None
| | power_interface | ipmitool
| | power_state | power on
| | properties | {'cpus': 20, 'cpu_arch': 'x86_64', 'capabilities': 'boot_option:local,boot_mode:uefi', 'memory_mb': 63700, 'local_gb': 470, 'vendor': 'hewlett-packard'}
| | protected | False
| | protected_reason | None
| | provision_state | clean wait
| | provision_updated_at | 2022-02-09T14:24:05+00:00
| | raid_config | {}
| | raid_interface | no-raid
| | rescue_interface | agent
| | reservation | None
| | resource_class | bm1
| | storage_interface | noop
| | target_power_state | None
| | target_provision_state | available
| | target_raid_config | {}
| | traits | []
| | updated_at | 2022-02-09T14:24:05+00:00
| | uuid | 7066fbe1-9c29-4702-9cd4-2b55daf19630
| | vendor_interface | ipmitool
| +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud ~]$
*Queries:*
* What are the settings we can do for successfully pxe-boot of the baremetal node and provisioning our node successfully ?
On Tue, Feb 8, 2022 at 6:27 PM Harald Jensas <hjensas@redhat.com <mailto:hjensas@redhat.com>> wrote:
On 2/7/22 13:47, Anirudh Gupta wrote: > Hi Julia, > > Thanks a lot for your responses and support. > To Update on the ongoing issue, I tried deploying the overcloud with > your valuable suggestions i.e by passing "*DhcpAgentNotification: true*" > in ironic-overcloud.yaml > The setup came up successfully, but with this configuration the IP > allocated on the system is one which is being configured while creating > the subnet in openstack. > > image.png > > The system is still getting the IP (172.23.3.212) from neutron. The > subnet range was configured as *172.23.3.210-172.23.3.240 *while > creating the provisioning subnet.
The node is supposed to get an IP address from the neutron subnet on the provisioning network when: a) provisioning node b) cleaning node.
When you do "baremetal node provide" cleaning is most likely automatically initiated. (Since cleaning is enabled by default for Ironic in overcloud AFIK.)
The only time you will get an address from the IronicInspectorSubnets (ip_range: 172.23.3.100,172.23.3.150 in your case) is when you start ironic node introspection.
> The system gets stuck here and no action is performed after this. >
It seems the system is getting an address from the expected DHCP server, but it does not boot. I would start looking into the pxe properties in the DHCP Reply.
What is the status of the node in ironic at this stage? `openstack baremetal node list` `openstack baremetal node show <node-uuid>`
Check the `extra_dhcp_opts` on the neutron port, it should set the nextserver and bootfile parameters. Does the bootfile exist in /var/lib/ironic/tftpboot? Inspect the `/var/lib/ironic/tftpboot/pxelinux.cfg/` directory, you should see a file matching the MAC address of your system. Does the content make sense?
Can you capture DHCP and TFTP traffic on the provisioning network?
> Is there any way to resolve this and make successful provisioning the > baremetal node in *TripleO Train Release* (Since RHOSP 16 was on Train, > so I thought to go with that version for better stability) > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/release_notes/index>
> <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/release_notes/index>> > > I have some queries: > > 1. Is passing "*DhcpAgentNotification: true" *enough or do we have to > make some other changes as well?
I belive in train "DhcpAgentNotification" defaults to True. The change to default to false was added more recently, and it was not backported. (https://review.opendev.org/c/openstack/tripleo-heat-templates/+/801761 <https://review.opendev.org/c/openstack/tripleo-heat-templates/+/801761>)
NOTE, the environment for enabling ironi for the overcloud 'environments/services/ironic-overcloud.yaml' overrides this to 'true' in later releases.
> 2. Although there are some security concerns specified in the document, > but Currently I am focusing on the default flat bare metal approach > which has dedicated interface for bare metal Provisioning. There is > one composable method approach as well. Keeping aside the security > concerns, which approach is better and functional? > 1. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/bare_metal_provisioning/prerequisites-for-bare-metal-provisioning> > <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/bare_metal_provisioning/prerequisites-for-bare-metal-provisioning>>
Both should work, using the composable network is more secure since baremetal nodes does not have access to the control plane network.
> 3. Will moving to upper openstack release version make this deployment > possible? > 1. If Yes, which release should I go with as till wallaby the > ironic-overcloud.yml file has no option of including > "*DhcpAgentNotification: true*" by default > 1. https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi... <https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/environments/services/ironic-overcloud.yaml> > <https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi... <https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/environments/services/ironic-overcloud.yaml>> > > > Looking forward for your valuable feedback/response. > > Regards > Anirudh Gupta > > > On Fri, Feb 4, 2022 at 8:54 PM Anirudh Gupta <anyrude10@gmail.com <mailto:anyrude10@gmail.com> > <mailto:anyrude10@gmail.com <mailto:anyrude10@gmail.com>>> wrote: > > Hi, > > Surely I'll revert the status once it gets deployed. > Bdw the suspicion is because of Train Release or it is something else? > > Regards > Anirudh Gupta > > On Fri, 4 Feb, 2022, 20:29 Julia Kreger, > <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com> <mailto:juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>>> > wrote: > > > > On Fri, Feb 4, 2022 at 5:50 AM Anirudh Gupta > <anyrude10@gmail.com <mailto:anyrude10@gmail.com> <mailto:anyrude10@gmail.com <mailto:anyrude10@gmail.com>>> wrote: > > Hi Julia > > Thanks for your response. > > Earlier I was passing both ironic.yaml and > ironic-overcloud.yaml located at path > /usr/share/openstack-tripleo-heat-templates/environments/services/ > > My current understanding now says that since I am using OVN, > not OVS so I should pass only ironic-overcloud.yaml in my > deployment. > > I am currently on Train Release and my default > ironic-overcloud.yaml file has no such entry > DhcpAgentNotification: true > > > I suspect that should work. Let us know if it does. > > I would add this there and re deploy the setup. > > Would that be enough to make my deployment successful? > > Regards > Anirudh Gupta > > > On Fri, 4 Feb, 2022, 18:40 Julia Kreger, > <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com> > <mailto:juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>>> wrote: > > It is not a matter of disabling OVN, but a matter of > enabling the dnsmasq service and notifications. > > https://github.com/openstack/tripleo-heat-templates/blob/master/environments... <https://github.com/openstack/tripleo-heat-templates/blob/master/environments/services/ironic-overcloud.yaml> > <https://github.com/openstack/tripleo-heat-templates/blob/master/environments... <https://github.com/openstack/tripleo-heat-templates/blob/master/environments/services/ironic-overcloud.yaml>> > may provide some insight. > > I suspect if you're using stable/wallaby based branches > and it is not working, there may need to be a patch > backported by the TripleO maintainers. > > On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta > <anyrude10@gmail.com <mailto:anyrude10@gmail.com> <mailto:anyrude10@gmail.com <mailto:anyrude10@gmail.com>>> wrote: > > Hi Julia, > > Thanks for your response. > For the overcloud deployment, I am executing the > following command: > > openstack overcloud deploy --templates \ > -n /home/stack/templates/network_data.yaml \ > -r /home/stack/templates/roles_data.yaml \ > -e /home/stack/templates/node-info.yaml \ > -e /home/stack/templates/environment.yaml \ > -e > /home/stack/templates/environments/network-isolation.yaml > \ > -e > /home/stack/templates/environments/network-environment.yaml > \ > -e > /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml > \ > -e > /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml > \ > -e > /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml > \ > -e > /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml > \ > -e /home/stack/templates/ironic-config.yaml \ > -e > /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml > \ > -e > /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml > \ > -e /home/stack/containers-prepare-parameter.yaml > > I can see some OVN related stuff in my roles_data > and environments/network-isolation.yaml > > [stack@undercloud ~]$ grep -inr "ovn" > roles_data.yaml:34: *OVNCMSOptions: > "enable-chassis-as-gw"* > roles_data.yaml:168: - > *OS::TripleO::Services::OVNDBs* > roles_data.yaml:169: - > *OS::TripleO::Services::OVNController* > roles_data.yaml:279: - > *OS::TripleO::Services::OVNController* > roles_data.yaml:280: - > *OS::TripleO::Services::OVNMetadataAgent* > environments/network-isolation.yaml:16: > *OS::TripleO::Network::Ports::OVNDBsVipPort: > ../network/ports/vip.yaml* > * > * > What is your recommendation and how to disable > OVN....should I remove it from roles_data.yaml and > then render so that it doesn't get generated > in environments/network-isolation.yaml > Please suggest some pointers. > > Regards > Anirudh Gupta > * > * > * > * > > > > > It seems OVN is getting installed in ironic > > > On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger > <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com> > <mailto:juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>>> wrote: > > My guess: You're running OVN. You need > neutron-dhcp-agent running as well. OVN disables > it by default and OVN's integrated DHCP service > does not support options for network booting. > > -Julia > > On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta > <anyrude10@gmail.com <mailto:anyrude10@gmail.com> > <mailto:anyrude10@gmail.com <mailto:anyrude10@gmail.com>>> wrote: > > Hi Team > > I am trying to Provision Bare Metal Node > from my tripleo Overcloud. > For this, while deploying the overcloud, I > have followed the *"default flat" *network > approach specified in the below link > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-planning> > <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-planning>> > > Just to highlight the changes, I have > defined the > > *ironic-config.yaml* > > parameter_defaults: > ... > ... > IronicIPXEEnabled: true > IronicInspectorSubnets: > - ip_range: *172.23.3.100,172.23.3.150* > IronicInspectorInterface: 'br-baremetal' > > Also modified the file > *~/templates/network-environment.yaml* > > parameter_defaults: > NeutronBridgeMappings: > datacentre:br-ex,baremetal:br-baremetal > NeutronFlatNetworks: datacentre,baremetal > > With this I have Followed all the steps of > creating br-baremetal bridge on controller, > given in the link below: > > https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-deploy> > <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-deploy>> > > - type: ovs_bridge > name: br-baremetal > use_dhcp: false > members: > - type: interface > name: nic3 > > Post Deployment, I have also create a flat > network on "datacentre" physical network and > subnet having the range > *172.23.3.200,172.23.3.240 *(as suggested > subnet is same as of inspector and range is > different) and the router > > Also created a baremetal node and ran > *"openstack baremetal node manage bm1", *the > state of which was a success. > > Observation: > > On executing "openstack baremetal node > *provide* bm1", the machine gets power on > and ideally it should take an IP from ironic > inspector range and PXE Boot. > But nothing of this sort happens and we see > an IP from neutron range "*172.23.3.239*" > (attached the screenshot) > > image.png > > I have checked overcloud ironic inspector > podman logs alongwith the tcpdump. > In tcpdump, I can only see dhcp discover > request on br-baremetal and nothing happens > after that. > > I have tried to explain my issue in detail, > but I would be happy to share more details > in case still required. > Can someone please help in resolving my issue. > > Regards > Anirudh Gupta >
Hi Harald, Thanks once again for your support, we tried activating the parameters: ServiceNetMap: IronicApiNetwork: provisioning IronicNetwork: provisioning at environments/network-environments.yaml [image: image.png] After changing these values the updated or even the fresh deployments are failing. The command that we are using to deploy the OpenStack overcloud: *openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml \ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml* **/home/stack/templates/ironic-config.yaml : (overcloud) [stack@undercloud ~]$ cat /home/stack/templates/ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel", " http://30.30.30.1:8088/agent.ramdisk"]' IronicInspectorInterface: 'br-baremetal' Also the baremetal network(provisioning)(172.23.3.x) is routed with ctlplane/admin network (30.30.30.x) *Query:* 1. any other location/way where we should add these so that they are included without error. *ServiceNetMap:* * IronicApiNetwork: provisioning* * IronicNetwork: provisioning* 2. Also are these commands(mentioned above) configure Baremetal services are fine. Best Regards, Lokendra On Wed, Feb 9, 2022 at 11:55 PM Harald Jensas <hjensas@redhat.com> wrote:
On 2/9/22 15:58, Lokendra Rathour wrote:
Hi Harald, Responding on behalf of Anirudh's email: Thanks for the response and we now do understand that we are getting IP from the expected DHCP server.
We tried the scenario and here are our findings, Our admin and internal endpoints are on subnet: 30.30.30.x public : 10.0.1.x
(overcloud) [stack@undercloud ~]$ *OpenStack endpoint list | grep ironic* | 04c163251e5546769446a4fa4fa20484 | regionOne | ironic | baremetal | True | admin | http://30.30.30.213:6385 <http://30.30.30.213:6385> | | 5c8557ae639a4898bdc6121f6e873724 | regionOne | ironic | baremetal | True | internal | http://30.30.30.213:6385 <http://30.30.30.213:6385> | | 62e07a3b2f3f4158bb27d8603a8f5138 | regionOne | ironic-inspector | baremetal-introspection | True | public | http://10.0.1.88:5050 <http://10.0.1.88:5050> | | af29bd64513546409f44cc5d56ea1082 | regionOne | ironic-inspector | baremetal-introspection | True | internal | http://30.30.30.213:5050 <http://30.30.30.213:5050> | | b76cdb5e77c54fc6b10cbfeada0e8bf5 | regionOne | ironic-inspector | baremetal-introspection | True | admin | http://30.30.30.213:5050 <http://30.30.30.213:5050> | | bd2954f41e49419f85669990eb59f51a | regionOne | ironic | baremetal | True | public | http://10.0.1.88:6385 <http://10.0.1.88:6385> | (overcloud) [stack@undercloud ~]$
we are following the flat default n/w approach for ironic provisioning, for which we are creating a flat network on baremetal physnet. we are still getting IP from neutron range (172.23.3.220 - 172.23.3.240) - 172.23.3.240.
Further, we found that once IP (172.23.3.240) is allocated to baremetal node, it looks for 30.30.30.220( IP of one of the three controllers) for pxe booting. Checking the same controllers logs we found that
*`/var/lib/ironic/tftpboot/pxelinux.cfg/` directory exists,* but then there is *no file matching the mac *address of the baremetal node.
Also checking the *extra_dhcp_opts* we found this: (overcloud) [stack@undercloud ~]$ *openstack port show d7e573bf-1028-437a-8118-a2074c7573b2 | grep "extra_dhcp_opts"*
| extra_dhcp_opts | ip_version='4', opt_name='tag:ipxe,67', opt_value='http://30.30.30.220:8088/boot.ipxe <http://30.30.30.220:8088/boot.ipxe>'
image.png *Few points as observations:*
1. Although the baremetal network (172.23.3.x) is routable to the admin network (30.30.30.x), but it gets timeout at this window.
It should be able to download the file over a routed network.
2. in TCPDump we are only getting read requests.
If you have access check the switches and routers if you can see the traffic being dropped/blocked somewhere on the path?
I'm not 100% sure what parameters you used when deploying, but did you try to change the ServiceNetMap for IronicApiNetwork, IronicNetwork?
If you set that to the name of the baremetal network (172.23.3.x)?
ServiceNetMap: IronicApiNetwork: baremetal_network IronicNetwork: baremetal_network
The result will be that the http server will listen on 172.23.3.x, and the extra_dhcp_opts should point to 172.23.3.x as well.
3. `openstack baremetal node list 1. (overcloud) [stack@undercloud ~]$ openstack baremetal node list
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| 7066fbe1-9c29-4702-9cd4-2b55daf19630 | bm1 | None | power on | clean wait | False |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
4. `openstack baremetal node show <node-uuid>` 1.
(overcloud) [stack@undercloud ~]$ openstack baremetal node show bm1
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value
|
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| allocation_uuid | None
| | automated_clean | None
| | bios_interface | no-bios
| | boot_interface | ipxe
| | chassis_uuid | None
| | clean_step | {}
| | conductor | overcloud-controller-0.localdomain
| | conductor_group |
| | console_enabled | False
| | console_interface | ipmitool-socat
| | created_at | 2022-02-09T14:21:24+00:00
| | deploy_interface | iscsi
| | deploy_step | {}
| | description | None
| | driver | ipmi
| | driver_info | {'ipmi_address': '10.0.1.183', 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_terminal_port': 623, 'deploy_kernel': '9e1365b6-261a-42a2-abfe-40158945de57', 'deploy_ramdisk': 'fe608dd2-ce86-4faf-b4b8-cc5cb143eb56'} | | driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, 'agent_continue_if_ata_erase_failed': False, 'agent_enable_ata_secure_erase': True, 'disk_erasure_concurrency': 1, 'last_power_state_change': '2022-02-09T14:23:39.525629'} | | extra | {}
| | fault | None
| | inspect_interface | inspector
| | inspection_finished_at | None
| | inspection_started_at | None
| | instance_info | {}
| | instance_uuid | None
| | last_error | None
| | maintenance | False
| | maintenance_reason | None
| | management_interface | ipmitool
| | name | bm1
| | network_interface | flat
| | owner | None
| | power_interface | ipmitool
| | power_state | power on
| | properties | {'cpus': 20, 'cpu_arch': 'x86_64', 'capabilities': 'boot_option:local,boot_mode:uefi', 'memory_mb': 63700, 'local_gb': 470, 'vendor': 'hewlett-packard'}
| | protected | False
| | protected_reason | None
| | provision_state | clean wait
| | provision_updated_at | 2022-02-09T14:24:05+00:00
| | raid_config | {}
| | raid_interface | no-raid
| | rescue_interface | agent
| | reservation | None
| | resource_class | bm1
| | storage_interface | noop
| | target_power_state | None
| | target_provision_state | available
| | target_raid_config | {}
| | traits | []
| | updated_at | 2022-02-09T14:24:05+00:00
| | uuid | 7066fbe1-9c29-4702-9cd4-2b55daf19630
| | vendor_interface | ipmitool
|
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
(overcloud) [stack@undercloud ~]$
*Queries:*
* What are the settings we can do for successfully pxe-boot of the baremetal node and provisioning our node successfully ?
On Tue, Feb 8, 2022 at 6:27 PM Harald Jensas <hjensas@redhat.com <mailto:hjensas@redhat.com>> wrote:
On 2/7/22 13:47, Anirudh Gupta wrote: > Hi Julia, > > Thanks a lot for your responses and support. > To Update on the ongoing issue, I tried deploying the overcloud
with
> your valuable suggestions i.e by passing "*DhcpAgentNotification: true*" > in ironic-overcloud.yaml > The setup came up successfully, but with this configuration the IP > allocated on the system is one which is being configured while creating > the subnet in openstack. > > image.png > > The system is still getting the IP (172.23.3.212) from neutron.
The
> subnet range was configured as *172.23.3.210-172.23.3.240 *while > creating the provisioning subnet.
The node is supposed to get an IP address from the neutron subnet on the provisioning network when: a) provisioning node b) cleaning node.
When you do "baremetal node provide" cleaning is most likely automatically initiated. (Since cleaning is enabled by default for Ironic in overcloud AFIK.)
The only time you will get an address from the IronicInspectorSubnets (ip_range: 172.23.3.100,172.23.3.150 in your case) is when you start ironic node introspection.
> The system gets stuck here and no action is performed after this. >
It seems the system is getting an address from the expected DHCP server, but it does not boot. I would start looking into the pxe properties
in
the DHCP Reply.
What is the status of the node in ironic at this stage? `openstack baremetal node list` `openstack baremetal node show <node-uuid>`
Check the `extra_dhcp_opts` on the neutron port, it should set the nextserver and bootfile parameters. Does the bootfile exist in /var/lib/ironic/tftpboot? Inspect the `/var/lib/ironic/tftpboot/pxelinux.cfg/` directory, you should see a file matching the MAC address of your system. Does the content make sense?
Can you capture DHCP and TFTP traffic on the provisioning network?
> Is there any way to resolve this and make successful provisioning the > baremetal node in *TripleO Train Release* (Since RHOSP 16 was on Train, > so I thought to go with that version for better stability) >
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
<
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
> <
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
<
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
> > I have some queries: > > 1. Is passing "*DhcpAgentNotification: true" *enough or do we have to > make some other changes as well?
I belive in train "DhcpAgentNotification" defaults to True. The change to default to false was added more recently, and it was not backported. ( https://review.opendev.org/c/openstack/tripleo-heat-templates/+/801761 < https://review.opendev.org/c/openstack/tripleo-heat-templates/+/801761>)
NOTE, the environment for enabling ironi for the overcloud 'environments/services/ironic-overcloud.yaml' overrides this to 'true' in later releases.
> 2. Although there are some security concerns specified in the document, > but Currently I am focusing on the default flat bare metal approach > which has dedicated interface for bare metal Provisioning. There is > one composable method approach as well. Keeping aside the security > concerns, which approach is better and functional? > 1.
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
<
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
> <
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.... < https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16....
Both should work, using the composable network is more secure since baremetal nodes does not have access to the control plane network.
> 3. Will moving to upper openstack release version make this deployment > possible? > 1. If Yes, which release should I go with as till wallaby the > ironic-overcloud.yml file has no option of including > "*DhcpAgentNotification: true*" by default > 1.
https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi...
<
https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi...
> <
https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi... < https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/envi...
> > > Looking forward for your valuable feedback/response. > > Regards > Anirudh Gupta > > > On Fri, Feb 4, 2022 at 8:54 PM Anirudh Gupta <anyrude10@gmail.com <mailto:anyrude10@gmail.com> > <mailto:anyrude10@gmail.com <mailto:anyrude10@gmail.com>>> wrote: > > Hi, > > Surely I'll revert the status once it gets deployed. > Bdw the suspicion is because of Train Release or it is something else? > > Regards > Anirudh Gupta > > On Fri, 4 Feb, 2022, 20:29 Julia Kreger, > <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com> <mailto:juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>>> > wrote: > > > > On Fri, Feb 4, 2022 at 5:50 AM Anirudh Gupta > <anyrude10@gmail.com <mailto:anyrude10@gmail.com> <mailto:anyrude10@gmail.com <mailto:anyrude10@gmail.com>>> wrote: > > Hi Julia > > Thanks for your response. > > Earlier I was passing both ironic.yaml and > ironic-overcloud.yaml located at path > /usr/share/openstack-tripleo-heat-templates/environments/services/ > > My current understanding now says that since I am using OVN, > not OVS so I should pass only ironic-overcloud.yaml in my > deployment. > > I am currently on Train Release and my default > ironic-overcloud.yaml file has no such entry > DhcpAgentNotification: true > > > I suspect that should work. Let us know if it does. > > I would add this there and re deploy the setup. > > Would that be enough to make my deployment successful? > > Regards > Anirudh Gupta > > > On Fri, 4 Feb, 2022, 18:40 Julia Kreger, > <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com> > <mailto:juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>>> wrote: > > It is not a matter of disabling OVN, but a matter of > enabling the dnsmasq service and notifications. > >
https://github.com/openstack/tripleo-heat-templates/blob/master/environments...
<
https://github.com/openstack/tripleo-heat-templates/blob/master/environments...
> <
> may provide some insight. > > I suspect if you're using stable/wallaby based branches > and it is not working, there may need to be a
https://github.com/openstack/tripleo-heat-templates/blob/master/environments... < https://github.com/openstack/tripleo-heat-templates/blob/master/environments... patch
> backported by the TripleO maintainers. > > On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta > <anyrude10@gmail.com <mailto:anyrude10@gmail.com> <mailto:anyrude10@gmail.com <mailto:anyrude10@gmail.com>>> wrote: > > Hi Julia, > > Thanks for your response. > For the overcloud deployment, I am executing
the
> following command: > > openstack overcloud deploy --templates \ > -n
/home/stack/templates/network_data.yaml \
> -r /home/stack/templates/roles_data.yaml
\
> -e /home/stack/templates/node-info.yaml \ > -e
/home/stack/templates/environment.yaml \
> -e > /home/stack/templates/environments/network-isolation.yaml > \ > -e > /home/stack/templates/environments/network-environment.yaml > \ > -e >
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml
> \ > -e >
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml
> \ > -e >
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml
> \ > -e >
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml
> \ > -e /home/stack/templates/ironic-config.yaml \ > -e >
/usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml
> \ > -e >
/usr/share/openstack-tripleo-heat-templates/environments/podman.yaml
> \ > -e /home/stack/containers-prepare-parameter.yaml > > I can see some OVN related stuff in my
roles_data
> and environments/network-isolation.yaml > > [stack@undercloud ~]$ grep -inr "ovn" > roles_data.yaml:34: *OVNCMSOptions: > "enable-chassis-as-gw"* > roles_data.yaml:168: - > *OS::TripleO::Services::OVNDBs* > roles_data.yaml:169: - > *OS::TripleO::Services::OVNController* > roles_data.yaml:279: - > *OS::TripleO::Services::OVNController* > roles_data.yaml:280: - > *OS::TripleO::Services::OVNMetadataAgent* > environments/network-isolation.yaml:16: > *OS::TripleO::Network::Ports::OVNDBsVipPort: > ../network/ports/vip.yaml* > * > * > What is your recommendation and how to disable > OVN....should I remove it from roles_data.yaml and > then render so that it doesn't get generated > in environments/network-isolation.yaml > Please suggest some pointers. > > Regards > Anirudh Gupta > * > * > * > * > > > > > It seems OVN is getting installed in ironic > > > On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger > <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com> > <mailto:juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>>> wrote: > > My guess: You're running OVN. You need > neutron-dhcp-agent running as well. OVN disables > it by default and OVN's integrated DHCP service > does not support options for network
booting.
> > -Julia > > On Thu, Feb 3, 2022 at 9:06 AM Anirudh
Gupta
> <anyrude10@gmail.com <mailto:anyrude10@gmail.com> > <mailto:anyrude10@gmail.com <mailto:anyrude10@gmail.com>>> wrote: > > Hi Team > > I am trying to Provision Bare Metal
Node
> from my tripleo Overcloud. > For this, while deploying the overcloud, I > have followed the *"default flat" *network > approach specified in the below link >
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
<
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
> <
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... < https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
> > Just to highlight the changes, I have > defined the > > *ironic-config.yaml* > > parameter_defaults: > ... > ... > IronicIPXEEnabled: true > IronicInspectorSubnets: > - ip_range: *172.23.3.100,172.23.3.150* > IronicInspectorInterface: 'br-baremetal' > > Also modified the file > *~/templates/network-environment.yaml* > > parameter_defaults: > NeutronBridgeMappings: > datacentre:br-ex,baremetal:br-baremetal > NeutronFlatNetworks: datacentre,baremetal > > With this I have Followed all the steps of > creating br-baremetal bridge on controller, > given in the link below: > >
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
<
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/...
> <
> > - type: ovs_bridge > name: br-baremetal > use_dhcp: false > members: > - type: interface > name: nic3 > > Post Deployment, I have also create a flat > network on "datacentre" physical network and > subnet having the range > *172.23.3.200,172.23.3.240 *(as suggested > subnet is same as of inspector and range is > different) and the router > > Also created a baremetal node and ran > *"openstack baremetal node manage bm1", *the > state of which was a success. > > Observation: > > On executing "openstack baremetal node > *provide* bm1", the machine gets
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... < https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/... power on
> and ideally it should take an IP from ironic > inspector range and PXE Boot. > But nothing of this sort happens and we see > an IP from neutron range
"*172.23.3.239*"
> (attached the screenshot) > > image.png > > I have checked overcloud ironic
inspector
> podman logs alongwith the tcpdump. > In tcpdump, I can only see dhcp
discover
> request on br-baremetal and nothing happens > after that. > > I have tried to explain my issue in detail, > but I would be happy to share more details > in case still required. > Can someone please help in resolving my issue. > > Regards > Anirudh Gupta >
On 2/10/22 14:49, Lokendra Rathour wrote:
Hi Harald, Thanks once again for your support, we tried activating the parameters: ServiceNetMap: IronicApiNetwork: provisioning IronicNetwork: provisioning at environments/network-environments.yaml image.png After changing these values the updated or even the fresh deployments are failing.
How did deployment fail?
The command that we are using to deploy the OpenStack overcloud: /openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \
What modifications did you do to network-isolation.yaml and network-environment.yaml? I typically use: -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml -e /home/stack/templates/environments/network-overrides.yaml The network-isolation.yaml and network-environment.yaml are Jinja2 rendered based on the -n input, so too keep in sync with change in the `-n` file reference the file in /usr/share/opentack-tripleo-heat-templates. Then add overrides in network-overrides.yaml as neede.
-e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml \ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml/
**/home/stack/templates/ironic-config.yaml : (overcloud) [stack@undercloud ~]$ cat /home/stack/templates/ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel <http://30.30.30.1:8088/agent.kernel>", "http://30.30.30.1:8088/agent.ramdisk <http://30.30.30.1:8088/agent.ramdisk>"] > IronicInspectorInterface: 'br-baremetal'
Also the baremetal network(provisioning)(172.23.3.x) is routed with ctlplane/admin network (30.30.30.x)
Unless the network you created in the overcloud is named `provisioning`, these parameters may be relevant. IronicCleaningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for cleaning bare metal nodes. The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string IronicProvisioningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for provisioning of bare metal nodes, if IronicDefaultNetworkInterface is set to "neutron". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string IronicRescuingNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for resucing of bare metal nodes, if IronicDefaultRescueInterface is not set to "no-rescue". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
*Query:*
1. any other location/way where we should add these so that they are included without error.
*ServiceNetMap:*
* IronicApiNetwork: provisioning*
* IronicNetwork: provisioning*
`provisioning` network is defined in -n /home/stack/templates/network_data.yaml right? And an entry in 'networks' for the controller role in /home/stack/templates/roles_data.yaml?
2. Also are these commands(mentioned above) configure Baremetal services are fine.
Yes, what you are doing makes sense. I'm actually not sure why it did'nt work with your previous configuration, it got the information about NBP file and obviously attempted to download it from 30.30.30.220. With routing in place, that should work. Changeing the ServiceNetMap to move IronicNetwork services to the 172.23.3 would avoid the routing. What is NeutronBridgeMappings? br-baremetal maps to the physical network of the overcloud `provisioning` neutron network? -- Harald
Hi Harald, Thanks for the response, please find my response inline: On Thu, Feb 10, 2022 at 8:24 PM Harald Jensas <hjensas@redhat.com> wrote:
On 2/10/22 14:49, Lokendra Rathour wrote:
Hi Harald, Thanks once again for your support, we tried activating the parameters: ServiceNetMap: IronicApiNetwork: provisioning IronicNetwork: provisioning at environments/network-environments.yaml image.png After changing these values the updated or even the fresh deployments are failing.
How did deployment fail?
[Loke] : it failed immediately after when the IP for ctlplane network is assigned, and ssh is established and stack creation is completed, I think at the start of ansible execution. Error: "enabling ssh admin - COMPLETE. Host 10.0.1.94 not found in /home/stack/.ssh/known_hosts" Although this message is even seen when the deployment is successful. so I do not think this is the culprit.
The command that we are using to deploy the OpenStack overcloud: /openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \
What modifications did you do to network-isolation.yaml and
[Loke]: *Network-isolation.yaml:* # Enable the creation of Neutron networks for isolated Overcloud # traffic and configure each role to assign ports (related # to that role) on these networks. resource_registry: # networks as defined in network_data.yaml OS::TripleO::Network::J3Mgmt: ../network/j3mgmt.yaml OS::TripleO::Network::Tenant: ../network/tenant.yaml OS::TripleO::Network::InternalApi: ../network/internal_api.yaml OS::TripleO::Network::External: ../network/external.yaml # Port assignments for the VIPs OS::TripleO::Network::Ports::J3MgmtVipPort: ../network/ports/j3mgmt.yaml OS::TripleO::Network::Ports::InternalApiVipPort: ../network/ports/internal_api.yaml OS::TripleO::Network::Ports::ExternalVipPort: ../network/ports/external.yaml OS::TripleO::Network::Ports::RedisVipPort: ../network/ports/vip.yaml OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml # Port assignments by role, edit role definition to assign networks to roles. # Port assignments for the Controller OS::TripleO::Controller::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml OS::TripleO::Controller::Ports::TenantPort: ../network/ports/tenant.yaml OS::TripleO::Controller::Ports::InternalApiPort: ../network/ports/internal_api.yaml OS::TripleO::Controller::Ports::ExternalPort: ../network/ports/external.yaml # Port assignments for the Compute OS::TripleO::Compute::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml OS::TripleO::Compute::Ports::TenantPort: ../network/ports/tenant.yaml OS::TripleO::Compute::Ports::InternalApiPort: ../network/ports/internal_api.yaml ~
network-environment.yaml?
resource_registry: # Network Interface templates to use (these files must exist). You can # override these by including one of the net-*.yaml environment files, # such as net-bond-with-vlans.yaml, or modifying the list here. # Port assignments for the Controller OS::TripleO::Controller::Net::SoftwareConfig: ../network/config/bond-with-vlans/controller.yaml # Port assignments for the Compute OS::TripleO::Compute::Net::SoftwareConfig: ../network/config/bond-with-vlans/compute.yaml parameter_defaults: J3MgmtNetCidr: '80.0.1.0/24' J3MgmtAllocationPools: [{'start': '80.0.1.4', 'end': '80.0.1.250'}] J3MgmtNetworkVlanID: 400 TenantNetCidr: '172.16.0.0/24' TenantAllocationPools: [{'start': '172.16.0.4', 'end': '172.16.0.250'}] TenantNetworkVlanID: 416 TenantNetPhysnetMtu: 1500 InternalApiNetCidr: '172.16.2.0/24' InternalApiAllocationPools: [{'start': '172.16.2.4', 'end': '172.16.2.250'}] InternalApiNetworkVlanID: 418 ExternalNetCidr: '10.0.1.0/24' ExternalAllocationPools: [{'start': '10.0.1.85', 'end': '10.0.1.98'}] ExternalNetworkVlanID: 408 DnsServers: [] NeutronNetworkType: 'geneve,vlan' NeutronNetworkVLANRanges: 'datacentre:1:1000' BondInterfaceOvsOptions: "bond_mode=active-backup"
I typically use: -e
/usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e
/usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml -e /home/stack/templates/environments/network-overrides.yaml
The network-isolation.yaml and network-environment.yaml are Jinja2 rendered based on the -n input, so too keep in sync with change in the `-n` file reference the file in /usr/share/opentack-tripleo-heat-templates. Then add overrides in network-overrides.yaml as neede.
[Loke] : we are using this as like only, I do not know what you pass in network-overrides.yaml but I pass other files as per commands as below: [stack@undercloud templates]$ cat environment.yaml parameter_defaults: ControllerCount: 3 TimeZone: 'Asia/Kolkata' NtpServer: ['30.30.30.3'] NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal [stack@undercloud templates]$ cat ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel", " http://30.30.30.1:8088/agent.ramdisk"]' IronicInspectorInterface: 'br-baremetal' [stack@undercloud templates]$ [stack@undercloud templates]$ cat node-info.yaml parameter_defaults: OvercloudControllerFlavor: control OvercloudComputeFlavor: compute ControllerCount: 3 ComputeCount: 1 [stack@undercloud templates]$
-e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml
\ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml/
**/home/stack/templates/ironic-config.yaml : (overcloud) [stack@undercloud ~]$ cat /home/stack/templates/ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel <http://30.30.30.1:8088/agent.kernel>", "http://30.30.30.1:8088/agent.ramdisk <http://30.30.30.1:8088/agent.ramdisk>"] > IronicInspectorInterface: 'br-baremetal'
Also the baremetal network(provisioning)(172.23.3.x) is routed with ctlplane/admin network (30.30.30.x)
Unless the network you created in the overcloud is named `provisioning`, these parameters may be relevant.
IronicCleaningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for cleaning bare metal nodes. The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
IronicProvisioningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for provisioning of bare metal nodes, if IronicDefaultNetworkInterface is set to "neutron". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
IronicRescuingNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for resucing of bare metal nodes, if IronicDefaultRescueInterface is not set to "no-rescue". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
*Query:*
1. any other location/way where we should add these so that they are included without error.
*ServiceNetMap:*
* IronicApiNetwork: provisioning*
* IronicNetwork: provisioning*
`provisioning` network is defined in -n /home/stack/templates/network_data.yaml right?
[Loke]: No it does not have any entry for provisioning in this file, it is network entry for J3Mgmt,Tenant,InternalApi, and External. These n/w's are added as vlan based under the br-ext bridge. provisioning network I am creating after the overcloud is deployed and before the baremetal node is provisioned. in the provisioning network, we are giving the range of the ironic network. (172.23.3.x)
And an entry in 'networks' for the controller role in /home/stack/templates/roles_data.yaml?
[Loke]: we also did not added a similar entry in the roles_data.yaml as well. Just to add with these two files we have rendered the remaining templates.
2. Also are these commands(mentioned above) configure Baremetal services are fine.
Yes, what you are doing makes sense.
I'm actually not sure why it did'nt work with your previous configuration, it got the information about NBP file and obviously attempted to download it from 30.30.30.220. With routing in place, that should work.
Changeing the ServiceNetMap to move IronicNetwork services to the 172.23.3 would avoid the routing.
[Loke] : we can try this but are somehow not able to do so because of some weird reasons.
What is NeutronBridgeMappings? br-baremetal maps to the physical network of the overcloud `provisioning` neutron network?
[Loke] : yes , we create br-barmetal and then we create provisioning network mapping it to br-baremetal.
Also attaching the complete rendered template folder along with custom yaml files that I am using, maybe referring that you might have a more clear picture of our problem. Any clue would help. Our problem, we are not able to provision the baremetal node after the overcloud is deployed. Do we have any straight-forward documents using which we can test the baremetal provision, please provide that.
Thanks once again for reading all these.
-- Harald
- skype: lokendrarathour
Hi Harald/ Openstack Team, Thank you again for your support. we have successfully provisioned the baremetal node as per the inputs shared by you. The only change that we did was to add an entry for the ServiceNetmap. Further, we were trying to launch the baremetal node instance in which we are facing ISSUE as mentioned below: [image: image.png] *"2022-02-11 18:13:45.840 7 ERROR nova.compute.manager [req-aafdea4d-815f-4504-b7d7-4fd95d1e083e - - - - -] Error updating resources for node 9560bc2d-5f94-4ba0-9711-340cb8ad7d8a.: nova.exception.NoResourceClass: Resource class not found for Ironic node 9560bc2d-5f94-4ba0-9711-340cb8ad7d8a.* *2022-02-11 18:13:45.840 7 ERROR nova.compute.manager Traceback (most recent call last):2022-02-11 18:13:45.840 7 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 8894, in _update_available_resource_for_node* " for your reference please refer following details: (overcloud) [stack@undercloud v4]$ openstack baremetal node show baremetal-node --fit-width +------------------------+-------------------------------------------------------------------------------------------------------------------+ | Field | Value | +------------------------+-------------------------------------------------------------------------------------------------------------------+ | allocation_uuid | None | | automated_clean | None | | bios_interface | no-bios | | boot_interface | ipxe | | chassis_uuid | None | | clean_step | {} | | conductor | overcloud-controller-0.localdomain | | conductor_group | | | console_enabled | False | | console_interface | ipmitool-socat | | created_at | 2022-02-11T13:02:40+00:00 | | deploy_interface | iscsi | | deploy_step | {} | | description | None | | driver | ipmi | | driver_info | {'ipmi_port': 623, 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_address': '10.0.1.183', | | | 'deploy_kernel': 'bc62f3dc-d091-4dbd-b730-cf7b6cb48625', 'deploy_ramdisk': | | | 'd58bcc08-cb7c-4f21-8158-0a5ed4198108'} | | driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, 'agent_continue_if_ata_erase_failed': | | | False, 'agent_enable_ata_secure_erase': True, 'disk_erasure_concurrency': 1, 'last_power_state_change': | | | '2022-02-11T13:14:29.581361', 'agent_version': '5.0.5.dev25', 'agent_last_heartbeat': | | | '2022-02-11T13:14:24.151928', 'hardware_manager_version': {'generic_hardware_manager': '1.1'}, | | | 'agent_cached_clean_steps': {'deploy': [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy', | | | 'reboot_requested': False, 'abortable': True}, {'step': 'erase_devices_metadata', 'priority': 99, 'interface': | | | 'deploy', 'reboot_requested': False, 'abortable': True}], 'raid': [{'step': 'delete_configuration', 'priority': | | | 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}, {'step': 'create_configuration', | | | 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}]}, | | | 'agent_cached_clean_steps_refreshed': '2022-02-11 13:14:22.580729', 'clean_steps': None} | | extra | {} | | fault | None | | inspect_interface | inspector | | inspection_finished_at | None | | inspection_started_at | None | | instance_info | {} | | instance_uuid | None | | last_error | None | | maintenance | False | | maintenance_reason | None | | management_interface | ipmitool | | name | baremetal-node | | network_interface | flat | | owner | None | | power_interface | ipmitool | | power_state | power off | *| properties | {'cpus': 20, 'memory_mb': 63700, 'local_gb': 470, 'cpu_arch': 'x86_64', 'capabilities': || | 'boot_option:local,boot_mode:uefi', 'vendor': 'hewlett-packard'} * | | protected | False | | protected_reason | None | | provision_state | available | | provision_updated_at | 2022-02-11T13:14:51+00:00 | | raid_config | {} | | raid_interface | no-raid | | rescue_interface | agent | | reservation | None | *| resource_class | baremetal-resource-class * | | storage_interface | noop | | target_power_state | None | | target_provision_state | None | | target_raid_config | {} | | traits | [] | | updated_at | 2022-02-11T13:14:52+00:00 | | uuid | e64ad28c-43d6-4b9f-aa34-f8bc58e9e8fe | | vendor_interface | ipmitool | +------------------------+-------------------------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$ (overcloud) [stack@undercloud v4]$ openstack flavor show my-baremetal-flavor --fit-width +----------------------------+---------------------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------------+---------------------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 470 | *| extra_specs | {'resources:CUSTOM_BAREMETAL_RESOURCE_CLASS': '1', 'resources:VCPU': '0', 'resources:MEMORY_MB': '0', || | 'resources:DISK_GB': '0', 'capabilities:boot_option': 'local,boot_mode:uefi'} * | | id | 66a13404-4c47-4b67-b954-e3df42ae8103 | | name | my-baremetal-flavor | | os-flavor-access:is_public | True | *| properties | capabilities:boot_option='local,boot_mode:uefi', resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1', || | resources:DISK_GB='0', resources:MEMORY_MB='0', resources:VCPU='0' * | | ram | 63700 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 20 | +----------------------------+---------------------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$ Can you please check and suggest if something is missing. Thanks once again for your support. -Lokendra On Thu, Feb 10, 2022 at 10:09 PM Lokendra Rathour <lokendrarathour@gmail.com> wrote:
Hi Harald, Thanks for the response, please find my response inline:
On Thu, Feb 10, 2022 at 8:24 PM Harald Jensas <hjensas@redhat.com> wrote:
On 2/10/22 14:49, Lokendra Rathour wrote:
Hi Harald, Thanks once again for your support, we tried activating the parameters: ServiceNetMap: IronicApiNetwork: provisioning IronicNetwork: provisioning at environments/network-environments.yaml image.png After changing these values the updated or even the fresh deployments are failing.
How did deployment fail?
[Loke] : it failed immediately after when the IP for ctlplane network is assigned, and ssh is established and stack creation is completed, I think at the start of ansible execution.
Error: "enabling ssh admin - COMPLETE. Host 10.0.1.94 not found in /home/stack/.ssh/known_hosts" Although this message is even seen when the deployment is successful. so I do not think this is the culprit.
The command that we are using to deploy the OpenStack overcloud: /openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \
What modifications did you do to network-isolation.yaml and
[Loke]: *Network-isolation.yaml:*
# Enable the creation of Neutron networks for isolated Overcloud # traffic and configure each role to assign ports (related # to that role) on these networks. resource_registry: # networks as defined in network_data.yaml OS::TripleO::Network::J3Mgmt: ../network/j3mgmt.yaml OS::TripleO::Network::Tenant: ../network/tenant.yaml OS::TripleO::Network::InternalApi: ../network/internal_api.yaml OS::TripleO::Network::External: ../network/external.yaml
# Port assignments for the VIPs OS::TripleO::Network::Ports::J3MgmtVipPort: ../network/ports/j3mgmt.yaml
OS::TripleO::Network::Ports::InternalApiVipPort: ../network/ports/internal_api.yaml OS::TripleO::Network::Ports::ExternalVipPort: ../network/ports/external.yaml
OS::TripleO::Network::Ports::RedisVipPort: ../network/ports/vip.yaml OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml
# Port assignments by role, edit role definition to assign networks to roles. # Port assignments for the Controller OS::TripleO::Controller::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml OS::TripleO::Controller::Ports::TenantPort: ../network/ports/tenant.yaml OS::TripleO::Controller::Ports::InternalApiPort: ../network/ports/internal_api.yaml OS::TripleO::Controller::Ports::ExternalPort: ../network/ports/external.yaml # Port assignments for the Compute OS::TripleO::Compute::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml OS::TripleO::Compute::Ports::TenantPort: ../network/ports/tenant.yaml OS::TripleO::Compute::Ports::InternalApiPort: ../network/ports/internal_api.yaml
~
network-environment.yaml?
resource_registry: # Network Interface templates to use (these files must exist). You can # override these by including one of the net-*.yaml environment files, # such as net-bond-with-vlans.yaml, or modifying the list here. # Port assignments for the Controller OS::TripleO::Controller::Net::SoftwareConfig: ../network/config/bond-with-vlans/controller.yaml # Port assignments for the Compute OS::TripleO::Compute::Net::SoftwareConfig: ../network/config/bond-with-vlans/compute.yaml parameter_defaults:
J3MgmtNetCidr: '80.0.1.0/24' J3MgmtAllocationPools: [{'start': '80.0.1.4', 'end': '80.0.1.250'}] J3MgmtNetworkVlanID: 400
TenantNetCidr: '172.16.0.0/24' TenantAllocationPools: [{'start': '172.16.0.4', 'end': '172.16.0.250'}] TenantNetworkVlanID: 416 TenantNetPhysnetMtu: 1500
InternalApiNetCidr: '172.16.2.0/24' InternalApiAllocationPools: [{'start': '172.16.2.4', 'end': '172.16.2.250'}] InternalApiNetworkVlanID: 418
ExternalNetCidr: '10.0.1.0/24' ExternalAllocationPools: [{'start': '10.0.1.85', 'end': '10.0.1.98'}] ExternalNetworkVlanID: 408
DnsServers: [] NeutronNetworkType: 'geneve,vlan' NeutronNetworkVLANRanges: 'datacentre:1:1000' BondInterfaceOvsOptions: "bond_mode=active-backup"
I typically use: -e
/usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e
/usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml -e /home/stack/templates/environments/network-overrides.yaml
The network-isolation.yaml and network-environment.yaml are Jinja2 rendered based on the -n input, so too keep in sync with change in the `-n` file reference the file in /usr/share/opentack-tripleo-heat-templates. Then add overrides in network-overrides.yaml as neede.
[Loke] : we are using this as like only, I do not know what you pass in network-overrides.yaml but I pass other files as per commands as below:
[stack@undercloud templates]$ cat environment.yaml parameter_defaults: ControllerCount: 3 TimeZone: 'Asia/Kolkata' NtpServer: ['30.30.30.3'] NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal [stack@undercloud templates]$ cat ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel", " http://30.30.30.1:8088/agent.ramdisk"]' IronicInspectorInterface: 'br-baremetal' [stack@undercloud templates]$ [stack@undercloud templates]$ cat node-info.yaml parameter_defaults: OvercloudControllerFlavor: control OvercloudComputeFlavor: compute ControllerCount: 3 ComputeCount: 1 [stack@undercloud templates]$
-e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml
\ -e /home/stack/templates/ironic-config.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \ -e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml/
**/home/stack/templates/ironic-config.yaml : (overcloud) [stack@undercloud ~]$ cat /home/stack/templates/ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel <http://30.30.30.1:8088/agent.kernel>", "http://30.30.30.1:8088/agent.ramdisk <http://30.30.30.1:8088/agent.ramdisk>"] > IronicInspectorInterface: 'br-baremetal'
Also the baremetal network(provisioning)(172.23.3.x) is routed with ctlplane/admin network (30.30.30.x)
Unless the network you created in the overcloud is named `provisioning`, these parameters may be relevant.
IronicCleaningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for cleaning bare metal nodes. The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
IronicProvisioningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for provisioning of bare metal nodes, if IronicDefaultNetworkInterface is set to "neutron". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
IronicRescuingNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for resucing of bare metal nodes, if IronicDefaultRescueInterface is not set to "no-rescue". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
*Query:*
1. any other location/way where we should add these so that they are included without error.
*ServiceNetMap:*
* IronicApiNetwork: provisioning*
* IronicNetwork: provisioning*
`provisioning` network is defined in -n /home/stack/templates/network_data.yaml right?
[Loke]: No it does not have any entry for provisioning in this file, it is network entry for J3Mgmt,Tenant,InternalApi, and External. These n/w's are added as vlan based under the br-ext bridge. provisioning network I am creating after the overcloud is deployed and before the baremetal node is provisioned. in the provisioning network, we are giving the range of the ironic network. (172.23.3.x)
And an entry in 'networks' for the controller role in /home/stack/templates/roles_data.yaml?
[Loke]: we also did not added a similar entry in the roles_data.yaml as well.
Just to add with these two files we have rendered the remaining templates.
2. Also are these commands(mentioned above) configure Baremetal services are fine.
Yes, what you are doing makes sense.
I'm actually not sure why it did'nt work with your previous configuration, it got the information about NBP file and obviously attempted to download it from 30.30.30.220. With routing in place, that should work.
Changeing the ServiceNetMap to move IronicNetwork services to the 172.23.3 would avoid the routing.
[Loke] : we can try this but are somehow not able to do so because of some weird reasons.
What is NeutronBridgeMappings? br-baremetal maps to the physical network of the overcloud `provisioning` neutron network?
[Loke] : yes , we create br-barmetal and then we create provisioning network mapping it to br-baremetal.
Also attaching the complete rendered template folder along with custom yaml files that I am using, maybe referring that you might have a more clear picture of our problem. Any clue would help. Our problem, we are not able to provision the baremetal node after the overcloud is deployed. Do we have any straight-forward documents using which we can test the baremetal provision, please provide that.
Thanks once again for reading all these.
-- Harald
- skype: lokendrarathour
--
On Fri, Feb 11, 2022 at 6:32 AM Lokendra Rathour <lokendrarathour@gmail.com> wrote:
Hi Harald/ Openstack Team, Thank you again for your support.
we have successfully provisioned the baremetal node as per the inputs shared by you. The only change that we did was to add an entry for the ServiceNetmap.
Further, we were trying to launch the baremetal node instance in which we are facing ISSUE as mentioned below:
[trim'ed picture because of message size]
*"2022-02-11 18:13:45.840 7 ERROR nova.compute.manager [req-aafdea4d-815f-4504-b7d7-4fd95d1e083e - - - - -] Error updating resources for node 9560bc2d-5f94-4ba0-9711-340cb8ad7d8a.: nova.exception.NoResourceClass: Resource class not found for Ironic node 9560bc2d-5f94-4ba0-9711-340cb8ad7d8a.*
*2022-02-11 18:13:45.840 7 ERROR nova.compute.manager Traceback (most recent call last):2022-02-11 18:13:45.840 7 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 8894, in _update_available_resource_for_node* "
So this exception can only be raised if the resource_class field is just not populated for the node. It is a required field for nova/ironic integration. Also, Interestingly enough, this UUID in the error doesn't match the baremetal node below. I don't know if that is intentional?
for your reference please refer following details: (overcloud) [stack@undercloud v4]$ openstack baremetal node show baremetal-node --fit-width
+------------------------+-------------------------------------------------------------------------------------------------------------------+ | Field | Value |
+------------------------+-------------------------------------------------------------------------------------------------------------------+ | allocation_uuid | None | | automated_clean | None | | bios_interface | no-bios | | boot_interface | ipxe | | chassis_uuid | None | | clean_step | {} | | conductor | overcloud-controller-0.localdomain | | conductor_group | | | console_enabled | False | | console_interface | ipmitool-socat | | created_at | 2022-02-11T13:02:40+00:00 | | deploy_interface | iscsi | | deploy_step | {} | | description | None | | driver | ipmi | | driver_info | {'ipmi_port': 623, 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_address': '10.0.1.183', | | | 'deploy_kernel': 'bc62f3dc-d091-4dbd-b730-cf7b6cb48625', 'deploy_ramdisk': | | | 'd58bcc08-cb7c-4f21-8158-0a5ed4198108'} | | driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, 'agent_continue_if_ata_erase_failed': | | | False, 'agent_enable_ata_secure_erase': True, 'disk_erasure_concurrency': 1, 'last_power_state_change': | | | '2022-02-11T13:14:29.581361', 'agent_version': '5.0.5.dev25', 'agent_last_heartbeat': | | | '2022-02-11T13:14:24.151928', 'hardware_manager_version': {'generic_hardware_manager': '1.1'}, | | | 'agent_cached_clean_steps': {'deploy': [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy', | | | 'reboot_requested': False, 'abortable': True}, {'step': 'erase_devices_metadata', 'priority': 99, 'interface': | | | 'deploy', 'reboot_requested': False, 'abortable': True}], 'raid': [{'step': 'delete_configuration', 'priority': | | | 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}, {'step': 'create_configuration', | | | 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}]}, | | | 'agent_cached_clean_steps_refreshed': '2022-02-11 13:14:22.580729', 'clean_steps': None} | | extra | {} | | fault | None | | inspect_interface | inspector | | inspection_finished_at | None | | inspection_started_at | None | | instance_info | {} | | instance_uuid | None | | last_error | None | | maintenance | False | | maintenance_reason | None | | management_interface | ipmitool | | name | baremetal-node | | network_interface | flat | | owner | None | | power_interface | ipmitool | | power_state | power off |
*| properties | {'cpus': 20, 'memory_mb': 63700, 'local_gb': 470, 'cpu_arch': 'x86_64', 'capabilities': || | 'boot_option:local,boot_mode:uefi', 'vendor': 'hewlett-packard'} * | | protected | False | | protected_reason | None | | provision_state | available | | provision_updated_at | 2022-02-11T13:14:51+00:00 | | raid_config | {} | | raid_interface | no-raid | | rescue_interface | agent | | reservation | None | *| resource_class | baremetal-resource-class * | | storage_interface | noop | | target_power_state | None | | target_provision_state | None | | target_raid_config | {} | | traits | [] | | updated_at | 2022-02-11T13:14:52+00:00 | | uuid | e64ad28c-43d6-4b9f-aa34-f8bc58e9e8fe | | vendor_interface | ipmitool |
+------------------------+-------------------------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$
(overcloud) [stack@undercloud v4]$ openstack flavor show my-baremetal-flavor --fit-width
+----------------------------+---------------------------------------------------------------------------------------------------------------+ | Field | Value |
+----------------------------+---------------------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 470 |
*| extra_specs | {'resources:CUSTOM_BAREMETAL_RESOURCE_CLASS': '1', 'resources:VCPU': '0', 'resources:MEMORY_MB': '0', || | 'resources:DISK_GB': '0', 'capabilities:boot_option': 'local,boot_mode:uefi'} * | | id | 66a13404-4c47-4b67-b954-e3df42ae8103 | | name | my-baremetal-flavor | | os-flavor-access:is_public | True |
*| properties | capabilities:boot_option='local,boot_mode:uefi', resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1', || | resources:DISK_GB='0', resources:MEMORY_MB='0', resources:VCPU='0' * | | ram | 63700 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 20 |
+----------------------------+---------------------------------------------------------------------------------------------------------------+
However you've set your capabilities field, it is actually unable to be parsed. Then again, it doesn't *have* to be defined to match the baremetal node. The setting can still apply on the baremetal node if that is the operational default for the machine as defined on the machine itself. I suspect, based upon whatever the precise nova settings are, this would result in an inability to schedule on to the node because it would parse it incorrectly, possibly looking for a key value of "capabilities:boot_option", instead of "capabilities". (overcloud) [stack@undercloud v4]$
Can you please check and suggest if something is missing.
Thanks once again for your support.
-Lokendra
On Thu, Feb 10, 2022 at 10:09 PM Lokendra Rathour < lokendrarathour@gmail.com> wrote:
Hi Harald, Thanks for the response, please find my response inline:
On Thu, Feb 10, 2022 at 8:24 PM Harald Jensas <hjensas@redhat.com> wrote:
On 2/10/22 14:49, Lokendra Rathour wrote:
Hi Harald, Thanks once again for your support, we tried activating the parameters: ServiceNetMap: IronicApiNetwork: provisioning IronicNetwork: provisioning at environments/network-environments.yaml image.png After changing these values the updated or even the fresh deployments are failing.
How did deployment fail?
[Loke] : it failed immediately after when the IP for ctlplane network is assigned, and ssh is established and stack creation is completed, I think at the start of ansible execution.
Error: "enabling ssh admin - COMPLETE. Host 10.0.1.94 not found in /home/stack/.ssh/known_hosts" Although this message is even seen when the deployment is successful. so I do not think this is the culprit.
The command that we are using to deploy the OpenStack overcloud: /openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \
What modifications did you do to network-isolation.yaml and
[Loke]: *Network-isolation.yaml:*
# Enable the creation of Neutron networks for isolated Overcloud # traffic and configure each role to assign ports (related # to that role) on these networks. resource_registry: # networks as defined in network_data.yaml OS::TripleO::Network::J3Mgmt: ../network/j3mgmt.yaml OS::TripleO::Network::Tenant: ../network/tenant.yaml OS::TripleO::Network::InternalApi: ../network/internal_api.yaml OS::TripleO::Network::External: ../network/external.yaml
# Port assignments for the VIPs OS::TripleO::Network::Ports::J3MgmtVipPort: ../network/ports/j3mgmt.yaml
OS::TripleO::Network::Ports::InternalApiVipPort: ../network/ports/internal_api.yaml OS::TripleO::Network::Ports::ExternalVipPort: ../network/ports/external.yaml
OS::TripleO::Network::Ports::RedisVipPort: ../network/ports/vip.yaml OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml
# Port assignments by role, edit role definition to assign networks to roles. # Port assignments for the Controller OS::TripleO::Controller::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml OS::TripleO::Controller::Ports::TenantPort: ../network/ports/tenant.yaml OS::TripleO::Controller::Ports::InternalApiPort: ../network/ports/internal_api.yaml OS::TripleO::Controller::Ports::ExternalPort: ../network/ports/external.yaml # Port assignments for the Compute OS::TripleO::Compute::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml OS::TripleO::Compute::Ports::TenantPort: ../network/ports/tenant.yaml OS::TripleO::Compute::Ports::InternalApiPort: ../network/ports/internal_api.yaml
~
network-environment.yaml?
resource_registry: # Network Interface templates to use (these files must exist). You can # override these by including one of the net-*.yaml environment files, # such as net-bond-with-vlans.yaml, or modifying the list here. # Port assignments for the Controller OS::TripleO::Controller::Net::SoftwareConfig: ../network/config/bond-with-vlans/controller.yaml # Port assignments for the Compute OS::TripleO::Compute::Net::SoftwareConfig: ../network/config/bond-with-vlans/compute.yaml parameter_defaults:
J3MgmtNetCidr: '80.0.1.0/24' J3MgmtAllocationPools: [{'start': '80.0.1.4', 'end': '80.0.1.250'}] J3MgmtNetworkVlanID: 400
TenantNetCidr: '172.16.0.0/24' TenantAllocationPools: [{'start': '172.16.0.4', 'end': '172.16.0.250'}] TenantNetworkVlanID: 416 TenantNetPhysnetMtu: 1500
InternalApiNetCidr: '172.16.2.0/24' InternalApiAllocationPools: [{'start': '172.16.2.4', 'end': '172.16.2.250'}] InternalApiNetworkVlanID: 418
ExternalNetCidr: '10.0.1.0/24' ExternalAllocationPools: [{'start': '10.0.1.85', 'end': '10.0.1.98'}] ExternalNetworkVlanID: 408
DnsServers: [] NeutronNetworkType: 'geneve,vlan' NeutronNetworkVLANRanges: 'datacentre:1:1000' BondInterfaceOvsOptions: "bond_mode=active-backup"
I typically use: -e
/usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e
/usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml -e /home/stack/templates/environments/network-overrides.yaml
The network-isolation.yaml and network-environment.yaml are Jinja2 rendered based on the -n input, so too keep in sync with change in the `-n` file reference the file in /usr/share/opentack-tripleo-heat-templates. Then add overrides in network-overrides.yaml as neede.
[Loke] : we are using this as like only, I do not know what you pass in network-overrides.yaml but I pass other files as per commands as below:
[stack@undercloud templates]$ cat environment.yaml parameter_defaults: ControllerCount: 3 TimeZone: 'Asia/Kolkata' NtpServer: ['30.30.30.3'] NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal [stack@undercloud templates]$ cat ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel", " http://30.30.30.1:8088/agent.ramdisk"]' IronicInspectorInterface: 'br-baremetal' [stack@undercloud templates]$ [stack@undercloud templates]$ cat node-info.yaml parameter_defaults: OvercloudControllerFlavor: control OvercloudComputeFlavor: compute ControllerCount: 3 ComputeCount: 1 [stack@undercloud templates]$
-e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml
\ -e /home/stack/templates/ironic-config.yaml \ -e
/usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml/
**/home/stack/templates/ironic-config.yaml : (overcloud) [stack@undercloud ~]$ cat /home/stack/templates/ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel <http://30.30.30.1:8088/agent.kernel>", "http://30.30.30.1:8088/agent.ramdisk <http://30.30.30.1:8088/agent.ramdisk>"] >
IronicInspectorInterface: 'br-baremetal'
Also the baremetal network(provisioning)(172.23.3.x) is routed with ctlplane/admin network (30.30.30.x)
Unless the network you created in the overcloud is named `provisioning`, these parameters may be relevant.
IronicCleaningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for cleaning bare metal nodes. The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
IronicProvisioningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for provisioning of bare metal nodes, if IronicDefaultNetworkInterface is set to "neutron". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
IronicRescuingNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for resucing of bare metal nodes, if IronicDefaultRescueInterface is not set to "no-rescue". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
*Query:*
1. any other location/way where we should add these so that they are included without error.
*ServiceNetMap:*
* IronicApiNetwork: provisioning*
* IronicNetwork: provisioning*
`provisioning` network is defined in -n /home/stack/templates/network_data.yaml right?
[Loke]: No it does not have any entry for provisioning in this file, it is network entry for J3Mgmt,Tenant,InternalApi, and External. These n/w's are added as vlan based under the br-ext bridge. provisioning network I am creating after the overcloud is deployed and before the baremetal node is provisioned. in the provisioning network, we are giving the range of the ironic network. (172.23.3.x)
And an entry in 'networks' for the controller role in /home/stack/templates/roles_data.yaml?
[Loke]: we also did not added a similar entry in the roles_data.yaml as well.
Just to add with these two files we have rendered the remaining templates.
2. Also are these commands(mentioned above) configure Baremetal services are fine.
Yes, what you are doing makes sense.
I'm actually not sure why it did'nt work with your previous configuration, it got the information about NBP file and obviously attempted to download it from 30.30.30.220. With routing in place, that should work.
Changeing the ServiceNetMap to move IronicNetwork services to the 172.23.3 would avoid the routing.
[Loke] : we can try this but are somehow not able to do so because of some weird reasons.
What is NeutronBridgeMappings? br-baremetal maps to the physical network of the overcloud `provisioning` neutron network?
[Loke] : yes , we create br-barmetal and then we create provisioning network mapping it to br-baremetal.
Also attaching the complete rendered template folder along with custom yaml files that I am using, maybe referring that you might have a more clear picture of our problem. Any clue would help. Our problem, we are not able to provision the baremetal node after the overcloud is deployed. Do we have any straight-forward documents using which we can test the baremetal provision, please provide that.
Thanks once again for reading all these.
-- Harald
- skype: lokendrarathour
--
Hi Julia, Thanks once again. we got your point and understood the issue, but we still are facing the same issue on our TRIPLEO Train HA Setup, even if the settings are done as per your recommendations. The error that we are seeing is again "*No valid host was found"* (overcloud) [stack@undercloud v4]$ openstack server show bm-server --fit-width +-------------------------------------+----------------------------------------------------------------------------------------+ | Field | Value | +-------------------------------------+----------------------------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | | | OS-EXT-SRV-ATTR:host | None | | OS-EXT-SRV-ATTR:hostname | bm-server | | OS-EXT-SRV-ATTR:hypervisor_hostname | None | | OS-EXT-SRV-ATTR:instance_name | instance-00000014 | | OS-EXT-SRV-ATTR:kernel_id | | | OS-EXT-SRV-ATTR:launch_index | 0 | | OS-EXT-SRV-ATTR:ramdisk_id | | | OS-EXT-SRV-ATTR:reservation_id | r-npd6m9ah | | OS-EXT-SRV-ATTR:root_device_name | None | | OS-EXT-SRV-ATTR:user_data | I2Nsb3VkLWNvbmZpZwpkaXNhYmxlX3Jvb3Q6IGZhbHNlCnBhc3N3b3JkOiBoc2MzMjEKc3NoX3B3YXV0aDogdH | | | J1ZQptYW5hZ2VfZXRjX2hvc3RzOiB0cnVlCmNocGFzc3dkOiB7ZXhwaXJlOiBmYWxzZSB9Cg== | | OS-EXT-STS:power_state | NOSTATE | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | error | | OS-SRV-USG:launched_at | None | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | | | config_drive | True | | created | 2022-02-14T10:20:48Z | | description | None | | fault | {'code': 500, 'created': '2022-02-14T10:20:49Z', 'message': 'No valid host was found. | | | There are not enough hosts available.', 'details': 'Traceback (most recent call | | | last):\n File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line | | | 1379, in schedule_and_build_instances\n instance_uuids, return_alternates=True)\n | | | File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 839, in | | | _schedule_instances\n return_alternates=return_alternates)\n File | | | "/usr/lib/python3.6/site-packages/nova/scheduler/client/query.py", line 42, in | | | select_destinations\n instance_uuids, return_objects, return_alternates)\n File | | | "/usr/lib/python3.6/site-packages/nova/scheduler/rpcapi.py", line 160, in | | | select_destinations\n return cctxt.call(ctxt, \'select_destinations\', | | | **msg_args)\n File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", | | | line 181, in call\n transport_options=self.transport_options)\n File | | | "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 129, in _send\n | | | transport_options=transport_options)\n File "/usr/lib/python3.6/site- | | | packages/oslo_messaging/_drivers/amqpdriver.py", line 674, in send\n | | | transport_options=transport_options)\n File "/usr/lib/python3.6/site- | | | packages/oslo_messaging/_drivers/amqpdriver.py", line 664, in _send\n raise | | | result\nnova.exception_Remote.NoValidHost_Remote: No valid host was found. There are | | | not enough hosts available.\nTraceback (most recent call last):\n\n File | | | "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 235, in inner\n | | | return func(*args, **kwargs)\n\n File "/usr/lib/python3.6/site- | | | packages/nova/scheduler/manager.py", line 214, in select_destinations\n | | | allocation_request_version, return_alternates)\n\n File "/usr/lib/python3.6/site- | | | packages/nova/scheduler/filter_scheduler.py", line 96, in select_destinations\n | | | allocation_request_version, return_alternates)\n\n File "/usr/lib/python3.6/site- | | | packages/nova/scheduler/filter_scheduler.py", line 265, in _schedule\n | | | claimed_instance_uuids)\n\n File "/usr/lib/python3.6/site- | | | packages/nova/scheduler/filter_scheduler.py", line 302, in _ensure_sufficient_hosts\n | | | raise exception.NoValidHost(reason=reason)\n\nnova.exception.NoValidHost: No valid | | | host was found. There are not enough hosts available.\n\n'} | | flavor | disk='470', ephemeral='0', | | | extra_specs.capabilities='boot_mode:uefi,boot_option:local', | | | extra_specs.resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1', | | | extra_specs.resources:DISK_GB='0', extra_specs.resources:MEMORY_MB='0', | | | extra_specs.resources:VCPU='0', original_name='bm-flavor', ram='63700', swap='0', | | | vcpus='20' | | hostId | | | host_status | | | id | 49944a1f-7758-4522-9ef1-867ede44b3fc | | image | whole-disk-centos (80724772-c760-4136-b453-754456d7c549) | | key_name | None | | locked | False | | locked_reason | None | | name | bm-server | | project_id | 8dde31e24eba41bfb7212ae154d61268 | | properties | | | server_groups | [] | | status | ERROR | | tags | [] | | trusted_image_certificates | None | | updated | 2022-02-14T10:20:49Z | | user_id | f689d147221549f1a6cbd1310078127d | | volumes_attached | | +-------------------------------------+----------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$ (overcloud) [stack@undercloud v4]$ For your reference our update flavor and baremetal node properties are as below: (overcloud) [stack@undercloud v4]$ *openstack flavor show bm-flavor --fit-width* +----------------------------+-------------------------------------------------------------------------------------------------+ | Field | Value | +----------------------------+-------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 470 | | extra_specs | {'resources:CUSTOM_BAREMETAL_RESOURCE_CLASS': '1', 'resources:VCPU': '0', | | | 'resources:MEMORY_MB': '0', 'resources:DISK_GB': '0', 'capabilities': | | | 'boot_mode:uefi,boot_option:local'} | | id | 021c3021-56ec-4eba-bf57-c516ee9b2ee3 | | name | bm-flavor | | os-flavor-access:is_public | True | | properties |* capabilities='boot_mode:uefi,boot_option:local', resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1', |* | | resources:DISK_GB='0', resources:MEMORY_MB='0', resources:VCPU='0' | | ram | 63700 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 20 | +----------------------------+-------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$ (overcloud) [stack@undercloud v4]$ (overcloud) [stack@undercloud v4]$* openstack baremetal node show baremetal-node --fit-width* +------------------------+-----------------------------------------------------------------------------------------------------+ | Field | Value | +------------------------+-----------------------------------------------------------------------------------------------------+ | allocation_uuid | None | | automated_clean | None | | bios_interface | no-bios | | boot_interface | ipxe | | chassis_uuid | None | | clean_step | {} | | conductor | overcloud-controller-0.localdomain | | conductor_group | | | console_enabled | False | | console_interface | ipmitool-socat | | created_at | 2022-02-14T10:05:32+00:00 | | deploy_interface | iscsi | | deploy_step | {} | | description | None | | driver | ipmi | | driver_info | {'ipmi_port': 623, 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_address': '10.0.1.183', | | | 'deploy_kernel': '95a5b644-c04e-4a66-8f2b-e1e9806bed6e', 'deploy_ramdisk': | | | '17644220-e623-4981-ae77-d789657851ba'} | | driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, | | | 'agent_continue_if_ata_erase_failed': False, 'agent_enable_ata_secure_erase': True, | | | 'disk_erasure_concurrency': 1, 'last_power_state_change': '2022-02-14T10:15:05.062161', | | | 'agent_version': '5.0.5.dev25', 'agent_last_heartbeat': '2022-02-14T10:14:59.666025', | | | 'hardware_manager_version': {'generic_hardware_manager': '1.1'}, 'agent_cached_clean_steps': | | | {'deploy': [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy', 'reboot_requested': | | | False, 'abortable': True}, {'step': 'erase_devices_metadata', 'priority': 99, 'interface': | | | 'deploy', 'reboot_requested': False, 'abortable': True}], 'raid': [{'step': 'delete_configuration', | | | 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}, {'step': | | | 'create_configuration', 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': | | | True}]}, 'agent_cached_clean_steps_refreshed': '2022-02-14 10:14:58.093777', 'clean_steps': None} | | extra | {} | | fault | None | | inspect_interface | inspector | | inspection_finished_at | None | | inspection_started_at | None | | instance_info | {} | | instance_uuid | None | | last_error | None | | maintenance | False | | maintenance_reason | None | | management_interface | ipmitool | | name | baremetal-node | | network_interface | flat | | owner | None | | power_interface | ipmitool | | power_state | power off | | *properties | {'cpus': 20, 'memory_mb': 63700, 'local_gb': 470, 'cpu_arch': 'x86_64', 'capabilities': || | 'boot_mode:uefi,boot_option:local', 'vendor': 'hewlett-packard'} * | | protected | False | | protected_reason | None | | provision_state | available | | provision_updated_at | 2022-02-14T10:15:27+00:00 | | raid_config | {} | | raid_interface | no-raid | | rescue_interface | agent | | reservation | None | | resource_class | baremetal-resource-class | | storage_interface | noop | | target_power_state | None | | target_provision_state | None | | target_raid_config | {} | | traits | [] | | updated_at | 2022-02-14T10:15:27+00:00 | | uuid | cd021878-40eb-407c-87c5-ce6ef92d29eb | | vendor_interface | ipmitool | +------------------------+-----------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$, On further debugging, we found that in the nova-scheduler logs : *2022-02-14 12:58:22.830 7 WARNING keystoneauth.discover [-] Failed to contact the endpoint at http://172.16.2.224:8778/placement <http://172.16.2.224:8778/placement> for discovery. Fallback to using that endpoint as the base url.2022-02-14 12:58:23.438 7 WARNING keystoneauth.discover [req-ad5801e4-efd7-4159-a601-68e72c0d651f - - - - -] Failed to contact the endpoint at http://172.16.2.224:8778/placement <http://172.16.2.224:8778/placement> for discovery. Fallback to using that endpoint as the base url.* where 172.16.2.224 is the internal IP. going by document : Bare Metal Instances in Overcloud — TripleO 3.0.0 documentation (openstack.org) <https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/baremetal_overcloud.html> it is given as below for commands: (overcloud) [root@overcloud-controller-0 ~]# endpoint= http://172.16.2.224:8778/placement (overcloud) [root@overcloud-controller-0 ~]# token=$(openstack token issue -f value -c id) (overcloud) [root@overcloud-controller-0 ~]# curl -sH "X-Auth-Token: $token" $endpoint/resource_providers/<node id> | jq .inventories *null* result is the same even if we run the curl command on public endpoint. Please advice. On Sat, Feb 12, 2022 at 12:45 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:
On Fri, Feb 11, 2022 at 6:32 AM Lokendra Rathour < lokendrarathour@gmail.com> wrote:
Hi Harald/ Openstack Team, Thank you again for your support.
we have successfully provisioned the baremetal node as per the inputs shared by you. The only change that we did was to add an entry for the ServiceNetmap.
Further, we were trying to launch the baremetal node instance in which we are facing ISSUE as mentioned below:
[trim'ed picture because of message size]
*"2022-02-11 18:13:45.840 7 ERROR nova.compute.manager [req-aafdea4d-815f-4504-b7d7-4fd95d1e083e - - - - -] Error updating resources for node 9560bc2d-5f94-4ba0-9711-340cb8ad7d8a.: nova.exception.NoResourceClass: Resource class not found for Ironic node 9560bc2d-5f94-4ba0-9711-340cb8ad7d8a.*
*2022-02-11 18:13:45.840 7 ERROR nova.compute.manager Traceback (most recent call last):2022-02-11 18:13:45.840 7 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 8894, in _update_available_resource_for_node* "
So this exception can only be raised if the resource_class field is just not populated for the node. It is a required field for nova/ironic integration. Also, Interestingly enough, this UUID in the error doesn't match the baremetal node below. I don't know if that is intentional?
for your reference please refer following details: (overcloud) [stack@undercloud v4]$ openstack baremetal node show baremetal-node --fit-width
+------------------------+-------------------------------------------------------------------------------------------------------------------+ | Field | Value |
+------------------------+-------------------------------------------------------------------------------------------------------------------+ | allocation_uuid | None | | automated_clean | None | | bios_interface | no-bios | | boot_interface | ipxe | | chassis_uuid | None | | clean_step | {} | | conductor | overcloud-controller-0.localdomain | | conductor_group | | | console_enabled | False | | console_interface | ipmitool-socat | | created_at | 2022-02-11T13:02:40+00:00 | | deploy_interface | iscsi | | deploy_step | {} | | description | None | | driver | ipmi | | driver_info | {'ipmi_port': 623, 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_address': '10.0.1.183', | | | 'deploy_kernel': 'bc62f3dc-d091-4dbd-b730-cf7b6cb48625', 'deploy_ramdisk': | | | 'd58bcc08-cb7c-4f21-8158-0a5ed4198108'} | | driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, 'agent_continue_if_ata_erase_failed': | | | False, 'agent_enable_ata_secure_erase': True, 'disk_erasure_concurrency': 1, 'last_power_state_change': | | | '2022-02-11T13:14:29.581361', 'agent_version': '5.0.5.dev25', 'agent_last_heartbeat': | | | '2022-02-11T13:14:24.151928', 'hardware_manager_version': {'generic_hardware_manager': '1.1'}, | | | 'agent_cached_clean_steps': {'deploy': [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy', | | | 'reboot_requested': False, 'abortable': True}, {'step': 'erase_devices_metadata', 'priority': 99, 'interface': | | | 'deploy', 'reboot_requested': False, 'abortable': True}], 'raid': [{'step': 'delete_configuration', 'priority': | | | 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}, {'step': 'create_configuration', | | | 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}]}, | | | 'agent_cached_clean_steps_refreshed': '2022-02-11 13:14:22.580729', 'clean_steps': None} | | extra | {} | | fault | None | | inspect_interface | inspector | | inspection_finished_at | None | | inspection_started_at | None | | instance_info | {} | | instance_uuid | None | | last_error | None | | maintenance | False | | maintenance_reason | None | | management_interface | ipmitool | | name | baremetal-node | | network_interface | flat | | owner | None | | power_interface | ipmitool | | power_state | power off |
*| properties | {'cpus': 20, 'memory_mb': 63700, 'local_gb': 470, 'cpu_arch': 'x86_64', 'capabilities': || | 'boot_option:local,boot_mode:uefi', 'vendor': 'hewlett-packard'} * | | protected | False | | protected_reason | None | | provision_state | available | | provision_updated_at | 2022-02-11T13:14:51+00:00 | | raid_config | {} | | raid_interface | no-raid | | rescue_interface | agent | | reservation | None | *| resource_class | baremetal-resource-class * | | storage_interface | noop | | target_power_state | None | | target_provision_state | None | | target_raid_config | {} | | traits | [] | | updated_at | 2022-02-11T13:14:52+00:00 | | uuid | e64ad28c-43d6-4b9f-aa34-f8bc58e9e8fe | | vendor_interface | ipmitool |
+------------------------+-------------------------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$
(overcloud) [stack@undercloud v4]$ openstack flavor show my-baremetal-flavor --fit-width
+----------------------------+---------------------------------------------------------------------------------------------------------------+ | Field | Value |
+----------------------------+---------------------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 470 |
*| extra_specs | {'resources:CUSTOM_BAREMETAL_RESOURCE_CLASS': '1', 'resources:VCPU': '0', 'resources:MEMORY_MB': '0', || | 'resources:DISK_GB': '0', 'capabilities:boot_option': 'local,boot_mode:uefi'} * | | id | 66a13404-4c47-4b67-b954-e3df42ae8103 | | name | my-baremetal-flavor | | os-flavor-access:is_public | True |
*| properties | capabilities:boot_option='local,boot_mode:uefi', resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1', || | resources:DISK_GB='0', resources:MEMORY_MB='0', resources:VCPU='0' * | | ram | 63700 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 20 |
+----------------------------+---------------------------------------------------------------------------------------------------------------+
However you've set your capabilities field, it is actually unable to be parsed. Then again, it doesn't *have* to be defined to match the baremetal node. The setting can still apply on the baremetal node if that is the operational default for the machine as defined on the machine itself.
I suspect, based upon whatever the precise nova settings are, this would result in an inability to schedule on to the node because it would parse it incorrectly, possibly looking for a key value of "capabilities:boot_option", instead of "capabilities".
(overcloud) [stack@undercloud v4]$
Can you please check and suggest if something is missing.
Thanks once again for your support.
-Lokendra
On Thu, Feb 10, 2022 at 10:09 PM Lokendra Rathour < lokendrarathour@gmail.com> wrote:
Hi Harald, Thanks for the response, please find my response inline:
On Thu, Feb 10, 2022 at 8:24 PM Harald Jensas <hjensas@redhat.com> wrote:
Hi Harald, Thanks once again for your support, we tried activating the
On 2/10/22 14:49, Lokendra Rathour wrote: parameters:
ServiceNetMap: IronicApiNetwork: provisioning IronicNetwork: provisioning at environments/network-environments.yaml image.png After changing these values the updated or even the fresh deployments are failing.
How did deployment fail?
[Loke] : it failed immediately after when the IP for ctlplane network is assigned, and ssh is established and stack creation is completed, I think at the start of ansible execution.
Error: "enabling ssh admin - COMPLETE. Host 10.0.1.94 not found in /home/stack/.ssh/known_hosts" Although this message is even seen when the deployment is successful. so I do not think this is the culprit.
The command that we are using to deploy the OpenStack overcloud: /openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \
What modifications did you do to network-isolation.yaml and
[Loke]: *Network-isolation.yaml:*
# Enable the creation of Neutron networks for isolated Overcloud # traffic and configure each role to assign ports (related # to that role) on these networks. resource_registry: # networks as defined in network_data.yaml OS::TripleO::Network::J3Mgmt: ../network/j3mgmt.yaml OS::TripleO::Network::Tenant: ../network/tenant.yaml OS::TripleO::Network::InternalApi: ../network/internal_api.yaml OS::TripleO::Network::External: ../network/external.yaml
# Port assignments for the VIPs OS::TripleO::Network::Ports::J3MgmtVipPort: ../network/ports/j3mgmt.yaml
OS::TripleO::Network::Ports::InternalApiVipPort: ../network/ports/internal_api.yaml OS::TripleO::Network::Ports::ExternalVipPort: ../network/ports/external.yaml
OS::TripleO::Network::Ports::RedisVipPort: ../network/ports/vip.yaml OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml
# Port assignments by role, edit role definition to assign networks to roles. # Port assignments for the Controller OS::TripleO::Controller::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml OS::TripleO::Controller::Ports::TenantPort: ../network/ports/tenant.yaml OS::TripleO::Controller::Ports::InternalApiPort: ../network/ports/internal_api.yaml OS::TripleO::Controller::Ports::ExternalPort: ../network/ports/external.yaml # Port assignments for the Compute OS::TripleO::Compute::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml OS::TripleO::Compute::Ports::TenantPort: ../network/ports/tenant.yaml OS::TripleO::Compute::Ports::InternalApiPort: ../network/ports/internal_api.yaml
~
network-environment.yaml?
resource_registry: # Network Interface templates to use (these files must exist). You can # override these by including one of the net-*.yaml environment files, # such as net-bond-with-vlans.yaml, or modifying the list here. # Port assignments for the Controller OS::TripleO::Controller::Net::SoftwareConfig: ../network/config/bond-with-vlans/controller.yaml # Port assignments for the Compute OS::TripleO::Compute::Net::SoftwareConfig: ../network/config/bond-with-vlans/compute.yaml parameter_defaults:
J3MgmtNetCidr: '80.0.1.0/24' J3MgmtAllocationPools: [{'start': '80.0.1.4', 'end': '80.0.1.250'}] J3MgmtNetworkVlanID: 400
TenantNetCidr: '172.16.0.0/24' TenantAllocationPools: [{'start': '172.16.0.4', 'end': '172.16.0.250'}] TenantNetworkVlanID: 416 TenantNetPhysnetMtu: 1500
InternalApiNetCidr: '172.16.2.0/24' InternalApiAllocationPools: [{'start': '172.16.2.4', 'end': '172.16.2.250'}] InternalApiNetworkVlanID: 418
ExternalNetCidr: '10.0.1.0/24' ExternalAllocationPools: [{'start': '10.0.1.85', 'end': '10.0.1.98'}] ExternalNetworkVlanID: 408
DnsServers: [] NeutronNetworkType: 'geneve,vlan' NeutronNetworkVLANRanges: 'datacentre:1:1000' BondInterfaceOvsOptions: "bond_mode=active-backup"
I typically use: -e
/usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e
/usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml -e /home/stack/templates/environments/network-overrides.yaml
The network-isolation.yaml and network-environment.yaml are Jinja2 rendered based on the -n input, so too keep in sync with change in the `-n` file reference the file in /usr/share/opentack-tripleo-heat-templates. Then add overrides in network-overrides.yaml as neede.
[Loke] : we are using this as like only, I do not know what you pass in network-overrides.yaml but I pass other files as per commands as below:
[stack@undercloud templates]$ cat environment.yaml parameter_defaults: ControllerCount: 3 TimeZone: 'Asia/Kolkata' NtpServer: ['30.30.30.3'] NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal [stack@undercloud templates]$ cat ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel", " http://30.30.30.1:8088/agent.ramdisk"]' IronicInspectorInterface: 'br-baremetal' [stack@undercloud templates]$ [stack@undercloud templates]$ cat node-info.yaml parameter_defaults: OvercloudControllerFlavor: control OvercloudComputeFlavor: compute ControllerCount: 3 ComputeCount: 1 [stack@undercloud templates]$
-e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml
\ -e /home/stack/templates/ironic-config.yaml \ -e
/usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \ -e /home/stack/containers-prepare-parameter.yaml/
**/home/stack/templates/ironic-config.yaml : (overcloud) [stack@undercloud ~]$ cat /home/stack/templates/ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel <http://30.30.30.1:8088/agent.kernel>", "http://30.30.30.1:8088/agent.ramdisk <http://30.30.30.1:8088/agent.ramdisk>"] >
IronicInspectorInterface: 'br-baremetal'
Also the baremetal network(provisioning)(172.23.3.x) is routed with ctlplane/admin network (30.30.30.x)
Unless the network you created in the overcloud is named `provisioning`, these parameters may be relevant.
IronicCleaningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for cleaning bare metal nodes. The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
IronicProvisioningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for provisioning of bare metal nodes, if IronicDefaultNetworkInterface is set to "neutron". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
IronicRescuingNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for resucing of bare metal nodes, if IronicDefaultRescueInterface is not set to "no-rescue". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
*Query:*
1. any other location/way where we should add these so that they are included without error.
*ServiceNetMap:*
* IronicApiNetwork: provisioning*
* IronicNetwork: provisioning*
`provisioning` network is defined in -n /home/stack/templates/network_data.yaml right?
[Loke]: No it does not have any entry for provisioning in this file, it is network entry for J3Mgmt,Tenant,InternalApi, and External. These n/w's are added as vlan based under the br-ext bridge. provisioning network I am creating after the overcloud is deployed and before the baremetal node is provisioned. in the provisioning network, we are giving the range of the ironic network. (172.23.3.x)
And an entry in 'networks' for the controller role in /home/stack/templates/roles_data.yaml?
[Loke]: we also did not added a similar entry in the roles_data.yaml as well.
Just to add with these two files we have rendered the remaining templates.
2. Also are these commands(mentioned above) configure Baremetal services are fine.
Yes, what you are doing makes sense.
I'm actually not sure why it did'nt work with your previous configuration, it got the information about NBP file and obviously attempted to download it from 30.30.30.220. With routing in place, that should work.
Changeing the ServiceNetMap to move IronicNetwork services to the 172.23.3 would avoid the routing.
[Loke] : we can try this but are somehow not able to do so because of some weird reasons.
What is NeutronBridgeMappings? br-baremetal maps to the physical network of the overcloud `provisioning` neutron network?
[Loke] : yes , we create br-barmetal and then we create provisioning network mapping it to br-baremetal.
Also attaching the complete rendered template folder along with custom yaml files that I am using, maybe referring that you might have a more clear picture of our problem. Any clue would help. Our problem, we are not able to provision the baremetal node after the overcloud is deployed. Do we have any straight-forward documents using which we can test the baremetal provision, please provide that.
Thanks once again for reading all these.
-- Harald
- skype: lokendrarathour
--
-- ~ Lokendra www.inertiaspeaks.com www.inertiagroups.com skype: lokendrarathour
From what I understand of baremetal nodes, they will show up as hypervisors from the Nova perspective.
Can you try "openstack hypervisor list"
From the doc
Each bare metal node becomes a separate hypervisor in Nova. The hypervisor host name always matches the associated node UUID. On Mon, Feb 14, 2022 at 10:03 AM Lokendra Rathour <lokendrarathour@gmail.com> wrote:
Hi Julia, Thanks once again. we got your point and understood the issue, but we still are facing the same issue on our TRIPLEO Train HA Setup, even if the settings are done as per your recommendations.
The error that we are seeing is again "*No valid host was found"*
(overcloud) [stack@undercloud v4]$ openstack server show bm-server --fit-width
+-------------------------------------+----------------------------------------------------------------------------------------+ | Field | Value |
+-------------------------------------+----------------------------------------------------------------------------------------+ | OS-DCF:diskConfig | MANUAL | | OS-EXT-AZ:availability_zone | | | OS-EXT-SRV-ATTR:host | None | | OS-EXT-SRV-ATTR:hostname | bm-server | | OS-EXT-SRV-ATTR:hypervisor_hostname | None | | OS-EXT-SRV-ATTR:instance_name | instance-00000014 | | OS-EXT-SRV-ATTR:kernel_id | | | OS-EXT-SRV-ATTR:launch_index | 0 | | OS-EXT-SRV-ATTR:ramdisk_id | | | OS-EXT-SRV-ATTR:reservation_id | r-npd6m9ah | | OS-EXT-SRV-ATTR:root_device_name | None | | OS-EXT-SRV-ATTR:user_data | I2Nsb3VkLWNvbmZpZwpkaXNhYmxlX3Jvb3Q6IGZhbHNlCnBhc3N3b3JkOiBoc2MzMjEKc3NoX3B3YXV0aDogdH | | | J1ZQptYW5hZ2VfZXRjX2hvc3RzOiB0cnVlCmNocGFzc3dkOiB7ZXhwaXJlOiBmYWxzZSB9Cg== | | OS-EXT-STS:power_state | NOSTATE | | OS-EXT-STS:task_state | None | | OS-EXT-STS:vm_state | error | | OS-SRV-USG:launched_at | None | | OS-SRV-USG:terminated_at | None | | accessIPv4 | | | accessIPv6 | | | addresses | | | config_drive | True | | created | 2022-02-14T10:20:48Z | | description | None | | fault | {'code': 500, 'created': '2022-02-14T10:20:49Z', 'message': 'No valid host was found. | | | There are not enough hosts available.', 'details': 'Traceback (most recent call | | | last):\n File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line | | | 1379, in schedule_and_build_instances\n instance_uuids, return_alternates=True)\n | | | File "/usr/lib/python3.6/site-packages/nova/conductor/manager.py", line 839, in | | | _schedule_instances\n return_alternates=return_alternates)\n File | | | "/usr/lib/python3.6/site-packages/nova/scheduler/client/query.py", line 42, in | | | select_destinations\n instance_uuids, return_objects, return_alternates)\n File | | | "/usr/lib/python3.6/site-packages/nova/scheduler/rpcapi.py", line 160, in | | | select_destinations\n return cctxt.call(ctxt, \'select_destinations\', | | | **msg_args)\n File "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", | | | line 181, in call\n transport_options=self.transport_options)\n File | | | "/usr/lib/python3.6/site-packages/oslo_messaging/transport.py", line 129, in _send\n | | | transport_options=transport_options)\n File "/usr/lib/python3.6/site- | | | packages/oslo_messaging/_drivers/amqpdriver.py", line 674, in send\n | | | transport_options=transport_options)\n File "/usr/lib/python3.6/site- | | | packages/oslo_messaging/_drivers/amqpdriver.py", line 664, in _send\n raise | | | result\nnova.exception_Remote.NoValidHost_Remote: No valid host was found. There are | | | not enough hosts available.\nTraceback (most recent call last):\n\n File | | | "/usr/lib/python3.6/site-packages/oslo_messaging/rpc/server.py", line 235, in inner\n | | | return func(*args, **kwargs)\n\n File "/usr/lib/python3.6/site- | | | packages/nova/scheduler/manager.py", line 214, in select_destinations\n | | | allocation_request_version, return_alternates)\n\n File "/usr/lib/python3.6/site- | | | packages/nova/scheduler/filter_scheduler.py", line 96, in select_destinations\n | | | allocation_request_version, return_alternates)\n\n File "/usr/lib/python3.6/site- | | | packages/nova/scheduler/filter_scheduler.py", line 265, in _schedule\n | | | claimed_instance_uuids)\n\n File "/usr/lib/python3.6/site- | | | packages/nova/scheduler/filter_scheduler.py", line 302, in _ensure_sufficient_hosts\n | | | raise exception.NoValidHost(reason=reason)\n\nnova.exception.NoValidHost: No valid | | | host was found. There are not enough hosts available.\n\n'} | | flavor | disk='470', ephemeral='0', | | | extra_specs.capabilities='boot_mode:uefi,boot_option:local', | | | extra_specs.resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1', | | | extra_specs.resources:DISK_GB='0', extra_specs.resources:MEMORY_MB='0', | | | extra_specs.resources:VCPU='0', original_name='bm-flavor', ram='63700', swap='0', | | | vcpus='20' | | hostId | | | host_status | | | id | 49944a1f-7758-4522-9ef1-867ede44b3fc | | image | whole-disk-centos (80724772-c760-4136-b453-754456d7c549) | | key_name | None | | locked | False | | locked_reason | None | | name | bm-server | | project_id | 8dde31e24eba41bfb7212ae154d61268 | | properties | | | server_groups | [] | | status | ERROR | | tags | [] | | trusted_image_certificates | None | | updated | 2022-02-14T10:20:49Z | | user_id | f689d147221549f1a6cbd1310078127d | | volumes_attached | |
+-------------------------------------+----------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$ (overcloud) [stack@undercloud v4]$
For your reference our update flavor and baremetal node properties are as below:
(overcloud) [stack@undercloud v4]$ *openstack flavor show bm-flavor --fit-width*
+----------------------------+-------------------------------------------------------------------------------------------------+ | Field | Value |
+----------------------------+-------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 470 | | extra_specs | {'resources:CUSTOM_BAREMETAL_RESOURCE_CLASS': '1', 'resources:VCPU': '0', | | | 'resources:MEMORY_MB': '0', 'resources:DISK_GB': '0', 'capabilities': | | | 'boot_mode:uefi,boot_option:local'} | | id | 021c3021-56ec-4eba-bf57-c516ee9b2ee3 | | name | bm-flavor | | os-flavor-access:is_public | True | | properties |* capabilities='boot_mode:uefi,boot_option:local', resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1', |* | | resources:DISK_GB='0', resources:MEMORY_MB='0', resources:VCPU='0' | | ram | 63700 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 20 |
+----------------------------+-------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$
(overcloud) [stack@undercloud v4]$
(overcloud) [stack@undercloud v4]$* openstack baremetal node show baremetal-node --fit-width*
+------------------------+-----------------------------------------------------------------------------------------------------+ | Field | Value |
+------------------------+-----------------------------------------------------------------------------------------------------+ | allocation_uuid | None | | automated_clean | None | | bios_interface | no-bios | | boot_interface | ipxe | | chassis_uuid | None | | clean_step | {} | | conductor | overcloud-controller-0.localdomain | | conductor_group | | | console_enabled | False | | console_interface | ipmitool-socat | | created_at | 2022-02-14T10:05:32+00:00 | | deploy_interface | iscsi | | deploy_step | {} | | description | None | | driver | ipmi | | driver_info | {'ipmi_port': 623, 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_address': '10.0.1.183', | | | 'deploy_kernel': '95a5b644-c04e-4a66-8f2b-e1e9806bed6e', 'deploy_ramdisk': | | | '17644220-e623-4981-ae77-d789657851ba'} | | driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, | | | 'agent_continue_if_ata_erase_failed': False, 'agent_enable_ata_secure_erase': True, | | | 'disk_erasure_concurrency': 1, 'last_power_state_change': '2022-02-14T10:15:05.062161', | | | 'agent_version': '5.0.5.dev25', 'agent_last_heartbeat': '2022-02-14T10:14:59.666025', | | | 'hardware_manager_version': {'generic_hardware_manager': '1.1'}, 'agent_cached_clean_steps': | | | {'deploy': [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy', 'reboot_requested': | | | False, 'abortable': True}, {'step': 'erase_devices_metadata', 'priority': 99, 'interface': | | | 'deploy', 'reboot_requested': False, 'abortable': True}], 'raid': [{'step': 'delete_configuration', | | | 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}, {'step': | | | 'create_configuration', 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': | | | True}]}, 'agent_cached_clean_steps_refreshed': '2022-02-14 10:14:58.093777', 'clean_steps': None} | | extra | {} | | fault | None | | inspect_interface | inspector | | inspection_finished_at | None | | inspection_started_at | None | | instance_info | {} | | instance_uuid | None | | last_error | None | | maintenance | False | | maintenance_reason | None | | management_interface | ipmitool | | name | baremetal-node | | network_interface | flat | | owner | None | | power_interface | ipmitool | | power_state | power off | | *properties | {'cpus': 20, 'memory_mb': 63700, 'local_gb': 470, 'cpu_arch': 'x86_64', 'capabilities': || | 'boot_mode:uefi,boot_option:local', 'vendor': 'hewlett-packard'} * | | protected | False | | protected_reason | None | | provision_state | available | | provision_updated_at | 2022-02-14T10:15:27+00:00 | | raid_config | {} | | raid_interface | no-raid | | rescue_interface | agent | | reservation | None | | resource_class | baremetal-resource-class | | storage_interface | noop | | target_power_state | None | | target_provision_state | None | | target_raid_config | {} | | traits | [] | | updated_at | 2022-02-14T10:15:27+00:00 | | uuid | cd021878-40eb-407c-87c5-ce6ef92d29eb | | vendor_interface | ipmitool |
+------------------------+-----------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$,
On further debugging, we found that in the nova-scheduler logs :
*2022-02-14 12:58:22.830 7 WARNING keystoneauth.discover [-] Failed to contact the endpoint at http://172.16.2.224:8778/placement <http://172.16.2.224:8778/placement> for discovery. Fallback to using that endpoint as the base url.2022-02-14 12:58:23.438 7 WARNING keystoneauth.discover [req-ad5801e4-efd7-4159-a601-68e72c0d651f - - - - -] Failed to contact the endpoint at http://172.16.2.224:8778/placement <http://172.16.2.224:8778/placement> for discovery. Fallback to using that endpoint as the base url.*
where 172.16.2.224 is the internal IP.
going by document : Bare Metal Instances in Overcloud — TripleO 3.0.0 documentation (openstack.org) <https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/baremetal_overcloud.html>
it is given as below for commands:
(overcloud) [root@overcloud-controller-0 ~]# endpoint= http://172.16.2.224:8778/placement (overcloud) [root@overcloud-controller-0 ~]# token=$(openstack token issue -f value -c id) (overcloud) [root@overcloud-controller-0 ~]# curl -sH "X-Auth-Token: $token" $endpoint/resource_providers/<node id> | jq .inventories *null*
result is the same even if we run the curl command on public endpoint.
Please advice.
On Sat, Feb 12, 2022 at 12:45 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:
On Fri, Feb 11, 2022 at 6:32 AM Lokendra Rathour < lokendrarathour@gmail.com> wrote:
Hi Harald/ Openstack Team, Thank you again for your support.
we have successfully provisioned the baremetal node as per the inputs shared by you. The only change that we did was to add an entry for the ServiceNetmap.
Further, we were trying to launch the baremetal node instance in which we are facing ISSUE as mentioned below:
[trim'ed picture because of message size]
*"2022-02-11 18:13:45.840 7 ERROR nova.compute.manager [req-aafdea4d-815f-4504-b7d7-4fd95d1e083e - - - - -] Error updating resources for node 9560bc2d-5f94-4ba0-9711-340cb8ad7d8a.: nova.exception.NoResourceClass: Resource class not found for Ironic node 9560bc2d-5f94-4ba0-9711-340cb8ad7d8a.*
*2022-02-11 18:13:45.840 7 ERROR nova.compute.manager Traceback (most recent call last):2022-02-11 18:13:45.840 7 ERROR nova.compute.manager File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 8894, in _update_available_resource_for_node* "
So this exception can only be raised if the resource_class field is just not populated for the node. It is a required field for nova/ironic integration. Also, Interestingly enough, this UUID in the error doesn't match the baremetal node below. I don't know if that is intentional?
for your reference please refer following details: (overcloud) [stack@undercloud v4]$ openstack baremetal node show baremetal-node --fit-width
+------------------------+-------------------------------------------------------------------------------------------------------------------+ | Field | Value |
+------------------------+-------------------------------------------------------------------------------------------------------------------+ | allocation_uuid | None | | automated_clean | None | | bios_interface | no-bios | | boot_interface | ipxe | | chassis_uuid | None | | clean_step | {} | | conductor | overcloud-controller-0.localdomain | | conductor_group | | | console_enabled | False | | console_interface | ipmitool-socat | | created_at | 2022-02-11T13:02:40+00:00 | | deploy_interface | iscsi | | deploy_step | {} | | description | None | | driver | ipmi | | driver_info | {'ipmi_port': 623, 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_address': '10.0.1.183', | | | 'deploy_kernel': 'bc62f3dc-d091-4dbd-b730-cf7b6cb48625', 'deploy_ramdisk': | | | 'd58bcc08-cb7c-4f21-8158-0a5ed4198108'} | | driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, 'agent_continue_if_ata_erase_failed': | | | False, 'agent_enable_ata_secure_erase': True, 'disk_erasure_concurrency': 1, 'last_power_state_change': | | | '2022-02-11T13:14:29.581361', 'agent_version': '5.0.5.dev25', 'agent_last_heartbeat': | | | '2022-02-11T13:14:24.151928', 'hardware_manager_version': {'generic_hardware_manager': '1.1'}, | | | 'agent_cached_clean_steps': {'deploy': [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy', | | | 'reboot_requested': False, 'abortable': True}, {'step': 'erase_devices_metadata', 'priority': 99, 'interface': | | | 'deploy', 'reboot_requested': False, 'abortable': True}], 'raid': [{'step': 'delete_configuration', 'priority': | | | 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}, {'step': 'create_configuration', | | | 'priority': 0, 'interface': 'raid', 'reboot_requested': False, 'abortable': True}]}, | | | 'agent_cached_clean_steps_refreshed': '2022-02-11 13:14:22.580729', 'clean_steps': None} | | extra | {} | | fault | None | | inspect_interface | inspector | | inspection_finished_at | None | | inspection_started_at | None | | instance_info | {} | | instance_uuid | None | | last_error | None | | maintenance | False | | maintenance_reason | None | | management_interface | ipmitool | | name | baremetal-node | | network_interface | flat | | owner | None | | power_interface | ipmitool | | power_state | power off |
*| properties | {'cpus': 20, 'memory_mb': 63700, 'local_gb': 470, 'cpu_arch': 'x86_64', 'capabilities': || | 'boot_option:local,boot_mode:uefi', 'vendor': 'hewlett-packard'} * | | protected | False | | protected_reason | None | | provision_state | available | | provision_updated_at | 2022-02-11T13:14:51+00:00 | | raid_config | {} | | raid_interface | no-raid | | rescue_interface | agent | | reservation | None | *| resource_class | baremetal-resource-class * | | storage_interface | noop | | target_power_state | None | | target_provision_state | None | | target_raid_config | {} | | traits | [] | | updated_at | 2022-02-11T13:14:52+00:00 | | uuid | e64ad28c-43d6-4b9f-aa34-f8bc58e9e8fe | | vendor_interface | ipmitool |
+------------------------+-------------------------------------------------------------------------------------------------------------------+ (overcloud) [stack@undercloud v4]$
(overcloud) [stack@undercloud v4]$ openstack flavor show my-baremetal-flavor --fit-width
+----------------------------+---------------------------------------------------------------------------------------------------------------+ | Field | Value |
+----------------------------+---------------------------------------------------------------------------------------------------------------+ | OS-FLV-DISABLED:disabled | False | | OS-FLV-EXT-DATA:ephemeral | 0 | | access_project_ids | None | | description | None | | disk | 470 |
*| extra_specs | {'resources:CUSTOM_BAREMETAL_RESOURCE_CLASS': '1', 'resources:VCPU': '0', 'resources:MEMORY_MB': '0', || | 'resources:DISK_GB': '0', 'capabilities:boot_option': 'local,boot_mode:uefi'} * | | id | 66a13404-4c47-4b67-b954-e3df42ae8103 | | name | my-baremetal-flavor | | os-flavor-access:is_public | True |
*| properties | capabilities:boot_option='local,boot_mode:uefi', resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1', || | resources:DISK_GB='0', resources:MEMORY_MB='0', resources:VCPU='0' * | | ram | 63700 | | rxtx_factor | 1.0 | | swap | 0 | | vcpus | 20 |
+----------------------------+---------------------------------------------------------------------------------------------------------------+
However you've set your capabilities field, it is actually unable to be parsed. Then again, it doesn't *have* to be defined to match the baremetal node. The setting can still apply on the baremetal node if that is the operational default for the machine as defined on the machine itself.
I suspect, based upon whatever the precise nova settings are, this would result in an inability to schedule on to the node because it would parse it incorrectly, possibly looking for a key value of "capabilities:boot_option", instead of "capabilities".
(overcloud) [stack@undercloud v4]$
Can you please check and suggest if something is missing.
Thanks once again for your support.
-Lokendra
On Thu, Feb 10, 2022 at 10:09 PM Lokendra Rathour < lokendrarathour@gmail.com> wrote:
Hi Harald, Thanks for the response, please find my response inline:
On Thu, Feb 10, 2022 at 8:24 PM Harald Jensas <hjensas@redhat.com> wrote:
Hi Harald, Thanks once again for your support, we tried activating the
On 2/10/22 14:49, Lokendra Rathour wrote: parameters:
ServiceNetMap: IronicApiNetwork: provisioning IronicNetwork: provisioning at environments/network-environments.yaml image.png After changing these values the updated or even the fresh deployments are failing.
How did deployment fail?
[Loke] : it failed immediately after when the IP for ctlplane network is assigned, and ssh is established and stack creation is completed, I think at the start of ansible execution.
Error: "enabling ssh admin - COMPLETE. Host 10.0.1.94 not found in /home/stack/.ssh/known_hosts" Although this message is even seen when the deployment is successful. so I do not think this is the culprit.
The command that we are using to deploy the OpenStack overcloud: /openstack overcloud deploy --templates \ -n /home/stack/templates/network_data.yaml \ -r /home/stack/templates/roles_data.yaml \ -e /home/stack/templates/node-info.yaml \ -e /home/stack/templates/environment.yaml \ -e /home/stack/templates/environments/network-isolation.yaml \ -e /home/stack/templates/environments/network-environment.yaml \
What modifications did you do to network-isolation.yaml and
[Loke]: *Network-isolation.yaml:*
# Enable the creation of Neutron networks for isolated Overcloud # traffic and configure each role to assign ports (related # to that role) on these networks. resource_registry: # networks as defined in network_data.yaml OS::TripleO::Network::J3Mgmt: ../network/j3mgmt.yaml OS::TripleO::Network::Tenant: ../network/tenant.yaml OS::TripleO::Network::InternalApi: ../network/internal_api.yaml OS::TripleO::Network::External: ../network/external.yaml
# Port assignments for the VIPs OS::TripleO::Network::Ports::J3MgmtVipPort: ../network/ports/j3mgmt.yaml
OS::TripleO::Network::Ports::InternalApiVipPort: ../network/ports/internal_api.yaml OS::TripleO::Network::Ports::ExternalVipPort: ../network/ports/external.yaml
OS::TripleO::Network::Ports::RedisVipPort: ../network/ports/vip.yaml OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml
# Port assignments by role, edit role definition to assign networks to roles. # Port assignments for the Controller OS::TripleO::Controller::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml OS::TripleO::Controller::Ports::TenantPort: ../network/ports/tenant.yaml OS::TripleO::Controller::Ports::InternalApiPort: ../network/ports/internal_api.yaml OS::TripleO::Controller::Ports::ExternalPort: ../network/ports/external.yaml # Port assignments for the Compute OS::TripleO::Compute::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml OS::TripleO::Compute::Ports::TenantPort: ../network/ports/tenant.yaml OS::TripleO::Compute::Ports::InternalApiPort: ../network/ports/internal_api.yaml
~
network-environment.yaml?
resource_registry: # Network Interface templates to use (these files must exist). You can # override these by including one of the net-*.yaml environment files, # such as net-bond-with-vlans.yaml, or modifying the list here. # Port assignments for the Controller OS::TripleO::Controller::Net::SoftwareConfig: ../network/config/bond-with-vlans/controller.yaml # Port assignments for the Compute OS::TripleO::Compute::Net::SoftwareConfig: ../network/config/bond-with-vlans/compute.yaml parameter_defaults:
J3MgmtNetCidr: '80.0.1.0/24' J3MgmtAllocationPools: [{'start': '80.0.1.4', 'end': '80.0.1.250'}] J3MgmtNetworkVlanID: 400
TenantNetCidr: '172.16.0.0/24' TenantAllocationPools: [{'start': '172.16.0.4', 'end': '172.16.0.250'}] TenantNetworkVlanID: 416 TenantNetPhysnetMtu: 1500
InternalApiNetCidr: '172.16.2.0/24' InternalApiAllocationPools: [{'start': '172.16.2.4', 'end': '172.16.2.250'}] InternalApiNetworkVlanID: 418
ExternalNetCidr: '10.0.1.0/24' ExternalAllocationPools: [{'start': '10.0.1.85', 'end': '10.0.1.98'}] ExternalNetworkVlanID: 408
DnsServers: [] NeutronNetworkType: 'geneve,vlan' NeutronNetworkVLANRanges: 'datacentre:1:1000' BondInterfaceOvsOptions: "bond_mode=active-backup"
I typically use: -e
/usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e
/usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml -e /home/stack/templates/environments/network-overrides.yaml
The network-isolation.yaml and network-environment.yaml are Jinja2 rendered based on the -n input, so too keep in sync with change in the `-n` file reference the file in /usr/share/opentack-tripleo-heat-templates. Then add overrides in network-overrides.yaml as neede.
[Loke] : we are using this as like only, I do not know what you pass in network-overrides.yaml but I pass other files as per commands as below:
[stack@undercloud templates]$ cat environment.yaml parameter_defaults: ControllerCount: 3 TimeZone: 'Asia/Kolkata' NtpServer: ['30.30.30.3'] NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal NeutronFlatNetworks: datacentre,baremetal [stack@undercloud templates]$ cat ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel", " http://30.30.30.1:8088/agent.ramdisk"]' IronicInspectorInterface: 'br-baremetal' [stack@undercloud templates]$ [stack@undercloud templates]$ cat node-info.yaml parameter_defaults: OvercloudControllerFlavor: control OvercloudComputeFlavor: compute ControllerCount: 3 ComputeCount: 1 [stack@undercloud templates]$
-e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml
\ -e
/usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml
\ -e /home/stack/templates/ironic-config.yaml \ -e
/usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
-e /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml
\
-e /home/stack/containers-prepare-parameter.yaml/
**/home/stack/templates/ironic-config.yaml : (overcloud) [stack@undercloud ~]$ cat /home/stack/templates/ironic-config.yaml parameter_defaults: IronicEnabledHardwareTypes: - ipmi - redfish IronicEnabledPowerInterfaces: - ipmitool - redfish IronicEnabledManagementInterfaces: - ipmitool - redfish IronicCleaningDiskErase: metadata IronicIPXEEnabled: true IronicInspectorSubnets: - ip_range: 172.23.3.100,172.23.3.150 IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel <http://30.30.30.1:8088/agent.kernel>", "http://30.30.30.1:8088/agent.ramdisk <http://30.30.30.1:8088/agent.ramdisk>"] >
IronicInspectorInterface: 'br-baremetal'
Also the baremetal network(provisioning)(172.23.3.x) is routed
with
ctlplane/admin network (30.30.30.x)
Unless the network you created in the overcloud is named `provisioning`, these parameters may be relevant.
IronicCleaningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for cleaning bare metal nodes. The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
IronicProvisioningNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for provisioning of bare metal nodes, if IronicDefaultNetworkInterface is set to "neutron". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
IronicRescuingNetwork: default: 'provisioning' description: Name or UUID of the *overcloud* network used for resucing of bare metal nodes, if IronicDefaultRescueInterface is not set to "no-rescue". The default value of "provisioning" can be left during the initial deployment (when no networks are created yet) and should be changed to an actual UUID in a post-deployment stack update. type: string
*Query:*
1. any other location/way where we should add these so that they are included without error.
*ServiceNetMap:*
* IronicApiNetwork: provisioning*
* IronicNetwork: provisioning*
`provisioning` network is defined in -n /home/stack/templates/network_data.yaml right?
[Loke]: No it does not have any entry for provisioning in this file, it is network entry for J3Mgmt,Tenant,InternalApi, and External. These n/w's are added as vlan based under the br-ext bridge. provisioning network I am creating after the overcloud is deployed and before the baremetal node is provisioned. in the provisioning network, we are giving the range of the ironic network. (172.23.3.x)
And an entry in 'networks' for the controller role in /home/stack/templates/roles_data.yaml?
[Loke]: we also did not added a similar entry in the roles_data.yaml as well.
Just to add with these two files we have rendered the remaining templates.
2. Also are these commands(mentioned above) configure Baremetal services are fine.
Yes, what you are doing makes sense.
I'm actually not sure why it did'nt work with your previous configuration, it got the information about NBP file and obviously attempted to download it from 30.30.30.220. With routing in place, that should work.
Changeing the ServiceNetMap to move IronicNetwork services to the 172.23.3 would avoid the routing.
[Loke] : we can try this but are somehow not able to do so because of some weird reasons.
What is NeutronBridgeMappings? br-baremetal maps to the physical network of the overcloud `provisioning` neutron network?
[Loke] : yes , we create br-barmetal and then we create provisioning network mapping it to br-baremetal.
Also attaching the complete rendered template folder along with custom yaml files that I am using, maybe referring that you might have a more clear picture of our problem. Any clue would help. Our problem, we are not able to provision the baremetal node after the overcloud is deployed. Do we have any straight-forward documents using which we can test the baremetal provision, please provide that.
Thanks once again for reading all these.
-- Harald
- skype: lokendrarathour
--
-- ~ Lokendra www.inertiaspeaks.com www.inertiagroups.com skype: lokendrarathour
On Mon, Feb 14, 2022 at 9:40 PM Laurent Dumont <laurentfdumont@gmail.com> wrote:
From what I understand of baremetal nodes, they will show up as hypervisors from the Nova perspective.
Can you try "openstack hypervisor list"
From the doc
+1. This is a good idea. This will tell us if Nova is at least syncing with Ironic. If it can't push the information to placement, that is obviously going to cause issues.
Each bare metal node becomes a separate hypervisor in Nova. The hypervisor host name always matches the associated node UUID.
On Mon, Feb 14, 2022 at 10:03 AM Lokendra Rathour < lokendrarathour@gmail.com> wrote:
Hi Julia, Thanks once again. we got your point and understood the issue, but we still are facing the same issue on our TRIPLEO Train HA Setup, even if the settings are done as per your recommendations.
The error that we are seeing is again "*No valid host was found"*
So this error is a bit of a generic catch error indicating it just doesn't know how to schedule the node. But the next error you mentioned *is* telling in that a node can't be scheduled if placement is not working. [trim]
On further debugging, we found that in the nova-scheduler logs :
*2022-02-14 12:58:22.830 7 WARNING keystoneauth.discover [-] Failed to contact the endpoint at http://172.16.2.224:8778/placement <http://172.16.2.224:8778/placement> for discovery. Fallback to using that endpoint as the base url.2022-02-14 12:58:23.438 7 WARNING keystoneauth.discover [req-ad5801e4-efd7-4159-a601-68e72c0d651f - - - - -] Failed to contact the endpoint at http://172.16.2.224:8778/placement <http://172.16.2.224:8778/placement> for discovery. Fallback to using that endpoint as the base url.*
where 172.16.2.224 is the internal IP.
going by document : Bare Metal Instances in Overcloud — TripleO 3.0.0 documentation (openstack.org) <https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/baremetal_overcloud.html>
it is given as below for commands:
(overcloud) [root@overcloud-controller-0 ~]# endpoint= http://172.16.2.224:8778/placement (overcloud) [root@overcloud-controller-0 ~]# token=$(openstack token issue -f value -c id) (overcloud) [root@overcloud-controller-0 ~]# curl -sH "X-Auth-Token: $token" $endpoint/resource_providers/<node id> | jq .inventories *null*
result is the same even if we run the curl command on public endpoint.
Please advice.
So this sounds like you have placement either not operating or incorrectly configured somehow. I am not a placement expert, but I don't think a node_id is used for resource providers.
Hopefully a placement expert can chime in here. That being said, the note about the service failing to connect to the endpoint for discovery is somewhat telling. You *should* be able to curl the root of the API, without a token, and discover a basic JSON document response with information which is used for API discovery. If that is not working, then there may be several things occuring. I would check to make sure the container(s) running placement are operating, not logging any errors, and responding properly. If they are responding if directly queried, then I wonder if there is something going on with load balancing. Possibly consider connecting to placement's port directly instead of through any sort of load balancing such as what is provided by haproxy. I think placement's log indicates the port it starts on, so that would hopefully help. It's configuration should also share the information as well.
Kudos to every1, with your valuable suggestions and feedback we were able to deploy the baremetal successfully. Few things we considered to make this run possible: - Yes, the openstack hypervisor list is showing details for the added Baremetal Node with the IP allocated from the pool of internalAPI. - That placement related error as was highlighted looks like a bug in Train release but it is not impacting our process. - inclusion of modified serviceNetMap in case of a composable network approach. ( as suggested by Harald) - Using the openstack openSource document as a primary reference . ( as suggest by Julia) - https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features... - This document focused clearly on the host-aggregate concept and setting baremetal flavor property to be "true". Thanks once again it was really helpful. *Also can you please share some information on this query:* *http://lists.openstack.org/pipermail/openstack-discuss/2022-February/027315.... <http://lists.openstack.org/pipermail/openstack-discuss/2022-February/027315.html> Best Regards, Lokendra On Thu, Feb 17, 2022 at 5:26 AM Julia Kreger <juliaashleykreger@gmail.com> wrote:
On Mon, Feb 14, 2022 at 9:40 PM Laurent Dumont <laurentfdumont@gmail.com> wrote:
From what I understand of baremetal nodes, they will show up as hypervisors from the Nova perspective.
Can you try "openstack hypervisor list"
From the doc
+1. This is a good idea. This will tell us if Nova is at least syncing with Ironic. If it can't push the information to placement, that is obviously going to cause issues.
Each bare metal node becomes a separate hypervisor in Nova. The hypervisor host name always matches the associated node UUID.
On Mon, Feb 14, 2022 at 10:03 AM Lokendra Rathour < lokendrarathour@gmail.com> wrote:
Hi Julia, Thanks once again. we got your point and understood the issue, but we still are facing the same issue on our TRIPLEO Train HA Setup, even if the settings are done as per your recommendations.
The error that we are seeing is again "*No valid host was found"*
So this error is a bit of a generic catch error indicating it just doesn't know how to schedule the node. But the next error you mentioned *is* telling in that a node can't be scheduled if placement is not working.
[trim]
On further debugging, we found that in the nova-scheduler logs :
*2022-02-14 12:58:22.830 7 WARNING keystoneauth.discover [-] Failed to contact the endpoint at http://172.16.2.224:8778/placement <http://172.16.2.224:8778/placement> for discovery. Fallback to using that endpoint as the base url.2022-02-14 12:58:23.438 7 WARNING keystoneauth.discover [req-ad5801e4-efd7-4159-a601-68e72c0d651f - - - - -] Failed to contact the endpoint at http://172.16.2.224:8778/placement <http://172.16.2.224:8778/placement> for discovery. Fallback to using that endpoint as the base url.*
where 172.16.2.224 is the internal IP.
going by document : Bare Metal Instances in Overcloud — TripleO 3.0.0 documentation (openstack.org) <https://docs.openstack.org/project-deploy-guide/tripleo-docs/latest/features/baremetal_overcloud.html>
it is given as below for commands:
(overcloud) [root@overcloud-controller-0 ~]# endpoint= http://172.16.2.224:8778/placement (overcloud) [root@overcloud-controller-0 ~]# token=$(openstack token issue -f value -c id) (overcloud) [root@overcloud-controller-0 ~]# curl -sH "X-Auth-Token: $token" $endpoint/resource_providers/<node id> | jq .inventories *null*
result is the same even if we run the curl command on public endpoint.
Please advice.
So this sounds like you have placement either not operating or incorrectly configured somehow. I am not a placement expert, but I don't think a node_id is used for resource providers.
Hopefully a placement expert can chime in here. That being said, the note about the service failing to connect to the endpoint for discovery is somewhat telling. You *should* be able to curl the root of the API, without a token, and discover a basic JSON document response with information which is used for API discovery. If that is not working, then there may be several things occuring. I would check to make sure the container(s) running placement are operating, not logging any errors, and responding properly. If they are responding if directly queried, then I wonder if there is something going on with load balancing. Possibly consider connecting to placement's port directly instead of through any sort of load balancing such as what is provided by haproxy. I think placement's log indicates the port it starts on, so that would hopefully help. It's configuration should also share the information as well.
--
participants (5)
-
Anirudh Gupta
-
Harald Jensas
-
Julia Kreger
-
Laurent Dumont
-
Lokendra Rathour