Hi Harald,
Responding on behalf of Anirudh's email:
Thanks for the response and we now do understand that we are getting IP from the expected DHCP server.
We tried the scenario and here are our findings, Our admin and internal endpoints are on subnet: 30.30.30.x
public : 10.0.1.x
(overcloud) [stack@undercloud ~]$
OpenStack endpoint list | grep ironic| 04c163251e5546769446a4fa4fa20484 | regionOne | ironic | baremetal | True | admin |
http://30.30.30.213:6385 |
| 5c8557ae639a4898bdc6121f6e873724 | regionOne | ironic | baremetal | True | internal |
http://30.30.30.213:6385 |
| 62e07a3b2f3f4158bb27d8603a8f5138 | regionOne | ironic-inspector | baremetal-introspection | True | public |
http://10.0.1.88:5050 |
| af29bd64513546409f44cc5d56ea1082 | regionOne | ironic-inspector | baremetal-introspection | True | internal |
http://30.30.30.213:5050 |
| b76cdb5e77c54fc6b10cbfeada0e8bf5 | regionOne | ironic-inspector | baremetal-introspection | True | admin |
http://30.30.30.213:5050 |
| bd2954f41e49419f85669990eb59f51a | regionOne | ironic | baremetal | True | public |
http://10.0.1.88:6385 |
(overcloud) [stack@undercloud ~]$
we are following the flat default n/w approach for ironic provisioning, for which we are creating a flat network on baremetal physnet. we are still getting IP from neutron range (172.23.3.220 - 172.23.3.240) - 172.23.3.240.
Further, we found that once IP (172.23.3.240) is allocated to baremetal node, it looks for 30.30.30.220( IP of one of the three controllers) for pxe booting.
Checking the same controllers logs we found that
`/var/lib/ironic/tftpboot/pxelinux.cfg/` directory exists, but then there is no file matching the mac address of the baremetal node.
Also checking the extra_dhcp_opts we found this:
(overcloud) [stack@undercloud ~]$ openstack port show d7e573bf-1028-437a-8118-a2074c7573b2 | grep "extra_dhcp_opts"
Few points as observations:
- Although the baremetal network (172.23.3.x) is routable to the admin network (30.30.30.x), but it gets timeout at this window.
- in TCPDump we are only getting read requests.
- `openstack baremetal node list
- (overcloud) [stack@undercloud ~]$ openstack baremetal node list
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
| 7066fbe1-9c29-4702-9cd4-2b55daf19630 | bm1 | None | power on | clean wait | False |
+--------------------------------------+------+---------------+-------------+--------------------+-------------+
- `openstack baremetal node show <node-uuid>`
(overcloud) [stack@undercloud ~]$ openstack baremetal node show bm1
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Field | Value |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| allocation_uuid | None |
| automated_clean | None |
| bios_interface | no-bios |
| boot_interface | ipxe |
| chassis_uuid | None |
| clean_step | {} |
| conductor | overcloud-controller-0.localdomain |
| conductor_group | |
| console_enabled | False |
| console_interface | ipmitool-socat |
| created_at | 2022-02-09T14:21:24+00:00 |
| deploy_interface | iscsi |
| deploy_step | {} |
| description | None |
| driver | ipmi |
| driver_info | {'ipmi_address': '10.0.1.183', 'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_terminal_port': 623, 'deploy_kernel': '9e1365b6-261a-42a2-abfe-40158945de57', 'deploy_ramdisk': 'fe608dd2-ce86-4faf-b4b8-cc5cb143eb56'} |
| driver_internal_info | {'agent_erase_devices_iterations': 1, 'agent_erase_devices_zeroize': True, 'agent_continue_if_ata_erase_failed': False, 'agent_enable_ata_secure_erase': True, 'disk_erasure_concurrency': 1, 'last_power_state_change': '2022-02-09T14:23:39.525629'} |
| extra | {} |
| fault | None |
| inspect_interface | inspector |
| inspection_finished_at | None |
| inspection_started_at | None |
| instance_info | {} |
| instance_uuid | None |
| last_error | None |
| maintenance | False |
| maintenance_reason | None |
| management_interface | ipmitool |
| name | bm1 |
| network_interface | flat |
| owner | None |
| power_interface | ipmitool |
| power_state | power on |
| properties | {'cpus': 20, 'cpu_arch': 'x86_64', 'capabilities': 'boot_option:local,boot_mode:uefi', 'memory_mb': 63700, 'local_gb': 470, 'vendor': 'hewlett-packard'} |
| protected | False |
| protected_reason | None |
| provision_state | clean wait |
| provision_updated_at | 2022-02-09T14:24:05+00:00 |
| raid_config | {} |
| raid_interface | no-raid |
| rescue_interface | agent |
| reservation | None |
| resource_class | bm1 |
| storage_interface | noop |
| target_power_state | None |
| target_provision_state | available |
| target_raid_config | {} |
| traits | [] |
| updated_at | 2022-02-09T14:24:05+00:00 |
| uuid | 7066fbe1-9c29-4702-9cd4-2b55daf19630 |
| vendor_interface | ipmitool |
+------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
(overcloud) [stack@undercloud ~]$
Queries:
- What are the settings we can do for successfully pxe-boot of the baremetal node and provisioning our node successfully ?
On 2/7/22 13:47, Anirudh Gupta wrote:
> Hi Julia,
>
> Thanks a lot for your responses and support.
> To Update on the ongoing issue, I tried deploying the overcloud with
> your valuable suggestions i.e by passing "*DhcpAgentNotification: true*"
> in ironic-overcloud.yaml
> The setup came up successfully, but with this configuration the IP
> allocated on the system is one which is being configured while creating
> the subnet in openstack.
>
> image.png
>
> The system is still getting the IP (172.23.3.212) from neutron. The
> subnet range was configured as *172.23.3.210-172.23.3.240 *while
> creating the provisioning subnet.
The node is supposed to get an IP address from the neutron subnet on the
provisioning network when:
a) provisioning node
b) cleaning node.
When you do "baremetal node provide" cleaning is most likely
automatically initiated. (Since cleaning is enabled by default for
Ironic in overcloud AFIK.)
The only time you will get an address from the IronicInspectorSubnets
(ip_range: 172.23.3.100,172.23.3.150 in your case) is when you start
ironic node introspection.
> The system gets stuck here and no action is performed after this.
>
It seems the system is getting an address from the expected DHCP server,
but it does not boot. I would start looking into the pxe properties in
the DHCP Reply.
What is the status of the node in ironic at this stage?
`openstack baremetal node list`
`openstack baremetal node show <node-uuid>`
Check the `extra_dhcp_opts` on the neutron port, it should set the
nextserver and bootfile parameters. Does the bootfile exist in
/var/lib/ironic/tftpboot? Inspect the
`/var/lib/ironic/tftpboot/pxelinux.cfg/` directory, you should see a
file matching the MAC address of your system. Does the content make sense?
Can you capture DHCP and TFTP traffic on the provisioning network?
> Is there any way to resolve this and make successful provisioning the
> baremetal node in *TripleO Train Release* (Since RHOSP 16 was on Train,
> so I thought to go with that version for better stability)
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/release_notes/index
> <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/release_notes/index>
>
> I have some queries:
>
> 1. Is passing "*DhcpAgentNotification: true" *enough or do we have to
> make some other changes as well?
I belive in train "DhcpAgentNotification" defaults to True.
The change to default to false was added more recently, and it was not
backported.
(https://review.opendev.org/c/openstack/tripleo-heat-templates/+/801761)
NOTE, the environment for enabling ironi for the overcloud
'environments/services/ironic-overcloud.yaml' overrides this to 'true'
in later releases.
> 2. Although there are some security concerns specified in the document,
> but Currently I am focusing on the default flat bare metal approach
> which has dedicated interface for bare metal Provisioning. There is
> one composable method approach as well. Keeping aside the security
> concerns, which approach is better and functional?
> 1. https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/bare_metal_provisioning/prerequisites-for-bare-metal-provisioning
> <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/bare_metal_provisioning/prerequisites-for-bare-metal-provisioning>
Both should work, using the composable network is more secure since
baremetal nodes does not have access to the control plane network.
> 3. Will moving to upper openstack release version make this deployment
> possible?
> 1. If Yes, which release should I go with as till wallaby the
> ironic-overcloud.yml file has no option of including
> "*DhcpAgentNotification: true*" by default
> 1. https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/environments/services/ironic-overcloud.yaml
> <https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/environments/services/ironic-overcloud.yaml>
>
>
> Looking forward for your valuable feedback/response.
>
> Regards
> Anirudh Gupta
>
>
> On Fri, Feb 4, 2022 at 8:54 PM Anirudh Gupta <anyrude10@gmail.com
> <mailto:anyrude10@gmail.com>> wrote:
>
> Hi,
>
> Surely I'll revert the status once it gets deployed.
> Bdw the suspicion is because of Train Release or it is something else?
>
> Regards
> Anirudh Gupta
>
> On Fri, 4 Feb, 2022, 20:29 Julia Kreger,
> <juliaashleykreger@gmail.com <mailto:juliaashleykreger@gmail.com>>
> wrote:
>
>
>
> On Fri, Feb 4, 2022 at 5:50 AM Anirudh Gupta
> <anyrude10@gmail.com <mailto:anyrude10@gmail.com>> wrote:
>
> Hi Julia
>
> Thanks for your response.
>
> Earlier I was passing both ironic.yaml and
> ironic-overcloud.yaml located at path
> /usr/share/openstack-tripleo-heat-templates/environments/services/
>
> My current understanding now says that since I am using OVN,
> not OVS so I should pass only ironic-overcloud.yaml in my
> deployment.
>
> I am currently on Train Release and my default
> ironic-overcloud.yaml file has no such entry
> DhcpAgentNotification: true
>
>
> I suspect that should work. Let us know if it does.
>
> I would add this there and re deploy the setup.
>
> Would that be enough to make my deployment successful?
>
> Regards
> Anirudh Gupta
>
>
> On Fri, 4 Feb, 2022, 18:40 Julia Kreger,
> <juliaashleykreger@gmail.com
> <mailto:juliaashleykreger@gmail.com>> wrote:
>
> It is not a matter of disabling OVN, but a matter of
> enabling the dnsmasq service and notifications.
>
> https://github.com/openstack/tripleo-heat-templates/blob/master/environments/services/ironic-overcloud.yaml
> <https://github.com/openstack/tripleo-heat-templates/blob/master/environments/services/ironic-overcloud.yaml>
> may provide some insight.
>
> I suspect if you're using stable/wallaby based branches
> and it is not working, there may need to be a patch
> backported by the TripleO maintainers.
>
> On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta
> <anyrude10@gmail.com <mailto:anyrude10@gmail.com>> wrote:
>
> Hi Julia,
>
> Thanks for your response.
> For the overcloud deployment, I am executing the
> following command:
>
> openstack overcloud deploy --templates \
> -n /home/stack/templates/network_data.yaml \
> -r /home/stack/templates/roles_data.yaml \
> -e /home/stack/templates/node-info.yaml \
> -e /home/stack/templates/environment.yaml \
> -e
> /home/stack/templates/environments/network-isolation.yaml
> \
> -e
> /home/stack/templates/environments/network-environment.yaml
> \
> -e
> /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml
> \
> -e
> /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml
> \
> -e
> /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml
> \
> -e
> /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml
> \
> -e /home/stack/templates/ironic-config.yaml \
> -e
> /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml
> \
> -e
> /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml
> \
> -e /home/stack/containers-prepare-parameter.yaml
>
> I can see some OVN related stuff in my roles_data
> and environments/network-isolation.yaml
>
> [stack@undercloud ~]$ grep -inr "ovn"
> roles_data.yaml:34: *OVNCMSOptions:
> "enable-chassis-as-gw"*
> roles_data.yaml:168: -
> *OS::TripleO::Services::OVNDBs*
> roles_data.yaml:169: -
> *OS::TripleO::Services::OVNController*
> roles_data.yaml:279: -
> *OS::TripleO::Services::OVNController*
> roles_data.yaml:280: -
> *OS::TripleO::Services::OVNMetadataAgent*
> environments/network-isolation.yaml:16:
> *OS::TripleO::Network::Ports::OVNDBsVipPort:
> ../network/ports/vip.yaml*
> *
> *
> What is your recommendation and how to disable
> OVN....should I remove it from roles_data.yaml and
> then render so that it doesn't get generated
> in environments/network-isolation.yaml
> Please suggest some pointers.
>
> Regards
> Anirudh Gupta
> *
> *
> *
> *
>
>
>
>
> It seems OVN is getting installed in ironic
>
>
> On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger
> <juliaashleykreger@gmail.com
> <mailto:juliaashleykreger@gmail.com>> wrote:
>
> My guess: You're running OVN. You need
> neutron-dhcp-agent running as well. OVN disables
> it by default and OVN's integrated DHCP service
> does not support options for network booting.
>
> -Julia
>
> On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta
> <anyrude10@gmail.com
> <mailto:anyrude10@gmail.com>> wrote:
>
> Hi Team
>
> I am trying to Provision Bare Metal Node
> from my tripleo Overcloud.
> For this, while deploying the overcloud, I
> have followed the *"default flat" *network
> approach specified in the below link
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-planning
> <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-planning>
>
> Just to highlight the changes, I have
> defined the
>
> *ironic-config.yaml*
>
> parameter_defaults:
> ...
> ...
> IronicIPXEEnabled: true
> IronicInspectorSubnets:
> - ip_range: *172.23.3.100,172.23.3.150*
> IronicInspectorInterface: 'br-baremetal'
>
> Also modified the file
> *~/templates/network-environment.yaml*
>
> parameter_defaults:
> NeutronBridgeMappings:
> datacentre:br-ex,baremetal:br-baremetal
> NeutronFlatNetworks: datacentre,baremetal
>
> With this I have Followed all the steps of
> creating br-baremetal bridge on controller,
> given in the link below:
>
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-deploy
> <https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-deploy>
>
> - type: ovs_bridge
> name: br-baremetal
> use_dhcp: false
> members:
> - type: interface
> name: nic3
>
> Post Deployment, I have also create a flat
> network on "datacentre" physical network and
> subnet having the range
> *172.23.3.200,172.23.3.240 *(as suggested
> subnet is same as of inspector and range is
> different) and the router
>
> Also created a baremetal node and ran
> *"openstack baremetal node manage bm1", *the
> state of which was a success.
>
> Observation:
>
> On executing "openstack baremetal node
> *provide* bm1", the machine gets power on
> and ideally it should take an IP from ironic
> inspector range and PXE Boot.
> But nothing of this sort happens and we see
> an IP from neutron range "*172.23.3.239*"
> (attached the screenshot)
>
> image.png
>
> I have checked overcloud ironic inspector
> podman logs alongwith the tcpdump.
> In tcpdump, I can only see dhcp discover
> request on br-baremetal and nothing happens
> after that.
>
> I have tried to explain my issue in detail,
> but I would be happy to share more details
> in case still required.
> Can someone please help in resolving my issue.
>
> Regards
> Anirudh Gupta
>