[Tripleo] Issue in Baremetal Provisioning from Overcloud

Lokendra Rathour lokendrarathour at gmail.com
Wed Feb 9 14:58:05 UTC 2022


Hi Harald,
Responding on behalf of Anirudh's email:
Thanks for the response and we now do understand that we are getting IP
from the expected DHCP server.

We tried the scenario and here are our findings, Our admin and internal
endpoints are on subnet: 30.30.30.x
public : 10.0.1.x

(overcloud) [stack at undercloud ~]$ *OpenStack endpoint list | grep ironic*
| 04c163251e5546769446a4fa4fa20484 | regionOne | ironic           |
baremetal               | True    | admin     | http://30.30.30.213:6385
                    |
| 5c8557ae639a4898bdc6121f6e873724 | regionOne | ironic           |
baremetal               | True    | internal  | http://30.30.30.213:6385
                    |
| 62e07a3b2f3f4158bb27d8603a8f5138 | regionOne | ironic-inspector |
baremetal-introspection | True    | public    | http://10.0.1.88:5050
                   |
| af29bd64513546409f44cc5d56ea1082 | regionOne | ironic-inspector |
baremetal-introspection | True    | internal  | http://30.30.30.213:5050
                    |
| b76cdb5e77c54fc6b10cbfeada0e8bf5 | regionOne | ironic-inspector |
baremetal-introspection | True    | admin     | http://30.30.30.213:5050
                    |
| bd2954f41e49419f85669990eb59f51a | regionOne | ironic           |
baremetal               | True    | public    | http://10.0.1.88:6385
                   |
(overcloud) [stack at undercloud ~]$


we are following the flat default n/w approach for ironic provisioning, for
which we are creating a flat network on baremetal physnet. we are still
getting IP from neutron range (172.23.3.220 - 172.23.3.240)  -
172.23.3.240.

Further, we found that once IP (172.23.3.240) is allocated to baremetal
node, it looks for 30.30.30.220( IP of one of the three controllers) for
pxe booting.
Checking the same controllers logs we found that

*`/var/lib/ironic/tftpboot/pxelinux.cfg/` directory exists,* but then there
is *no file matching the mac *address of the baremetal node.

Also checking the *extra_dhcp_opts* we found this:
(overcloud) [stack at undercloud ~]$ *openstack port show
d7e573bf-1028-437a-8118-a2074c7573b2 | grep "extra_dhcp_opts"*

| extra_dhcp_opts         | ip_version='4', opt_name='tag:ipxe,67',
opt_value='http://30.30.30.220:8088/boot.ipxe'


                         [image: image.png]
*Few points as observations:*

   1. Although the baremetal network (172.23.3.x) is routable to the admin
   network (30.30.30.x), but it gets timeout at this window.
   2. in TCPDump we are only getting read requests.
   3. `openstack baremetal node list
      1. (overcloud) [stack at undercloud ~]$ openstack baremetal node list

      +--------------------------------------+------+---------------+-------------+--------------------+-------------+
      | UUID                                 | Name | Instance UUID | Power
      State | Provisioning State | Maintenance |

      +--------------------------------------+------+---------------+-------------+--------------------+-------------+
      | 7066fbe1-9c29-4702-9cd4-2b55daf19630 | bm1  | None          | power
      on    | clean wait         | False       |

      +--------------------------------------+------+---------------+-------------+--------------------+-------------+
      4.  `openstack baremetal node show <node-uuid>`
      1.
      (overcloud) [stack at undercloud ~]$ openstack baremetal node show bm1

      +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | Field                  | Value


                                                           |

      +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      | allocation_uuid        | None


                                                          |
      | automated_clean        | None


                                                          |
      | bios_interface         | no-bios


                                                           |
      | boot_interface         | ipxe


                                                          |
      | chassis_uuid           | None


                                                          |
      | clean_step             | {}


                                                          |
      | conductor              | overcloud-controller-0.localdomain


                                                          |
      | conductor_group        |


                                                           |
      | console_enabled        | False


                                                           |
      | console_interface      | ipmitool-socat


                                                          |
      | created_at             | 2022-02-09T14:21:24+00:00


                                                           |
      | deploy_interface       | iscsi


                                                           |
      | deploy_step            | {}


                                                          |
      | description            | None


                                                          |
      | driver                 | ipmi


                                                          |
      | driver_info            | {'ipmi_address': '10.0.1.183',
      'ipmi_username': 'hsc', 'ipmi_password': '******', 'ipmi_terminal_port':
      623, 'deploy_kernel': '9e1365b6-261a-42a2-abfe-40158945de57',
      'deploy_ramdisk': 'fe608dd2-ce86-4faf-b4b8-cc5cb143eb56'}
           |
      | driver_internal_info   | {'agent_erase_devices_iterations': 1,
      'agent_erase_devices_zeroize': True,
'agent_continue_if_ata_erase_failed':
      False, 'agent_enable_ata_secure_erase': True, 'disk_erasure_concurrency':
      1, 'last_power_state_change': '2022-02-09T14:23:39.525629'} |
      | extra                  | {}


                                                          |
      | fault                  | None


                                                          |
      | inspect_interface      | inspector


                                                           |
      | inspection_finished_at | None


                                                          |
      | inspection_started_at  | None


                                                          |
      | instance_info          | {}


                                                          |
      | instance_uuid          | None


                                                          |
      | last_error             | None


                                                          |
      | maintenance            | False


                                                           |
      | maintenance_reason     | None


                                                          |
      | management_interface   | ipmitool


                                                          |
      | name                   | bm1


                                                           |
      | network_interface      | flat


                                                          |
      | owner                  | None


                                                          |
      | power_interface        | ipmitool


                                                          |
      | power_state            | power on


                                                          |
      | properties             | {'cpus': 20, 'cpu_arch': 'x86_64',
      'capabilities': 'boot_option:local,boot_mode:uefi', 'memory_mb': 63700,
      'local_gb': 470, 'vendor': 'hewlett-packard'}
                                                                      |
      | protected              | False


                                                           |
      | protected_reason       | None


                                                          |
      | provision_state        | clean wait


                                                          |
      | provision_updated_at   | 2022-02-09T14:24:05+00:00


                                                           |
      | raid_config            | {}


                                                          |
      | raid_interface         | no-raid


                                                           |
      | rescue_interface       | agent


                                                           |
      | reservation            | None


                                                          |
      | resource_class         | bm1


                                                           |
      | storage_interface      | noop


                                                          |
      | target_power_state     | None


                                                          |
      | target_provision_state | available


                                                           |
      | target_raid_config     | {}


                                                          |
      | traits                 | []


                                                          |
      | updated_at             | 2022-02-09T14:24:05+00:00


                                                           |
      | uuid                   | 7066fbe1-9c29-4702-9cd4-2b55daf19630


                                                          |
      | vendor_interface       | ipmitool


                                                          |

      +------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      (overcloud) [stack at undercloud ~]$



*Queries:*

   - What are the settings we can do for successfully pxe-boot of the
   baremetal node and provisioning our node successfully ?





On Tue, Feb 8, 2022 at 6:27 PM Harald Jensas <hjensas at redhat.com> wrote:

> On 2/7/22 13:47, Anirudh Gupta wrote:
> > Hi Julia,
> >
> > Thanks a lot for your responses and support.
> > To Update on the ongoing issue, I tried deploying the overcloud with
> > your valuable suggestions i.e by passing "*DhcpAgentNotification: true*"
> > in ironic-overcloud.yaml
> > The setup came up successfully, but with this configuration the IP
> > allocated on the system is one which is being configured while creating
> > the subnet in openstack.
> >
> > image.png
> >
> > The system is still getting the IP (172.23.3.212) from neutron. The
> > subnet range was configured as *172.23.3.210-172.23.3.240 *while
> > creating the provisioning subnet.
>
>
> The node is supposed to get an IP address from the neutron subnet on the
> provisioning network when:
> a) provisioning node
> b) cleaning node.
>
> When you do "baremetal node provide" cleaning is most likely
> automatically initiated. (Since cleaning is enabled by default for
> Ironic in overcloud AFIK.)
>
> The only time you will get an address from the IronicInspectorSubnets
> (ip_range: 172.23.3.100,172.23.3.150 in your case) is when you start
> ironic node introspection.
>
> > The system gets stuck here and no action is performed after this.
> >
>
> It seems the system is getting an address from the expected DHCP server,
> but it does not boot. I would start looking into the pxe properties in
> the DHCP Reply.
>
> What is the status of the node in ironic at this stage?
>   `openstack baremetal node list`
>   `openstack baremetal node show <node-uuid>`
>
> Check the `extra_dhcp_opts` on the neutron port, it should set the
> nextserver and bootfile parameters. Does the bootfile exist in
> /var/lib/ironic/tftpboot? Inspect the
> `/var/lib/ironic/tftpboot/pxelinux.cfg/` directory, you should see a
> file matching the MAC address of your system. Does the content make sense?
>
> Can you capture DHCP and TFTP traffic on the provisioning network?
>
> > Is there any way to resolve this and make successful  provisioning the
> > baremetal node in *TripleO Train Release* (Since RHOSP 16 was on Train,
> > so I thought to go with that version for better stability)
> >
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/release_notes/index
> > <
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html-single/release_notes/index
> >
> >
> > I have some queries:
> >
> >  1. Is passing "*DhcpAgentNotification: true" *enough or do we have to
> >     make some other changes as well?
>
> I belive in train "DhcpAgentNotification" defaults to True.
> The change to default to false was added more recently, and it was not
> backported.
> (https://review.opendev.org/c/openstack/tripleo-heat-templates/+/801761)
>
> NOTE, the environment for enabling ironi for the overcloud
> 'environments/services/ironic-overcloud.yaml' overrides this to 'true'
> in later releases.
>
> >  2. Although there are some security concerns specified in the document,
> >     but Currently I am focusing on the default flat bare metal approach
> >     which has dedicated interface for bare metal Provisioning. There is
> >     one composable method approach as well. Keeping aside the security
> >     concerns, which approach is better and functional?
> >      1.
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/bare_metal_provisioning/prerequisites-for-bare-metal-provisioning
> >         <
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/16.2/html/bare_metal_provisioning/prerequisites-for-bare-metal-provisioning
> >
>
> Both should work, using the composable network is more secure since
> baremetal nodes does not have access to the control plane network.
>
> >  3. Will moving to upper openstack release version make this deployment
> >     possible?
> >      1. If Yes, which release should I go with as till wallaby the
> >         ironic-overcloud.yml file has no option of including
> >         "*DhcpAgentNotification: true*" by default
> >          1.
> https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/environments/services/ironic-overcloud.yaml
> >             <
> https://github.com/openstack/tripleo-heat-templates/blob/stable/wallaby/environments/services/ironic-overcloud.yaml
> >
> >
> >
> > Looking forward for your valuable feedback/response.
> >
> > Regards
> > Anirudh Gupta
> >
> >
> > On Fri, Feb 4, 2022 at 8:54 PM Anirudh Gupta <anyrude10 at gmail.com
> > <mailto:anyrude10 at gmail.com>> wrote:
> >
> >     Hi,
> >
> >     Surely I'll revert the status once it gets deployed.
> >     Bdw the suspicion is because of Train Release or it is something
> else?
> >
> >     Regards
> >     Anirudh Gupta
> >
> >     On Fri, 4 Feb, 2022, 20:29 Julia Kreger,
> >     <juliaashleykreger at gmail.com <mailto:juliaashleykreger at gmail.com>>
> >     wrote:
> >
> >
> >
> >         On Fri, Feb 4, 2022 at 5:50 AM Anirudh Gupta
> >         <anyrude10 at gmail.com <mailto:anyrude10 at gmail.com>> wrote:
> >
> >             Hi Julia
> >
> >             Thanks for your response.
> >
> >             Earlier I was passing both ironic.yaml and
> >             ironic-overcloud.yaml located at path
> >
>  /usr/share/openstack-tripleo-heat-templates/environments/services/
> >
> >             My current understanding now says that since I am using OVN,
> >             not OVS so I should pass only ironic-overcloud.yaml in my
> >             deployment.
> >
> >             I am currently on Train Release and my default
> >             ironic-overcloud.yaml file has no such entry
> >             DhcpAgentNotification: true
> >
> >
> >         I suspect that should work. Let us know if it does.
> >
> >             I would add this there and re deploy the setup.
> >
> >             Would that be enough to make my deployment successful?
> >
> >             Regards
> >             Anirudh Gupta
> >
> >
> >             On Fri, 4 Feb, 2022, 18:40 Julia Kreger,
> >             <juliaashleykreger at gmail.com
> >             <mailto:juliaashleykreger at gmail.com>> wrote:
> >
> >                 It is not a matter of disabling OVN, but a matter of
> >                 enabling the dnsmasq service and notifications.
> >
> >
> https://github.com/openstack/tripleo-heat-templates/blob/master/environments/services/ironic-overcloud.yaml
> >                 <
> https://github.com/openstack/tripleo-heat-templates/blob/master/environments/services/ironic-overcloud.yaml
> >
> >                 may provide some insight.
> >
> >                 I suspect if you're using stable/wallaby based branches
> >                 and it is not working, there may need to be a patch
> >                 backported by the TripleO maintainers.
> >
> >                 On Thu, Feb 3, 2022 at 8:02 PM Anirudh Gupta
> >                 <anyrude10 at gmail.com <mailto:anyrude10 at gmail.com>>
> wrote:
> >
> >                     Hi Julia,
> >
> >                     Thanks for your response.
> >                     For the overcloud deployment, I am executing the
> >                     following command:
> >
> >                     openstack overcloud deploy --templates \
> >                          -n /home/stack/templates/network_data.yaml \
> >                          -r /home/stack/templates/roles_data.yaml \
> >                          -e /home/stack/templates/node-info.yaml \
> >                          -e /home/stack/templates/environment.yaml \
> >                          -e
> >
>  /home/stack/templates/environments/network-isolation.yaml
> >                     \
> >                          -e
> >
>  /home/stack/templates/environments/network-environment.yaml
> >                     \
> >                          -e
> >
>  /usr/share/openstack-tripleo-heat-templates/environments/services/ironic.yaml
> >                     \
> >                          -e
> >
>  /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml
> >                     \
> >                          -e
> >
>  /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml
> >                     \
> >                          -e
> >
>  /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml
> >                     \
> >                          -e /home/stack/templates/ironic-config.yaml \
> >                          -e
> >
>  /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml
> >                     \
> >                          -e
> >
>  /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml
> >                     \
> >                          -e /home/stack/containers-prepare-parameter.yaml
> >
> >                     I can see some OVN related stuff in my roles_data
> >                     and environments/network-isolation.yaml
> >
> >                     [stack at undercloud ~]$ grep -inr "ovn"
> >                     roles_data.yaml:34: *OVNCMSOptions:
> >                     "enable-chassis-as-gw"*
> >                     roles_data.yaml:168:    -
> >                     *OS::TripleO::Services::OVNDBs*
> >                     roles_data.yaml:169:    -
> >                     *OS::TripleO::Services::OVNController*
> >                     roles_data.yaml:279:    -
> >                     *OS::TripleO::Services::OVNController*
> >                     roles_data.yaml:280:    -
> >                     *OS::TripleO::Services::OVNMetadataAgent*
> >                     environments/network-isolation.yaml:16:
> >                     *OS::TripleO::Network::Ports::OVNDBsVipPort:
> >                     ../network/ports/vip.yaml*
> >                     *
> >                     *
> >                     What is your recommendation and how to disable
> >                     OVN....should I remove it from roles_data.yaml and
> >                     then render so that it doesn't get generated
> >                     in environments/network-isolation.yaml
> >                     Please suggest some pointers.
> >
> >                     Regards
> >                     Anirudh Gupta
> >                     *
> >                     *
> >                     *
> >                     *
> >
> >
> >
> >
> >                     It seems OVN is getting installed in ironic
> >
> >
> >                     On Fri, Feb 4, 2022 at 1:36 AM Julia Kreger
> >                     <juliaashleykreger at gmail.com
> >                     <mailto:juliaashleykreger at gmail.com>> wrote:
> >
> >                         My guess: You're running OVN. You need
> >                         neutron-dhcp-agent running as well. OVN disables
> >                         it by default and OVN's integrated DHCP service
> >                         does not support options for network booting.
> >
> >                         -Julia
> >
> >                         On Thu, Feb 3, 2022 at 9:06 AM Anirudh Gupta
> >                         <anyrude10 at gmail.com
> >                         <mailto:anyrude10 at gmail.com>> wrote:
> >
> >                             Hi Team
> >
> >                             I am trying to Provision Bare Metal Node
> >                             from my tripleo Overcloud.
> >                             For this, while deploying the overcloud, I
> >                             have followed the *"default flat" *network
> >                             approach specified in the below link
> >
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-planning
> >                             <
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-planning
> >
> >
> >                             Just to highlight the changes, I have
> >                             defined the
> >
> >                             *ironic-config.yaml*
> >
> >                             parameter_defaults:
> >                                  ...
> >                                  ...
> >                                  IronicIPXEEnabled: true
> >                                  IronicInspectorSubnets:
> >                                  - ip_range: *172.23.3.100,172.23.3.150*
> >                                  IronicInspectorInterface: 'br-baremetal'
> >
> >                             Also modified the file
> >                             *~/templates/network-environment.yaml*
> >
> >                             parameter_defaults:
> >                                NeutronBridgeMappings:
> >                             datacentre:br-ex,baremetal:br-baremetal
> >                                NeutronFlatNetworks: datacentre,baremetal
> >
> >                             With this I have Followed all the steps of
> >                             creating br-baremetal bridge on controller,
> >                             given in the link below:
> >
> >
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-deploy
> >                             <
> https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/15/html/bare_metal_provisioning/sect-deploy
> >
> >
> >                                - type: ovs_bridge
> >                                   name: br-baremetal
> >                                   use_dhcp: false
> >                                   members:
> >                                   - type: interface
> >                                     name: nic3
> >
> >                             Post Deployment, I have also create a flat
> >                             network on "datacentre" physical network and
> >                             subnet having the range
> >                             *172.23.3.200,172.23.3.240 *(as suggested
> >                             subnet is same as of inspector and range is
> >                             different) and the router
> >
> >                             Also created a baremetal node and ran
> >                             *"openstack baremetal node manage bm1", *the
> >                             state of which was a success.
> >
> >                             Observation:
> >
> >                             On executing "openstack baremetal node
> >                             *provide* bm1", the machine gets power on
> >                             and ideally it should take an IP from ironic
> >                             inspector range and PXE Boot.
> >                             But nothing of this sort happens and we see
> >                             an IP from neutron range "*172.23.3.239*"
> >                             (attached the screenshot)
> >
> >                             image.png
> >
> >                             I have checked overcloud ironic inspector
> >                             podman logs alongwith the tcpdump.
> >                             In tcpdump, I can only see dhcp discover
> >                             request on br-baremetal and nothing happens
> >                             after that.
> >
> >                             I have tried to explain my issue in detail,
> >                             but I would be happy to share more details
> >                             in case still required.
> >                             Can someone please help in resolving my
> issue.
> >
> >                             Regards
> >                             Anirudh Gupta
> >
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20220209/88fba8e0/attachment-0001.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image.png
Type: image/png
Size: 138084 bytes
Desc: not available
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20220209/88fba8e0/attachment-0001.png>


More information about the openstack-discuss mailing list