[Tripleo] Issue in Baremetal Provisioning from Overcloud

Julia Kreger juliaashleykreger at gmail.com
Fri Feb 11 19:15:02 UTC 2022


On Fri, Feb 11, 2022 at 6:32 AM Lokendra Rathour <lokendrarathour at gmail.com>
wrote:

> Hi Harald/ Openstack Team,
> Thank you again for your support.
>
> we have successfully provisioned the baremetal node as per the inputs
> shared by you. The only change that we did was to add an entry for the
> ServiceNetmap.
>
> Further, we were trying to launch the baremetal node instance  in which we
> are facing ISSUE as mentioned below:
>
>
> [trim'ed picture because of message size]

>
> *"2022-02-11 18:13:45.840 7 ERROR nova.compute.manager
> [req-aafdea4d-815f-4504-b7d7-4fd95d1e083e - - - - -] Error updating
> resources for node 9560bc2d-5f94-4ba0-9711-340cb8ad7d8a.:
> nova.exception.NoResourceClass: Resource class not found for Ironic node
> 9560bc2d-5f94-4ba0-9711-340cb8ad7d8a.*
>
> *2022-02-11 18:13:45.840 7 ERROR nova.compute.manager Traceback (most
> recent call last):2022-02-11 18:13:45.840 7 ERROR nova.compute.manager
> File "/usr/lib/python3.6/site-packages/nova/compute/manager.py", line 8894,
> in _update_available_resource_for_node*
> "
>

So this exception can only be raised if the resource_class field is just
not populated for the node. It is a required field for nova/ironic
integration. Also, Interestingly enough, this UUID in the error doesn't
match the baremetal node below. I don't know if that is intentional?


> for your reference please refer following details:
> (overcloud) [stack at undercloud v4]$ openstack baremetal node show
> baremetal-node --fit-width
>
> +------------------------+-------------------------------------------------------------------------------------------------------------------+
> | Field                  | Value
>                                                                   |
>
> +------------------------+-------------------------------------------------------------------------------------------------------------------+
> | allocation_uuid        | None
>                                                                    |
> | automated_clean        | None
>                                                                    |
> | bios_interface         | no-bios
>                                                                   |
> | boot_interface         | ipxe
>                                                                    |
> | chassis_uuid           | None
>                                                                    |
> | clean_step             | {}
>                                                                    |
> | conductor              | overcloud-controller-0.localdomain
>                                                                    |
> | conductor_group        |
>                                                                   |
> | console_enabled        | False
>                                                                   |
> | console_interface      | ipmitool-socat
>                                                                    |
> | created_at             | 2022-02-11T13:02:40+00:00
>                                                                   |
> | deploy_interface       | iscsi
>                                                                   |
> | deploy_step            | {}
>                                                                    |
> | description            | None
>                                                                    |
> | driver                 | ipmi
>                                                                    |
> | driver_info            | {'ipmi_port': 623, 'ipmi_username': 'hsc',
> 'ipmi_password': '******', 'ipmi_address': '10.0.1.183',               |
> |                        | 'deploy_kernel':
> 'bc62f3dc-d091-4dbd-b730-cf7b6cb48625', 'deploy_ramdisk':
>                      |
> |                        | 'd58bcc08-cb7c-4f21-8158-0a5ed4198108'}
>                                                                   |
> | driver_internal_info   | {'agent_erase_devices_iterations': 1,
> 'agent_erase_devices_zeroize': True, 'agent_continue_if_ata_erase_failed':
>  |
> |                        | False, 'agent_enable_ata_secure_erase': True,
> 'disk_erasure_concurrency': 1, 'last_power_state_change':           |
> |                        | '2022-02-11T13:14:29.581361', 'agent_version':
> '5.0.5.dev25', 'agent_last_heartbeat':                             |
> |                        | '2022-02-11T13:14:24.151928',
> 'hardware_manager_version': {'generic_hardware_manager': '1.1'},
>          |
> |                        | 'agent_cached_clean_steps': {'deploy':
> [{'step': 'erase_devices', 'priority': 10, 'interface': 'deploy',          |
> |                        | 'reboot_requested': False, 'abortable': True},
> {'step': 'erase_devices_metadata', 'priority': 99, 'interface':    |
> |                        | 'deploy', 'reboot_requested': False,
> 'abortable': True}], 'raid': [{'step': 'delete_configuration', 'priority':
>   |
> |                        | 0, 'interface': 'raid', 'reboot_requested':
> False, 'abortable': True}, {'step': 'create_configuration',           |
> |                        | 'priority': 0, 'interface': 'raid',
> 'reboot_requested': False, 'abortable': True}]},
>    |
> |                        | 'agent_cached_clean_steps_refreshed':
> '2022-02-11 13:14:22.580729', 'clean_steps': None}
>  |
> | extra                  | {}
>                                                                    |
> | fault                  | None
>                                                                    |
> | inspect_interface      | inspector
>                                                                   |
> | inspection_finished_at | None
>                                                                    |
> | inspection_started_at  | None
>                                                                    |
> | instance_info          | {}
>                                                                    |
> | instance_uuid          | None
>                                                                    |
> | last_error             | None
>                                                                    |
> | maintenance            | False
>                                                                   |
> | maintenance_reason     | None
>                                                                    |
> | management_interface   | ipmitool
>                                                                    |
> | name                   | baremetal-node
>                                                                    |
> | network_interface      | flat
>                                                                    |
> | owner                  | None
>                                                                    |
> | power_interface        | ipmitool
>                                                                    |
> | power_state            | power off
>                                                                   |
>
> *| properties             | {'cpus': 20, 'memory_mb': 63700, 'local_gb':
> 470, 'cpu_arch': 'x86_64', 'capabilities':                           ||
>                    | 'boot_option:local,boot_mode:uefi', 'vendor':
> 'hewlett-packard'}     *                                             |
> | protected              | False
>                                                                   |
> | protected_reason       | None
>                                                                    |
> | provision_state        | available
>                                                                   |
> | provision_updated_at   | 2022-02-11T13:14:51+00:00
>                                                                   |
> | raid_config            | {}
>                                                                    |
> | raid_interface         | no-raid
>                                                                   |
> | rescue_interface       | agent
>                                                                   |
> | reservation            | None
>                                                                    |
> *| resource_class         | baremetal-resource-class  *
>                                                                      |
> | storage_interface      | noop
>                                                                    |
> | target_power_state     | None
>                                                                    |
> | target_provision_state | None
>                                                                    |
> | target_raid_config     | {}
>                                                                    |
> | traits                 | []
>                                                                    |
> | updated_at             | 2022-02-11T13:14:52+00:00
>                                                                   |
> | uuid                   | e64ad28c-43d6-4b9f-aa34-f8bc58e9e8fe
>                                                                    |
> | vendor_interface       | ipmitool
>                                                                    |
>
> +------------------------+-------------------------------------------------------------------------------------------------------------------+
> (overcloud) [stack at undercloud v4]$
>
>
>
>
> (overcloud) [stack at undercloud v4]$ openstack flavor show
> my-baremetal-flavor --fit-width
>
> +----------------------------+---------------------------------------------------------------------------------------------------------------+
> | Field                      | Value
>                                                                   |
>
> +----------------------------+---------------------------------------------------------------------------------------------------------------+
> | OS-FLV-DISABLED:disabled   | False
>                                                                   |
> | OS-FLV-EXT-DATA:ephemeral  | 0
>                                                                   |
> | access_project_ids         | None
>                                                                    |
> | description                | None
>                                                                    |
> | disk                       | 470
>                                                                   |
>
> *| extra_specs                |
> {'resources:CUSTOM_BAREMETAL_RESOURCE_CLASS': '1', 'resources:VCPU': '0',
> 'resources:MEMORY_MB': '0',         ||                            |
> 'resources:DISK_GB': '0', 'capabilities:boot_option':
> 'local,boot_mode:uefi'} *                                |
> | id                         | 66a13404-4c47-4b67-b954-e3df42ae8103
>                                                                    |
> | name                       | my-baremetal-flavor
>                                                                   |
> | os-flavor-access:is_public | True
>                                                                    |
>
> *| properties                 |
> capabilities:boot_option='local,boot_mode:uefi',
> resources:CUSTOM_BAREMETAL_RESOURCE_CLASS='1',               ||
>                | resources:DISK_GB='0', resources:MEMORY_MB='0',
> resources:VCPU='0'    *                                        |
> | ram                        | 63700
>                                                                   |
> | rxtx_factor                | 1.0
>                                                                   |
> | swap                       | 0
>                                                                   |
> | vcpus                      | 20
>                                                                    |
>
> +----------------------------+---------------------------------------------------------------------------------------------------------------+
>

However you've set your capabilities field, it is actually unable to be
parsed. Then again, it doesn't *have* to be defined to match the baremetal
node. The setting can still apply on the baremetal node if that is the
operational default for the machine as defined on the machine itself.

I suspect, based upon whatever the precise nova settings are, this would
result in an inability to schedule on to the node because it would parse it
incorrectly, possibly looking for a key value of
"capabilities:boot_option", instead of "capabilities".

(overcloud) [stack at undercloud v4]$
>
> Can you please check and suggest if something is missing.
>
> Thanks once again for your support.
>
> -Lokendra
>
>
>
>
> On Thu, Feb 10, 2022 at 10:09 PM Lokendra Rathour <
> lokendrarathour at gmail.com> wrote:
>
>> Hi Harald,
>> Thanks for the response, please find my response inline:
>>
>>
>> On Thu, Feb 10, 2022 at 8:24 PM Harald Jensas <hjensas at redhat.com> wrote:
>>
>>> On 2/10/22 14:49, Lokendra Rathour wrote:
>>> > Hi Harald,
>>> > Thanks once again for your support, we tried activating the parameters:
>>> > ServiceNetMap:
>>> >      IronicApiNetwork: provisioning
>>> >      IronicNetwork: provisioning
>>> > at environments/network-environments.yaml
>>> > image.png
>>> > After changing these values the updated or even the fresh deployments
>>> > are failing.
>>> >
>>>
>>> How did deployment fail?
>>>
>>
>> [Loke] : it failed immediately after when the IP for ctlplane network is
>> assigned, and ssh is established and stack creation is completed, I think
>> at the start of ansible execution.
>>
>> Error:
>> "enabling ssh admin - COMPLETE.
>> Host 10.0.1.94 not found in /home/stack/.ssh/known_hosts"
>> Although this message is even seen when the deployment is successful. so
>> I do not think this is the culprit.
>>
>>
>>
>>
>>> > The command that we are using to deploy the OpenStack overcloud:
>>> > /openstack overcloud deploy --templates \
>>> >      -n /home/stack/templates/network_data.yaml \
>>> >      -r /home/stack/templates/roles_data.yaml \
>>> >      -e /home/stack/templates/node-info.yaml \
>>> >      -e /home/stack/templates/environment.yaml \
>>> >      -e /home/stack/templates/environments/network-isolation.yaml \
>>> >      -e /home/stack/templates/environments/network-environment.yaml \
>>>
>>> What modifications did you do to network-isolation.yaml and
>>
>> [Loke]:
>> *Network-isolation.yaml:*
>>
>> # Enable the creation of Neutron networks for isolated Overcloud
>> # traffic and configure each role to assign ports (related
>> # to that role) on these networks.
>> resource_registry:
>>   # networks as defined in network_data.yaml
>>   OS::TripleO::Network::J3Mgmt: ../network/j3mgmt.yaml
>>   OS::TripleO::Network::Tenant: ../network/tenant.yaml
>>   OS::TripleO::Network::InternalApi: ../network/internal_api.yaml
>>   OS::TripleO::Network::External: ../network/external.yaml
>>
>>
>>   # Port assignments for the VIPs
>>   OS::TripleO::Network::Ports::J3MgmtVipPort: ../network/ports/j3mgmt.yaml
>>
>>
>>   OS::TripleO::Network::Ports::InternalApiVipPort:
>> ../network/ports/internal_api.yaml
>>   OS::TripleO::Network::Ports::ExternalVipPort:
>> ../network/ports/external.yaml
>>
>>
>>   OS::TripleO::Network::Ports::RedisVipPort: ../network/ports/vip.yaml
>>   OS::TripleO::Network::Ports::OVNDBsVipPort: ../network/ports/vip.yaml
>>
>>
>>
>>   # Port assignments by role, edit role definition to assign networks to
>> roles.
>>   # Port assignments for the Controller
>>   OS::TripleO::Controller::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml
>>   OS::TripleO::Controller::Ports::TenantPort: ../network/ports/tenant.yaml
>>   OS::TripleO::Controller::Ports::InternalApiPort:
>> ../network/ports/internal_api.yaml
>>   OS::TripleO::Controller::Ports::ExternalPort:
>> ../network/ports/external.yaml
>>   # Port assignments for the Compute
>>   OS::TripleO::Compute::Ports::J3MgmtPort: ../network/ports/j3mgmt.yaml
>>   OS::TripleO::Compute::Ports::TenantPort: ../network/ports/tenant.yaml
>>   OS::TripleO::Compute::Ports::InternalApiPort:
>> ../network/ports/internal_api.yaml
>>
>> ~
>>
>>
>>>
>>> network-environment.yaml?
>>>
>>
>> resource_registry:
>>   # Network Interface templates to use (these files must exist). You can
>>   # override these by including one of the net-*.yaml environment files,
>>   # such as net-bond-with-vlans.yaml, or modifying the list here.
>>   # Port assignments for the Controller
>>   OS::TripleO::Controller::Net::SoftwareConfig:
>>     ../network/config/bond-with-vlans/controller.yaml
>>   # Port assignments for the Compute
>>   OS::TripleO::Compute::Net::SoftwareConfig:
>>     ../network/config/bond-with-vlans/compute.yaml
>> parameter_defaults:
>>
>>   J3MgmtNetCidr: '80.0.1.0/24'
>>   J3MgmtAllocationPools: [{'start': '80.0.1.4', 'end': '80.0.1.250'}]
>>   J3MgmtNetworkVlanID: 400
>>
>>   TenantNetCidr: '172.16.0.0/24'
>>   TenantAllocationPools: [{'start': '172.16.0.4', 'end': '172.16.0.250'}]
>>   TenantNetworkVlanID: 416
>>   TenantNetPhysnetMtu: 1500
>>
>>   InternalApiNetCidr: '172.16.2.0/24'
>>   InternalApiAllocationPools: [{'start': '172.16.2.4', 'end':
>> '172.16.2.250'}]
>>   InternalApiNetworkVlanID: 418
>>
>>   ExternalNetCidr: '10.0.1.0/24'
>>   ExternalAllocationPools: [{'start': '10.0.1.85', 'end': '10.0.1.98'}]
>>   ExternalNetworkVlanID: 408
>>
>>   DnsServers: []
>>   NeutronNetworkType: 'geneve,vlan'
>>   NeutronNetworkVLANRanges: 'datacentre:1:1000'
>>   BondInterfaceOvsOptions: "bond_mode=active-backup"
>>
>>
>>>
>>> I typically use:
>>> -e
>>>
>>> /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml
>>> -e
>>>
>>> /usr/share/openstack-tripleo-heat-templates/environments/network-environment.yaml
>>> -e /home/stack/templates/environments/network-overrides.yaml
>>>
>>> The network-isolation.yaml and network-environment.yaml are Jinja2
>>> rendered based on the -n input, so too keep in sync with change in the
>>> `-n` file reference the file in
>>> /usr/share/opentack-tripleo-heat-templates. Then add overrides in
>>> network-overrides.yaml as neede.
>>>
>>
>> [Loke] : we are using this as like only, I do not know what you pass in
>> network-overrides.yaml but I pass other files as per commands as below:
>>
>> [stack at undercloud templates]$ cat environment.yaml
>> parameter_defaults:
>>   ControllerCount: 3
>>   TimeZone: 'Asia/Kolkata'
>>   NtpServer: ['30.30.30.3']
>>   NeutronBridgeMappings: datacentre:br-ex,baremetal:br-baremetal
>>   NeutronFlatNetworks: datacentre,baremetal
>> [stack at undercloud templates]$ cat ironic-config.yaml
>> parameter_defaults:
>>     IronicEnabledHardwareTypes:
>>         - ipmi
>>         - redfish
>>     IronicEnabledPowerInterfaces:
>>         - ipmitool
>>         - redfish
>>     IronicEnabledManagementInterfaces:
>>         - ipmitool
>>         - redfish
>>     IronicCleaningDiskErase: metadata
>>     IronicIPXEEnabled: true
>>     IronicInspectorSubnets:
>>     - ip_range: 172.23.3.100,172.23.3.150
>>     IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel", "
>> http://30.30.30.1:8088/agent.ramdisk"]'
>>     IronicInspectorInterface: 'br-baremetal'
>> [stack at undercloud templates]$
>> [stack at undercloud templates]$ cat node-info.yaml
>> parameter_defaults:
>>   OvercloudControllerFlavor: control
>>   OvercloudComputeFlavor: compute
>>   ControllerCount: 3
>>   ComputeCount: 1
>> [stack at undercloud templates]$
>>
>>
>>
>>>
>>> >      -e
>>> >
>>> /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-conductor.yaml
>>>
>>> > \
>>> >      -e
>>> >
>>> /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-inspector.yaml
>>>
>>> > \
>>> >      -e
>>> >
>>> /usr/share/openstack-tripleo-heat-templates/environments/services/ironic-overcloud.yaml
>>>
>>> > \
>>> >      -e /home/stack/templates/ironic-config.yaml \
>>> >      -e
>>> >
>>> /usr/share/openstack-tripleo-heat-templates/environments/docker-ha.yaml \
>>> >      -e
>>> > /usr/share/openstack-tripleo-heat-templates/environments/podman.yaml \
>>> >      -e /home/stack/containers-prepare-parameter.yaml/
>>> >
>>> > **/home/stack/templates/ironic-config.yaml :
>>> > (overcloud) [stack at undercloud ~]$ cat
>>> > /home/stack/templates/ironic-config.yaml
>>> > parameter_defaults:
>>> >      IronicEnabledHardwareTypes:
>>> >          - ipmi
>>> >          - redfish
>>> >      IronicEnabledPowerInterfaces:
>>> >          - ipmitool
>>> >          - redfish
>>> >      IronicEnabledManagementInterfaces:
>>> >          - ipmitool
>>> >          - redfish
>>> >      IronicCleaningDiskErase: metadata
>>> >      IronicIPXEEnabled: true
>>> >      IronicInspectorSubnets:
>>> >      - ip_range: 172.23.3.100,172.23.3.150
>>> >      IPAImageURLs: '["http://30.30.30.1:8088/agent.kernel
>>> > <http://30.30.30.1:8088/agent.kernel>",
>>> > "http://30.30.30.1:8088/agent.ramdisk
>>> > <http://30.30.30.1:8088/agent.ramdisk>"] >
>>> IronicInspectorInterface: 'br-baremetal'
>>> >
>>> > Also the baremetal network(provisioning)(172.23.3.x)  is  routed with
>>> > ctlplane/admin network (30.30.30.x)
>>> >
>>>
>>> Unless the network you created in the overcloud is named `provisioning`,
>>> these parameters may be relevant.
>>>
>>> IronicCleaningNetwork:
>>>    default: 'provisioning'
>>>    description: Name or UUID of the *overcloud* network used for cleaning
>>>                 bare metal nodes. The default value of "provisioning"
>>> can be
>>>                 left during the initial deployment (when no networks are
>>>                 created yet) and should be changed to an actual UUID in
>>>                 a post-deployment stack update.
>>>    type: string
>>>
>>> IronicProvisioningNetwork:
>>>    default: 'provisioning'
>>>    description: Name or UUID of the *overcloud* network used for
>>> provisioning
>>>                 of bare metal nodes, if IronicDefaultNetworkInterface is
>>>                 set to "neutron". The default value of "provisioning"
>>> can be
>>>                 left during the initial deployment (when no networks are
>>>                 created yet) and should be changed to an actual UUID in
>>>                 a post-deployment stack update.
>>>    type: string
>>>
>>> IronicRescuingNetwork:
>>>    default: 'provisioning'
>>>    description: Name or UUID of the *overcloud* network used for resucing
>>>                 of bare metal nodes, if IronicDefaultRescueInterface is
>>> not
>>>                 set to "no-rescue". The default value of "provisioning"
>>> can be
>>>                 left during the initial deployment (when no networks are
>>>                 created yet) and should be changed to an actual UUID in
>>>                 a post-deployment stack update.
>>>    type: string
>>>
>>> > *Query:*
>>> >
>>> >  1. any other location/way where we should add these so that they are
>>> >     included without error.
>>> >
>>> >         *ServiceNetMap:*
>>> >
>>> >         *    IronicApiNetwork: provisioning*
>>> >
>>> >         *    IronicNetwork: provisioning*
>>> >
>>>
>>> `provisioning` network is defined in -n
>>> /home/stack/templates/network_data.yaml right?
>>
>> [Loke]: No it does not have any entry for provisioning in this file, it
>> is network entry for J3Mgmt,Tenant,InternalApi, and External. These n/w's
>> are added as vlan based under the br-ext bridge.
>> provisioning network I am creating after the overcloud is deployed and
>> before the baremetal node is provisioned.
>> in the provisioning network, we are giving the range of the ironic
>> network. (172.23.3.x)
>>
>>
>>
>>
>>> And an entry in
>>> 'networks' for the controller role in
>>> /home/stack/templates/roles_data.yaml?
>>>
>> [Loke]: we also did not added a similar entry in the roles_data.yaml as
>> well.
>>
>> Just to add with these two files we have rendered the remaining
>> templates.
>>
>>
>>
>>
>>>
>>>
>>> >       2. Also are these commands(mentioned above) configure Baremetal
>>> >     services are fine.
>>> >
>>>
>>> Yes, what you are doing makes sense.
>>>
>>> I'm actually not sure why it did'nt work with your previous
>>> configuration, it got the information about NBP file and obviously
>>> attempted to download it from 30.30.30.220. With routing in place, that
>>> should work.
>>>
>>> Changeing the ServiceNetMap to move IronicNetwork services to the
>>> 172.23.3 would avoid the routing.
>>>
>> [Loke] : we can try this but are somehow not able to do so because of
>> some weird reasons.
>>
>>>
>>>
>>> What is NeutronBridgeMappings?
>>>   br-baremetal maps to the physical network of the overcloud
>>> `provisioning` neutron network?
>>>
>>
>>
>>> [Loke] : yes , we create br-barmetal and then we create provisioning
>>> network mapping it to br-baremetal.
>>>
>>> Also attaching the complete rendered template folder along with custom
>> yaml files that I am using, maybe referring that you might have a more
>> clear picture of our problem.
>> Any clue would help.
>> Our problem,
>> we are not able to provision the baremetal node after the overcloud is
>> deployed.
>> Do we have any straight-forward documents using which we can test the
>> baremetal provision, please provide that.
>>
>> Thanks once again for reading all these.
>>
>>
>>
>>
>>> --
>>> Harald
>>>
>>>
>>
>> -
>> skype: lokendrarathour
>>
>>
>>
>
> --
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20220211/91404a84/attachment-0001.htm>


More information about the openstack-discuss mailing list